Efficient common-core lossless and lossy image coder based on integer wavelets

Efficient common-core lossless and lossy image coder based on integer wavelets

Signal Processing 81 (2001) 403}408 E$cient common-core lossless and lossy image coder based on integer wavelets夽 Marco Grangetto, Enrico Magli, Gabr...

223KB Sizes 0 Downloads 38 Views

Signal Processing 81 (2001) 403}408

E$cient common-core lossless and lossy image coder based on integer wavelets夽 Marco Grangetto, Enrico Magli, Gabriella Olmo* Dipartimento di Elettronica, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy Received 6 December 1999; received in revised form 13 July 2000

Abstract The integer wavelet transform (IWT), applied to lossy-image compression, yields rate-distortion performance which is generally, slightly inferior to the discrete wavelet transform (DWT). In this paper, we propose a simple method which, based on the analysis of the IWT signal representation mechanism, is able to raise the IWT performance up to the DWT level. The method can be pro"tably employed for the implementation of a truly common-core lossless and lossy compression algorithm, with negligible loss of performance with respect to the classical, real-valued transform.  2001 Elsevier Science B.V. All rights reserved. Keywords: Image coding; Integer wavelets; Lifting scheme

1. Introduction Recently, the integer wavelet transform (IWT) has received considerable attention from the image processing community. The IWT can be achieved with a straightforward modi"cation of the discrete wavelet transform (DWT) implemented by means of the well-known lifting scheme (LS) [17]. Besides the reduced computational complexity, the LS exhibits the major advantage of allowing the design of a uni"ed lossless/lossy compression algorithm.



The authors are with the Signal Analysis and Simulation group at Politecnico di Torino. URL: www.helinet.polito. it/sasgroup. * Corresponding author. Tel.: #39-011-564-4094; fax: #39011-564-4149. E-mail addresses: [email protected] (M. Grangetto), [email protected] (E. Magli), [email protected] (G. Olmo).

The lossy and lossless performances of IWTbased algorithms have already been studied in the literature [1,3}5,7,10,11,16], showing that it can be pro"tably exploited for image compression. In [11] it is shown that, applying proper context modelling in the IWT domain, the compression performance of a well-known lossless coder such as CALIC can be approached. In [14,16] the JPEG2000 VM 0 bitplane coder and the popular SPIHT algorithm are compared with LOCO-I and CALIC, showing that progressive transmission up to the lossless level can be achieved at the expense of a very slight impairment in the compression ratio. It is worth noticing that both the mentioned papers address "lters which are now recognized not to be optimal. This means that a margin for further gain is possible simply by changing the IWT "lters. In [1] a thorough analysis of many "lters for both lossless and lossy compression is presented. In the lossy case,

0165-1684/01/$ - see front matter  2001 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 5 - 1 6 8 4 ( 0 0 ) 0 0 2 1 6 - 4

404

M. Grangetto et al. / Signal Processing 81 (2001) 403}408

a comparison between the DWT and IWT shows that the IWT exhibits slightly worse performance in terms of the peak signal-to-noise ratio (PSNR) on a set of standard test images. This fact has both theoretical and practical implications; in fact, it suggests that the DWT and the IWT are ruled by di!erent mechanisms, which deserves further study. In the present paper, a modi"cation of the IWT is proposed, based on the understanding of its signal representation mechanism, which allows the design of a common-core algorithm with lossy compression performance almost equal to that of the DWT. In Section 2, the IWT signal representation mechanism is addressed, and a discussion about the iterated graphic function is carried on, leading to a method able to improve the IWT compression performance up to the DWT level. In Section 3, the proposed common-core algorithm based on IWT is described and simulation results are presented along with a discussion on the possible applications and advantages. In Section 4, some implementation issues are dealt with, and in Section 5, conclusions are drawn. 2. Analysis of the IWT signal representation The LS is based on a factorization of the polyphase matrix of analysis and synthesis "lters [6]. This factorization leads to an implementation consisting of a cascade of blocks equivalent to each single step of the "lter bank scheme; reconstruction is achieved, performing the same steps in reverse order. Moreover, truncating each "lter output to an integer value just before adding or subtracting yields a truly reversible integer transform [4]. In the following, we analyse the IWT signal representation mechanism, so as to be able to get a better insight into the transform shortcomings and propose suitable counteractions. To this end, let us recall the i-times iterated graphic function tG(t) related to the synthesis bandpass "lter GG(z) equivalent to i levels of decomposition [12,19]. tG(t)"2GgG[n] for

n#1 n (t( . 2G 2G

This function can be generated by feeding a unit impulse h[n]"d[n] into the synthesis scheme; it is

known that lim tG(t) converges to the synthesis G wavelet t(t). In the case of the "lter bank scheme, in the limit for iPR, the iterated impulse responses generate the synthesis basis, and the recovered signal is built through a linear combination of these basis functions weighted by the wavelet coe$cients: x(t)" w t (t) (1) K L K L K L with t (t)"2\Kt(2\Kt!n). The regularity of K L the basis functions is a requirement for a good coding scheme [2]. In the LS framework this still holds for the real-valued transform, which is equivalent to the "lter bank scheme. Due to the truncation operation, the IWT cannot be thoroughly described in terms of its iterated graphic functions tG(t), as these latter depend on the amplitude of the input signal. As the di!erence between the DWT and IWT is due to the truncation error, which at each step is bounded to $0.5, a comparison between the real-valued and integer transforms can be done in terms of the iterated responses of the respective synthesis banks to an amplixed input impulse h [n]"Ad[n].  The graphic functions tG (t) and tG (t), "  '  achieved as i-times iterated responses of the DWT (resp. IWT) synthesis banks to h [n], can be de "ned; the equality tG (t)"AtG(t) holds for the "  real-valued transform, which is taken as the term of comparison for the integer one. This can be formalized introducing the following L norm:  EG"""AtG(t)!tG (t)""L  '  which represents the MSE between the ampli"ed iterated function AtG(t) and the IWT response to an ampli"ed input impulse tG (t), and can be taken '  as a measure of `how much non-lineara an IWT is. The behaviour of EG has been analysed for sev eral values of i and A, and for all of the most popular wavelet "lters for image compression; the results reported in this paper are obtained with the DB(9,7) wavelet [18], also selected for the JPEG2000 compression standard [8]. It has turned out that, for small values of A, EG is large, i.e.  tG (t) are highly oscillating, non-smooth functions; '  this is represented in Fig. 1(a), where the iterated graphic functions are reported for the DWT, and

M. Grangetto et al. / Signal Processing 81 (2001) 403}408

Fig. 1. (a) Iterated graphic functions (i"3) for the DWT (solid) and the IWT (low input ampli"cation of 10, dashdot); (b) DWT (solid) and IWT (high input ampli"cation of 64, dashdot).

for the IWT with low ampli"cation (A"10). This behaviour is due to the fact that, with small A, the "lter outputs before truncation have values comparable with the truncation error (i.e. $0.5), and this error propagates through subsequent lowpass iterations. Although this has no consequences for perfect reconstruction, as the non-linear operations are exactly compensated for at the synthesis side, this leads to the poorer compression performance in the lossy case noticed in [1,16]. The quantization of the w coe$cients of Eq. (1) produces artifacts, whose K L e!ects are particularly harmful on regular images. This behaviour is depicted in Fig. 2, and will be further commented on in Section 3. Even smoother tG (t) functions are achieved for '  increasing values of A, as in Fig. 1(b), where the iterated graphic functions are reported for the DWT and for the IWT with high ampli"cation (A"64). In general, it has turned out that increasing A makes the set of synthesis functions converge to the synthesis basis of the real-valued transform; in other words, this means that EGP0 for APR;  with this procedure, the relative truncation errors between the polyphase representations of the IWT and the DWT are made arbitrarily low, i.e. the IWT is lifted towards a quasi-linear working point, and the integer and the real-valued versions of the transform tend to coincide. Consequently, the compression performance of the IWT tends to reach

405

Fig. 2. (a) Original synthetic image; (b) reconstruction from the DWT coe$cients (PSNR"32,62 dB at 0,75 bpp); (c) reconstruction from the IWT (PSNR"18,65 dB at 0,75 bpp); (d) reconstruction from 64 times ampli"ed input IWT (PSNR"32,41 dB at 0,75 bpp).

asymptotically that of the DWT, as further discussed in Section 3. IWT and DWT coincide only asymptotically for "lters with irrational coe$cients, e.g. DB(9,7). In the case of rational "lters, for example those of the interpolating family, there is indeed a "nite value of A allowing formal equality between DWT and IWT; in fact, in this case, choosing a value of A equal to the minimum common denominator of all the "lter taps, the "ltered coe$cients are integers and the IWT is no longer a!ected by rounding operations.

3. The proposed scheme and its compression performance The previous discussion also suggests the path to be followed in order to design an e$cient IWTbased image coder. The key observation is that the IWT image representation, which depends on the dynamic range of the input image, can be made more regular and input-independent by means of proper ampli"cation of the input image. From the above consideration, stems the idea of an encoder capable of both the lossless and lossy options, which shares as a common core, the IWT processor followed by a suitable encoder. The proposed scheme, represented in Fig. 3, embeds both the DWT and IWT capabilities without directly

406

M. Grangetto et al. / Signal Processing 81 (2001) 403}408

Fig. 3. Proposed lossless/lossy compression scheme. (a): lossless; (b): lossy.

employing the #oating point transform. In fact, by means of input ampli"cation, the IWT is able to achieve the same performance of the DWT. On the other hand, the IWT processor exhibits a noticeable advantage in terms of memory requirements with respect to the DWT, as partial transform coef"cients are integers and can be stored on a limited number of bits. Moreover, the "ltering operations can be performed with faster and more e$cient routines or architectures since the "lter input samples are always integers [9]. The IWT processor represents a #exible and e$cient core which can be exploited in several ways: E Followed by a suitable encoder, it is able to yield progressive transmission up to the lossless level at the expense of a performance loss which, as already mentioned, can be kept very limited by means of a proper choice of "lters. In this framework, the input ampli"cation is not used, as it would increase the number of bits needed for lossless encoding by a factor U log AV.  E If the lossless option is not required, the performance loss with respect to the DWT can be avoided by means of a proper input ampli"cation. Obviously, on increasing the ampli"cation, the computational advantages of the IWT in terms of memory and complexity, are progressively reduced. However, the real gain of the proposed architecture is represented by the possibility of employing the same computational core for both the IWT and a su$ciently accurate DWT, according to the application requirements. E Another interesting application is represented by those situations in which an a priori selection among di!erent trade-o!s between quality levels and memory/computational requirements can be made (eg. wireless multimedia access or digital

cameras compression tools). In this case, a preselection of the input ampli"cation can be performed and the algorithm can be inexpensively adapted to the particular situation. The above-discussed concepts are represented in Fig. 3, where the common core consists of the IWT processor followed by the popular SPIHT encoder [13]. The latter is a modi"ed version of Shapiro's popular zerotree encoder [15] and has been selected as it is simple and e$cient and allows for both lossless and lossy progressive transmission of the wavelet coe$cients. However, it is worth remarking that the choice of SPIHT is not crucial and has been addressed in order to be able to work out sensible results. The performance of the same scheme, employing the DWT, has been used as term of comparison. Two situations have been considered, namely the IWT performance with natural and synthetic images. Simulations have been carried out on a large image set in order to evaluate the performance of the ampli"ed IWT. In Fig. 4 the peak signal-tonoise ratio (PSNR) required to achieve a bit rate of 0.5 bpp in the case of a natural image is plotted as a function of A, and compared to the PSNR achieved by the DWT; it can be seen that the curves tend to asymptotically coincide. This behaviour con"rms that the IWT performance can be made arbitrarily close to the DWT one by means of amplifying the input signal. It is worth noticing that the values of A needed to "ll the gap between the IWT and DWT, are typically between 1 and 20, depending on the "lter employed. Another case of interest arises in the compression of low-entropy images. In Fig. 2 the images recovered from the IWT and DWT coe$cients are reported for a synthetic image with low entropy. It can be seen that input ampli"cation yields a

M. Grangetto et al. / Signal Processing 81 (2001) 403}408

Fig. 4. Comparison of PSNR for the DWT (solid) and the IWT (dashdot) at 0.5 bpp.

signi"cant increase of image quality, which is in this case far more evident than for natural images. This proves that, besides "lling the moderate gap between IWT and DWT in the compression of natural images, the proposed method can avoid the dramatic loss of performance for low-entropy images. In this case, typical values of A range from 32 to 64.

407

mentioned feature is the possibility of designing fast "ltering operation. In fact, the IWT can be decomposed in a cascade of real coe$cients FIR "lter, with integer samples in input. This allows for the design of optimized routines in terms of execution time and memory requirements. This consideration heavily impacts also on VLSI architectures. In this case, the "lters must be mapped on multiply and accumulate (MAC) units that, given the particular kind of multiplications to be performed, can achieve good performance in terms of area, latency and speed. Finally, besides impacting very positively on compression performance, the input ampli"cation opens promising implementation perspectives, which are discussed in the following. The IWTbased algorithm embeds the DWT capabilities, given that a proper ampli"cation is selected. This possibility represents an e$cient way of implementing a DWT/IWT transform stage not requiring any branches (the rounding operations are always performed). This choice, in fact, overcomes the natural implementation that would require branches (i.e. lower e$ciency) in order to round the "lter output. The advantage of using an algorithm without conditional statements is even more evident in the case of parallel or pipelined architectures, where code jumps limit the e$ciency of processor architecture.

4. Implementation issues 5. Conclusion As already stated, the major advantage of the proposed scheme is the possibility to employ the same algorithm core in practical situations with di!erent features and constraints. Moreover, the common core, represented by the IWT processor, presents many advantages when practical realizations are addressed. In the following some promising implementation issues are discussed for both software and architectural design and some exploitation of previously discussed advantage is done for DSP and VLSI implementations. The IWT's limited memory requirement for the partial results storage greatly reduces hardware resources. This advantage is relevant especially for VLSI implementation where the design aims at the optimization of the resources. Another already

In this paper we have investigated the properties of integer wavelets related to lossy image compression. We have shown that the IWT-signal representation can be improved by means of an ampli"cation of the input signal; this enables the design of an image coder capable of "lling the performance gap between the IWT and the DWT. Actually, we have shown that the proposed scheme, fully based on the IWT, nearly achieves the performance of the DWT in lossy compression, for both natural and synthetic images. Implementation topics and the possibility of exploiting the error norm EG for the problem of "nding optimum fac torizations for the polyphase matrix of wavelet "lters are currently under research by the authors.

408

M. Grangetto et al. / Signal Processing 81 (2001) 403}408

References [1] M.D. Adams, F. Kossentini, Reversible integer-to-integer wavelet transforms for image compression: performance evaluation and analysis, IEEE Trans. Image Proc. 9, 1972}1977. [2] M. Antonini, M. Barlaud, P. Mathieu, I. Daubechies, Image coding using wavelet transform, IEEE Trans. Image Process. 1 (April 1992) 205}230. [3] A. Bilgin, P. Sementilli, F. Sheng, M. Marcellin, Scalable image coding using reversible integer wavelet transforms, IEEE Trans. Image Proc. 9, 1010}1024. [4] R.C. Calderbank, I. Daubechies, W. Sweldens, B. Yeo, Wavelet transforms that map integers to integers, Appl. Comput. Harmonic Anal. 5 (3) (1998) 332}369. [5] H. Chao, P. Fisher, Z. Hua, An approach to integer wavelet transformations for lossless image compression, http://www.infinop.com/infinop/html/whitepap er.html. [6] I. Daubechies, W. Sweldens, Factoring wavelet transforms into lifting steps, J. Fourier Anal. Appl. 4 (3) (1998) 247}269. [7] S. Dewitte, J. Cornelis, Lossless integer wavelet transform, IEEE Signal Process. Lett. 4 (6) (June 1997) 158}160. [8] Document ISO/IEC CD 15444-1: 1999 (V1.0, 9 December 1999), available at URL www.jpeg.org. [9] M. Grangetto, E. Magli, M. Martina, G. Olmo, Optimization and implementation of the integer wavelet transform for image coding, IEEE Trans. Image Process. (February 2000) submitted for publication.

[10] C. Lin, B. Zhang, Y. Zheng, Packed integer wavelet transform constructed by lifting scheme, Proceedings of ICASSP 1999. [11] N. Memon, X. Wu, B.-L. Yeo, Entropy coding techniques for lossless image compression with reversible integer wavelet transform, Proceedings of ICIP 1998, Chicago, IL, 1998. [12] O. Rioul, M. Vetterli, Wavelets and signal processing, IEEE Signal Process. Mag. 8 (4) (October 1991) 14}38. [13] A. Said, W.A. Pearlman, A new, fast, and e$cient image codec based on set partitioning in hierarchical trees, IEEE Trans. Circuits Systems Video Technol. 6 (3) (June 1996) 243}250. [14] A. Said, W.A. Pearlman, An image multiresolution representation for lossless and lossy compression, IEEE Trans. Image Process. 5 (September 1996) 1303}1310. [15] J.M. Shapiro, Embedded image coding using zerotrees of wavelet coe$cients, IEEE Trans. Signal Process. 41 (12) (1993) 3445}3462. [16] F. Sheng, A. Bilgin, P.J. Sementilli, M.W. Marcellin, Lossy and lossless image compression using reversible integer wavelet transforms, Proceedings of ICIP 1998, Chicago, IL, 1998. [17] W. Sweldens, The lifting scheme: a construction of second generation wavelets, SIAM J. Math. Anal. 29 (2) (1997) 511}546. [18] J.D. Villasenor, B. Belzer, J. Liao, Wavelet "lter evaluation for image compression, IEEE Trans. Image Process. 4 (8) (August 1995) 1053}1060. [19] M. Vetterli, J. Kovac\ evicH , Wavelets and Subband Coding, Prentice-Hall PTR, Englewood Cli!s, NJ, 1995.