Vsr 7
ELSEVIER
Image and Vision Computing 14 (1996) 365-371
Image compression
n
COMPUTE
using non-stationary and inhomogeneous multiresolution analyses Andreas Uhl*
Research Institute for Softwaretechnology,
Hellbrunnerstr.
34, A-5020 Salzburg, Austria
Received 22 July 1994; revised 14 November 1995
Abstract New methods for lossy image compression based on wavelet theory are introduced. Unlike the classical wavelet decomposition scheme, it is possible to have different scaling and wavelet filters at every scale by using non-stationary multiresolution analyses. For the bidimensional case, inhomogeneous multiresolution analyses using different wavelet filters for the two variables are introduced. Beyond it, these two methods are combined. All this freedom is used for compact image coding. The idea is to build out of the filters in a library that special non-stationary and/or inhomogeneous multiresolution analysis, that is best suited for a given image in the context of compact coding (in the sense of optimizing certain cost-functions). To identify the optimal filter combination, an adaptive algorithm of low complexity is developed. The usefulness of these new schemes for image compression is demonstrated. Keywords: Image compression; Wavelet theory; Multiresolution analysis
1. Introduction Image compression methods that use wavelet transforms (which are based on multiresolution analysis (MRA)) have been successful in providing high rates of compression while mi;intaining good image quality. In the classical wavelet decomposition (and also subband coding) scheme, one uses a set of well chosen filters to perform a convolution followed by a decimation from fine to coarse scales. Commonly, these filters do not change from one scale level to the next one. Since all the transformations at each level are performed independently, it is possible to use different filters (which correspond to different scaling and wavelet functions) at every scale. This non-stationary MRA (NSMRA) was recently introduced [l] in a more theoretical context (where it is called non-stationary multiscale analysis). For the bidimensional case several extensions are presented: (1) The straightforward two-dimensional version of a NSMRA. (2) Inhomogeneous multiresolution analysis (IMRA), in * Email:
[email protected] 0262-8856/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDZ 0262-8856(95)01077-7
which different filters are used for the two variables of the coordinate axes. (3) A very natural extension of both schemes is a simple combination of (1) and (2), resulting in an inhomogeneous non-stationary multiresolution analysis (INSMRA). All this freedom in choosing different filters is exploited for adapting the filter-combination in an optimal way for image compression. At this point the question arises of how to choose the best suited filters for the different scales and directions. We want to choose the filters out of a given library that are best suited for a given image in the context of compression by minimizing certain cost-functions. Among other cost-functions, information cost functions are used by analogy to these for best basis selection in the wavelet packet case [2-41. A fast algorithm of complexity O(N) for choosing the (for our purposes) optimal filters and performing the decomposition is introduced - we call this algorithm ‘best filter selection’, because it is an adaptive algorithm. This search procedure is entirely the reverse to that in the wavelet packet case (we search top-down in the wavelet decomposition scheme). Finally, we present the results of some experiments
A. Uhl/Image and Vision Computing 14 (1996) 365-371
366
Filters (QMF). Therefore, wavelet-based image compression can be viewed as a form of subband coding [l 11. A 1D wavelet transform of a signal s is performed by convolving s with both QMFs and downsampling by 2; since s is finite, one must make some choice about what values to pad the extensions with [12]. This operation decomposes the original signal into two frequency-bands (called subbands), which are often denoted coarse scale approximation and detail signal. Then the same procedure is applied recursively to the coarse scale approximations several times (see Fig. 1). The classical 2D transform is performed by two separate 1D transforms along the rows and the columns of the image data, resulting at each decomposition step in a low pass image (the coarse scale approximation) and three detail images (see Fig. 1). For more details, see, for example, Ref. [13].
‘Lt
F
Fig. 1, 1D and 2D wavelet decomposition.
our concepts for a lossy image compression scheme, and compare them with the results obtained by a classical wavelet decomposition scheme. For completeness, we mention the work in Ref. [5] - it is related in using different filters at each scale for a wavelet-like image compression scheme, but our approach offers a much lower computational complexity and a cheaper coding of the filters. using
3. Generalized 2D wavelet decompositions 2. Wavelet image coding Wavelet theory may be explained from a purely mathematical point of view, where the main emphasis lies on the underlying MRA [6]. Despite using the MRA terminology, we describe wavelet techniques from a mainly signal and image processing point of view. A wide variety of wavelet-based image compression schemes have been reported in the literature, ranging from simple entropy coding to more complex techniques such as vector quantization [7], adaptive transforms [5,8], tree encoding [9] and edge-based coding [lo]. In all these schemes (including that introduced here), compression is accomplished by applying a wavelet transform to decorrelate the image data, quantizing the resulting transform coefficients (this is where the actual lossy compression takes place), and coding the quantized values. In this paper, we restrict our attention to the optimization of the transform part. The fast wavelet transform (which is used in signal and image processing) can be efficiently implemented by a pair of appropriately designed Quadrature Mirror
The classical 2D wavelet decomposition is implemented by first convolving the rows of the low pass image S’+, (or the original image in the first decomposition step) with the QMF filterpair G and H (which are a high pass and a low pass filter, respectively), retaining every other row, then convolving the columns of the resulting images with the same filterpair and retaining every other column. The same procedure is applied to the coarse scale approximation Sj, and to all subsequent approximations (see Fig. 2). Since all the convolutions at different scale (or resolution) levels and image directions are performed independently, we can define generalized decompositions as follows: l
l
A NSMRA wavelet decomposition is obtained by using different filterpairs for different scale levels of the decomposition (e.g. Fig. 3: filterpair G, H at scale level j + 1, filterpair A, B at scale level j). An IMRA wavelet decomposition is obtained by using different filterpairs for the rows and the columns of
Dl
Sj+l -
D2 D3
Sj-1 Fig. 2. Classical 2D wavelet decomposition.
A. Uhljlmage ROWS
and Vision Computing
14 (1996) 365-371
367
Columns
G
4
G H
D2
Sj+l
Dl
G
D3
H
D2
-
H
S.i D3
Sj-1 Fig. 3. 2D NSMRA
l
wavelet decomposition.
different scale levels and the coordinate axes as being independent concerning the choice of an optimal filter (contained in the library), which means, e.g., for a NSMRA decomposition that we first search for the optimal filterpair for the detailed images of the first scale level. This is fixed, then the search is continued for the optimal filterpair for the detailed images of the second scale level, and so on. This reduces the number of filter combinations to examine from i2 to 21 (IMRA), from I” to ml (NSMRA) and from 12” to 2ml (INSMRA). The justification for this procedure is the following: we want to find that combination of filters which is best suited for compression purposes. Since the detail images of the first scale level have the biggest compression potential, it is only fair to optimize and fix the filters for the first scale level first. The algorithm cannot guarantee to find the best possible combination - but it guarantees to find a good one. Practical experiences suggest even more - in our experiments we carried out some full searches, too, and the algorithm always found the optimal combination. Now it remains to specify how the optimal filters for the detail images of the several scale levels and axes can be chosen. This is done by minimizing certain cost functions on the detail images for each scale and coordinate axis separately. Among these cost functions, information cost functions introduced in Ref. [3] for wavelet packets are applicable.
the image (e.g. Fig. 4: filterpair A, B is used for the decomposition along the rows, filterpair G, H for the decomposition along the columns). Finally, an INSMRA wavelet decomposition is obtained by using different filterpairs for both scale levels and coordinate axes of the image (e.g. Fig. 5: filterpair C, D for the rows at scale level j + 1. G, H for the columns at scale levelj + 1, E, F for the rows at scale level j and A, B for the columns at scale levelj).
4. Best filter selection The question to answer now is how this freedom in using different wavelet filters can be exploited efficiently for an adaptive image coding scheme. Suppose we have given a filter-library containing I pairs of different wavelet filters and a fixed maximal decomposition depth m. It is possible to build with the filters contained in this library 1’ different IMRA, I” different NSMRA and Z2” different INSMRA wavelet decompositions (always including classical ones). Now we introduce an algorithm that identifies good filter combinations in this big set of possible ones. 4.1. The algorithm The underlying
idea is the following: ROWS
we consider
the
COlUllTlS
G
A
F--4
H
D2
Sj+l -
Dl G
D3
B
D2 H
+
3 4
Sj-1 Fig. 4. 2D IMRA
wavelet decomposition.
A. lJhl/Image
368
and Vision Computing
14 (1996) 365-371
ROWS
D2
COIWMS A
Sj+l
Dl
E
4
B
D2
Sj A
4
F B
Fig. 5. 2D INSMRA
The cost functions needed for our purposes should give large values when the data values to compress are roughly of the same size (hard to compress), and small values when all but a few data values are negligible (easy to compress). Since we are not restricted to additive cost functions (as we are for wavelet packet best basis selection), statistical measures of concentration are applicable as well as other non-additive cost functions, as introduced in Ref. [14] besides the classical (additive) ones. The two examples presented here are the classical additive information cost functions introduced for best basis selection in the wavelet packet method [2,4], which are widely used in the wavelet community. For a given library, the most efficient representation of an image is selected by minimizing the information content of the detail images: Let X = (x0, xl, x2, . . .) be the (normalized) sequence of the detail coefficients. An additive analogon to Shannon-Weaver entropy is minimizing the latter miniX(X) = - C jXj2 lOg( mizes the former. (2) Number above a threshold: Set an arbitrary threshold E and count the elements in the sequence X whose absolute value exceeds E (#{Xj : lXj[ > 6)).
(1) Entropy:
We now describe the algorithm for best filter selection for a INSMRA decomposition in more detail. Since this is the most general case, the algorithms for NSMRA and IMRA decompositions can easily be derived: First scale level: -) Apply a wavelet transform with each of the filterpairs in the library to the rows of the original image -) Calculate and store the cost t’unction of the resulting detail image for each filterpair -) Determine the filterpair producing the lowest cost function -) Apply a wavelet transform with each of the filterpairs in the library to the columns of the original image
I-+
Sj-1
wavelet decomposition.
-1 -1 -1 -1
Calculate and store the cost function of the resulting detail image for each filterpair Determine the filterpair producing the lowest cost function Apply a 2D wavelet transform with the fixed filterpairs to the original image (which results in a lowpass image and three detail images) Proceed to the next scale level
Second
(and all subsequent) scale level(s): with each of the -> Apply a wavelet transform filterpairs in the library to the rows of the low pass image of the previous scale level -) Calculate and store the cost function of the resulting detail image for each filterpair -) Determine the filterpair producing the lowest cost function -) Apply a wavelet transform with each of the filterpairs in the library to the columns of the low pass image of the previous scale level -1 Calculate and store the cost function of the resulting detail image for each filterpair -) Determine the filterpair producing the lowest cost function -) Apply a 2D wavelet transform with the fixed filterpairs to the low pass image of the previous scale level (which results in a lowpass image and three detail images) -1 Proceed to the next scale level
Code the filterchoice Quantize and code transform-coefficients Let N be the number of datapoints to be transformed. The time complexity of the classical wavelet transform is O(N) [15], therefore our algorithm also has complexity O(N), at which the constant is of course bigger, as in the wavelet case, and depends on the size of the library, the number of decomposition levels and whether NSMRA, IMRA or INSMRA is used. The computation of the cost functions of the detail spaces is an additional O(N) procedure; the constant depends on the cost function being used.
A. lJhl/Itnage and Vision Computing Table 1 Results for image Lena in SNR
369
14 (1996) 365-371
Table 2 Results for image Lung in SNR
Rate
MRA
NSMRA
IMRA
INSMRA
Rate
MRA
NSMRA
IMRA
INSMRA
5 10 20
26.71 23.11 19.81
27.66 24.23 20.98
26.91 23.25 19.95
27.86 24.61 21.38
5 10 20
30.15 22.89 18.15
31.83 24.22 19.47
30.56 23.12 18.72
32.32 24.74 20.03
5.
Experimental
5.1. Description
is the average
results
sample meansquare
error:
of methods
The experiments are carried out by using a standard wavelet image coding scheme with uniform quantization and entropy coding, plus an additional filter-index coding for the generalized wavelet decompositions. We use a library containing Daubechies’ compactly supported wavelets [16] from two (Haar-filter) to 20 taps, X(x) as the cost function and a fixed decomposition depth of six scale levels. We compare the compression results of the generalized wavelet decompositions produced by the best filter selection algorithm with the result of a classical wavelet decomposition built out of the most appropriate filter in the library (in the sense of maximizing signal to noise ratio (SNR) after coding) for the image considered (denoted MRA in Tables l-3 and Fig. 7). Compression rate is defined by
wheref(i,j) andf(i,j) represent the N x N original and the reproduced images, respectively. All the computations have been performed on several DEC 3000 AXP/400 of the Research Institute for Software Technology. Fig. 6 shows three images (with 256 grey levels and 512 x 512 pixels) for which detailed results are presented. 5.2. Results
(The considered rates 5, 10 and 20 correspond in our case to bit rates of 1.6 bpp, 0.8 bpp and 0.4 bpp, respectively). SNR is defined (as in Ref. [17]) as follows, and measured in decibels (db):
We gain about 4-8% of SNR with the best method of the three generalized decompositions in comparison to the best wavelet in the library. For almost all the images that have been considered, there is a fixed ranking: INSMRA is best followed by NSMRA; both are clearly superior to the best wavelet, IMRA is only much superior to the wavelet when applied to the image Artificial, which shows a very different smoothness behavior in the horizontal and vertical direction. Fig. 7 shows cpu-timings for a classical non-adaptive wavelet coder (wav), for searching the best wavelet contained in the library (MRA) and for best filter selection (IMRA, NSMRA, INSMRA).
SNR = 10 log,,
5.3. Discussion
number number
of bits of the original of bits of the compressed
image
$ ems
where g2 is the variance
(4
image
of the original
image and ei,
The proposed
(b) Fig. 6. Test images Lena, Lung and Artificial.
method has, of course, a slower encoding
Cc)
A. Uhl/Image and Vision Computing
370 Table 3 Results for image Artificial
120
14 (1996) 365-371
-
in SNR
100Rate
MRA
NSMRA
IMRA
INSMRA
5 10 20
18.95 10.50 7.20
19.92 11.15 7.63
19.87 11.12 7.59
20.28 11.31 7.79
then decoding procedure, decoding is as fast as a classical wavelet decoding procedure. Concerning the relation between reconstruction quality and CPU time, NSMRA performs best (see Fig. 7), whereas the ‘inhomogeneous’ methods generally do not always justify the bigger computational effort. IMRA decomposition in particular turned out to be significantly superior to classical wavelet methods only in the case of the image Artificial (SNR gain of 4.85% to 5.9%), whereas in the case of the other two images (and also different ones not reported here), the gain in SNR is only at about 0.5% to 3%. The reason for this seems to be clear - the IMRA method does not offer enough adaptivity for usual real world images. The top gain in SNR of about 10% is achieved for the Lung image at a compression rate of 20 with INSMRA decomposition; NSMRA typically gains 4-7%. It is possible to derive a generalization from the data - the higher the compression rate is, the better the results of the adaptive methods. We could not observe any rule in filter choice (such as longer filters being preferable for the first decomposition steps, or vice versa), but our results can contribute to the discussion of whether filter regularity is an important property in wavelet coding schemes or not [18,19]. The filters that have been chosen by our search algorithms are completely mixed in length, and there is absolutely no trend like choosing preferably longer (and more regular) filters. This corresponds to other observations, where four tap filters can sometimes perform much better than 18 or 20. Concerning decomposition depth, it turned out that a six level transform is the best compromise between encoding time and achieved adaptivity (and image quality) for a 512 x 5 12 image. When discussing coding methods, one of the most difficult issues is to have a good measure of quality (in the sense of similar to the human visual system). I’-norm, /*-norm and SNR as objective measures have been used, but since the results are very similar, only numbers for SNR are reported here (which seems to be one of the most accepted measures). Concerning the subjective visuable distortion, the generalized decompositions show the same type (since wavelet filters are used) but less strong distortions. The main reason why we use a dictionary approach instead of an optimization procedure, like in [5], is that the library does not have to be coded with each image separately; it is sufficient to code an index that may be
80
Fig. 7. CPU-timings
(in seconds)
for the different
methods.
coded easily, of course, and our method is much faster (as long as a library with a moderate size is used).There is one more possible solution for the coding of the filters - if lots of similar images are to be coded (e.g. fingerprints), a few searches may be performed and then the filters that turned out to be useful are fixed. On the one hand, this makes the decomposition as fast as the wavelet decomposition; on the other hand, the filter indices do not need to be stored with each image separately. Similar techniques are used for the best basis choice in wavelet packet methods [20]. Future research in this direction will be done by using parallel computers for this or similar algorithms in order to speed up execution time, and by using different types of optimization procedures without the assumption of independence between different scale levels and orientations.
6. Conclusion In this paper, possible generalizations for twodimensional wavelet decompositions are presented. Different filters can be used for different scale levels and orientations of the wavelet decomposition scheme - based on this fact, an adaptive algorithm for filter selection based on (information) cost functions is introduced. This adaptive method clearly outperforms nonadaptive (classical) wavelet methods, which are among the most popular coding methods today - based on this fact, and that execution time becomes less and less important (since computers get faster almost every day), we claim a big potential for adaptive coding methods like the one proposed here.
References [l] A. Cohen, Non-stationary multiscale analysis, In Wavelets: Theory, Algorithms and Applications, C.K. Chui et al. (eds.), Academic Press, San Diego, CA, 1994, pp. 3-12. [2] R.R. Coifman and M.V. Wickerhauser, Entropy based methods
A. UhllImage and Vision Computing 14 (1996) 365-371
[3]
[4] [5]
[6] [7]
[8]
[9] [lo]
for best basis selection, IEEE Trans. Inf. Theory, 38(2) (1992) 7199746. R.R. Coifrnan, Y. Meyer, S. Quake and M.V. Wickerhauser, Signal processing and compression with wavelet packets, in Y. Meyer and S. Roques (eds.), Progress in Wavelet Analysis and Applications, Proceedings of the International Conference ‘Wavelets and Applications’, Toulouse, France, 1992, pp. 77-93. Editions Frontieres, 1993. M.V. Wickerhauser, INRIA lectures on wavelet packet algorithms. Lecture notes, INRIA, 1991. P. Desarte, B. Macq and D.T.M. Slack, Signal-adapted multiresolution transform for image coding, IEEE Trans. Inf. Theory, 38(2) (1992) 897-904. I. Daubechies, Ten Lectures on Wavelets, Number 61, CBMSNSF Series in Applied Mathematics, SIAM. Philadelphia, 1992. M. Antonini, M. Barlaud, P. Mathieu and I. Daubechies, Image coding using wavelet transform, IEEE Trans. Image Process., l(2) (1992) 205-220. A. Uhl, Compact image coding using wavelets and wavelet packets based on non-stationary and inhomogeneous multiresolution analyses, in A.F. Laine and M. Unser (eds.), Mathematical Imaging: Wavelet applications in signal and image processing II, Proc. SPIE 2303, 1994, pp. 378-388. J.M. Shapiro, Embedded image coding using zerotrees of wavelet coefficients, IEEE Trans. Signal Process., 41(12) (1993) 344-3462. J. Froment and S. Mallat, Second generation compact image coding, in Wavelets: A Tutorial in Theory and Applications, Academic Press, San Diego, CA, 1992, pp. 655-678.
[ll]
[12]
[13]
[14]
[IS]
[ 161 [17] [18] [19]
[20]
371
J. Woods and S.D. Q’Neil, Subband codings of images, IEEE Trans. Acoust. Signal Speech Process., 3495) (1986) 12788 1288. C. Taswell and K.C. McGill, Wavelet transform algorithms for finite-duration discrete-time signals, ACM Trans. Math. Software, 20(3) (1994) 398-412. S. Mallat, A theory of multiresolution signal decomposition: The wavelet representation, IEEE Trans. Patt. Anal. Mach. Intell., 1 l(7) (1989) 674-693. C. Taswell, Near-best basis selection algorithms with non-additive information cost functions, in M. Amin (ed.), Proc. IEEE Int. Symposium on Time-Frequency and Time-Scale Analysis, IEEE Press, 1994. G. Strang, Wavelet transforms versus fourier transforms, Bull. Amer. Math. Sot., 28(2) (1993) 288-305. I. Daubechies, Orthonormal bases of compactly supported wavelets, Comm. Pure and Appl. Math., 41 (1988) 909-996. A.K. Jam, Image data compression: a review, Proc. IEEE, 69(3) (1981) 349-389. 0. Rioul, Regular wavelets: a discrete-time approach, IEEE Trans. Signal Process., 41(12) (1993) 3572-3579. F. Hartung and J.H. Husoy, Wavelet and subband coding of images - a comparative study, in A.F. Laine (ed.), Mathematical Imaging: Wavelet Applications in Signal and Image Processing, Proc. SPIE 2034, 1993, pp. 242-253. T. Hopper, Compression of gray-scale fingerprint images, in H.H. Szu (ed.), Wavelet Applications, Proc; SPIE 2242, 1994, pp. 180-187.