Signal Processing: Image Communication 14 (1999) 195—208
Variable tree size fractal compression for wavelet pyramid image coding Ying Zhang, Lai-Man Po* CityU Image Processing Laboratory, Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Received 25 October 1996
Abstract Pyramidal wavelet decomposition provides a hierarchical data structure for image representation which is suitable for further quantization and compression. Through discrete wavelet transform, image signals are decomposed into multiresolution and multi-frequency subbands with a set of tree-structured coefficients. The coefficients which have the same spatial location but with different resolution and different orientation can be organized into wavelet subtree. This efficient representation of image signals has achieved superior coding performance in wavelet-based image compression. In this paper, a novel variable tree partition algorithm is introduced which can efficiently split the wavelet subtrees according to the local details. Then a new variable size wavelet-subtree-based fractal coding algorithm is proposed to obtain a good trade-off between image quality and compression ratio. The self-similarities among wavelet subtrees are successfully exploited in the fractal coding method by predicting the coefficients at finer scales from those at coarser scales through proper affine transformation. Experimental results show a gain over JPEG of 5—6 dB in PSNR with even a slightly higher compression ratio (around 60 : 1 to 70 : 1). A slight gain in terms of coding efficiency is also achieved when compared to a method proposed by Davis, which also applies variable tree size wavelet fractal coding, but the complexity has been significantly reduced by avoiding iterative optimization procedures. 1999 Elsevier Science B.V. All rights reserved. Keywords: Wavelet transform; Fractal coding; Hybrid coding
1. Introduction The wavelet [1,3,4,12,15] and fractal coding [2,8—10] have recently attracted a lot of attention in image compression and, hence, have become two important branches of image coding methods. Al-
* Corresponding author. Tel.: #852 2788-7779; fax: #852 2788-7791; e-mail:
[email protected].
gorithms developed on wavelet coding or fractal coding have shown better performance than standard block-transform-based algorithms such as DCT, etc. However, the research of combining wavelet and fractal coding in one image compression scheme is still at the beginning. Aimed at superior performance of wavelet-based fractal coding or fractal-based wavelet coding, several new approaches [5—7,11,14] have been presented recently and the simulation tests have shown
0923-5965/99/$ — see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 3 - 5 9 6 5 ( 9 8 ) 0 0 0 0 8 - 3
196
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
promising coding results. In this paper, we propose a novel fractal image compression scheme based on wavelet framework with a hierarchical tree representation of wavelet coefficients. Wavelets are a family of functions derived from translations and dilations of one single function, the ‘mother’ wavelet. Daubechies et al. provided the way of designing orthogonal and biorthogonal bases of wavelets [1,4] and Mallat proposed a fast algorithm to compute discrete wavelet transform (DWT) and its inverse process (IDWT) [12]. The basic idea of applying DWT on image coding is that of successive approximation of an input image signal at a reduced resolution together with the added detail signal at that resolution. This wavelet representation provides a multiresolution and multi-frequency expression of nonstationary image signals with localization in both time and frequency domain. Such property is desirable for image coding because the multiresolution subbands become relatively more stationary and easier to code than the original entire picture in spatial domain. The wavelet transform coding has conventionally been treated as subband coding [17], which has been successfully used in image compression for more than a decade. The subband coding yields a better subjective perception due to the significant reduction of the ‘blocking effects’ as compared to other transform coding, e.g. DCT, at roughly the same PSNR and compression ratio. However, this approach, in general, demands the design of sophisticated band pass filters to minimize the aliasing effects. Thus, quadrature mirror filters (QMF) were introduced which allowed alias free reconstruction of one-dimensional signal in the absence of quantization errors, to be extended to the multi-dimensional signals. The process of wavelet coding is the same as subband coding except that the analysis/synthesis filters (QMFs) must be constructed to satisfy additional regularity condition. The wavelet pyramid decomposition, providing a hierarchical tree-structured image representation, results in a good tradeoff between spatial and frequency resolution with promising application in image compression. Therefore, most recent wavelet-based coding algorithms concentrate on encoding a set of tree-structured wavelet coefficients instead of quantizing each wavelet coefficient independently. Com-
pression schemes using different wavelet bases and quantization methods for coding still images can be found in [1,3,15]. Fractal coding, first presented by Barnsley based on the iterated function system (IFS) [2], is an alternative new tool to compress the image and video signals. Jacquin [9,10] developed the first automatic fractal image compression algorithm which can be taken as partitioned iterated function system (PIFS). Jacquin’s algorithm has been extensively studied and used to compress still images. All these existing schemes utilize affine transformation which maps one part of an image to another part of the same image under certain approximation error. The optimal parameters of affine transformation are recorded as fractal codes for decoding. Iterative procedure is carried out on these fractal codes and applied to any arbitrary image. An approximation of the encoded image will be obtained if such decoding process converges. Since some form of self-similarity does exist in many natural images, a high compression ratio can be achieved with acceptable image quality [8]. In addition, the fast decoding performance of fractal compression scheme is a special advantage to software-only decoding. However, fractal coding suffers from heavy encoding computation complexity and annoying blocking artifacts in the reconstructed images. Such disadvantages restrict its practical applications. Recently, Davis [5,6] and Krupnik et al. [11] have independently generalized the fractal coding from spatial domain to the wavelet domain. By constructing wavelet subtrees with coefficients at different resolution and orientation but with the same spatial location, Davis demonstrated that the conventional fractal block codec is a form of Haar wavelet subtree quantization scheme. If smooth wavelet bases are employed, the blocking artifacts are dramatically reduced. In this wavelet-based fractal coding algorithm, the coefficients at finer scales are predicted from those at coarser scales and its effectiveness due to the ability of fractal coding to exploit the correlation between different resolutions and efficiently represent wavelet zerotrees. This new approach of wavelet framework not only gives new insight into the convergence properties of fractal block coders but also leads to the
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
development of an unconditionally convergent fractal block coders with a fast decoding algorithm. Davis [7] further proposed an adaptive wavelettransform-based coding algorithm using joint optimized scalar quantization and fractal compression, which outperforms state-of-the-art fractal-based coding scheme. The blocking effects have been reduced substantially, with high-quality reconstructed images being obtained at very low bit-rates. However, the computational requirement is very high as an iterative optimized algorithm is used in his scheme to find the optimal partition of a wavelet subtree. To tackle this problem, a non-iterative variable tree partition algorithm is presented to reduce the computation complexity while maintaining the coding performance. This efficient partition algorithm of wavelet subtrees is then employed in the wavelet-based fractal codec. Theoretical analysis and simulation results will show that the encoding complexity of the proposed method is significantly reduced as compared with that of Davis’ algorithm at nearly the same compression ratio and image quality. The rest of the paper is organized as follows. Section 2 gives an overview of discrete wavelet transform with application to pyramidal image decomposition, and followed by discussion on constructing wavelet subtrees. In Section 3, fixed-size wavelet-subtree-based fractal coding algorithm is introduced as an extension of fractal coding from spatial domain to wavelet domain. In Section 4, a variable tree partition algorithm is investigated, and then a detailed variable size wavelet subtree based fractal coding is described. The simulation results for standard test images are presented in Section 5 and finally, in Section 6, summaries and conclusions are given.
2. Hierarchical wavelet subtree structure 2.1. Wavelet decomposition and reconstruction Wavelet decomposition is a process of successive projection of a given signal to spaces of signal at different resolutions via discrete wavelet transform (DWT). The fast algorithm for computing DWT and inverse DWT was given by Mallat [12]. The
197
Fig. 1. One stage in multiresolution wavelet decomposition of an image.
block diagram of one-stage wavelet decomposition for an image block W** is illustrated in Fig. 1. In H> this 2D DWT, a pair of lowpass and highpass filters, HI and GI , is applied to image signals first in image columns and then in image rows independently, producing four subbands as the output. Each subband decreases the resolution by a factor of 4 (2 in the horizontal direction and 2 in the vertical direction) by sub-sampling. The subband W** is the multiresolution approximation of H W** at a reduced resolution which corresponds to H> the low frequency of the image, and the other three W*&, W&* and W&& are detail signals at that resoluH H H tion and give the high frequencies in horizontal, vertical and diagonal direction, respectively. The low-frequency subband W** can be further decomH posed into four subbands at the next coarser scale. Such a repeated decomposition for low-frequency subband provides a wavelet pyramid structure similar to the Laplacian pyramid. An example of four-level wavelet pyramid decomposition of the image ¸enna is shown in Fig. 2, in which there are a total of 13 subbands with three subimages at each of the first three levels, and four on the top including the lowest frequency subband. One stage of the inverse DWT is illustrated in Fig. 3, in which the low-frequency subband W** of finer scale is reconstructed from the four H> subbands W**, W*&, W&* and W&& of coarser H H H H scale. The reconstruction is accomplished by putting zeros between each sample of four subbands and then convolving the resulting signals first along image columns and then along image rows with the synthesis filters H and G. The output signals are summed up to get the reconstructed image.
198
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
corresponding to symmetrical wavelets, together with compact support, can be maintained by relaxing the orthonormality constraint, which leads to the use of biorthogonal wavelets. The analysis filter pair, HI and GI , is no longer orthogonal. However, they are orthogonal to synthesis filter pair, H and G, respectively. The perfect reconstruction property is preserved, and Mallat’s fast algorithm can still be used. The Haar wavelets and biorthogonal wavelets with less dissimilar lengths of filters ‘9—7’ referred as B97 [1], are used for simulation tests in this paper.
2.2. Construction of wavelet subtree
Fig. 2. A four-level wavelet pyramid image decomposition of 512;512 test image ¸enna. The wavelet coefficients are quantized to 8 bits in the range 0—255.
Fig. 3. One stage in multiresolution wavelet reconstruction of an image.
Daubechies [4] designed orthonormal compactly supported wavelet bases corresponding to FIR filters with strictly perfect reconstruction. The same pair of filters for reconstruction was used for decomposition. The major disadvantage of compactly supported wavelets is their asymmetry, which induces nonlinear phase in the associated FIR filters. Except for the Haar wavelet with 2-tap filters, symmetry in wavelets and their associated filters cannot be obtained simultaneously since there are no orthonormal linear phase FIR filters enabling exact reconstruction, regardless of regularity. Nevertheless, the highly important linear phase constraint
Wavelet pyramid decomposition provides a hierarchical data structure of image representation which has been successfully employed in most recent wavelet coding scheme to improve the coding performance [15]. Davis coined a new term, wavelet subtree, to represent such a data structure used in wavelet-based fractal codec. A wavelet subtree consists of the wavelet coefficients that have the same spatial location but with different resolution and different orientation. Suppose an image is transformed to the wavelet domain by a four-stage pyramidal decomposition as shown in Fig. 4. In this hierarchical subband representation, each coefficient (a parent node) of high-frequency subband at coarser scale relates to 2;2 coefficients (children nodes) at the next finer scale of the same orientation. Note that the coefficients at the finest scale have no children. With the exception of the lowest-frequency subband, all parents have four children at the next finer scale. For the lowest-frequency subband, each parent node has only three children from the three different high-frequency subbands at the same scale level. The coefficients of the lowest-frequency subband are not considered in the construction of a wavelet subtree. As illustrated in Fig. 4, the three coefficients with the same spatial location from each of the three high-frequency subbands at the coarsest scale together with their children and grandchildren, 2;2, 4;4 and 8;8 coefficients at successive finer scales, are highly correlated. Thus, these coefficients can be organized into a hierarchical data
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
199
Fig. 5. Arrangement and scanning order of wavelet coefficients to form a wavelet subtree. The shaded area represents one of the four children range subtrees of the pruned parent range subtree.
Fig. 4. Construction of wavelet subtrees with a four-level wavelet decomposition. The domain and range subtrees consist of the triangular and square pixels located in same spatial position but with different resolutions and orientations, respectively. The shaded square pixels represent one of the children range subtrees.
structure — wavelet subtree. The three coefficients at the highest tree layer are called root nodes, and the coefficients at the lowest tree layer are called leaf nodes. This wavelet subtree is denoted as D (x, y), N where p"4 and p is the scale level of the root nodes, x and y are the coordinates of root node within its own subband. The tree size of D (x, y) is N expressed as ¸(D (x, y)) and given by N N\ ¸(D (x, y))"3 (2G;2G)"2N!1. N G
(1)
The wavelet subtree can have root nodes starting from high-frequency subband at finer scale. A wavelet subtree with root nodes at scale level 3 is also depicted in Fig. 4 and denoted as R (x, y). q is O the scale level of the root nodes and q"3, while x and y are the coordinates of root node within its own subbands. Similarly, the size of R (x, y), ¸(R (x, y))"2O!1. Each wavelet subO O tree, R (x, y) and D (x, y), can form a square of O N corresponding dimension by scanning the coefficients. The scanning order of wavelet coefficients
for R (x, y) to construct a 8;8 square block is O shown in Fig. 5. No child node is scanned before its parent, and it is performed from left to right, then top to down. As contrasted with spatial domain blockwise fractal coding, R (x, y) and D (x, y) are O N called range and domain subtrees, respectively. In this case, it is clear that p"q#1.
3. Fixed-Size wavelet-subtree based fractal coding The main idea of wavelet-based fractal coding is to approximate each range subtree by a domain subtree through fractal transformation as done in spatial domain [2,8—10]. Basically, this process can be explained as a prediction of coefficients from coarser scale to finer scale gradually. Theoretical analysis can prove that the conventional fractal coding algorithm is a particular case of this wavelet subtree quantization scheme if Haar wavelets are chosen as the wavelet basis. Such a wavelet-based fractal coding exhibits unconditional convergence which guarantees a fast decoding algorithm. Detailed proof can be found in [7]. This, however, is out of the scope of this paper. Here, we give the algorithm of applying fractal coding in wavelet domain. The block diagram of the fixed-size wavelet subtree based fractal coding algorithm is shown in Fig. 6. In general, it can be summarized as follows:
200
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
tion for prediction, the coefficients of the four subbands at the coarsest scale level are uniformly scalar quantized within each subband independently. Step 3. Construction of range and domain subtrees. The range and domain subtrees, denoted as R (x, y), q"3, and D (x, y), p"4, are constructed O N as described in Section 2.2. All the domain subtrees form a searching pool S in which the domain subtrees are non-overlapped with each other. Step 4. Fractal coding of range subtree. For each range subtree R (x, y), all the domain subtrees O D (x, y) in the searching pool are processed to find N the best-matched one through affine transformation. Owing to the special structure of wavelet subtree, the affine transformation used for wavelet subtree mapping is different from that used in the conventional spatial domain fractal coding. Let q denote the affine transformation that is composed of the following three parts:
Fig. 6. The block diagram of the fixed-size wavelet subtreebased fractal coding algorithm.
Step 1. Pyramidal wavelet decomposition. The original input image is decomposed into multiresolution and multi-frequency subbands by a multi-stage DWT as described in Section 2.1. Step 2. Scalar quantization of the four subbands at the coarsest scale. In order to give an initial condi-
1. Geometric part C, which is a truncation and scaling-down operator. It truncates all the leaf nodes of a domain subtree to match the tree size of a range subtree, followed by multiplication of a factor 1/2 on each of the remaining nodes of the pruned domain subtree. 2. Shuffle part P, this operation is similar to the one which is defined in conventional fractal coding schemes [2,9,10], but with transferring operation of subband coefficients. For this isometry operation, the horizontal, vertical and diagonal reflection of wavelet coefficients are carried out within each subband separately. Similarly, the rotation operation is also done within each subband and followed by switching the W*& coefficients with W&* coefficients. H H Therefore, the wavelet analog of the block symmetry operation permutes wavelet coefficients within each resolution. 3. Massic part X, which operates directly on the coefficients of a wavelet subtree. Since the wavelet subtree does not have DC component, the shift factor is unnecessary. Thus, X is defined as X(D )"aD , N N where a is the scaling factor.
(2)
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
Therefore, the affine transformation q for wavelet subtrees is the composition of the geometric, shuffle and massic part, which can be expressed as q(D )"XPC(D )"aP(C(D )). N N N
(3)
For a given range subtree, the main task of fractal coding is to find the best-matched domain subtree which can approximate the range subtree with the minimum distortion through certain affine transformation. If the mean-squared error (MSE) is used as the distortion measure, the range-domain comparison can be formulated as e"min [#R !q(D )#], p"4, q"3. O N "NZ1
(4)
The scaling factor a is obtained when the minimum error occurs, je "0. ja
(5)
After full search, the best-matched domain subtree is found and then the fractal transformation parameters are recorded as fractal codes regardless of the distortion. In the simulation of fixed-size wavelet subtree fractal coding, the four subbands at the coarsest scale are uniformly quantized to 7 bits. The fractal transformation parameters include the scaling factor a, the index of isometry operator and the location of the best-matched domain subtree. As there are only eight different isometries used in fractal transformation, 3 bits is enough to specify this operation. The scaling factor a is set in the range !2.0—2.0 and is uniformly quantized to 5 bits. The bits assigned for these two parameters are unchanged for all fractal encoded range subtrees and in the following experimental tests. However, the bits to address the best matched domain subtree depend on the size of the searching pool. For example, with a five-level pyramidal wavelet decomposition for an image of size 512;512, there are a total 256 domain subtrees. Thus, we need 8 bits to locate different domain subtrees within the searching pool. Table 1 gives the results of fixedsize wavelet subtree based fractal coding with two different multilevel wavelet decompositions. It is observed that B97 can obtain a higher PSNR value
201
Table 1 Simulation results of fixed-size wavelet subtree fractal coding for 512;512 test images ¸enna and Pepper in terms of compression ratio and PSNR Pyramid level
5 4
Compression ratio
89.04 : 1 20.48 : 1
Pepper
¸enna Haar
B97
Haar
B97
25.77 30.33
26.67 31.47
25.74 30.49
27.03 31.37
as well as visual quality than Haar wavelet due to the significant reduction of blocking artifacts. Thus, wavelet-based fractal coding exploits the self-similarities of wavelet subtrees across scales and reduces the blocking effects greatly by choosing smoothly decaying mother-wavelet. It can be interpreted as extrapolating the wavelet coefficients from coarse scale to fine scale. However, this coding process will incur big reconstruction error which leads to very poor image quality especially at high compression ratio due to the restriction of using fixed-size wavelet subtree. In order to achieve a good trade-off between bit-rates and image fidelity, a new variable size wavelet subtree fractal coding algorithm based on variable tree partition using joint scalar quantization and fractal compression is presented in the next section.
4. Variable size wavelet-subtree-based fractal coding Davis proposed a jointly optimized scalar/subtree quantization coding algorithm in [7] and achieved fairly good image quality at compression ratio as high as 64.4 : 1. The basic idea is to partition a wavelet subtree into two parts, separate nodes and several children subtrees. The separate nodes are coarse-scale wavelet coefficients and are scalar-quantized independently. The children subtrees are quantized by fractal coding method. Hence, it is crucial to find the optimal partition of the wavelet subtree for the best coding performance. However, the optimization process of Davis’ method is based on an iterative optimized partition algorithm carried out on each node of
202
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
a wavelet subtree from bottom to top (from children nodes up to their parents). At each step of iteration, a constructed Lagrangian cost is computed which determines whether the subtree node is a pruned node (the root of a fractal coded range subtree) or an unpruned node with children. The partition of the wavelet subtree is modified accordingly. At the same time, another step of quantizer optimization for each subband is performed to reduce the total Lagrangian cost of the quantized coefficients and fractal quantized subtrees. This optimization process is repeated until all nodes have been checked for several cycles. Although the iterative algorithm can obtain the optimal partition of a wavelet subtree, the heavy computation complexity is a burden for practical applications. Thus, we have proposed a new variable tree partition algorithm for wavelet subtrees. This does not require any iteration but is capable of achieving nearly optimal partition.
4.1. Variable tree partition Variable block size based on quadtree partition [8] is widely used in spatial domain fractal coding to adaptively separate a range block into quadrant sub-blocks according to local image complexities. Its superior performance over that of fixed-size blockwise fractal coding suggests the development of an adaptive variable size wavelet-subtree-based fractal coding algorithm. Similarly, if a range subtree cannot find a good matching domain subtree within the given distortion, it will be divided into smaller subtrees and be further coded separately. These range subtrees usually contain more details; thus more bits should be allocated for them instead of just encoding them with the same bits as the other ones. It is observed that the coefficients at coarser scale usually possess more energy as compared to those at finer scale. This is the reason why the three root nodes of the range subtree are more important than their children for reconstruction. A natural extension of quadtree partition is to remove the three root nodes from the range subtree and then split the pruned range subtree into four quadrant children subtrees. Such a modified quadtree partition is shown in Fig. 4 where the shaded
square/triangular pixels represent one of the four children subtrees. These children subtrees will be encoded as their parent by fractal coding. Further splitting is performed as before if the children subtree cannot find a matched domain subtree within the given distortion in the corresponding searching pool. Although this partition algorithm is easy to implement, it is not an efficient representation of the tree structure. In fact, because the four children range subtrees are close to each other in both wavelet and spatial domain, they are highly correlated and can be combined into a new range subtree. If such a combined range subtree can find a well-matched domain subtree within a predefined distortion threshold, nearly 75% reduction in bit allocation will be obtained as compared to that of coding four separate children range subtrees. The combined range subtree will not be split into four quadrants of which it is composed unless it cannot be fractal coded within given distortion. The proposed variable tree partition algorithm for coding a given range subtree R can be summarized as follows. O 1. Set a distortion threshold ¹ , check whether the O initial range subtree R can be fully fractal apO proximated by a domain subtree D within O> the given distortion. 2. If the range subtree can be fractal coded, the coding is finished and the next range subtree will be processed. Otherwise, the three root nodes are removed from R and the rest nodes are O combined as one entire range subtree R . The O\ three root nodes are scalar quantized independently. 3. A new distortion threshold ¹ is set and the O\ newly formed range subtree R is encoded as O\ its parent using the corresponding domain subtrees D to form the searching pool. If the distorO tion between the best-matched domain subtree and this combined range subtree is below ¹ , O\ the coding is also finished. Otherwise, if the distortion exceeds ¹ , R is divided into O\ O\ four children subtrees R . O\ 4. Each of the four children subtrees R is enO\ coded as their parent. If further splitting is allowed, just let q"q!1 and then go to step 2 to repeat the above variable tree size partition.
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
203
Note that the distortion thresholds are adjusted at each partition level. This variable size wavelet subtree fractal coding can be repeated until all range subtrees have found their best-matched domain subtrees within the given distortion. Alternatively, the allowed minimum dimension of range subtree has been reached which means it cannot be further split and the best matched domain subtree will be found regardless of the distortion.
4.2. Description of the new coder Suppose the test image is of size 512;512 and 6-level wavelet pyramid decomposition is adopted in the simulation. The block diagram of the proposed variable size wavelet subtree fractal coding algorithm is illustrated in Fig. 7. The distortion measure d( ) in Fig. 7 is the commonly used meansquared error (MSE). The first three steps are just the same as those of the fixed size wavelet subtree fractal coding algorithm. The initial range and domain subtrees, R (x, y) and D (x, y), are of size 1023 and 4095, respectively. By simple computation, we find that there are altogether 256 R (x, y) and 64 D (x, y). For each range subtree R (x, y), full search is employed to find the best-matched domain subtree D (x, y), which has the minimum approxima tion distortion e through fractal transformation. The fixed-size wavelet subtree fractal coding is finished regardless of the distortion. However, in the proposed variable size wavelet subtree fractal coding, e is compared with a predefined threshold ¹ to see whether the matching distortion is small enough. It is obvious that the range subtree can be ‘fractal-coded’ within the given distortion if e )¹ . In this case, the bits allocated for R (x, y) reach the minimum. Note that an additional bit is assigned to indicate that R (x, y) is encoded by fractal quantiz ation and further splitting is not necessary. On the contrary, if e '¹ which means a good match cannot be found, the proposed variable tree size partition algorithm is employed to segment the range subtree R (x, y) into several parts which are coded separately to reduce the overall distortion.
Fig. 7. The block diagram of the proposed variable size wavelet subtree-based fractal coding algorithm.
204
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
The badly matched range subtree R (x, y) is divided into two parts, three root nodes and the pruned subtree R (x, y) containing 1020 nodes. The three root nodes are scalar quantized independently within their own subbands. The pruned subtree R (x, y) differs from R (x, y) in having 12 root nodes instead of 3 at the highest tree layer. Similarly, the corresponding domain subtrees D (x, y) can be obtained by removing the three root nodes from D (x, y) so that the searching pool still con tains 64 disjoint domain subtrees D (x, y). It is because an enlarged searching pool usually can improve the matching probability in search of the best-matched domain subtree for a given range subtree. Then, the domain subtrees D (x, y) are constructed with one coefficient overlap between the neighboring domain subtrees in both horizontal and vertical direction within the three highfrequency subbands at scale level 5. The tree nodes of D (x, y) at the successive scale levels have the overlaps of 2, 4, 8 and 16. This means that the searching pool is now formed by 256 D (x, y) of size 4092. The effectiveness of using overlapped domain subtrees is confirmed by experimental results. As before, the best-matched domain subtree D (x, y) is found and the distortion e between the matched subtrees is calculated. If e is below a predefined threshold ¹ , R (x, y) is coded suc cessfully by fractal transformation and further splitting is not needed. On the contrary, if e exceeds ¹ , R (x, y) is divided into four children subtrees R (x, y) of size 255. Note that an additional bit is required to indicate whether R (x, y) is split or not. The corresponding domain subtrees D (x, y) of size 1023 with root nodes at scale level 5 are constructed to form the searching pool. This construction is very simple because the domain subtrees D (x, y) are, in fact, the range subtrees R (x, y). Therefore, there are altogether 256 non-overlapped D (x, y) considered during fractal coding. Each of the four children range subtree is coded as what has been carried on their parent. In the end, the best-matched domain subtree D (x, y), which has the min imum distortion, is found. When the children range subtrees R (x, y) cannot be coded within the given distortion ¹ , the vari able tree partition algorithm is further applied to divide them into smaller subtrees. At this time, the
pruned children subtrees R (x, y) and the corres ponding domain subtrees D (x, y) are of size 252 and 1020, respectively. We also consider the domain subtrees with one root node overlapping so that the searching pool contains 1024 different domain subtrees. Similarly, each R (x, y) will find a matched D (x, y) with the minimum distortion e . If the distortion is below a threshold ¹ , the subtree is coded by fractal quantization. Otherwise, it will be further partitioned into four quadrant subtrees R (x, y) of size 63. Since the minimum subtree di mension allowed in the simulation is 63, each subtree R (x, y) will be fractal-coded regardless of the distortion. The major advantage of the proposed variable tree partition algorithm is that no computational intensive iteration process is needed as compared to Davis’ iterative optimized partition method. The proposed algorithm is carried out from ‘top-tobottom’ instead of ‘bottom-to-top’ which requires fewer nodes checking to determine the final subtree partition. In addition, the heavy computation of Lagrangian sum calculation is also avoided in the proposed method. The encoder structure and computational requirement of the proposed method is lower than Davis’ method. Thus, it is more suitable for practical applications. Experimental results show that nearly optimal wavelet subtree partition is obtained with the new variable tree partition algorithm, and the coding performance in terms of PSNR and compression ratio is comparable to or even slightly better than Davis’ method.
5. Experimental results To evaluate the performance of the new variable size wavelet-subtree-based fractal coding scheme, experimental results are presented in this section. The test images are 512;512 gray-level pictures ¸enna and Pepper with 8 bits per pixel. Haar wavelets and biorthogonal wavelets with less dissimilar lengths of filters ‘9—7’, referred as B97, are employed in the wavelet transform. Note that full search of the domain subtrees is used in the coding scheme and, thus, the bits to address the bestmatched domain subtree depend on the size of the corresponding searching pool. The most commonly
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
205
used measure, PSNR of the reconstructed image, is given as an indication of the image quality which is defined as
Table 2 Multilevel distortion threshold settings in the experiments. Note that the listed values here are root-mean-squared error (RMS)
PSNR"
Wavelets
¹
¹
¹
¹
Haar B97
9.4 7.2
10.4 8.2
12.4 10.2
13.4 11.2
255;255 , 10 log (1/MN) + , [I(i, j)!IK (i, j)] G H
(6)
where I and IK denote the original and decoded image, respectively, and M and N are the vertical and horizontal dimension of the image. This is a mathematical measure for the difference between the original and the reconstructed image. A reconstructed image with better subjective quality usually has a higher PSNR value. The distortion thresholds to determine the wavelet subtree splitting are adjusted for different size range subtrees and take their values from experiments. It is very difficult to get the optimal setting from the theoretical analysis because the wavelet subbands have different distribution of wavelet coefficients which relates to the image content and the wavelet bases. In general, the higher the threshold, the less splitting appears and the lower bit-rates are obtained with poor image quality. On the contrary, the lower the threshold, the more splitting appears and the higher bit-rates are obtained with an improved PSNR value as well as the subjective quality. Table 2 lists the settings of the distortion thresholds used in the simulation tests for Haar and B97 wavelets, respectively. At the end of the encoder, adaptive arithmetic coding [16] is applied to the scalar quantized coefficients, fractal transformation parameters and tree node symbols to generate the output bit stream as the coding results. The reconstructed images are illustrated in Fig. 8 for test image ¸enna with the compression ratio (CR) about 64 : 1. The program is compiled and executed on a Cyrix 166#PC running ¸inux. The encoding time is closely related to image content, wavelet basis and distortion threshold which is about 1290 and 1285 CPU seconds in the simulation for using Haar and B97 wavelets, respectively. The number of iterations needed for reconstruction is five [7]. As Haar wavelets are discontinuous, they induce discontinuities into the reconstructed
image and, hence, they perform poorly in image quality due to the obvious blocking artifacts. Choosing smoothly decayed mother wavelet yields better performance for significant reduction of blockiness. From the pictures, it is clear that B97 wavelets can obtain a substantial improvement in perceived image quality over the Haar wavelets. The blocking artifacts have been completely eliminated and the PSNR has increased by 1.2 dB as compared with that of using Haar wavelets. To show the superior coding performance, the results of the proposed algorithm are compared with that of JPEG [13] and Davis’ method in terms of PSNR and compression ratio in Table 3. Although JPEG is the standard for still image compression, it is very difficult to obtain an acceptable image quality at very low bit-rates. The proposed algorithm can achieve 5.4 dB improvement in PSNR as compared to JPEG and the visual quality is also much better than JPEG (see Fig. 8). In addition, the experimental results are slightly better than that of Davis’ algorithm [7] in terms of PSNR value and compression ratio as shown in Table 3. Thus, the proposed variable tree partition algorithm has obtained nearly optimal partition of wavelet subtrees, and its implementation is simpler due to the relatively lower computation requirement as compared to Davis’ method. Therefore, the proposed variable size wavelet subtree fractal coding is more convenient for practical applications in which the encoding complexity is reduced while the coding performance is still maintained. Further, to show the robustness of the proposed algorithm, the coding results of the test image Pepper with the same experimental conditions are shown in Fig. 9. It has obtained the PSNR of nearly 30 dB at compression ratio 67 : 1 when using B97 wavelets. The JPEG coding results are also given in Fig. 9 and it
206
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
Fig. 8. Coding results for test image ¸enna of size 512;512. (a) Original image, (b) JPEG, CR"60.2 : 1 with PSNR"24.24 dB, (c) encoded using Haar wavelets CR"64.45 : 1 with PSNR"28.47 dB, (d) encoded using biorthogonal wavelets ‘B97’ CR"64.69 : 1 with PSNR"29.62 dB.
Table 3 Comparisons of coding results for test image ¸enna in terms of PSNR (dB) and compression ratio
Compression ratio PSNR
JPEG
Davis [7]
The proposed method
60.2 : 1 24.24
64.4 : 1 29.6
64.69 : 1 29.62
is very clear that JPEG cannot achieve an acceptable reconstructed image at such a high compression ratio.
6. Conclusions In this paper, a variable size wavelet subtree fractal coding scheme is proposed for still image compression especially at very low bit-rates. Based
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
207
Fig. 9. Coding results for test image Pepper of size 512;512. Original image, (b) JPEG, CR"61.3 : 1 with PSNR"24.58 dB, (c) encoded using Haar wavelets CR"64.19 : 1 with PSNR"28.77 dB, (d) encoded using biorthogonal wavelets ‘B97’ CR"67.15 : 1 with PSNR"29.95 dB.
on the variable tree partition algorithm, a range subtree is adaptively segmented into various detail regions according to local complexities resulting in nearly optimal partition. The domain subtrees are constructed with some overlapping areas to enlarge the searching pool under certain circumstances. The range subtrees and their children are approximated by the corresponding domain subtrees through fractal transformation. This wavelet subtree based fractal coding successfully exploits the correlation across scales by predicting the coeffi-
cients at finer scales from those at coarser scales. Experimental results show that the proposed hybrid image coding algorithm can obtain a much better reconstructed image at very low bit-rates than JPEG in terms of PSNR as well as subjective quality. In addition, its performance is slightly better than Davis’ iterative optimized partition based coding algorithm. The most important advantage of the proposed variable size wavelet subtree fractal coding scheme as compared with Davis’ method is the low encoding complexity because the new
208
Y. Zhang, L.-M. Po / Signal Processing: Image Communication 14 (1999) 195—208
variable tree partition algorithm is non-iterative and simple in implementation. Acknowledgements The authors would like to express their grateful to the anonymous referees for their many valuable comments and suggestions in improving the technical presentation of this paper. References [1] M. Antonini, M. Barlaud, P. Mathieu, I. Daubechies, Image coding using wavelet transform, IEEE Trans. Image Process. 1 (2) (April 1992) 205—220. [2] M.F. Barnsley, L.P. Hurd, Fractal Image Compression, AK Peters, Wellesley, MA, 1993. [3] J. Chen, S. Itoh, T. Hashimoto, Scalar quantization noise analysis and optimal bit allocation for wavelet pyramid image coding, IEICE Trans. Fundam. E76-A (9) (September 1993) 1502—1514. [4] I. Daubechies, Orthonormal bases of compactly supported wavelets, Commun. Pure Appl. Math. 41 (1988) 909—996. [5] G. Davis, Self-quantization of wavelet subtrees, Proc. SPIE Wavelet Applications II, Orlando, Vol. 2491, April 1995, pp. 141—152. [6] G. Davis, Self-quantized wavelet subtrees: a wavelet-based theory for fractal image compression, in: J. Storer, M. Cohn (eds.), Proc. DCC, March 1995, pp. 232—241.
[7] G. Davis, A wavelet-based analysis of fractal image compression, IEEE Trans. Image Process., accepted. [8] Y. Fisher, Fractal image compression with quadtrees, in: Y. Fisher (Ed.), Fractal Compression: Theory and Application to Digital Images, Springer, New York, 1994. [9] A. Jacquin, Fractal image coding based on a theory of iterated contractive image transformations, Proc. SPIE Visual Comm. and Image Proc., Vol. 1360, 1990, pp. 227—239. [10] A. Jacquin, Image coding based on a fractal theory of iterated contractive image transformations, IEEE Trans. Image Process. 1 (1) (January 1992) 18—30. [11] H. Krupnik, D. Malah, E. Karnin, Fractal representation of images via the discrete wavelet transform, in: IEEE 18th Conf. of EE in Israel, Tel-Aviv, March 1995. [12] S.G. Mallat, A theory for multiresolution signal decomposition: The wavelet representation, IEEE Trans. Pattern Anal. Machine Intell. 11 (7) (July 1989) 674—693. [13] W.B. Pennebaker, J.L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand Reinhold, New York, 1993. [14] R. Rinaldo, G. Calvagno, Image coding by block prediction of multiresolution subimages, IEEE Trans. Image Process. 4 (7) (July 1995) 909—920. [15] J. Shapiro, Embedded image coding using zerotrees of wavelet coefficients, IEEE Trans. Signal Process. 41 (12) (December 1993) 3445—3462. [16] I. Witten, R. Neal, J. Cleary, Arithmetic coding for data compression, Commun. ACM 30 (6) (1987) 520—540. [17] J.W. Woods, S.D. O’Niel, Subband coding of images, IEEE Trans. Acoust. Speech, Signal Process. ASSP-34 (5) (October 1986) 1278—1288.