Pattern Recognition 32 (1999) 435—441
Lossless compression of pre-press images using a novel colour decorrelation technique Steven Van Assche*, Wilfried Philips, Ignace Lemahieu Department of Electronics and Information Systems, University of Ghent, St.-Pietersnieuwstraat 41, B-9000 Gent, Belgium Received 19 October 1997
Abstract Most existing lossless image compression schemes operate on grey-scale images. However, in the case of colour images, higher compression ratios can be achieved by exploiting inter-colour redundancies. A new lossless colour transform is presented, based on the KLT. This transform removes redundancies in the colour representation of each pixel and can be combined with many existing compression schemes. In this paper it is combined with a prediction scheme that exploits spatial redundancies. The results proposed in this paper show that the colour transform typically saves about a half to two bits per pixel, compared to a purely predictive scheme. 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Lossless image compression; Colour decorrelation
1. Introduction Nowadays, in pre-press companies, documents (e.g. magazines, brochures and advertising posters) are usually composed electronically. These documents contain images with high spatial resolution (typically, 2000; 2000 pixels) and high colour resolution (24 bit per pixel for RGB-images and 32 bit per pixel for CMYK-images, which are more commonly used). Hence, such images occupy considerable amounts of storage space (typically, 16 Mbyte/image), and also pose transmission problems (e.g. transmitting images from a pre-press house to a printing company over ISDN-lines takes 15—30 min). Clearly, data compression can significantly alleviate these problems. Because of the high-quality requirements imposed by the customers, the pre-press industry is reluctant to adopt compression techniques (i.e. they do not
* Corresponding author. E-mail:
[email protected].
accept any degradation in image quality) and only lossless compression is acceptable (at least today). As far as lossless compression is concerned, many techniques are available, at least for grey-scale images. Some of the current state-of-the-art lossless compression techniques for contone grey-scale images are lossless JPEG [1] (the current standard), BTPC [2] (a binary piramid coder), FELICS [3] (a ‘‘fast efficient lossless image coder’’), the S#P-transform [4—6] (an integer wavelet coder) and CALIC [7] (a highly optimized technique with a complex prediction scheme). According to our experiments [8], CALIC yields the highest lossless compression ratios (of the order of 2), but it is also the slowest of the techniques mentioned. In the case of colour images, the above techniques usually process each of the colour components independently, i.e. the gray-scale images obtained after colour separation are encoded separately. In this case, the Kcomponent of a CMYK-image usually compresses better than the other components (factor of 3-4 instead
0031-3203/99/$ — See front matter 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 8 ) 0 0 0 3 4 - X
436
S.V. Assche et al. / Pattern Recognition 32 (1999) 435—441
of 2), but the mean compression ratio for all components remains of the order of 2. It is clear that higher compression ratios can be achieved by exploiting intercomponent redundancies. For instance, it is well known that for CMYK-images the K-component is highly correlated with the three other components. Nowadays, only few proposed techniques exploit such colour redundancies. Two noteworthy exceptions operate on RGB colour images [9,10]. Both techniques first compress the red component and then use information derived from the red component to predict the green component; subsequently, the green component is used in the same manner to predict the blue component. Unfortunately, this approach provides only a slight improvement in compression ratio (about 5%). This paper proposes a new technique for exploiting inter-component redundancies. The technique is based on a modified Karhunen-Loe`ve transform (KLT) scheme in combination with a novel quantization scheme that guarantees losslessness. The KLT decorrelates the colour components. It is recomputed on a block-by-block basis and is therefore spatially adaptive. It can be combined with several spatial prediction techniques. In this paper we investigate the combinations with a LJPG-predictor and with a CALIC-predictor. Section 2 describes how the modified KLT is able to achieve both a lossless transform of the image components and a good compression ratio, which the usual KLT, and in fact all floating-point transforms, are incapable of achieving (when they achieve losslessness, the compression ratio drops to about 1). Section 3 gives an overview of the complete compression scheme. Section 4 presents experimental results on many pre-press images. These results show that an increase of more than 10% in compression ratio is possible over pure gray-scale techniques.
2. Lossless transform coding In lossy transform coding a data vector d, consisting of sample values of a signal or image, is multiplied by an orthogonal transform matrix ¹, which results in a coefficient vector a"¹d. In the following, we assume that the components of d are integers. A ‘‘good’’ transform yields coefficients a which are very little correlated. This G is important because the rate-distortion theory predicts that the rate-distortion bound can be reached by coding each of the coefficients independently of the others, which greatly simplifies the coding problem. In practical transform coders, the coefficients a are quantized, i.e. conH verted into integers a , which are then entropy coded. In H the simplest case (uniform quantization), a "[a /*], H H where [x] denotes the integer nearest to x. In lossy compression, * is used to control the compression ratio (e.g. increasing * causes a decrease in bit rate at the
expense of larger coding errors). Decompression involves transforming the quantized coefficients into an approximation d"[*¹Ra] of the original data vector d. In principle, all transform coding techniques can be made lossless by sufficiently reducing the quantization step *. Indeed, let N be the dimension of d. It is easily shown that d"d for all possible data vectors d when *)1/(N; this is because #d!d#"#*a!a# and because "*a!a "(*/2, which implies that "d!d ") H H H H #d!d#((N*/2(. It can also be shown that, in general, a transform coding technique cannot be lossless when *'1/(N. Unfortunately, the compression ratio that corresponds to the minimum value of * is usually lower than 1, which means that lossless transform coding is useless in practice, at least in the manner described. In principle, transform coding techniques can also be made lossless by using higher values of * and coding the residual error. However, this approach does not work either, because the residual image is noise-like and cannot be compressed well. We now present a novel technique for turning a lossy transform coding scheme into a successful lossless scheme. The procedure is based on the fact that any matrix ¹ (even a non-orthogonal one) can be well approximated by two prediction operators P [.] and P [.] which map integers into integers, fol lowed by a floating point scaling step: a"¹d+DP [P [d]], where D is a diagonal matrix, e.g. for N"3:
D D" 0 0
0 D 0
(1)
0 0 . D
An important property of the prediction operators P [.] and P [.] is that they are invertible when applied to integer vectors. Now, consider the transform a"P[d], where P[d]"P [P [d]]. This operator maps integer data vectors d into integer coefficient vectors a and is lossless. Furthermore, in view of equation (1), a+D\a. Therefore, the statistical properties of the coefficients a are similar to those of the corresponding H coefficients a in an ‘‘ordinary’’ transform scheme, but H with a quantization step * "1/D , which differs from H H coefficient to coefficient. Note that simply quantizing the coefficients a using the quantization steps * , does not H H produce quantized coefficients a from which the integer H data vector d can be recovered error-freely. The experimental results in Section 4 clearly demonstrate that the lossless transform that we have just defined effectively compresses real data. In the following, we present some qualitative arguments to show why this is so. In lossy transform coding it is usually assumed that the coefficients a are Gaussian variables with zero mean. H A crude approximation for the entropy of the quantized
S.V. Assche et al. / Pattern Recognition 32 (1999) 435—441
437
Fig. 1. The encoder of the proposed scheme.
coefficient a is then given by H"0.5#log ((2np )! H H H log (* ), where * "D\. Therefore, the total number of H H H bits required to entropy-code a equals H"H ! log (%, * ), where H is the part of H which is indepen G H dent of the quantization steps * . Similarly, for the ‘‘ordiH nary’’ lossy transform coding scheme, the corresponding bit rate is H"H !N log *. As ¹ is an orthogonal matrix, "det(¹)""1 which implies that "%, D "" G H "det(¹)""1. Therefore, H"H . On the other hand, if *"1/(N (which is required for lossless coding), then H"H #N log (N)/2. This shows that, at least under these very crude assumptions, the proposed new lossless scheme saves about log (N)/2 bits per transform coefficient.
3. An overview of the lossless compression scheme Fig. 1 displays an overview of the proposed coder and Fig. 2 of the corresponding decoder. In the following paragraphs, we briefly describe the different steps of the coding process. We will not discuss the decoder in detail because it is basically the inverse of the coder. In the proposed scheme, spatial redundancies are removed using a spatial prediction scheme, while colour redundancy is removed using the lossless KLT (the KLT may be applied before or after the prediction step). Finally, the resulting coefficients are presented to an arithmetic coder. 3.1. The lossless Karhunen—Loe% ve transform The Karhunen—Loe[ ve transform [11] is defined as the transform which decorrelates the components of a random vector. In classical compression schemes decorrelation is usually applied in the spatial domain and is usually replaced by the DCT which is close to optimal, in practice. In any case, the KLT can be well approximated by a lossless operator, as explained in the previous section. We will refer to this operator as the lossless KLT.
Fig. 2. The decoder of the proposed scheme.
In the proposed scheme the lossless KLT is applied in the colour domain, i.e. to transform the colour values of a pixel into decorrelated numbers. In fact, the KLT is computed for a block of pixels and is therefore block dependent. This is to exploit the fact that the colour statistics in an image can vary strongly from region to region. As the lossless KLT for a block depends on the colour statistics in that block, which are unknown to the receiver, side information must be transmitted so that the receiver can reconstruct the lossless KLT itself. For this purpose, we use the following scheme: we start by computing the orthogonal KLT matrix ¹ for a given block; next, we expand ¹ as the product of rotation matrices, each of which is completely specified by one rotation angle. These rotation angles are quantized to 8 bit, and are used to construct an (exactly orthogonal) approximation ¹ of ¹. The lossless KLT is then derived from ¹ (instead of from ¹). The quantized rotation angles are sent to the receiver which reconstructs the lossless KLT from them. The involved overhead is negligible in practice because each rotation angle requires only 8 bit and because the number of rotation angles is small (i.e. 3 for N"3 and 6 for N"4).
The KLT implementation we used is part of the Meschachlibrary, which can be found at http: //www.netlib.org/c/ meschach/.
438
S.V. Assche et al. / Pattern Recognition 32 (1999) 435—441
3.2. Spatial prediction and arithmetic coding In many images spatial redundancies are much higher than inter-colour redundancies. In any case, spatial redundancies must be removed in order to maximize the compression ratio. In principle, any spatial decorrelation scheme may be combined with the above lossless KLT colour transform to further remove (spatial) redundancies. Currently, we have implemented two spatial predictors. The first predictor is a simple linear predictor, i.e. predictor 7 as described in the lossless JPEG standard [1]. If I and I are the intensities of the pixel above and ? J the pixel to the left of the pixel to be coded, then the prediction is W(I #I )/2X, where W x X denotes the largest ? J integer not greater than x. The second predictor is the very sophisticated predictor incorporated in the CALIC [7] scheme. Prediction is performed in two steps. First, a gradientadjusted prediction IQ for the pixel to be coded I is calculated. Depending on the local gradients in pixel intensities, an appropriate predictor is selected. This way, the sharpness and the direction of edges are detected and taken into account. In the second prediction step context modeling is used to make an even better prediction I® . In the context spatial texture patterns and the energy of past prediction errors are included. This two-step prediction scheme is able to greatly reduce spatial redundancies. Note that the CALIC-predictor does not incorporate all features of the full CALIC technique. More specifically, the predictor does not include the binary mode and the arithmetic coder. CALIC normally operates in a continuous-tone mode (in which the predictor I® is in use), but in uniform image areas CALIC triggers its binary mode, which yields higher compression efficiency because of its small symbol set. This feature greatly enhances coding performance on binary-type images, text-image combinations, and images with large uniform areas. We opted for only implementing the predictor and not the full CALIC scheme because we are mainly interested in evaluating the performance of our colour decorrelation technique on pure continuous-tone images and because of implementational choices (we opted for a clear separation between predictor and KLT). The fact that we did not include the arithmetic coder (mainly due to lack of time) is not really a drawback either, because the entropy figures we compute instead adequately approximate real arithmetic encoding figures and because the additional context-modeling included in the arithmetic coder cannot be used in our scheme any way. The context is based on information from the original image and is useless after the transform.
Both the specification and an executable were downloaded from ftp://ftp.csd.uwo.ca/pub/from —wu/.
In the following, we will refer to the original CALIC implementation from Wu and Memon simply by ‘‘CALIC’’, while our implementation of the CALIC-predictor (without binary mode and arithmetic coder) by CALICP. The combination of this predictor and the lossless KLT will thus be CALICP#KLT. The data which remains after the (spatial and colour) decorrelation is very non-uniformly distributed. Therefore, it should be entropy-coded. In principle, both Huffman and arithmetic coders can be used. However, as arithmetic coders can easily adapt to varying data statistics and therefore usually produce higher data compressions, we have opted for an arithmetic coder in the proposed scheme. As mentioned before, we did not implement an arithmetic coder for the CALIC#KLT scheme. For the LJPG#KLT scheme we used the ‘‘bitplus-follow’’ version of the coder described in Reference 12. In the current implementation of our codec, coder statistics are reinitialized in each block.
4. Experimental results We have implemented the proposed colour decorrelation scheme and tested it on several (mainly pre-press) RGB- and CMYK-images. In each case we reconstructed the image and verified the losslessness of the proposed scheme. In this section, we first describe the results of some preliminary experiments aimed at optimizing the proposed scheme. Next, we present compression results obtained on a large set of images. In the first experiment we evaluated the colour decorrelation technique by itself (i.e. not in combination with a spatial decorrelation technique) on the CMYK image ‘‘musicians’’, which is representative for pre-press applications. When applying only the KLT a bit rate of 19.86 bpp (bit per pixel) is achieved. It is interesting to compare this result to bit rates obtained with purely grey-scale techniques, which compress the colour separations independently. With the LJPG-based technique (which operates on the same blocks as the KLT-only technique) a bit rate of 19.10 bpp is obtained, while CALICP (which operates on the whole image rather than on blocks) achieves a much lower bit rate of 17.92 bpp. These results show that the KLT effectively compresses the image data, but they also demonstrate that spatial correlations are more important than inter-colour dependencies. In the second experiment, we investigated the optimal order in which to combine the KLT inter-colour redundancy removal and the spatial prediction (we only tested
The coder can be found at ftp://ftp.cpsc.ucalgary.ca/pub/ projects/ar.cod/.
S.V. Assche et al. / Pattern Recognition 32 (1999) 435—441
439
Fig. 3. Influence of the block size on the compression of ‘‘musicians’’
Table 1 Bit rate (bpp) for some typical continuous-tone pre-press colour images
Musicians Scid0 Scid3 Woman* Bike* Cafe* Water* Cats*
Number of pixels
Orig.
LJPG only
1853;2103 2048;2560 2048;2560 2048;2560 2048;2056 2048;2056 3072;2048 3072;2048
32 32 32 32 32 32 24 24
19.10 16.00 17.02 17.78 16.67 22.38 5.56 8.24
LJPG and KLT
CALICP and KLT
CALICP only
CALIC
FELICS
S#P
16.67 14.95 15.84 16.33 15.31 21.48 5.14 7.66
16.08 14.60 15.12 15.48 14.41 19.67 5.03 7.40
17.92 15.86 15.99 18.09 15.42 21.38 7.21 10.42
16.58 14.16 13.85 16.16 14.04 18.82 5.21 7.54
18.56 16.15 16.16 18.30 15.86 21.80 7.08 9.90
17.28 15.69 16.13 17.06 15.53 21.11 7.07 9.23
* Found at ft://ipl.rpi.edu/pub/image2/jpeg—cont—tone/.
LJPG-prediction in this preliminary experiment). When the KLT is applied first, followed by LJPG-prediction, the bit rate is 17.84 bpp. However, higher gains are obtained when applying the lossless KLT step after the spatial prediction step, as a kind of error modeling. In this case a bit rate of 16.67 bpp is achieved, which is a 15% improvement compared to the purely spatial scheme. In the following we always performed the lossless KLT step after the spatial prediction step. In the third experiment, the influence of the block size is investigated for the same image, see Fig. 3. As expected, for the purely predictive technique (LJPG-based) large blocks yield a better compression, because the arithmetic coder is then able to build up more reliable statistics. On the other hand, the results show that the colour decorrelation is more efficient at smaller block sizes. For ‘‘musicians’’, the optimal compromise is a block size of 30;30. Our experiments have shown that the optimal
block size strongly depends on the image (optimal block sizes vary from 10;10 to 100;100); however, the bit rate does not vary rapidly as a function of the block size. In conclusion, the preliminary experiments above show that it is best to apply the KLT after prediction and to use a block size of 10;10 till 100;100, depending on the image. In the following evaluation, we compressed the images with their optimal block size. Tables 1 and 2 show compression results for some 32 and 24 bpp colour images, including the well-known ‘‘lena’’ and ‘‘peppers’’ images. In each case, the bit rate is listed for the LJPG and LJPG#KLT schemes, for the CALICP and CALICP#KLT schemes, and for the purely grey-scale schemes CALIC, FELICS [3] and
FELICS is part of the Managing Gigabytes-suite, which can be found at ftp://128.250.1.21/pub/mg/.
440
S.V. Assche et al. / Pattern Recognition 32 (1999) 435—441
Table 2 Bit rate (bpp) for some less typical colour images
Koekjes1 koekjes2 ToolsR Timep—xfld Cmpnd2* Graphic** Chart—s* Lena Peppers
Number of pixels
Orig.
LJPG only
827;591 839;638 1524;1200 339;432 1024;1400 2644;3046 1688;2347 512;512 512;512
32 32 32 32 24 24 24 24 24
18.71 17.88 23.36 18.71 5.66 8.35 11.30 15.00 15.79
LJPG and KLT
CALICP and KLT
CALICP only
CALIC
FELICS
S#P
18.08 17.49 23.53 19.05 5.59 7.88 10.97 14.20 15.48
17.00 16.62 22.09 18.15 6.23 7.20 9.59 13.45 14.69
18.24 17.82 21.68 19.28 7.82 7.19 10.51 13.72 14.51
16.00 15.76 19.75 16.58 3.72 6.78 7.99 13.19 13.87
18.05 17.58 21.66 18.64 7.19 8.55 10.31 14.52 15.43
17.57 17.13 21.77 18.59 8.75 7.64 10.45 13.70 14.90
* Found at ftp://ipl.rpi.edu/pub/image2/jpeg—cont—tone/. R Graphics is an image represented in the l*a*b* colour-space.
S#P-transform. [4—6] The FELICS technique is a fast image compressor, which predicts the current pixel’s value I based upon the values of the two nearest already known pixels I and I ; the smallest one being denoted as ? J ¸ and the largest one as H. Depending on the position of I in respect to the [¸, H] interval, an adaptive binary coding is used (if I3[¸, H]) or a fast exponential Rice coding (if I,[¸, H]). The S#P-transform (scaling#prediction) performs a kind of subband-decomposition and can be seen as a reversible (i.e. lossless) wavelet transform. At each decomposition step, the S-transform converts the latest calculated low-resolution image into four new images which are the combinations of high- and low-resolution versions in the horizontal and vertical directions. Then the P-transform (prediction) is performed on the highpass versions. In our S#P-transform implementation entropy-coding is performed by an arithmetic coder. Table 1 shows results for some typical continuous-tone pre-press images for which our KLT technique was intended, while Table 2 lists results for less typical colour images, mainly images containing line-art or text (on which the KLT is expected to perform rather poorly). The results in Tables 1 and 2 show that the KLT-step leads to a considerable decrease in bit rate (0.5—2 bpp) on most images compared with the purely spatial techniques LJPG and CALICP, even on the non-typical images. As expected, our CALICP#KLT scheme outperforms the LJPG#KLT scheme, but compared to the original CALIC itself the compression gains on the typical images are quite moderate ((0.5 bpp) and on the non-typical images, our method performs worse than CALIC. As
Downloaded from http://ips.rpi.edu/.
mentioned before some features of the CALIC specification are not implemented in our technique, explaining some of the difference. Especially, the images which contain line-art and text (such as ‘‘cmpnd2’’, ‘‘graphic’’ and ‘‘chart—s’’) are compressed better by CALIC because of its binary mode. Compared to FELICS and the S#Ptransform, our technique yields better results, except for the image ‘‘tools’’, which also poses problems to other compression algorithms (e.g. see the bit rate for CALIC).
5. Conclusion The present paper has introduced a new lossless colour transform, based on the KLT. The transform removes redundancies in the colour representation of each pixel but not inter-pixel redundancies and is therefore combined with a prediction scheme that exploits spatial redundancies. The proposed scheme is tested on several images and the results show that it typically saves about a half to two bits per pixel, compared to a purely predictive scheme.
Acknowledgements This work was financially supported by the Belgian National Fund for Scientific Research (NFWO) through a mandate of ‘‘postdoctoral research fellow’’ and through the projects 39.0051.93 and 31.5831.95, and by the Flemish Institute for the Advancement of ScientificTechnological Research in Industry (IWT) through the projects Tele-Visie (IWT 950202) and Samset (IWT 950204).
S.V. Assche et al. / Pattern Recognition 32 (1999) 435—441
References [1] T.I. Telegraph, T.C.C. (CCITT), eds, Digital Compression and Coding of Continuous-Tone Still Images, Recommendation T.81, 1992. [2] J.A. Robinson, Efficient general-purpose image compression with binary tree predictive coding, IEEE Trans. Image Process. 6 (1997) 601—608. [3] P.G. Howard, The Design and Analysis of Efficient Lossless Data Compression Systems, Ph.D. Thesis, Department of Computer Science, Brown University, Providence, Rhode Island, June 1993. [4] A. Said, W.A. Pearlman, An Image Multiresolution Representation for Lossless and Lossy Compression, SPIE Sympo. on Visual Communications and Image Processing, Cambridge, MA, November 1993. [5] A. Said, W.A. Pearlman, Image Compression via Multiresolution Representation and Predictive Coding, Visual Communications and Image Processing, no. 2094, in SPIE, November 1993, pp. 664—674. [6] S. Dewitte, J. Cornelius, Lossless Integer Wavelet Transform, IEEE Signal Processing Letters. 4 (1997), 158—160. [7] X. Wu, N. Memon, CALIC — a context based adaptive lossless image codec, IEEE Int. Conf. on Acoustics,
[8]
[9] [10]
[11] [12] [13]
441
Speech, & Signal Pocessing, Vol. 4, May 1996, pp. 1890—1893. K. Denecker, J. Van Overloop, I. Lemahieu, An Experimental Comparison of several Lossless Image Coders for Medical Images, Proc. Data Compression Industry Workshop, S. Hagerty and R. Renner, eds., (Snowbird, Utah, U.S.A.), pp. 67—76. Ball Aerospace & Technologies Corp. (March 1997). N.D. Memon, K. Sayood, Lossless Compression of RGB Color Images, Opt. Engng, 34(6) (1995), pp. 1711—1717. K. Denecker, I. Lemahieu, Lossless Colour Image Compression using Inter-colour Error Prediction, Proc. PRORISC IEEE Benelux Workshop on Circuits, Systems and Signal Processing, J.-P., Veen, ed., Mierlo, Nederland, pp. 95—100. STW Technology Foundation, November 1996. W.K. Pratt, Digital Image Processing, 2nd edn. WileyInterscience, New York, 1991. P.G. Howard, J.S. Vitter, Arithmetic Coding for Data Compression, Proc. IEEE 82 (1994) 857—865. W. Philips, K. Denecker, A New Embedded Lossless/ quasi-lossless Image Coder Based on the Hadamard Transform, Proc. IEEE Int. Conf. on Image Processing (ICIP97). 1 (1997) pp. 667—670. Santa Barbara, California, USA.
About the Author—STEVEN VAN ASSCHE was born in Gent, Belgium, on 26 February 1973. In 1996 he received a Degree in electrical engineering from the University of Gent, Belgium. Since September 1996 he is working as a research assistant at the Department of Electronics and Information Systems (ELIS) at the same university. His main research topics are lossless compression of halftone and colour images.