Signal Processing 65 (1998) 391—401
An adaptive decimation and interpolation scheme for low complexity image compression P.W.M. Tsang*, W.T. Lee Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong Received 8 August 1996; revised 30 October 1997
Abstract In this paper, a low complexity image compression scheme based on adaptive decimation is reported. Reconstruction of the encoded image is performed with a two-dimensional adaptive interpolation algorithm which is capable of maintaining a reasonable coding fidelity, as well as a smooth and natural visual effect for the object contours contained in the picture. The method involves only small amount of computation and can be implemented to operate in real time with simple hardware and small amount of memory storage. Results obtained by applying the proposed scheme in encoding images at low bit-rates of around 0.3 bpp are presented to demonstrate the feasibility of the approach and its potential in practical applications. ( 1998 Elsevier Science B.V. All rights reserved. Zusammenfassung In diesem Beitrag wird eine auf adaptiver Unterabtastung basierende Bildkompressionsmethode mit geringer Komplexita¨t vorgestellt. Die Rekonstruktion des codierten Bildes erfolgt mittels eines Algorithmus zur zweidimensionalen adaptiven Interpolation. Dieser Algorithmus weist sowohl eine angemessene Codierungsgu¨te als auch eine glatte und natu¨rliche Widergabe der im Bild enthaltenen Objektkonturen auf. Die Methode ist wenig rechenaufwendig und kann mit einfacher Hardware und wenig Speicher in Echtzeit implementiert werden. Wir pra¨sentieren Codierungsergebnisse bie einer niedrigen Bitrate von ca. 0.3 bpp, um das Potential dieses Ansatzes fu¨r praktische Anwendungen zu demonstrieren. ( 1998 Elsevier Science B.V. All rights reserved. Re´sume´ Nous pre´sentons dans cet article une me´thode de compression d’image de faible complexite´ base´e sur une de´cimation adaptative. La reconstruction de l’image code´e est ope´re´e a` l’aide d’un algorithme d’interpolation adaptatif bi-dimensional capable de maintenir une fide´lite´ de codage raisonnable de meˆme qu’une impression visuelle naturelle et sans discontinuite´s pour les contours d’objets pre´sents dans l’image. Cette me´thode n’implique qu’une quantite´ de calculs re´duite et peut eˆtre implante´e en temps re´el sur un mate´riel simple et avec une me´moire de stockage limite´e. Les re´sultats obtenus en appliquant la technique propose´e au codage d’images a` bas de´bit, de l’ordre de 0.3 bpp, sont pre´sente´s pour mettre en lumie`re la faisabilite´ de cette approche et son potentiel pour des applications pratiques. ( 1998 Elsevier Science B.V. All rights reserved. Keywords: Adaptive interpolative vector quantization; Adaptive decimation and interpolation; Blocking effect
* Corresponding author. Tel.: (852) 2788 7763; fax: (852) 2788 7791; e-mail:
[email protected]. 0165-1684/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved. PII S 0 1 6 5 - 1 6 8 4 ( 9 7 ) 0 0 2 3 5 - 1
392
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391–401
1. Introduction Recently, the use of image compression techniques for transmitting and storing massive amount of video data has become increasingly popular in multimedia, teleconferencing and virtual reality applications. The heavy demand has led to the development of wide variety of compression algorithms to provide trade-off between price and quality to suit different constraints and markets. Amongst existing schemes, the usage of JPEG and MPEG standards have dominated in video broadcasting industry as well as in advance video recording and playback systems. Although hardware solutions are available for implementing these standards, the technology as a whole is often too expensive to be incorporated into low cost products that are mainly targeted at the vast domestic market. The latter has led to a niche that calls for less sophisticated schemes such as Vector Quantization (VQ) [3,6] and Block Truncation Coding (BTC) [13] for providing image encoding facilities of lower quality at more competitive price. Researches in VQ, in particular, have received much attention in the past decade on account of the simplicity of the decoders that are basically constructed with lookup tables (codebooks). However, the encoders are complicated and the quality of the compressed images are more or less affected by the size and contents of the codebooks. Besides, images compressed with VQ (as well as BTC) are sometimes blocky in appearance with noticeable discontinuity in edge contours despite that some of these problems had been alleviated with more advanced algorithms such as the Predictive Vector Quantization [2,5,7,9]. In fact, this blocking effect is a common phenomenon that is exhibited in a lot of encoding schemes that operate on non-overlapping, rectangular partitions of the source picture. It was found that the continuity of edges and shade across the boundary of two adjacent partitions are difficult to preserve in the decoded image, hence forming an illusion as if the latter is constructed with rectangular tiles arranged in a uniform grid structure. The problem of blocking effect had been overcome with the use of multirate Wavelet and Sub-band Coding (SBC) [1] techniques which utilize special filters to decompose the entire source image into a set of othorgonal components
(or sub-images). Compression is achieved by discarding or suppressing less relevant components, and retain the essential information (which usually contributes to a small proportion of the set of sub-images) to reconstruct the decoded image. Similar to the JPEG and MPEG standards, the complexity and memory storage involved in multirate techniques are nevertheless too high to be incorporated in low cost products. A compromise between cost and quality in image compression was made possible with the use of Interpolative Vector Quantization (IVQ) [4] which provides satisfactory coding fidelity at 1 bpp. Along this direction, the Adaptive Interpolative Vector Quantization (AIVQ) [8,10,11] was later developed to improve the coding fidelity at reduced bit-rate. In AIVQ, a one-pass subsampling process is employed for encoding intensity profiles of pictures along the horizontal direction. Images recovered by interpolating the subsampled points are found to exhibit good visual quality and coding fidelity. The method, however, is not applicable in the vertical direction and fix decimation is therefore performed along this dimension to further reduce the overall bit-rate. The quality of the approximated image decoded by interpolating the subsampled rows of pixels are poor with prominent jagged edges, and VQ is employed to encoded the residual (error) image separately. In the decoder, the latter is reconstructed and added back to the interpolated images to compensate for some of the losses. The use of VQ, naturally, also brings in its inherent disadvantages and restrictions as well as a considerable increase in the bit-rate. It was discovered that human perception is more sensitive to the integrity and smoothness of edge contours than how well they are preserved as compare with a reference image on quantitative basis. As a consequence, further bit-rate reduction has been made possible in recent years by removing information in a picture that is of less relevance to our visual system. On this basis, and in view of the shortcomings of existing VQ techniques, a low complexity image compression scheme known as the Adaptive Boundary Search (ABS) algorithm has been developed and reported in this paper. The method is based on [8,10,11], but instead of employing VQ to encode the residual component for
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391—401
compensating the error in the approximated image, a prediction algorithm is developed to synthesize the edge contours between two subsampled rows of pixels. The encoded images are pleasant in appearance with acceptable coding fidelity. The organization of the paper is given as follows. In Section 2, a description on the adaptive decimation and interpolation process in AIVQ is described. The proposed scheme and experimental results are presented in Sections 3 and 4, respectively. These are followed by a conclusion in Section 5 that summarise the essential findings.
2. Adaptive decimation and interpolation (ADI) The adaptive decimator and interpolator reported in [8,10—12] is shown in Fig. 1. The source picture X(x,y) is first subsample in the vertical direction by a factor of K (i.e., only one out of K rows are retained) to form an image XM (x,y). A horizontal adaptive decimator is then applied to extract the sampling points on each row of pixels in XM (x,y) which may induce considerable aliasing errors if excluded in the interpolation process. Prior to that, it is necessary to reject small noisy fluctuations which, if included, will result in superfluous samples and lead to unnecessary increase in the bit-rate. For this purpose, the intensity curve is first filtered with a lowpass Guassian function G to give a smoothed image XM (x,y). S Suppose Iu"Mi , i ,2, i N is the intensity profile 1 2 N of the gth row of pixels in XM (x,y), the task of the S
393
adaptive decimator is to determine the smallest set Pu"Mp , p ,2, p N such that when the smoothed 1 2 M(g) section between any two adjacent sampling points is approximated by a straight line, the overall error will be limited to within a fixed threshold. Mathematically, if S(p , p ) and L(p , p ) are the curve j j`1 j j`1 segment and the straight line bounded by two adjacent sampling points with intensities i and i . n m The maximum distance (or error) between them is given by dist(m,n)
G C
A
B
DH
i !i n (k!n) "max abs i !i ! m k n m!n
n)k)m 0(j(M(g)
(1) where n"p , m"p , and M(g) is the total numj j`1 ber of sampling points in Iu. The set Pu should satisfy the criteria dist(p , p )(¹ for 0(j(M(g) . j`1 j
(C1)
The start point in Iu is always included as the first sampling point p . The next element p is the nearest 1 2 point from p which does not satisfy the above 1 criteria. The process is repetitively performed to determine the entire sequence in Pu. A collection of the sampling point sets for every row in XM (x,y) S forms the encoded data Y. In the interpolating process, each row of intensity profile encoded in Y is reconstructed to give an image XI (x,y) by joining adjacent sampling points with straight lines. XI (x,y)
Fig. 1. Adaptive decimation and interpolation (ADI) in AIVQ.
394
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391–401
is then upsampled to construct the approximated image XK (x,y) using vertical bilinear interpolation. In AIVQ, the difference between X(x,y) and XK (x,y) is encoded with VQ and is transmitted to the decoder to compensate the errors in the approximated image.
3. Adaptive boundary search (ABS) algorithm In AIVQ, the major error in the approximated image XK (x,y) is attributed to the irrecoverable loss of information in vertical subsampling. The error does not only decrease the coding fidelity in terms of PSNR, but also imposed a prominent degradation in the edge contours. Although the use of VQ encoded residual can partially eliminate some of these defects, it also increases the bit-rate as well as the complexity of the encoder. The ABS algorithm provides an alternative solution in overcoming the problem by predicting the missing edge information, hence provides better visual effect and reduction in the coding error. The method is described as follows. Suppose Iu and Iu#K are two consecutive rows of pixels in XI (x,y), and Pu is the corresponding sampS ling point set of Iu. Since each element p D in j 0:j:M(g) Pu reflects a discontinuity in the intensity curve, it is highly possible that there exist an edge that begins at p and terminates at a pixel b in Iu#K as j j shown in Fig. 2. The sequence of order pair E"[(p ,b )] forms a collection of matched j j 0:j:M(g) point sets which define the edge contour between Iu and Iu#K. Determination of the degree of matching between the mth pixel in Iu#K and a point p in j Pu is performed with the difference measure
given by 1 w@2 + Di(g, p #k)!i(g#K, m#k)D , id(p , m)" j j w k/~w@2 (2) where i(r, s) is the intensity of the sth pixel at the rth row of the image XM (x, y), and w is a window of S about 4 pixels wide. The sequence E is constructed with the following procedure. (see Table 1). The encoder and decoder of the compression scheme are depicted in Fig. 3(a,b). Comparing with Fig. 1, it can be seen that the encoder is similar to that in AIVQ, but the use of VQ in residual compensation had been discarded. In the decoder, the ABS edge predictor is first applied to estimate the edge points between the subsampled image lines, and bilinear interpolation is performed to reconstruct the pixel intensities in the following manner. Suppose i(g,n) and i(g#K, n) are the nth pixels in two consecutive subsampled lines Iu and Iu#K, respectively. If i(g#j, n) is not an edge point for 0(j(K, its value is determined by i(g, n)](K!j)#i(g#K, n)]j i(g#j, n)" . K (3) Otherwise, if an edge point is located at the ¸th row by the ABS algorithm, the interpolation process will be split into two parts as given by i(g, n)](¸!j)#i(g#¸, n)]j i(g#j, n)" ¸ for 0(j(¸,
Fig. 2. Predicted edge segments between two rows of pixels.
(4a)
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391—401
395
Table 1 The adaptive boundary search (ABS) algorithm Step
Forward search
1
Initially, define b "p for 0(j(M(g) j j Set count"0 set j"1, fscore"bscore"0 j"j#1 Apply Eq. (2) to compute the intensity difference id between p and each of the pixels in Iu#K within the interval j bounded by b and b , and find the location c of the pixel that exhibits the closest match with p j~1 j`1 j j set b "c , fscore"score#id(p , b ) j j j j Repeat step 3 to 6 until j"M(g)
2 3 4 5 6
Backward search j"j!1 Repeat step 4 set b "c , bscore"bscore#id(p ,b ) j j j j Repeat step 7 to 10 until j"0 count"count#1 The process terminates if count'4, or if the difference between fscore and bscore is less than 10%. Otherwise repeat step 2 through 12
7 8 9 10 11 12
Fig. 3. (a) Encoder based on adaptive decimator. (b) Decoder formed by adaptive interpolation and ABS.
and i(g#j, n) i(g#¸, n)](K!j)#i(g#K, n)](j!¸) " (K!¸) for ¸(j(K.
(4b)
It was found that in practice a good compromise between bit-rate and visual quality can be achieved with the vertical subsampling factor K set to 4. Consequently, the interpolation can be implemented with simple integer additions and look-uptables.
396
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391–401
Fig. 4. Original images. (a) Lenna, (b) Peppers, (c) Tiffany, (d) Monument.
Table 2 Coding fidelity (in PSNR-dB) of images encoded with plain VQ, ADI algorithm and the proposed scheme at around 0.3 bpp Plain VQ
ADI
Proposed scheme
Images
PSNR
bpp
PSNR
bpp
PSNR
bpp
Lenna Peppers Tiffany Monument
28.23 28.26 30.16 26.08
0.31 0.31 0.31 0.31
28.82 29.07 30.09 25.75
0.30 0.29 0.29 0.28
29.06 29.21 30.24 25.86
0.30 0.29 0.29 0.28
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391—401
397
Fig. 5. Images encoded with plain VQ. (a) Lenna — 28.23 dB, (b) Peppers — 28.26 dB, (c) Tiffany — 30.16 dB, (d) Monument — 26.08 dB.
4. Result The performance of the proposed scheme is demonstrated with the set of images: Lenna, Peppers, Tiffany and Monument [14] shown in Fig. 4(a,c). All pictures are digitized with a resolution of 512]512 pixels. The decoded image applying the conventional plain VQ method with 32 codevectors in-training set codebook (i.e., codebook generated directly with the image to be encoded) and the ADI algorithm [8,10,11] in encoding the images are shown in Fig. 5(a,b) and Fig. 6(a—d). It can be seen that the majority of edge contours in the
decoded images are poorly constructed and take the form of staircases. This is mainly caused by the errors in vertical interpolation which rigidly imposed false boundaries in between two subsampled rows of pixels. Images decoded with the proposed scheme are shown in Fig. 7(a—d). It can be observed that edge contours had been reconstructed by the ABS algorithm and give a more pleasant visual effect to human perception. The enhancement of the proposed scheme on edge contours is further shown with the enlarged images in Fig. 8(a—d). The coding fidelity of the above encoded images are listed in Table 2 and the comparison of coding
398
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391–401
Fig. 6. Images encoded with the ADI algorithm. (a) Lenna — 28.82 dB, (b) Peppers — 29.07 dB, (c) Tiffany — 30.09 dB, (d) Monument — 25.75 dB.
fidelity on ADI method and proposed ABS method in different bit rates in Fig. 9.
5. Conclusion The development of a low complexity image compression scheme is reported in this paper. The encoder is an adaptive decimator that decompose each row of pixels in the vertically down-sampled source image into a diminutive set of domi-
nant points. In the decoder, these set of points are used to recover the image, and the missing information between the vertically down-sampledlines are reconstructed with the Adaptive Boundary Search (ABS) algorithm and simple bilinear interpolation. Experimental results show that the method is capable of providing good estimation on edge contours even at low bit-rate of around 0.3 bpp. The amount of computation in the encoder and decoder is small and can be implemented to operate in real time at low hardware cost.
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391—401
399
Fig. 7. Images encoded with the ADI and ABS algorithm. (a) Lenna — 29.06 dB, (b) Peppers — 29.21 dB, (c) Tiffany — 30.24 dB, (d) Monument — 25.86 dB.
400
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391–401
Fig. 8. The enlarged images encoded by existing scheme and proposed scheme. (a) Original Pepper, (b) Coded by Plain VQ method, (c) Coded by ADI coding scheme, (d) Coded by proposed coding scheme.
P.W.M. Tsang, W.T. Lee / Signal Processing 65 (1998) 391—401
401
Fig. 9. Comparison of coding fidelity of decoded images (Pepper — 512]512 in size, 8 bits in color) employing convertional ADI method and proposed method in this paper.
Acknowledgements The research works reported in this paper is supported by the UPGC Competitive Earmarked Research Grant (CERG).
References [1] A.N. Akansu, R.A. Haddad, Multiresolution Signal decomposition — Transform, Subbands, Wavelets, Academic Press, New York, 1992. [2] R.A. Cohen, J.W. Woods, Entropy-constrained SBPVQ for image coding, in: ICASSP-90, 1990, pp. 2269—2272. [3] A. Gersho, R.M. Gray, Vector quantization and signal compression, Kluwer Academic Publishers, Dordrecht, 1992. [4] H.M. Hang, B.G. Haskell, Interpolative vector quantization of color images, IEEE Trans. Commun. 36 (April 1988) 465—469. [5] H.M. Hang, J.W. Woods, Predictive vector quantization of image, IEEE Trans. Commun. COM-33 (11) (November 1985). [6] Y. Linde, A. Buzo, R.M. Gray, An algorithm for vector quantizer design, IEEE Trans. Commun. COM-28 (January 1980) 84—95.
[7] H. Sun, C.N. Manikopoulus, Predictive vector quantization based upon first-order Markov model for image transmission, in: ISCAS-89, 1989, pp. 1516—1519. [8] P.W.M. Tsang, A high quality image compression technique for low cost multimedia applications, IEEE Trans. Consumer Electronics 41 (1) (February 1995) 140—149. [9] P.W.M. Tsang, W.T. Lee, A low complexity predicted vector quantization scheme, IEEE Trans. Consumer Electronics (November 1995) 1108—1117. [10] P.W.M. Tsang, W.H. Tsang, An improved interpolative vector quantization scheme using non-recursive adaptive decimation, Pattern Recognition Lett. 16 (1995) 1043—1050. [11] W.M. Tsang, W.T. Lee, Small codebook interpolative vector quantization system, in: 4th Internat. Symp. on Signal Process. Appl., 1996, accepted. [12] W.M. Tsang, P.C. Yuen, F.K. Lam, Contour approximation, in: Proc. 3rd Internat. Symp. Signal Process. Appl., 1992, pp. 97—100. [13] K.S. Wang, J.O. Normile, H. Wu, Software decodable video compression algorithm based on vector quantization and classification, in: IEEE Workshop on Visual Signal Processing and Communication, Melbourne, Australia, 1993, pp. 129—132. [14] S. Wolf, Monument (Cologne, Germany) — KODACOLOR Gold 100 (35 mm), Ref. dJN0291, Kodak Photo CD Access Software Photo Sampler, Part No. 15-1132-01, 1991.