ARTICLE IN PRESS
Neurocomputing 69 (2006) 2327–2335 www.elsevier.com/locate/neucom
An image coding scheme using SMVQ and support vector machines Chin-Chen Changa,, Chia-Te Liaob a
Department of Information Engineering and Computer Science, Feng Chia University, Taichung, Taiwan 40724, ROC b Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 300, ROC Received 17 December 2003; received in revised form 22 February 2005; accepted 24 February 2005 Available online 24 January 2006 Communicated by T. Heskes
Abstract To efficiently compress images, the spatial relations among image blocks to achieve low bit-rate compression has been widely adopted in these years. An example of using such spatial relations is side-match vector quantization (SMVQ). In this paper, a new technique utilizing support vector machines (SVM) to enhance the quality of images compressed by an algorithm based on SMVQ is proposed. We incorporate SVM as a part of the encoder/decoder to detect the edges across boundaries and, therefore, to further improve the accuracy in the block predicting phase, while the number of codewords available in the state codebooks can be enlarged at the same time without increasing the bit-rate. Our simulation confirms the superiority of the proposed scheme. Reasonable improvements in the resulting images were also obtained. r 2006 Elsevier B.V. All rights reserved. Keywords: Side-match vector quantization; Discrete wavelet transform; Support vector machines
1. Introduction Today, due to the rapid growth of the demand for video communication and multimedia applications in this digitalized world, image compression has become one of the most important research topics for the purpose of reducing the size of storage along with the bit-rate in transmission. Vector quantization (VQ) is a competitive and simple lossy image compression technique, which only shows little image quality degradation, while its efficiency has been proven in [5]. In addition, because of the superiority of the characteristics of its rate-distortion and simple coding scheme over the conventional compression techniques [5,15,18], VQ nowadays has been widely used as a fundamental technique in various image/video/voice compression procedures in various industries. Generally, a VQ encoder maps a given set of data vectors, O ¼ foi joi 2 Rk ; i ¼ 1; 2; . . . ; mg, into a finite subset C ¼ fcj jcj 2 Rk ; j ¼ 1; 2; . . . ; ng composed of n codewords in the k-dimensional Euclidean space Rk, where m stands for the Corresponding author.
E-mail addresses:
[email protected] (C.-C. Chang),
[email protected] (C.-T. Liao). 0925-2312/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2005.02.022
number of data vectors. In other words, the VQ coding function tries to replace a data vector oi 2 O with the most similar codeword cj in codebook C such that VQðoi Þ ¼ cj ,
(1)
where cj achieves the requirement EDðoi ; cj Þ ¼ min EDðoi ; cz Þ; cz 2C
for z ¼ 1; 2; . . . ; n.
(2)
Here, the notation ED denotes the Euclidean distance and is defined by EDða; bÞ ¼ jja bjj2 ¼
k X
½aðiÞ bðiÞ 2 ,
(3)
i¼0
where a and b are arbitrary k-dimensional vectors. Thus, an input data vector oi can be successfully quantized by the closest codeword cj in codebook C. For an image to be compressed, first, the VQ encoder has to divide the image into m non-overlapping blocks as the set of data vectors, which is O. Then, after the mapping procedure to each block is performed, the only information we have to keep are the indices of the mapped codewords in C. The decoding procedure is simply performed by using these indices as addresses to retrieve the corresponding
ARTICLE IN PRESS 2328
C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
codewords from the same codebook used in encoding, and by appropriately piecing them together into the resulting image. There are many methods based on VQ that have been proposed in recent research to further lower the bit-rate of the compressed image. A concept named finite state vector quantization (FSVQ) discussed in [5] using the correlations among previously encoded neighboring blocks is introduced. It is able to reduce the bit-rate significantly beyond traditional VQ. FSVQ takes advantage of the previously encoded state information to predict the current state by a next-state function. That is, by using the correlation of previously encoded blocks, it can construct a much smaller sub-codebook (state codebook), which is used in the coding procedure, from the original codebook (master codebook) for every current block being encoded. Therefore, the goal of low bit-rate compression can be achieved by encoding each block in the image with one distinct smaller state codebook, and the quality of the encoded image can still be retained because the state codebook is constructed from predictions precisely from the master codebook. Side-match vector quantization (SMVQ), which is proposed in [11], is a popular class of a typical FSVQ. Many researches in the area of image compression based on SMVQ have been proposed in recent years. In [3], Chen et al. presented a novel variable-rate SMVQ algorithm named NewSMVQ that incorporates three techniques: diagonal sampling, principle component analysis (PCA), and rechecking to improve the encoded image quality together with its encoding time. [20] have also proposed a hierarchical three-sided side match finite-state vector quantization method (TSMVQ). They make use of a three-sided prediction and a sampling arrangement to effectively enhance the hit ratio of the prediction; at the same time, with regular refreshing of the used codewords, the derailment problem is avoided. In [13], by considering both the mean value transitions and the boundary pixel transitions, Lee et al. exploited the smooth transition on the boundary of two neighboring blocks, successfully saving 80–90% computation time with the same criterion of image quality. Pan et al. [16] proposed an efficient scheme named new CSMVQ, benefited by using a two-level block classifier for the variances among blocks and two separate master codebooks. [21], also presented the Smooth Side Match Classified Vector Quantizer method (SSMCVQ) that combines three different techniques. In [8], Huang and Chang used a multilayer perceptron neural network to further improve the accuracy of SMVQ’s prediction with its non-linear property; at the same time, its architecture is also simple and efficient for hardware design. In this paper, we will give a general framework based on the SMVQ method for fixed bit-rate image compression. The good generalization properties of SVM together with the image processing technique of discrete wavelet transform (DWT) will be applied to identify edge blocks; for every block being encoded, there will concurrently be two
different state codebooks available to make the qualities of encoded images better, but without increasing the bit-rate. The organization of this paper is as follows: In Section 2, we shall briefly review some background theories related to our scheme. Then, in Section 3, we will describe the proposed scheme in detail. The experimental results will be shown in Section 4. Finally, we will conclude with some remarks, together with a discussion of the future direction of our work, in Section 5. 2. Review of background Some related theories shall be briefly reviewed in this section. We will discuss the conventional SMVQ method in Section 2.1 and the basic concept of DWT in Section 2.2. Finally, the notions of SVM will be given in Section 2.3. 2.1. SMVQ Since SMVQ is a kind of typical FSVQ, the only problem is how it uses the previously encoded blocks as information to generate the state codebook used for current state encoding. Due to the highly spatial correlations in natural images, the SMVQ method takes advantage of that fact—the transition on the boundaries between two neighboring VQ blocks is usually very smooth—to obtain the current state. In other words, SMVQ tries to make the distortions on the boundaries between two adjacent VQ blocks as small as possible. Based on this idea, SMVQ has to encode the first column and the first row of image blocks by the exhaustive-search VQ method (ESVQ) using master codebook C initially. Then, as Fig. 1 depicts, for every input data vector x, the
Fig. 1. Blocks and their pixel indices in SMVQ. The boundary pixels of the upper block u and the left block l are used as the state information to construct the state codebook.
ARTICLE IN PRESS C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
boundary pixels of the upper block u and the left block l that were previously encoded would be used as the information to construct the desired state codebook SCx. Formally speaking, suppose we have the state space, which is defined as S ¼ fsi 2 Sji ¼ 0; 1; . . . ; u lg, where u and l denote the codewords of the upper and the left neighboring blocks along x, respectively. Note that here S is the set of all current states, and it can be understood as a record saving information to determine the next state. The blocks u and l will be used to offer the vertical and horizontal correlations as the state information to decide the next state sx to encode x. Here, in order to construct the state codebook SCx encoding x, we make use of the notation D to express the distance of the distortion between the input block x and the compared codeword in C. Assume cz 2 C is a codeword. Then, D is defined by Dðx; cz Þ ¼ Dv ðx; cz Þ þ Dh ðx; cz Þ,
(4)
where Dv ðx; cz Þ ¼
w1 X
Due to its property of orthonormal wavelet basis function and the simplicity in implementation, therefore, in the proposed method we will make use of the DWT based on the Haar function. The basis for what we nowadays call Haar Wavelets was first given by [7] and is able to decompose images into the sub-bands, LL, LH, HL, and HH. It can be realized With the following equations [9]: ll 1 ðx; yÞ ¼
1 X 1 1X pð2x þ i; 2y þ jÞ, 4 i¼0 j¼0
lh1 ðx; yÞ ¼
1 1 1X 1X pð2x þ i; 2yÞ pð2x þ i; 2y þ 1Þ, 4 i¼0 4 i¼0
(7)
(8) hl 1 ðx; yÞ ¼
1 1 1X 1X pð2x; 2y þ jÞ pð2x þ 1; 2y þ jÞ, 4 j¼0 4 j¼0
(9) ½xð0;iÞ czð0;iÞ 2
(5)
i¼0
and Dh ðx; cz Þ ¼
2329
h1 X ½xði;0Þ czði;0Þ 2 ;
cz 2 C for z ¼ 1; 2; . . . ; n.
i¼0
(6) Finally, the SMVQ encoder will pick Ns nearest codewords compared to x from the master codebook to construct the desired SCx. By selecting the nearest Ns codewords at smaller distance from the master codebook, we can build the used state codebook SCx and apply it to encode the input block x, achieving the goal of low bit-rate image compression.
hh1 ðx; yÞ ¼ 14fpð2x; 2yÞ þ pð2x þ 1; 2y þ 1Þ pð2x þ 1; 2yÞ pð2x; 2y þ 1Þg.
ð10Þ
The notations ll1(x,y), lh1(x,y), hl1(x,y), and hh1(x,y) refer to the components of four corresponding first-level sub-bands LL1, LH1, HL1, and HH1, respectively, where LL1 is the component of approximation coefficients corresponding to level one. Here, we denote p(x,y) as the pixel value at the location (x,y) in the DWT processed image with size M N (0pxpN/2; 0pypM/2). Note that it is available for recursively applying the same procedure to the approximation coefficient component (LL) in order to obtain the next-level wavelet coefficients. Fig. 2 shows an example image derived from third-level Haar Wavelet Transform along with its conceptual expression diagram.
2.2. Discrete wavelet transform
2.3. Support vector machines
DWT is a frequently used image processing technique which performs the function of transforming images from the spatial domain into the frequency domain. By applying DWT, we are able to decompose an image into the corresponding sub-bands with their relative DWT coefficients. Basically, the low-frequency parts of the transformed image can be viewed as the most important parts of the transformed images; on the other hand, since the highfrequency parts of the images are not very perceptible to human eyes, they may be treated as one kind of redundancy from this viewpoint. Several researches apply these characteristics to different applications, such as digital image watermarking and image compression. For example, JPEG2000 is a famous image compression technique using wavelet coding. At the same time, it is well-known to hide digital watermarks in the highfrequency part of images or to make use of the lowfrequency components for image retrieval.
Recently, more and more interesting researches motivated by Support Vector Machines (SVM) have been investigated in such fields as text categorization, handwritten pattern recognition, bioinformatics, etc., and most of them have attained successful results [6,10,12]. In general, SVM are systems of learning machines that have non-linear approximation abilities with the kernel induced; at the same, the optimization theory also provides efficient training abilities and the generalization theory gives the insights to control/keep the complexity and consistency of the hypotheses [4]. We can easily formulate a SVM problem as follows: Given a training set S ¼ fðx1 ; y1 Þ; ðx2 ; y2 Þ; . . . ; ðxl ; yl Þg ðX Y Þl , ðxi ; yi Þ 2 S is an example instance and xi 2 X , yi 2 Y . Here, X Rd is the d-dimensional input space, Y is the output domain, and l is the number of instances in S. The vector xi would be the describing features of the corresponding instance. For cases of binary classification,
ARTICLE IN PRESS C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
2330
Fig. 2. The conception of three-level discrete wavelet transform (a) a derived example image (b) the representation of the corresponding sub-band components.
Fig. 4. Example blocks with different types (a) edge block (b) edge block (c) non-edge block (d) non-edge block.
Fig. 3. The optimal separating hyperplane and its margin.
we have Y ¼ f1; 1g; for cases of m-class classification, we have Y ¼ f1; 2; . . . ; mg; and for cases of regression, we have Y R. Note that originally SVM were designed as a binary classifier, so one needs to apply techniques such as ‘‘one against one’’ [17] or ‘‘one against the others’’ [1] when the multi-class problems are dealt with in real situation. SVM are the learning methodologies that synthesize the solutions from examples. In other words, we try to find a hyperplane f that is able to classify the examples such that all the instances with the same labels stand on the same side when it comes to classification, or to get a linear function that best interpolates the set of examples in regression cases. That is, it tries to find a hyperplane f subject to yi ðw xi þ bÞ40;
for i ¼ 1; 2; . . . ; l,
(11)
where w and b are the weight and the bias, respectively. If such a hyperplane f satisfying Eq. (11) exists, then it can be scaled to yi ðw xi þ bÞX1;
for i ¼ 1; 2; . . . ; l.
(12)
The quantity 2/||w|| is named the margin, as we show in Fig. 3. In a sense, it can be realized as the generalization abilities of a hyperplane f: the larger the margin is, the better the generation ability will be [19]. Therefore, by minimizing the quantity ||w||2, we can obtain the optimal separating hyperplane (OSH) with the best generalization capability. Assume now that we have found the best solution f* (OSH) in the form of (w*,b*) for the training set S. Then the decision function h(x) will be hðxÞ ¼ signðw x þ b Þ.
(13)
Thus, to handle an unclassified data instance z 2 X , we can employ the decision function h(z) to determine which class it belongs to. Moreover, just as we select the architecture for a neural network, kernel representations are often adopted in order to increase the computational power of linear SVM [4]. By projecting the data instances into a more meaningful and higher dimensional space, kernel representations provide non-linear approximation abilities to handle problems that are usually not easy to solve in real life. 3. The proposed method In this section, we will describe the proposed imagecoding scheme in detail. Owing to its good generalization performance [4], SVM will be used here as the block type (edge/non-edge) identifier, making the recognition task
ARTICLE IN PRESS C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
DWT Feature Extraction
Example Blocks
SVM Training
2331
DWT for each block
Corresponding Types
Obtained SVM Model
TRAINING
Master Codebook
Obtained SVM Model
a 4x 4 block of 16-dimensional vector
Fig. 6. Blocks and the transformed input pattern of DWT coefficients. The most descriptive features are condensed into the upper left corner with the representation of the relative DWT coefficients in gray lattices.
SVM classification to adjacent blocks
DWT Feature Extraction
Build the state codebook for the type classified
for each block
The corresponding state codebook
Encoding / Decoding Phase
Source image
Compressed image
PREDICTING
Fig. 5. The flowchart of the proposed coding system. A well-trained SVM identifier is incorporated to generate an appropriate state codebook corresponding to the classified adjacent block types.
more feasible and practical. Some example blocks of different types are shown in Fig. 4 for clarity. As Fig. 5 depicts, we can simply divide the proposed scheme into three phases: (1) feature extraction, (2) SVM training, and (3) the image encoding/decoding phase. First, to make the input patterns suitable for the SVM identifier, we have to transform the necessary image blocks into the corresponding feature vectors in X together with their block types in Y. DWT would be utilized here for the block-transforming task to extract features as inputs for SVM. After the training procedure of example blocks is completed we can obtain a SVM model used for block type identification. Finally, according to the result of SVM classification, for every block we are able to generate a state codebook corresponding to its type, and use it further for encoding/ decoding accomplishing image compression. We will give the detailed explanations in the following three subsections. 3.1. Feature extraction This subsection will focus on how to use DWT to condense the data from 16-dimensional vectors into the
smaller sized input patterns in SVM training and classification. In fact, for each block, only 4 DWT coefficients are finally taken as the feature vectors, because we would like to trade a tiny bit of precision loss for a significant reduction in complexity in SVM training and classifying computation. First, we will apply DWT to each image block. Then, after transforming the blocks into the frequency domain, the blocks’ essences will be concentrated in the upper left corner. These essences are the most important descriptive features of the transformed blocks, which are represented by the relative DWT coefficients in the upper left corner. Therefore, to get the most important essence for each block by applying DWT, we can only take 4 corresponding DWT coefficients as the features used in SVM training and classification. Fig. 6 explains the relationships between the original data blocks and their transformed patterns used for SVM; the lattices in gray with indices 0, 1, 2, and 4 are the corresponding positions of the four DWT coefficients that we will take. 3.2. SVM training Suppose we have selected a set of example blocks in advance. In this subsection, after applying DWT to the example blocks as discussed in subsection 3.1, the model used for SVM classification can be trained now. For each block, we will take four DWT coefficients as the descriptive features plus one target label to express its type, which is ‘‘edge’’ or ‘‘non-edge,’’ to assemble a training instance. That is, after the transformation procedure is applied to each block, we will obtain a set of example training instances S ¼ fðx1 ; y1 Þ; ðx2 ; y2 Þ; . . . ; ðxl ; yl Þg ðX Y Þl . Here, l denotes the number of example instances for training and ðxi ; yi Þ 2 S is an example instance, respectively. For i ¼ 1; 2; . . . ; l, xi 2 R4 is the feature vector composed of four corresponding DWT coefficients as the extracted features of block i, and yi 2 f1; 1g is the label representing its type (edge/non-edge). This is the simplest case of binary classification. Note that here one could arbitrarily define his own meaning of the terms ‘‘edge’’ and ‘‘non-edge’’ according to the requirements. Finally, after
ARTICLE IN PRESS 2332
C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
feeding these example training instances into SVM for training, we can have the desired model for this classification problem as the result. 3.3. Image encoding/decoding phase In this subsection, we will focus on how to encode/ decode an image with the proposed scheme and then enhance its quality using SVM. We see that for every SMVQ-based image compression algorithm, the factors causing derailment problems are mainly due to (1) the obvious variations among the neighboring blocks and (2) a single predicting error will more or less lead to derailment problems. Here, we are going to apply SVM with the good generalization properties to tackle this phenomenon and further improve it. To do so, first, we will feed the previously encoded blocks adjacent to the current block being encoded into SVM for estimating the variations in the neighboring area. When the smoothness among blocks is not retained anymore, it is highly possible to make errors in prediction while the conventional SMVQ-based methods are used as the encoder/decoder. The intensive variations among adjacent blocks are just signs for the error predictions. Therefore, as Fig. 5 depicts, with a welltrained SVM model for edge block classification, we can now decide which type of state codebooks should be generated for current block coding purpose. For each block being encoded, either ‘‘edge’’ typed state codebook in the area with large variance or ‘‘non-edge’’ typed state codebook in the smooth area will be generated, depending on the results of SVM estimation. Note that since edges crossing block boundaries are the most common reason to cause blocks with large variances, we deal with these blocks with the type ‘‘edge,’’ and vice versa. Therefore, by SVM estimation we can now decide which type of state codebook should be generated and use it for the current block coding procedure. The trick used here to vary two types of state codebooks, which are ‘‘edge’’ or ‘‘non-edge,’’ is to arrange a different number of ‘‘jumping-codewords’’ into the corresponding type state codebooks. These ‘‘jumping-codewords’’ are the codewords that have larger side-match distances to the neighboring blocks. In other words, the main differences between these two types of state codebooks are the numbers of the jumping-codewords we settled on. These arranged jumping-codewords are chosen carefully from the master codebook, and they will function to correct the error predictions often made by the conventional SMVQ when the adjacent blocks have large variances. Besides, it would be natural to place more jumping-codewords in the state codebook typed ‘‘nonedge’’ than one typed ‘‘edge’’ since the original SMVQ method predicts well in the smooth areas. In short, by incorporating SVM to make two separate types of state codebooks concurrently available, we can detect and effectively improve the block error prediction problem often made by SMVQ-based image compression algorithms.
4. Experimental results In this section, we will demonstrate the experimental results benefited by the proposed scheme. For comparison, the results of the conventional SMVQ method are presented as well. The Library of Support Vector Machines (LIBSVM), which can be downloaded from [2], is an efficient and easy-to-use software implementing support vector learning. We will use LIBSVM as tools for our classification procedure in the simulations. Here, the term peak signal to noise ratio (PSNR) is used to evaluate the quality of the resulting images, which is defined as PSNR ¼ 10 log10
ð2n 1Þ2 , MSE
(14)
where n is the number of bits per pixel. The term MSE refers to the mean square error, which is given by X u X v 1 MSE ¼ ðpij p0 ij Þ2 . (15) uv i j The notations u and v are, respectively, the height and width of the images. And pij denotes the (i,j) pixel value of the original image, while p0 ij denotes the (i,j)th pixel value of the recovered image. In the experiments, all the test images are of size 512 512 with 8 bits resolution in gray level per pixel. The master codebook we adopted was trained from five 512 512 gray level images by LBG algorithm [14]. They are AIRPLANE, BOAT, LENNA, SAILBOAT, and TOYS. Moreover, every state codebook in the coding procedure will have 32 codewords; that is, for every block being encoded we will use 5 bits to record it. Note that the proposed method can be simply viewed as a module and is capable of being mounted on an arbitrary SMVQ-based image compression algorithm to further improve its compressed image qualities without increasing the bit-rate by utilizing SVM with the good generalization properties. On the other hand, since this technique has to cooperate with an SMVQ-based image compression method, the performance of the image compression ratio also depends on which method it works with. Therefore, because we adopted the conventional SMVQ image compression method in the experiments to compare and further improve its compressed image qualities, the image compression ratio will be equal to that of SMVQ, which is about 3.91%. In the experiments, the simplest one-level DWT was used in the feature extraction phase, while in the SVM training phase Support Vector Classification with radial basis function (RBF) kernel representation in the 2 form of Kðx; yÞ ¼ erjjxyjj was exploited. In the training phase we only took a total of 256 blocks as the training examples; however, the remarkable improvements obtained from the evaluated images are still very exciting. The experimental results are shown in Fig. 7. Fig. 7(a) is the original source image TOYS, while the image compressed by the conventional SMVQ method is shown in Fig. 7(b) with the associated PSNR value being
ARTICLE IN PRESS C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
2333
Fig. 7. (a) Original TOYS image, (b) SMVQ version, (c) enhanced version, (d) original PEPPER image, (e) SMVQ version, (f) enhanced version.
by the exhaustive-search vector quantization method (ESVQ).
Table 1 Comparison of SMVQ/ESVQ for other host images
AIRPLANE BABOON BARBARA BOAT GIRL GOLD LENA LENNA PEPPER SAILBOAT TIFFANY TOYS ZELDA
SMVQ
Our scheme (PSNR)
ESVQ
28.27 22.69 23.94 27.22 30.00 29.05 27.25 28.68 28.53 26.74 30.05 27.15 32.17
29.07 22.98 24.46 27.90 30.19 29.33 27.97 29.54 29.34 27.38 30.05 28.53 32.30
30.62 24.13 25.84 29.42 31.21 30.25 29.45 31.41 30.76 28.66 30.35 29.95 33.41
27.15 dB. Fig. 7(c) is the resulting image compressed by the proposed method with the improved PSNR value being 28.53 dB. In addition, another experimental result of the image PEPPER, which was not included in the master codebook training procedure, is also given. Fig. 7(d) is the original source image PEPPER. Fig. 7(e) presents the SMVQ compressed version of image ‘‘PEPPER’’ with the associated PSNR value being 28.53 dB. Without the compression bit-rate being increased, when the proposed technique is applied, a better quality of the compressed image can be obtained, as shown in Fig. 7(f), while the associated PSNR value is 29.34 dB. In Table 1, we also give other experimental results with different source images together with their theoretically optimal qualities encoded
5. Conclusions Being a typical class of FSVQ, SMVQ does successfully use the spatial correlations among neighboring blocks to make predictions precisely and obtain excellent results in the area of low bit-rate image compression. At the same time, DWT is frequently used in image processing to divide images into sub-bands in frequency domain. Many researches based on the specialties of different sub-bands have been developed such as watermarking, image retrieval, etc. In addition, due to good generalization properties, SVM appear in the world of machine learning and perform very successfully in the fields of image recognition, hand-written digital recognition, text recognition, and bioinformatics. Motivated by these good results, we combine all these classic techniques and present a novel image coding scheme in this paper. By collaborating with one SMVQ-based image compression algorithm, the main contribution of this scheme is offering a module that can improve the qualities of compressed images without increasing any bit-rate. The available state codebooks are enlarged in order to obtain better image compression qualities; at the same time, the number of bits used for coding are still retained by reassembling the state codebooks corresponding to the types classified by SVM. As to the experimental results shown in Section 4, we confirmed the superiority in the aspect of better image compression qualities along with the non-increasing bit-rate. Note that it is important since SVM is a system of learning machines,
ARTICLE IN PRESS 2334
C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335
with other definitions given to the types of ‘‘edge’’ or ‘‘nonedge’’; this method is also very flexible and easy to apply to other FSVQ-based image compression methods. In the future, there are still some issues to work on. First, because the proposed mechanism has to cooperate with an SMVQ-based algorithm, when combined with a different outstanding algorithm, which is selected carefully, a marvelously new image compression scheme could be developed. Second, using one more master codebook especially designed for our type definitions would make the constructed state codebooks much more fit after SVM classification. In addition, in the experiments, we took only 256 blocks into the training procedure for SVM; it is also reasonable that with more carefully chosen example instances taken into training SVM, there could be better performance in the aspect of block recognition. When this scheme cooperates with different FSVQ-based image compression methods under varied applications/pattern definitions, exploring other type classifiers such as Bayesian classifier, Gaussian Mixture Model, or Neural Networks may be more suitable than using SVM. Acknowledgment The authors would like to thank Chih-Chung Chang and Chih-Jen Lin for making LIBSVM software available on the website for researchers. References [1] V. Blanz, B. Scholkopf, H. Bulthoff, C. Burges, V. Vapnik, T. Vetter, Comparison of view-based object recognition algorithms using realistic 3D models, in: Proceedings of the Sixth International Conference on Artificial Neural Networks, Bochum, Germany, July 1996, pp. 251–256. [2] C.C. Chang, C.J. Lin, LIBSVM: a library for support vector machines. [Online]. Available WWW: http://www.csie.ntu.edu.tw/ cjlin/libsvm, 2001. [3] T.S. Chen, C.C. Chang, A new image coding algorithm using variable-rate side-match finite-state vector quantization, IEEE Trans. Image Process. 6 (8) (1997) 1185–1187. [4] N. Cristianini, J.S. Taylor, An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Cambridge University Press, Cambridge, 2000. [5] A. Gersho, R.M. Gray, Vector Quantization and Signal Compression, Kluwer Academic Publishers, Boston, MA, 1992. [6] I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classification using support vector machines, Mach. Lear. 46 (2002) 389–422. [7] A. Haar, Zur Theorie der Orthogonalen Funktionensysteme, Ann. Math. 69 (1910) 331–371. [8] Y.L. Huang, R.F. Chang, A new side-match finite-state vector quantization using neural networks for image coding, J. Visual Commun. Image Represent. 13 (3) (2002) 335–347. [9] M. Iwata, A. Shiozaki, Index data embedding method utilizing quantitative relation of wavelet coefficients, IEICE Trans. Fund. Electron. Commun. Comput. Sci. E-84-A (10) (2001) 2508–2513. [10] T. Joachims, Text categorization with support vector machines: learning with many relevant features, in: Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, April 1998, pp. 137–142.
[11] T. Kim, Side match and overlap match vector quantizers for images, IEEE Trans. Image Process. 1 (2) (1992) 170–185. [12] Y. LeCun, L.D. Jackel, L. Bottou, A. Brunot, C. Cortes, J.S. Denker, H. Drucker, I. Guyon, U.A. Muller, E. Sackinger, P. Simard, V. Vapnik, Comparison of learning algorithms for handwritten digit recognition, in: Proceedings of the International Conference on Artificial Neural Networks, Paris, France, October 1995, pp. 53–60. [13] A.M.Y. Lee, J. Feng, K.T. Lo, J.H.T. Tang, New fast search algorithm for image vector quantization, in: Proceedings of the Fifth International Conference on Signal Processing, IEEE, Beijing, China, October 2000, pp. 1069–1072. [14] Y. Linde, A. Buzo, R.M. Gray, An algorithm for vector quantizer design, IEEE Trans. Commun. 28 (1) (1980) 84–95. [15] N.M. Nasrabadi, R.A. King, Image coding using vector quantization: a review, IEEE Trans. Commun. 36 (8) (1988) 957–971. [16] J.S. Pan, Z.M. Lu, S.H. Sun, Image coding using SMVQ with twolevel block classifier, in: Proceedings of the International Symposium on Multimedia Software Engineering, IEEE, Taipei, Taiwan, December 2000, pp. 276–279. [17] M. Pontil, A. Verri, Support vector machines for 3D object recognition, IEEE Trans. Pattern Anal. Mach. Intell. 20 (6) (1998) 637–646. [18] G. Shen, M.L. Liou, An efficient codebook post-processing technique and a window-based fast-search algorithm for image vector quantization, IEEE Trans. Circuits Syst. Video Technol. 10 (6) (2000) 990–997. [19] V. Vapnik, Statistical Learning Theory, Wiley, Chichester, 1998. [20] H.C. Wei, P.C. Tsai, J.S. Wang, Three-sided side match finite-state vector quantization, IEEE Trans. Circuits Syst. Video Technol. 10 (1) (2000) 51–58. [21] S.B. Yang, L.Y. Tseng, Smooth side-match classified vector quantizer with variable block size, IEEE Trans. Image Process. 10 (5) (2001) 677–685. Professor C.C. Chang was born in Taichung, Taiwan on November 12th, 1954. He obtained his Ph.D. degree in computer engineering from National Chiao Tung University. His first degree was a Bachelor of Science in Applied Mathematics and master degree was a Master of Science in computer and decision sciences. Both were awarded by National Tsing Hua University. Dr. Chang served in National Chung Cheng Uni versity from 1989 to 2005. His current title is Chair Professor in Department of Information Engineering and Computer Science, Feng Chia University, from February 2005. Prior to joining Feng Chia University, Professor Chang was an associate professor in Chiao Tung University, professor in National Chung Hsing University, chair professor in National Chung Cheng University. He had also been Visiting Researcher and Visiting Scientist to Tokyo University and Kyoto University, Japan. During his service in Chung Cheng, Professor Chang served as Chairman of the Institute of Computer Science and Information Engineering, Dean of College of Engineering, Provost and then as Acting President of Chung Cheng University and Director of Advisory Office in Ministry of Education, Taiwan. Professor Chang’s specialties include, but are not limited to, data engineering, database systems, computer cryptography and information security. A researcher of acclaimed and distinguished services and contributions to his country who advanced human knowledge in the field of information science, Professor Chang has won many research awards and honorary positions by and in prestigious organizations both nationally and internationally. He is currently a Fellow of IEEE and a Fellow of IEE, UK. And since his early years of career development, he consecutively won the Outstanding Youth Award of the ROC, Outstanding Talent in Information Sciences of the ROC, AceR Dragon Award of the Ten Most Outstanding Talents, Outstanding Scholar Award of the ROC, Outstanding Engineering Professor Award of the ROC,
ARTICLE IN PRESS C.-C. Chang, C.-T. Liao / Neurocomputing 69 (2006) 2327–2335 Chung-Shan Academic Publication Awards, Distinguished Research Awards of National Science Council of the ROC, Outstanding Scholarly Contribution Award of the International Institute for Advanced Studies in Systems Research and Cybernetics, Top Fifteen Scholars in Systems and Software Engineering of the Journal of Systems and Software, and so on. On numerous occasions, he was invited to serve as Visiting Professor, Chair Professor, Honorary Professor, Honorary Director, Honorary Chairman, Distinguished Alumnus, Distinguished Researcher, Research Fellow by universities and research institutes. He also published over 850 papers in Information Sciences. In the meantime, he participates actively in international academic organizations and performs advisory work to government agencies and academic organizations.
2335
C.T. Liao was born in Kaohsiung, Taiwan on October 12th, 1979. He received a BS degree in computer science and information engineering from National Chi Nan University, and a MS degree in computer science and information engineering from National Chung Cheng University. He is currently a Ph.D. student at the department of computer science, National Tsing Hua University. His research interests include machine learning and pattern recognition.