A comparison of clustering algorithms applied to color image quantization

A comparison of clustering algorithms applied to color image quantization

Pattern Recognition Letters 18 Ž1997. 1379–1384 A comparison of clustering algorithms applied to color image quantization P. Scheunders ) Vision La...

89KB Sizes 31 Downloads 189 Views

Pattern Recognition Letters 18 Ž1997. 1379–1384

A comparison of clustering algorithms applied to color image quantization P. Scheunders

)

Vision Lab, Department of Physics, RUCA UniÕersity of Antwerp, Groenenborgerlaan 171, Antwerpen 2020, Belgium

Abstract In this paper color image quantization by clustering is discussed. A clustering scheme, based on competitive learning is constructed and compared to the well-known C-means clustering algorithm. It is demonstrated that both perform equally well, but that the former is superior to the latter with respect to computing time. However, both depend on the initial conditions and may end up in local optima. Based on these findings, a hierarchical competitive learning scheme is constructed which is completely independent of initial conditions. The hierarchical approach is a hybrid structure between competitive learning and splitting of the color space. For comparison, a genetic approach is applied, which is a hybrid structure between a genetic algorithm and C-means clustering. The latter was demonstrated in the past to obtain global optimal results, but with high computational load. The hierarchical clustering scheme is shown to obtain near-global optimal results with low computational load. q 1997 Elsevier Science B.V. Keywords: Color image quantization; C-means clustering algorithm; Competitive learning; Hierarchical clustering; Genetic algorithm

1. Introduction In this paper the problem of color image quantization is discussed. Color quantization consists of two steps: palette design, in which a reduced number of palette colors Žtypically 8–256. is specified, and pixel mapping in which each color pixel is assigned to one of the colors in the palette. From the pattern recognition point of view, color quantization can be regarded as an unsupervised classification of the Ž3D. color space, each class being represented by one palette color. Since an RGB image can contain

)

Corresponding author. E-mail: [email protected].

up to Ž256. 3 distinct colors, the classification problem involves a large number of datapoints in a low dimensional space. Several techniques exist for color quantization. First, there is the class of splitting algorithms that divide the color space into disjoint regions, by consecutive splitting up the space. From each region a color is chosen to represent the region in the color palette. Two algorithms of this class which are regularly applied are the median-cut algorithm ŽMCA. ŽHeckbert, 1982. and the variance-based algorithm ŽVBA. ŽWan, 1990.. The latter will be used in this work for comparison. Other splitting algorithms, which incorporate techniques to take into account the human visual system, were introduced ŽBalasubramanian and Allebach, 1990; Balasubramanian et al.,

0167-8655r97r$17.00 q 1997 Elsevier Science B.V. All rights reserved. PII S 0 1 6 7 - 8 6 5 5 Ž 9 7 . 0 0 1 1 6 - 5

1380

P. Scheundersr Pattern Recognition Letters 18 (1997) 1379–1384

1994; Wu, 1987.. In general, splitting algorithms are fast. The disadvantage is that generally no global optima are obtained, because a decision made for splitting at one level cannot be undone at a further level. Another class of quantization techniques performs clustering of the color space, and cluster representatives are chosen as palette colors. A frequently used clustering algorithm is the C-means clustering algorithm ŽCMA. ŽShafer, 1987; Celenk, 1990.. Here, an iterative updating of the cluster representatives and an assignment of color pixels to clusters takes place. Other clustering algorithms have been proposed. Fuzzy C-means clustering ŽLim, 1990., learning vector quantization ŽKotropoulos, 1992. and a selforganizing map ŽSOM. ŽKohonen, 1995; Dekker, 1994. were applied to color quantization. Clustering algorithms are commonly accepted as optimal quantization approaches, but are also known as very time consuming ones. Moreover, although optimal, the above clustering algorithms suffer from their dependence on initial conditions. In most applications one specific initial condition is chosen to present the results. However, using other initial conditions can change the performance of the algorithm dramatically. We have demonstrated the severeness of this effect for CMA on grey-level as well as color image quantization, and we have proposed a genetic approach to solve the problem, the main drawback being its computational requirements ŽScheunders, 1996, 1997.. In this paper, the problem of local optima in color image quantization is studied, by applying several clustering techniques. First of all CMA is compared to a competitive learning technique ŽCL.. CL is very similar to CMA, in the sense that it minimizes the same objective function. The main difference is that using CL, cluster centers are updated sequentially instead of in parallel. An extra dependence is introduced, namely the order in which the color pixels are presented to the algorithm. This dependence will be shown to have only small effects on the results. The main advantage of using CL is the required computation time. To solve the problem of local optima, a hybrid approach combining splitting and Competitive Learning clustering is proposed, which is independent of the initial conditions. It is a hierarchical approach which gradually splits up the complete

dataset into clusters until the requested number of clusters is obtained. This scheme will be compared to CMA, CL and the genetic approach, with respect to optimality and computation time. 2. Clustering algorithms 2.1. The C-means clustering algorithm Suppose a 3 dimensional dataset of points x s Ž x R , x G , x B . contains the red, green and blue component of a color pixel x. This space contains the color histogram of an image and is called the color space. After a linear transformation, another color space can be obtained, which can be, e.g., more adapted to the human visual system. In this paper, RGB space is used. Color quantization is performed by clustering the color space into a given number C of clusters Sk . Each cluster is represented by one representative color zk s Ž Õ kR , ÕkG , Õ kB .. The set of cluster representatives  zk 4k defines the color palette of the quantized image. During the pixel mapping, each color pixel is assigned to one of the clusters and is replaced by the representative of that cluster. Optimal quantization is obtained by minimizing the following objective function: 2

s s ÝCks 1 Ý x g S kŽ x y Õ k . .

Ž 1.

This function is the mean squared error ŽMSE. which is made when replacing each color pixel by its representative. Minimizing the MSE with respect to Õ k leads to the following conditions: Ý x g Sk x Õk s , Ž 2. Ý x g S k1 i.e., the representatives are positioned at the center of mass of the colors belonging to the cluster. The second set of conditions states that a color should be associated with the closest cluster representative. 2

x g S j l j s argmin k Ž x y Õ k . .

Ž 3.

A set of cluster centers which satisfies Eqs. Ž2. and Ž3., satisfies minimal objective function conditions and is called a local optimum. One way to satisfy both sets of equations is by using CMA. Here, one starts with an initial set of cluster centers  z1 , . . . , zC 4 , after which Eqs. Ž3. and Ž2. are updated alternately until convergence.

P. Scheundersr Pattern Recognition Letters 18 (1997) 1379–1384

1381

2.2. The competitiÕe learning algorithm Another way to minimize Eq. Ž1. is using a steepest descent type of approach. Here, colors are presented sequentially, the order in which they are presented is randomly chosen from the color image. A step size or learning rate factor a Ž t . is defined. If color x g S j is presented, then Õi Ž t q 1 . s Õi Ž t . q a Ž t . d i j x Ž t . y Õi Ž t . , Ž 4. i.e., the closest cluster center is moved towards x while all other cluster centers remain at their position. In practice, a Ž t . is chosen to be equal to one at the start and thereafter decreasing monotonically. This guarantees a global ordering of the cluster centers initially and a fine adjustment afterwards. This algorithm is called competitive learning ŽCL.. The required computational time depends on the total number of color pixels that is presented. It is observed that in the problem of color quantization when presenting all color pixels once, a nearby global optimum is obtained. When all color pixels are presented once, cpu-time is comparable to one iteration step of CMA. In general both algorithms should obtain an optimal result. However, they both depend upon the initial position of the cluster centers. Moreover, CL depends on the order in which colors are presented to the algorithm. Which algorithm is more appropriate, depends on the application and on the quality of the available initial conditions. Because of the steepest descent character of CL, it is expected to be more appropriate when near-optimal initial conditions are available. In this paper, it will be demonstrated that CL and CMA perform equally well on the problem of color quantization. 2.3. The hierarchical competitiÕe learning algorithm (HCL) Since splitting algorithms do not need initial conditions, a way to overcome the problem of local optima is by combining clustering and splitting techniques. Several strategies are possible. We propose the following algorithm. Motivated by its speed, the clustering part of the algorithm is performed by CL. The splitting part can be performed by the following steps: 1. Start with one cluster representative at a random position and apply CL so that the complete color

Fig. 1. Schematic presentation of the HCL algorithm for C s 4. The three iteration steps are numbered. The color pixels are denoted as white dots, the obtained cluster centers as black dots.

space can be regarded as one cluster around the representative. 2. Degenerate Žsplit. this center into two independent ones, and apply CL again to both centers. 3. Repeat step 2 for all obtained cluster centers, until the desired number of clusters is obtained. If the desired number of clusters differs from a power of two, the splitting up in the last step is performed on a limited number of centers. In Fig. 1, HCL is schematically presented. HCL is completely independent of the initial position of the cluster center, it only depends on the order in which the color pixels are presented. This effect will be shown to be small in the case of color quantization. 2.4. The genetic C-means clustering algorithm (GCMA) Another way of avoiding local optima is by optimizing Eq. Ž1. in a global way. In the past, we have developed a genetic approach, combining genetic algorithms with CMA. This technique will be employed for comparison. 3. Experiments and discussion In this section several experiments are discussed to demonstrate the performance of the different clustering algorithms CMA, CL, HCL and GCMA. The images used are RGB color images of 256 = 256 pixels. A set of five commonly used images, ‘‘Lena’’,

1382

P. Scheundersr Pattern Recognition Letters 18 (1997) 1379–1384

‘‘Peppers’’, ‘‘Airplane’’, ‘‘M andrill’’ and ‘‘Sailboat’’ are taken from a standard library Že.g., WWW at site http:rr vision.ce.pusan.ac.kr.. To be able to compare results with the splitting algorithm VBA, images are prereduced to 5bitrcolor. All images are quantized to 16, 32 and 64 colors. In the first experiment the dependence on initial conditions is investigated. Several strategies are possible to obtain an initial set of palette colors for starting a clustering algorithm. An obvious choice is a random initial set. Applying the quantizer on different initial sets independently, allows to study in a statistical way the influence of the initial conditions

Table 1 Obtained MSEs after applying CMA and CL, respectively, using the result of VBA as initial conditions

Lena

Peppers

Airplane

Mandrill

Sailboat

C

VBA

CMA

CL

HCL

GCMA

16 32 64 16 32 64 16 32 64 16 32 64 16 32 64

576 304 169 1370 519 247 419 239 101 1971 775 456 463 275 175

377 217 131 602 347 181 273 132 73 676 418 267 342 229 149

368 206 126 504 295 171 248 122 69 637 372 235 338 210 137

370 207 126 505 295 174 265 131 74 637 373 234 339 210 138

351 201 130 499 286 175 228 117 71 628 374 244 328 213 141

The results of HCL and GCMA are also given.

Fig. 2. Distribution of MSEs for ‘‘Lena’’, quantized to 16 colors, obtained after applying 1000 independent runs using random initial conditions. Ža. CMA; Žb. CL; Žc. HCL.

on the behaviour of the algorithm. A statistically representative number of initial sets is constructed and the algorithm is applied on each set independently. The distribution of obtained MSEs is a statistical representation of the distribution of local optima obtained by the quantizer. In Fig. 2Ža. – Žc. such distributions are shown for quantization of color image ‘‘Lena’’, quantized to 16 colors, after 1000 independent runs, using random initial conditions, after applying CMA, CL and HCL, respectively. In Fig. 2Ža., a few discrete local optima are clearly visible, which indicates that CMA converges to a local optimum. However, the distance between different local optima is large. Best and worst case differ by almost a factor of 2. In Fig. 2Žb., a continuous distribution of local optima is visible. This agrees with the fact that CL is a steepest descent type of approach, which converges to the nearest local optimum. The distribution, however, is as broad as the distribution of CMA. Finally in Fig. 2Žc., a narrow distribution on the left hand side demonstrates that the hierarchical approach, which is independent of initial conditions, converges to a solution near the real global optimum. The distribution shows the dependence of the competitive learning algorithm on the learning order of the presentation of the color pixels. The effect of this dependence is only a few percent.

P. Scheundersr Pattern Recognition Letters 18 (1997) 1379–1384 Table 2 CPU-times in seconds Žon an HP-9000r730 workstation. of the algorithms VBA, CMA, CL, HCL and GCMA C

VBA

CMA

CL

HCL

GCMA

16 32 64

1.1 1.5 1.7

34 89 267

5.5 7.5 12.0

20 29 39

145 284 679

In the second experiment, the four algorithms CMA, CL, HCL and GCMA are compared. CMA and CL are applied using fixed initial conditions, which were generated by the splitting algorithm VBA. Notice that CL and HCL depend on the order of presented pixels, an effect which is shown to be small in the previous experiment. In order to deal with this dependence, the algorithms are applied 20 times independently, and the average result is shown. Observed variances were only a few percent. In Table 1 all results are shown. All cluster algorithms improve the results obtained by the splitting algorithm. In all but a few cases, results of CL are statistically significantly better than CMA. The result of HCL are very close to the results obtained by CL. Only on the image ‘‘Airplane’’ differences are significantly observable, probably due to the large amount of white in that image. In Table 2, the average CPU-times are shown. The splitting algorithms are very fast. CL is 10 to 20 times faster than CMA and HCL needs about 3 times more cpu-time than CL, while GCMA is clearly the most time consuming approach. From these experiments we can conclude: Ø CL and CMA are equally dependent on initial conditions; Ø CL and CMA generate comparable results but CL converges much faster than CMA, which makes it a useful fast alternative for optimal color quantization; Ø both HCL and GCMA are almost insensitive to the applied initial conditions; Ø GCMA obtains the most optimal results, but with high computational cost; Ø HCL is much faster and generates near-optimal results. Discussion Loew: I just would like to observe that with respect to colour, it may be worthwhile to consider that

1383

human visual perception of colour is far less dependent on resolution in colour than in intensity. If you represent colour as intensity, hue and saturation, it has been shown that far more bits are required in the intensity channel than in the hue or saturation channels. You may be able to take advantage of this for some of your work. Scheunders: Maybe I can comment on this. I used the RGB space and, of course, I could use another space. There are better spaces than the RGB space. This was, however, not the main issue that I wanted to discuss. By taking, for example, another error function, you could as well take this problem into account. I did check the same thing on other spaces, linear as well as non-linear transformed spaces and they gave similar results: the broad distribution of local optima and the problem that the conversion speed can differ a lot between different algorithms. Loew: So it may be that the straight mean-square error may be handicapping yourself unnecessarily. That is to say, not all errors are the same. Scheunders: Yes that is true, that could be correct. However, by using another error function you would have probably the same problems. Raghavan: I was wondering if all of the work you did, all the results you reported are just based on this one image? Scheunders: The table that I showed was based on five test images. I did not use a large set of images, I just tested it on a few images. Raghavan: In some situations, one needs to get the quantization done in the context of a large set of images. So that the quantization is the same for maybe one or two hundred images. Have you thought about that? Scheunders: I know of a paper where they used the competitive learning approach for an image series and they start by taking the result of one image as the initial condition for the next image.

1384

P. Scheundersr Pattern Recognition Letters 18 (1997) 1379–1384

References Balasubramanian, R., Allebach, J.P., 1990. A new approach to palette selection for color images. J. Imag. Technol. 17, 284–290. Balasubramanian, R., Allebach, J.P., Bouman, C.A., 1994. Colorimage quantization with use of a fast binary splitting technique. J. Opt. Soc. Amer. A 11 Ž11., 2777–2786. Celenk, M., 1990. A color clustering technique for image segmentation. Comput. Vision Graphics Image Process. 52, 145–170. Dekker, A.H., 1994. Kohonen neural networks for optimal color quantization. Network: Computation in Neural Systems 5, 351–367. Heckbert, P., 1982. Color image quantization for frame buffer display. Comput. Graphics 16 Ž3., 297–307. Kohonen, T., 1995. Self-Organizing Maps. Springer Series in Information Sciences, Vol. 30. Springer, New York. Kotropoulos, C., Auge, ´ E., Pitas, I., 1992. Two-layer learning vector quantizer for color image quantization. In: Vandewalle,

J., Boite, R., Moonen, M., Oosterlinck, A. ŽEds.., Signal Processing IV: Theories and Applications. Elsevier, Amsterdam, pp. 1177–1180. Lim, Y.W., Lee, S.U., 1990. On the color image segmentation algorithm based on the thresholding and the fuzzy C-means techniques. Pattern Recognition 23 Ž9., 935–952. Scheunders, P., 1996. A genetic Lloyd-Max image quantization algorithm. Pattern Recognition Letters 17, 547–556. Scheunders, P., 1997. A genetic C-means clustering algorithm applied to color image quantization. Pattern Recognition 30 Ž6.. Shafer, S.A., Kanade, T., 1987. Color vision. In: Shapiro, S.C., Eckroth, D. ŽEds.., Encyclopedia of Artificial Intelligence. Wiley, New York, pp. 124–131. Wan, S.J., Prusinkiewicz, P., Wong, S.K.M., 1990. Variance-based color image quantization for frame buffer display. Color. Res. Appl. 15, 52–58. Wu, X., 1987. Color quantization by dynamic programming and principal analysis. ACM Trans. Graphics 11 Ž4., 348–372.