Automatic cluster evolution using gravitational search algorithm and its application on image segmentation

Automatic cluster evolution using gravitational search algorithm and its application on image segmentation

Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ Contents lists available at ScienceDirect Engineering Applications of Artificial ...

5MB Sizes 1 Downloads 62 Views

Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence journal homepage: www.elsevier.com/locate/engappai

Automatic cluster evolution using gravitational search algorithm and its application on image segmentation Vijay Kumar a,n, Jitender Kumar Chhabra b, Dinesh Kumar c a

CSE Department, JCDMCOE, Sirsa, Haryana, India Computer Engineering Department, NIT, Kurukshetra, Haryana, India c CSE Department, GJUS&T, Hisar, Haryana, India b

art ic l e i nf o

a b s t r a c t

Article history: Received 12 July 2013 Received in revised form 16 October 2013 Accepted 18 November 2013

In real life problems, prior information about the number of clusters is not known. In this paper, an attempt has been made to determine the number of clusters using automatic clustering using gravitational search algorithm (ACGSA). Based on the statistical property of datasets, two new concepts are proposed to efficiently find the optimal number of clusters. Within the ACGSA, a variable chromosome representation is used to encode the cluster centers with different number of clusters. In order to refine cluster centroids, two new operations namely threshold setting and weighted cluster centroid computation are also introduced. Finally, a new fitness function is proposed to make the search more efficient. A comparison of the proposed technique is also carried out with automatic clustering techniques developed recently. The proposed technique is further applied for automatic segmentation of both grayscale and color images and its performance is compared with other techniques. Experimental results demonstrate the efficiency and efficacy of the proposed clustering technique over other existing techniques. & 2013 Elsevier Ltd. All rights reserved.

Keywords: Gravitational search algorithm Clustering Image segmentation

1. Introduction Data clustering is an unsupervised technique that distributes unlabeled data into groups based upon dissimilarity measures (Jain et al., 1999; Bezdek, 1981). Clustering techniques are applied to a wide variety of applications such as image segmentation, data compression, pattern recognition, and machine learning (Jain et al., 1999, 2000; Ball and Hall, 1967; Das et al., 2009a). These can be classified into two main categories: hierarchical and partition clustering. The former clustering constructs a tree like nested structured partition of datasets (Xu and Wunsch, 2009). However, it has some drawbacks such as it is computationally expensive and it fails to separate the overlapping clusters (Das et al., 2009a). Whereas the latter category that is partition clustering technique divides data points into some pre-specified number of clusters without the hierarchical structure (Xu and Wunsch, 2009). Partition clustering is widely used in pattern recognition than the hierarchical clustering as reported in literature (Jain et al., 2000). In many clustering problems, the number of clusters may be unknown or difficult to estimate. Recently, researchers used the metaheuristic techniques to solve partition

n

Corresponding author. E-mail addresses: [email protected] (V. Kumar), [email protected] (J.K. Chhabra), [email protected] (D. Kumar).

clustering problem. A comprehensive review of metaheuristic algorithms for clustering problems can be found in Abraham et al. (2007) and Hruschka et al. (2009). However, in most of the clustering techniques based on metaheuristic algorithms, it has been observed that we need to specify the number of clusters as an input prior to running the algorithm. We would like to determine the number of clusters during run time. The main contribution of this paper is to propose a new approach, called Automatic Clustering Using Gravitational Search Algorithm (ACGSA), for finding the number of clusters in a given dataset automatically. It simultaneously finds both the number of clusters and the corresponding partitioning. The two new concepts i.e. threshold setting and weighted cluster centroid computation have been introduced for finding the optimal number of clusters accurately and efficiently. The new fitness function has also been proposed to guide the search accurately as it plays a vital role for performance evaluation of metaheuristic based clustering techniques. The performance of ACGSA has been evaluated on real-life datasets with respect to three clustering metrics namely the number of clusters, inter-cluster and intra-cluster distances. The results have been compared with state-of-the-art clustering techniques such as Automatic Clustering using Modified Differential Evolution (ACDE) (Das et al., 2008b), Dynamic Clustering using Particle Swarm Optimization (DCPSO) (Omran et al., 2006) and Genetic Clustering with an unknown number of clusters (GCUK) (Bandyopadhyay and Maulik, 2002). The ACGSA has further been

0952-1976/$ - see front matter & 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.engappai.2013.11.008

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

2

applied on five well-known grayscale and color images for segmentation. The experimental results demonstrate that the number of clusters determined using ACGSA, is almost equal to the number of actual classes present in real-life datasets. The inter-cluster and intra-cluster distance values are better than those obtained using other existing techniques. The rest of the paper is organized as follows. In Section 2, related work done in the field of partitional clustering techniques based on metaheuristic is described briefly. Section 3 presents basics of clustering and gravitational search algorithm. Section 4 presents the motivation and mathematical foundation for the proposed work. Section 5 presents real-life datasets, experimental setups and the experimental results. Section 6 covers the application of proposed approach in image pixel classification followed by conclusions in Section 7.

2. Related works Finding an optimal number of clusters in a given dataset is a challenging task. Recently, researchers focused on this clustering problem and tried to develop dynamic clustering techniques. Lee and Antonsson (2000) used an evolutionary strategy to cluster a dataset dynamically. They implemented variable length genomes to search for both centroids and the number of clusters. Bandyopadhyay and Maulik (2002) proposed a variable string length genetic algorithm to solve the clustering problem using a single fitness function. Their algorithm was called as Genetic Clustering for Unknown K (GCUK). Jarboui et al. (2007) developed a discrete PSO algorithm to solve partitional clustering problems. Omran et al. (2006) developed an automatic hard clustering scheme called as DCPSO. It started partitioning the dataset into relatively a large number of clusters in order to reduce the effects of initial conditions. They used PSO to select the optimal number of clusters. Ye and Chen (2005) used the hybridization of PSO and K-means algorithm for detecting the cluster centers of geometrical structure datasets automatically. Abdule-Wahab et al. (2006) used a scatter search for automatic clustering. Das et al. (2008b) presented a new DE-based strategy, named as Automatic Clustering using DE (ACDE), for hard partitioning problems. Das et al. (2008a) used the same concept as used in ACDE except that multielitist PSO instead of DE. Das and Konar (2009) used fuzzy concept in ACDE for image segmentation. Das and Sil (2010) extended ACDE with a kernel induced similarity measure for image segmentation. Quadfel et al. (2010) used the same concept as used in ACDE except that PSO was used in place of DE. Pan and Cheng (2007) proposed a framework for automatic clustering using Tabu Search. They used the number of clusters, as a variable, and evolved it to an optimal number. Saha et al. proposed a technique for automatic evolution of clusters. They developed a variable string length genetic algorithm based clustering technique, which utilized the symmetry based distances (Bandyopadhyay and Saha, 2008; Saha and Bandyopadhyay, 2010; Saha and Maulik, 2011). Das et al. (2009b) used Bacterial Evolutionary Algorithm for automatic clustering. Lee and Chen (2010) proposed improved Differential Evolution algorithm with cluster number oscillation for automatic crisp clustering. Cai and Gong (2011) used the differential evolution with modified point symmetry based cluster validity index for automatic clustering. Hatamlou et al. (2012) used gravitational search for fixed number of clusters in data clustering. Recently, Masoud et al. (2013) used the combinatorial PSO for dynamic clustering. Sarkar and Das (2013) presented a 2-D histogram based multi-level thresholding approach for image segmentation. The hybridization of metaheuristic algorithms has also been in use in clustering problems. Niknam and Amiri (2010) proposed a

cluster optimization algorithm based on the combination of PSO, Ant Colony Optimization and K-Means and so did Siriporn and Kim (2009) using combination of GA, ACO and fuzzy C-Means. There are other techniques such as cuckoo search (Gandomi et al., 2013a), Bat (Gandomi et al., 2013b), Krill Herd (Gandomi and Alavi, 2012) and firefly (Gandomi et al., 2011) algorithms, which can be used in data clustering. However, the Gravitational Search Algorithm is yet to be applied to automatic clustering of real-life datasets as well as image segmentation.

3. Background This section describes the clustering problem and gravitational search algorithm.

3.1. Clustering problem The clustering algorithms aim to minimize within-cluster variation, called intra-cluster distance and maximize the between-cluster variation, called inter-cluster distance. The mathematical description of partitional clustering is as follows: Let the set of n input data points be X ¼ fx1 ; x2 ; …; xn g, where xj ¼ ðxj1 ; xj2 ; …; xjd Þ A Rd , and each measure xji represents a feature. Clustering algorithms try to find out K partitions of X, C ¼ fC 1 ; C 2 ; …; C K g such that (Jain et al., 1999; Xu and Wunsch, 2009): C i a ϕ;

i ¼ 1; 2; …; K

C i \ C j ¼ ϕ;

i; j ¼ 1; 2; …; K

ð1Þ and

K

[ Ci ¼ X

i¼1

iaj

ð2Þ ð3Þ

In general data points are assigned to clusters based on some similarity measures. The most commonly used similarity measure is the Euclidean distance, between data point xj and cluster center mi of cluster C i and is defined as sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dij ¼

d

∑ ðxjk  mik Þ2

ð4Þ

k¼1

3.2. Gravitational search algorithm Gravitational search algorithm (GSA) was first introduced by Rashedi et al. (2009), which is inspired by the laws of gravitation and motion. In this algorithm, the agents are considered as objects and their performance is measured by their masses. All these objects attract each other by the gravity force, and this force causes a global movement of all objects towards the objects with heavier masses. The heavy masses correspond to good solutions of the problem. In GSA, each mass has four specifications: position, inertial mass, active gravitational mass and passive gravitational mass. The position of the mass corresponds to a solution of the problem, and its gravitational and inertial masses are determined by using fitness function (Cai and Gong, 2011). During the search process, the agents are moved according to the following equations: xti þ 1 ¼ xti þ vti þ 1

ð5Þ

vti þ 1 ¼ randi  vti þ ati

ð6Þ

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

where xti and vti represent the current position and velocity of ith agent at iteration t respectively. ati indicates the acceleration of ith agent at iteration t and it is defined as ati ¼

F ti M tii

ð7Þ

Here M tii and F ti denote the inertial mass and total force acting on ith agent at iteration t respectively. The total force that acts on ith agent is a weighted sum of the forces exerted from K best (i.e. first K agents having best fitness) agents and is as given below: F ti ¼



j A K best ; j a i

randj  F tij

ð8Þ

where F tij represents the force acting on agent i from agent j at iteration t and randj is a random number in the interval ½0; 1. F tij is given as follows: F tij ¼ GðtÞ

M tpi  M taj Rtij þ ε

ðxtj xti Þ

ð9Þ

where M pi and M aj are the passive and active gravitational masses related to agent i respectively, GðtÞ is a gravitational constant at time t, ε is a small constant, and Rtij is the Euclidean distance between two agents xi and xj at iteration t. The gravitational constant, G, is a function of initial value, G0 , and iteration, t:   t ð10Þ GðtÞ ¼ G0 exp 1  T where G0 is a constant and T is the total number of iterations. The gravitational and inertial masses are calculated by the fitness function evaluation. In this paper, we assume that the gravitational and inertial masses are equal and these are computed using the map of fitness (Rashedi et al., 2010, 2009). After updating the velocity and position of an agent, the termination criterion is checked (i.e., the number of iterations or adequate fitness function value). If the criterion is satisfied, the best solution is retained; otherwise these operations (Eqs. (5)–(9)) are repeated. The main strengths of GSA lie in the fact that it is memory-less and only the current positions of the agents participate in the updating procedure, in contrast to other metaheuristic algorithms such as PSO, GA (Rashedi et al., 2009).

4. Proposed work This section first describes the motivation and mathematical foundation of proposed work followed by proposed approach for automatic clustering technique. 4.1. Motivation The major contribution of this paper is a novel approach for the automatic clustering scheme. In this paper, a variable string length based clustering technique is proposed which is used to automatically find the optimal number of clusters and partitions. The variable string encodes the centers of clusters with the corresponding threshold values. For this, a new method of threshold value setting for each cluster centroid is proposed. This enables the proposed algorithm to consider the variance of given dataset, which has not so far been considered in other contemporary techniques. Weighted Euclidean distance has been preferred for assignment of data-points to different clusters in place of Euclidean distance. The reason behind this is to make the proposed algorithm include the effect of threshold values corresponding to the cluster centers during the distance computation for approximate grouping of points as a cluster for all types of clusters. A new

3

fitness function is proposed here and is used for computing the fitness of the agents. The GSA metaheuristic has been preferred. The reason for adopting GSA is that it requires a few parameters for fine-tuning. For example, differential evolution (DE) requires many parameters such as crossover, mutation probability. In GSA, only one constant is required to be set. ACGSA is inspired from ACDE (Das et al., 2008b). ACDE has two major shortcomings. First, it uses fixed threshold cutoff value for all types of datasets whereas the fact is that it is dataset dependent. So, this was not acceptable as it should be made to change according to the dataset. Second, the threshold value has not been taken into consideration during the distance computation. These two major weaknesses of ACDE motivated us to propose the new approach explained as under: Issue 1: Threshold calculation and cutoff value: Das et al. (2008b) fixed the cutoff threshold value as 0.5. But as said earlier, this value is dataset dependent. It has to be changed accordingly. The behavior of any dataset largely depends upon the number of features and data points. Using this fact, a novel approach for threshold setting is proposed. The threshold value for each cluster center is set to the corresponding within-cluster variation. The cutoff threshold value is adjusted to mean value of standard deviation of a given dataset, instead of fixing a particular value for all datasets. This proposed approach for threshold setting utilizes the variation of given dataset that helps in effective computation of the optimal number of clusters in a given dataset. Issue 2: Cluster centroid weighted calculation: There are many ways to define the similarity measures between cluster centroid and data points such as correlation coefficient, Euclidean distance. Das et al. (2008b) used the Euclidean distance as a similarity measure. They did not take into consideration the threshold of the corresponding cluster centroid during the computation of similarity measure. To include this, weighted Euclidean distance is used in our approach that does consider the effect of threshold during the computation. This distance is mathematically formulated as vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! u u d w 2 t 2 d ¼ ð11Þ ∑ w ðx  m Þ ij

l¼1

i

jl

il

where wi is the threshold assigned to cluster center mi . The more the value of wi , the better is the ith cluster center in clustering.

4.2. Automatic clustering using gravitational search algorithm The proposed approach of the automatic clustering using gravitational search algorithm (ACGSA) consists of following four steps. Algorithm Step 1. Initialize the algorithm parameters, such as number of agents (or population), maximum number of iterations, maximum number of clusters (K max ) and the parameters used in gravitational constant equation. Step 2. Initialize each agent such that it contains K max number of randomly selected cluster centers and corresponding values of activation thresholds (see Section 4.2.1). Step 3. Repeat the following steps until the maximum number of iterations is reached: (a) Find out the active cluster centers in each agent (see Section 4.2.1).

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

4

(b)

(c) (d)

(e)

Step 4.

For each data point xi , compute its weighted Euclidean distance from all the active cluster centers of the ith agent. Assign xi to a cluster center whose distance is minimum with respect to that cluster center. In case if the number of data points pertaining to any cluster is less than two, then reinitialize the cluster centers of the agent by an average computation described in Section 4.2.3. Change the agents using the GSA algorithm outlined in Section 3.2. The fitness of the agents is used to guide the search process (see Section 4.2.2). The best agent will yield the optimal cluster centers and the optimal number of clusters at final iteration

The main concepts of ACGSA and its steps are described in the following subsections

corresponding first (8.6, 9.7) and fourth (2.2, 1.9) cluster centers have been selected (shown in circle) for partitioning the dataset. When the agents are updated during the search process of GSA, there are chances that none of the thresholds is greater than the cut-off threshold value. In such a situation, we randomly select two thresholds and reinitialize them to the values greater than the cut-off threshold to ensure that there are a minimum of two clusters. 4.2.2. Fitness evaluation The partitioning of a dataset is achieved by optimizing the specified clustering criterion. A large number of clustering criteria have been reported in literature (Jain et al., 1999; Xu and Wunsch, 2009). Most of the criteria are based on the within-cluster and 1 between-cluster scatter matrices. This paper uses trace ðSW SB Þ. The within-cluster variation ðSW Þ measures how much scattered are the data points from their cluster centroid. SW is defined as K

4.2.1. Cluster encoding in agents ACGSA uses the cluster centroid-based encoding to represent the clustering solutions. In the proposed method, for n data points, each having d dimensions, for user specified maximum number of clusters K max , an agent is a vector of real numbers of dimension K max þ K max  d. The first K max entries are positive real numbers in the range ½0; 1, each of which controls whether the corresponding cluster is to be activated or not during the process of clustering. The remaining entries are used for K max cluster centers, each having d dimensions. The vector (V i ðtÞ) of an agent i is illustrated as

where mi;j is the jth cluster center of ith agent and Thi;j is threshold value of corresponding cluster centroid mi;j . The Thi;j s are the selection thresholds for selecting active cluster centers. These selection thresholds are assigned to each of the corresponding cluster centers. The value of selection threshold is set to the corresponding intra-cluster variation. If Thi;j 4ThresholdSpecif ied , then mi;j is selected for partitioning the given dataset otherwise not. This means that the jth cluster center in the ith agent is active and is selected for partitioning the dataset. Here ThresholdSpecif ied indicates specified threshold or cutoff value, which is based on mean standard deviation of given dataset. The rule for selecting the number of clusters specified by the chromosome is as follows: If Thi;j 4 ThresholdSpecif ied then mij is active Else mij is inactive.

n

SW ¼ ∑ ∑ νij ðX i mj ÞðX i mj ÞT

ð12Þ

j¼1i¼1

where vij is a partition matrix. vij ¼ 1 if X j A cluster i otherwise zero. The between-cluster variation ðSB Þ measures how much scattered are the cluster centroids from the mean of whole dataset. SB is defined as K

SB ¼ ∑ ni ðmi XÞðmi  XÞT

ð13Þ

i¼1

1 In trace ðSW SB Þ, SB is normalized by SW . The larger value of this criterion is required for high quality of clustering solutions. Sheng et al. (2008) suggested a penalty term for this criterion as it is biased towards increasing the number of clusters. They included the user specified maximum number of clusters in their penalty term. The penalty function has been modified in this paper and is mathematically reformulated as

1 1 Fit ¼ trace ðSW SB Þ n ðK  1Þ

ð14Þ

where 1=ðK  1Þ is used as a penalty function and K represents the number of clusters. The penalty function has been introduced to ensure that at least two clusters exist in the given dataset. 4.2.3. Cluster center validation There is a possibility that the number of data points assigned to a cluster center is less than two. This may be attributed to the fact that the selected cluster center (s) is (are) outside the boundary of the distribution of data points. To cope up with this problem, cluster center positions of particular agents are reinitialized by an average computation method (Das et al., 2008b). 4.3. Complexity analysis

For example, in two-dimensional data, the agent encodes the centers of five clusters as shown in Fig. 1. Let us assume the threshold cutoff value is 0.4. Then, according to the rule mentioned above, only two thresholds are higher than 0.4 (i.e., 0.8 and 0.6) and

In this section, the complexity analysis of ACGSA is presented. We have analyzed both time and space complexities of the proposed technique. 4.3.1. Time complexity

0.8

0.3

0.2

0.6

0.1

Selection Thresholds

8.6

9.7

6.0

5.4

3.7

7.4

2.2

1.9

Cluster Centers

Fig. 1. Agent encoding scheme consists of 2-D dataset.

4.5

0.9

1. Initialization of ACGSA needs O ðagent size  string lengthÞ time where agent size and string length indicate the number of agents and the length of each encoded agent in the ACGSA respectively. Here, the string length is O ðK max þ K max  dÞ where d is the dimension of the dataset and K max is the maximum number of clusters.

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

2. Active cluster extraction step of ACGSA requires O ðagent size  K max Þ time. 3. Fitness computation is composed of three basic steps. (a) The assignment of data points to different clusters requires O ðn2  K max Þ for each agent. (b) The cluster center updation requires O ðK max Þ. (c) Time complexity of fitness function is Oðn  K max Þ. 4. Step 3 is repeated for all agents i.e., agent size computation times and the three sub-steps mentioned above are performed in sequence one after the other. Hence total complexity of Step 3 i.e. fitness evaluation will be agent size  ðn2  K max þ K max þ n  K max Þ ¼ O ðagent size  n2  K max Þ 5. Mass calculation, acceleration calculation, and velocity and position update steps of ACGSA require O ðagent size  string lengthÞ each. Therefore, summing up the complexities of all the above steps and considering that string length{n, the total time complexity becomes O ðn2  K max  agent sizeÞ per generation. The total time complexity of ACGSA for maximum number of generations is O ðn2  K max  agent size  MaxgenÞ. Here Maxgen indicates the maximum number of generations.

4.3.2. Space complexity The major space requirement of ACGSA clustering technique is due to its number of agents (agent size). Thus, the total space complexity of ACGSA clustering technique is O ðagent size  string lengthÞ.

5. Experimentation and results To evaluate the performance of ACGSA, five real life datasets mentioned in Section 5.1 have been used for experimentation. We have compared the performance of ACGSA with the three well established automatic clustering algorithms such as ACDE, GCUK and DCPSO. The results are evaluated and compared using some acceptable cluster quality measures such as the optimal number of clusters found, inter-cluster and intra-cluster distances (Jain et al., 1999). Inter-cluster distance is a measure of separation between clusters whereas the intra-cluster distance denotes a measure of homogeneity within a cluster. For better clustering the values of

Table 1 Description of UCI datasets. Dataset

No. of data points

No. of features

No. of classes

Iris Wine Glass Breast Cancer Vowel

150 178 214 683 871

4 13 9 9 3

3 3 6 2 6

5

intra-cluster and inter-cluster distances must be as small as possible and as large as possible respectively. 5.1. Real-life datasets used The clustering techniques have been tested over five real-life datasets of UCI database (Blake and Merz, 1998). Table 1 presents the details of these datasets. 5.2. Experimental setup The experiment was run for different values of parameters used in the proposed approach (number of agents, maximum Iterations, K max , and G0 ). These parameters were fixed that are as follows: The number of agents and K max are set to 30 and 15 respectively. The maximum number of iterations is fixed to 200. G0 is set to 100. For each dataset, clustering algorithm is run for 40 times for the comparison. 5.3. Implementation results The performance of ACGSA is compared with ACDE (Das et al., 2008b), DCPSO (Omran et al., 2006), GCUK (Bandyopadhyay and Maulik, 2002), and classical DE (Das et al., 2008b). Three quality metrics have been used i.e. optimal number of clusters obtained, inter-cluster and intra-cluster distances. The results have been calculated in terms of mean and standard deviation over 40 independent runs in each case. Table 2 shows the optimal number of clusters obtained. The results reveal that DCPSO, GCUK and classical DE produce two clusters for Iris dataset. ACGSA and ACDE both produce three clusters. For Wine dataset, ACGSA produces three clusters in each run. For Glass dataset, ACGSA, ACDE and DCPSO provide six clusters in almost each run. For Breast Cancer dataset, we have found that ACGSA, ACDE and GCUK find two clusters in almost each run. For Vowel dataset, ACGSA produces six clusters in each run. However, ACDE produces near optimal number of clusters, but not in each run. DCPSO, GCUK and classical DE do not produce the optimal number of clusters. Hence, ACGSA performs better than other well known techniques in respect of the optimal number of clusters. The unpaired t-tests are used to compare the results produced by the best and the second best clustering algorithms. We have taken 40 as the sample size for unpaired t-tests. Table 3 shows the results of unpaired t-tests based on number of clusters presented in Table 2. For all the datasets except Glass dataset, ACGSA is statistically significant as compared to other clustering techniques. Tables 4 and 5 demonstrate the intra-cluster and inter-cluster distances obtained from clustering algorithms. For all the datasets except Breast Cancer, ACGSA provides compact clusters as compared to other clustering algorithms. ACDE produces clusters that have minimum distance between data points for Breast Cancer dataset. It has also been noted from Table 5 that for Iris, Wine, and Breast Cancer datasets ACGSA creates well-separated clusters as compared to other techniques. For Glass and Vowel datasets, Classical DE and ACDE generate well separated clusters respectively.

Table 2 Mean and standard deviation of 40 runs of number of clusters obtained from ACGSA over five real life datasets. Dataset

ACGSA

ACDE

DCPSO

GCUK

Classical DE

Iris Wine Glass Breast Cancer Vowel

2.95 70.220 3.00 70.000 6.02 70.619 2.00 70.000 6.00 70.000

3.25 7 0.038 3.25 7 0.039 6.05 7 0.015 2.007 0.000 5.75 7 0.0751

2.23 70.044 3.05 70.035 5.95 70.075 2.25 70.063 7.25 70.018

2.357 0.098 2.95 7 0.011 5.85 7 0.035 2.007 0.008 5.05 7 0.007

2.50 7 0.047 3.50 7 0.014 5.60 7 0.075 2.25 7 0.026 7.50 7 0.056

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

6

Table 3 Unpaired t-test between the best and the second best performing algorithms for each dataset based on the data presented in Table 2. Dataset

Standard error

t

95% Confidence interval

Two-tailed P

Significance

Iris Wine Glass Breast Cancer Vowel

0.035 0.006 0.098 0.004 0.012

8.4985 40.5420 0.3064 60.8130 21.0538

 0.37028 to  0.26228 to  0.22491 to  0.25818 to 0.22626 to

o 0.0001 o 0.0001 0.7601 o 0.0001 o 0.0001

Extremely significant Extremely significant Not significant Extremely significant Extremely significant

 0.22972  0.23772 0.16491  0.24182 0.27364

Table 4 Mean and standard deviation of 40 runs of intra-cluster distance obtained from ACGSA over five real life datasets. Dataset

ACGSA

ACDE

DCPSO

GCUK

Classical DE

Iris Wine Glass Breast Cancer Vowel

1.35967 0.189 1.0617 0.000 14.4407 1.722 6.096 7 0.125 173.1247 2.866

3.1167 0.033 4.0467 0.002 563.2477 134.2 4.2447 0.143 1412.63 7 0.792

3.652 7 1.195 4.8517 0.184 599.5357 10.34 4.8517 0.373 1482.517 3.973

3.56772.792 4.163 71.929 594.673 730.62 4.994 70.904 1495.13 712.334

3.9447 1.874 4.9497 1.232 608.8377 20.92 4.694 7 0.654 1493.727 10.833

Table 5 Mean and standard deviation of 40 runs of inter-cluster obtained from ACGSA over five real life datasets. Dataset

ACGSA

ACDE

DCPSO

GCUK

Classical DE

Iris Wine Glass Breast Cancer Vowel

4.341 70.559 3.506 70.000 22.828 72.832 6.729 70.192 620.896 71.754

2.593 7 0.027 3.148 7 0.078 853.62 7 9.044 3.2577 0.138 2724.857 0.124

2.210 70.773 2.611 71.637 889.32 74.233 2.361 70.021 1923.93 71.154

2.505 7 1.409 2.805 7 1.365 869.93 7 1.789 2.394 7 1.744 1944.38 7 0.747

2.1157 1.089 2.6117 1.384 891.827 4.945 2.9647 1.464 2434.457 1.213

Fig. 2. Comparison of ACGSA over other existing clustering techniques in terms of (a) ratio of inter to intra-cluster distance; (b) fitness function evaluations.

The results also depict that the clusters formed by ACGSA are well separated and cohesive in nature as demonstrated above. The inter-cluster distance obtained from ACGSA is almost three times of intra-cluster distance. None of the existing clustering algorithms generates much well separated and compact cluster as are produced by the proposed algorithm. Tables 4 and 5 also indicate that clusters formed by ACGSA exhibit one important property of clustering. That is, intra-cluster distance is not larger than the inter-cluster especially in these datasets. But other techniques

violate this property. Fig. 2(a) shows the ratio of inter-cluster to intra-cluster distances among above mentioned five datasets. The results indicate that ACGSA produces compact and well separated clusters as compared to those obtained by using other existing clustering techniques. The number of fitness function evaluations is used to compare the speed of above mentioned clustering techniques. The number of fitness function evaluations for ACDE, DCPSO, GUCK and Classical DE are quoted from (Das et al., 2008b). Fig. 2(b) shows

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

7

Fig. 3. (a) Variation of number of clusters with number of agents; (b) variation of intra-cluster distance with number of agents; (c) variation of inter-cluster distance with number of agents.

Fig. 4. (a) Variation of number of clusters with K max ; (b) variation of intra-cluster distance with K max ; (c) variation of inter-cluster distance with K max .

the performance evaluation of ACGSA over other existing clustering techniques in terms of function evaluations. The results obtained (refer Fig. 2(b)) indicate that ACGSA is faster than the other clustering techniques. Besides fitness function evaluations, we have also computed the execution time for all the datasets. The mean execution time for Iris, Wine, Glass, Cancer, and Vowel datasets are 17.3337 s, 23.1006 s, 27.9689 s, 81.0526 s and 92.8483 s respectively. 5.4. Sensitivity analysis of parameters for ACGSA The ACGSA has three control parameters that affect the performance of clustering technique. We discuss the impact of the parameters such as number of agents, maximum number of userspecified clusters (K max ), and G0 on the ACGSA algorithm. (1) Number of agents: To investigate the effect of number of agents on all datasets, the ACGSA was executed for 10, 30, 50, 70, and 100 agents keeping other parameters fixed as specified in Section 5.2. Fig. 3(a) shows the number of clusters evaluated over all datasets. The results demonstrate that the number of clusters increases with the number of agents for Iris and Vowel datasets. For Wine and Breast Cancer datasets, the change of number of agents has no effect on the number of

clusters. However, the number of clusters decreases as the number of agents increases for Glass dataset. Fig. 3(b) and (c) shows the variation on inter-cluster and intra-cluster distances over the number of agents. For Iris dataset, the inter-cluster distance increases and intra-cluster distance decreases as the number of agents increases. There is no effect on inter-cluster and intra-cluster distances for Wine dataset. For Breast Cancer, intra-cluster distance first increases with number of agents and thereafter decreases. For Glass and Breast Cancer datasets, inter-cluster distance decreases with the increase in number of agents. For most of the datasets, it was found that keeping the number of agents as 30 provides a reasonable inter-cluster and intra-cluster distance measures and the number of clusters. (2) Maximum number of user-specified clusters (K max ): Keeping other parameters fixed as specified in Section 5.2, ACGSA was run for different values of K max ( 5, 10, 15, 20, and 25). Fig. 4 (a) shows the effect of K max over the number of clusters. The results reveal that the number of clusters decreases with number of agents for Iris dataset. For Wine and Breast Cancer datasets, the change of number of agents has no effect on the number of clusters. However, the number of cluster increases as the number of agent increases for Glass and Vowel datasets. Fig. 4(b) and (c) shows how the inter-cluster and intra-cluster distances vary over different values of K max . For Iris dataset,

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

8

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Fig. 5. (a) Variation of number of clusters with G0 ; (b) variation of intra-cluster distance with G0 ; (c) variation of inter-cluster distance with G0 .

Fig. 6. (a) The original Cloud image; (b) labeled image obtained from ACGSA(K ¼3); (c) the original peppers image; (d) labeled image obtained from ACGSA(K¼ 7); (e) the original Science Magazine image; (f) labeled image obtained from ACGSA (K¼ 2); (g) the original Mumbai city image; (h) labeled image obtained from ACGSA(K¼ 4); (i) the original robot image; (j) labeled image obtained from ACGSA(K ¼3).

the intra-cluster distance increases and inter-cluster decreases as K max increases. The inter-cluster and intra-cluster distances are least effected as K max varies for Wine dataset. For Glass, Breast Cancer and Vowel datasets intra-cluster distance decreases with the increase in value of K max . The intercluster distance increases as K max increases for Glass and Breast Cancer datasets. It has been found that, ACGSA performs better for K max ¼ 15 for most of the datasets. (3) The constant G0 : The ACGSA was run for different values of G0 keeping other parameters fixed as specified in Section 5.2. The values of G0 used in experimentation are 35, 50, 100, 150, and 200. Fig. 5(a) depicts that the number of clusters decreases with the increase in G0 for Iris, Glass and Vowel datasets. For Wine and Breast Cancer datasets, the number of clusters remains same as G0 changes. Fig. 5(b) and (c) illustrates the effect of G0 over inter-cluster and intra-cluster distances. The inter-cluster and intra-cluster distances are least effected as G0 varies for Wine dataset. For other datasets, the values of inter-

cluster and intra-cluster distances are fluctuated over range of G0 . It was observed that the G0 ¼100 yields better clustering results for most of the datasets.

6. Application to image segmentation Image segmentation can be defined as the process of dividing an image into constituent regions and each region should have similar features. The similarity of features is measured using some image property (e.g. pixel intensity). Image segmentation can be modeled as a clustering problem. Several clustering techniques have been successfully applied in image segmentation (Jain et al., 1999). Here each pixel corresponds to a pattern and image region corresponds to a cluster. The image segmentation is mathematically defined as follows: Let I be the set of all image pixels. By applying segmentation on I, different non-overlapping regions R1 ; R2 ; …; Rn are formed

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

9

Table 6 Mean and standard deviation of 40 runs of automatic clustering using ACGSA over five grayscale images. Image

Optimal cluster range

ACGSA

ACDE

DCPSO

GCUK

Classical DE

Clouds Peppers Magazine Mumbai Robot

3–4 4–8 2–4 3–6 3–4

3.23 7 0.696 7.007 0.000 2.007 0.000 4.617 1.203 3.007 0.000

4.157 0.211 7.05 7 0.038 4.05 7 0.772 6.107 0.079 4.25 7 0.428

4.50 7 0.132 6.85 7 0.064 3.25 7 0.082 4.65 7 0.674 2.30 7 0.012

4.75 7 0.432 3.90 7 0.447 6.357 0.093 7.45 7 0.043 3.357 0.982

3.00 70.000 8.50 70.067 3.50 70.059 5.25 70.007 3.00 70.004

Table 7 Unpaired t-test between the best and the second best performing algorithms for each dataset based on Table 6. Image

Standard error

Clouds Peppers Magazine Mumbai Robot

0.115 0.006 0.013 0.218 0.155

t 8.0005 8.3218 96.4109 0.1835 2.2542

95% Confidence interval

Two-tailed P

Significance

 1.14893 to  0.06196 to  1.27581 to  0.47406 to  0.65911 to

o 0.0001 o 0.0001 o 0.0001 0.8549 0.0270

Extremely significant Extremely significant Extremely significant Not significant Significant

 0.69107  0.03804  1.22419 0.39406  0.04089

Fig. 7. (a) The original Lena image; (b) labeled image by ACGSA(K ¼6); (c) the original Mandril image; (d) labeled image by ACGSA(K¼6); (e) the original Jet image; (f) labeled image by ACGSA(K ¼5); (g) the original Pepper image; (h) labeled image by ACGSA(K¼ 7).

such that n

[ Ri ¼ I

i¼1

where

Ri \ Rj ¼ φ

ð15Þ

Every pixel of the image must be exhibited in one and only one segmented region. The proposed automatic clustering technique has been applied to both types of images; grayscale and color images for segmentation.

Table 8 Mean and standard deviation of 20 runs of automatic clustering using ACGSA over four color images. Image

Optimal cluster range

ACGSA

DCPSO

Lena Mandrill Jet Peppers

5–10 5–10 5–7 6–10

5.75 7 0.772 6.007 0.917 4.90 7 0.911 7.007 0.000

6.85 70.477 6.25 70.433 5.30 70.459 6.00 70.000

6.1. Implementation 1: grayscale image segmentation The five-grayscale images, each of size 256  256, are used for experimentation. For clustering, the intensity of each pixel is taken as a feature. For each image, the number of data points is 65 536. The same parameters setting is used for image segmentation as mentioned in Section 5.2 except that the user specified maximum number of clusters, K max , is set to 10 as it is the common practice.

The results are compared with ACDE, DCPSO, GUCK and Classical DE. Fig. 6 shows original images and segmented images obtained from ACGSA. The results, in terms of mean and standard deviations of number of clusters computed over 40 runs, are tabulated in

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

10

Table 9 Unpaired t-test between the best and the second best performing algorithms for images presented in Table 8. Image

Standard error

t

95% Confidence interval

Two-tailed P

Significance

Lena Mandrill Jet Peppers

0.203 0.097 0.228 –

5.4209 2.5821 1.7536 –

 1.51079 to  0.68921  0.44601 to  0.05399  0.86177 to 0.06177 –

o 0.0001 0.0138 0.0876 –

Extremely significant Significant Not significant –

Table 6. For Clouds image, DCPSO and GCUK provide five clusters which is not the optimal number of clusters. ACGSA produces three clusters and corresponding segmented image is shown in Fig. 6(b). ACGSA produces the number of clusters that fall in the optimal range in almost every run for Peppers, Magazine and Robot images and corresponding segmented images are shown in Figs. 6(d), (f) and (j) respectively. ACGSA produces number, in the optimal range, of clusters for Mumbai image and corresponding segmented image is shown in Fig. 6(h). Therefore, results in Table 6 depict that the ACGSA finds the number of clusters for grayscale images in optimal cluster range. Table 7 shows the results of unpaired t-tests based on the number of clusters of Table 6 between the best two algorithms. The results reveal that the ACGSA is statistically significant than other techniques for image segmentation. 6.2. Implementation 2: color image segmentation The proposed technique has also been applied to four wellknown color images; Lena, Mandrill, Jet and Peppers. The size of each image is 256  256. For each image, the number of data points is 65 536. The RGB color scheme is used. The experimental setup is same as used in grayscale images. The original images, labeled images formed by cluster labels and segmented images are shown in Fig. 7. The optimal range for number of clusters for above-mentioned images is specified in (Celenk, 1990; Turi, 2001). The results of ACGSA are shown in Table 8. For Lena and Mandrill images, ACGSA produces six clusters. ACGSA produces five and seven clusters in case of Jet and Peppers images respectively. The results are compared with DCPSO. The results produced by both the algorithms are in the optimal range. Further the significance of ACGSA is proved through unpaired t-test mentioned in Table 9. The unpaired t-test is not feasible for Peppers image as there is no variation in the number of clusters. These results show the supremacy of ACGSA over DCPSO. ACGSA finds the number of clusters in the optimal range.

7. Conclusions In this paper, a GSA based automatic clustering technique has been proposed. The proposed approach uses two new concepts namely a dynamic threshold setting and weighted cluster centroid computation. The experimental results clearly show that the proposed ACGSA is able to determine the correct number of clusters. The newly proposed optimization criterion helps in finding the optimal number of clusters and best partitions from the given dataset. The clusters produced by ACGSA are well separated and are more compact as claimed on the basis of inter-cluster and intra-cluster distances. The proposed algorithm has further been applied on five grayscale and four color images for segmentation. On comparing the results of the proposed technique with the recently developed techniques, it has also been observed that ACGSA outperforms the recently proposed automatic clustering techniques. On the basis of results obtained,

we can conclude that the proposed algorithm is well applicable for real life datasets, grayscale and color images. References Abdule-Wahab, R.S., Monmarch´e, N., Slimane, M., Fahdil, M.A., Saleh, H.H., 2006. A scatter search algorithm for the automatic clustering problem. In: Industrial Conference on Data Mining, pp.350-364. Abraham, A., Das, S., Roy, S., 2007. Swarm intelligence algorithms for data clustering. In: Maimon, O., Rokach, L. (Eds.), Soft Computing for Knowledge Discovery and Data Mining. Springer Verlag, Germany, pp. 279–313. Ball, G., Hall, D., 1967. A clustering technique for summarizing multivariate data. J. Behav. Sci. 12, 153–155. Bandyopadhyay, S., Maulik, U., 2002. Genetic clustering for automatic evolution of clusters and application to image segmentation. J. Pattern Recognit. 35, 1197–1208. Bandyopadhyay, S., Saha, S., 2008. A point symmetry based clustering technique for automatic evolution of clusters. IEEE Trans. Knowl. Data Eng. 20 (11), 1–17. Bezdek, J., 1981. Pattern Recognition with Fuzzy Objective function Algorithms. Plenum, New York. Blake, C.L., Merz, C.J., 1998. UCI Repository of Machine Learning 〈http:/www.ics.uci. edu/  mlearn/databases/〉. Cai, Z., Gong, W., 2011. A point symmetry-based clustering approach using differential evolution. J. Inf. Comput. Sci. 8 (9), 1593–1608. Celenk, M., 1990. A color clustering technique for image segmentation. J. Comput. Vis. Graph. Image Process. 52, 145–170. Das, S., Konar, A., 2009. Automatic image pixel clustering with an improved differential evolution. Appl. Soft Comput. 9, 226–236. Das, S., Sil, S., 2010. Kernel-induced fuzzy clustering of image pixels with an improved differential evolution algorithm. Inf. Sci. 180, 1237–1256. Das, S., Abraham, A., Konar, A., 2008a. Automatic kernel clustering with a multielitist particle swarm optimization algorithm. Pattern Recognit. Lett. 29, 688–699. Das, S., Abraham, A., Konar, A., 2008b. Automatic clustering using an improved differential evolution algorithm. IEEE Trans. Syst. Man Cybern. Part A 38, 218–237. Das, S., Abraham, A., Konar, A., 2009a. Metaheuristic Clustering, first ed. SpringerVerlag, Berlin, Heidelberg. Das, S., Chowdhury, A., Abraham, A., 2009b. A bacterial evolutionary algorithm for automatic data clustering. In: IEEE Congress on Evolutionary Computation, 2403–2410. Gandomi, A.H., Alavi, A.H., 2012. Krill Herd: a new bio-inspired optimization algorithm. Commun. Nonlinear Sci. Numer. Simul. 17, 4831–4845. Gandomi, A.H., Yang, X-.S., Alavi, A.H., 2011. Mixed variable structural optimization using firefly algorithm. Comput. Struct. 89, 2325–2336. Gandomi, A.H., Yang, X-.S., Alavi, A.H., Talatahari, S., 2013a. Bat algorithm for constrained optimization tasks. Neural Comput. Appl. 22, 1239–1255. Gandomi, A.H., Yang, X-.S., Alavi, A.H., 2013b. Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems. Eng. Comput. 29, 17–35. Hatamlou, A., Abdullah, S., Nezamabadi-Pour, H., 2012. A combined approach for clustering based on K-means and gravitational search algorithms. Swarm Evol. Comput. 6, 47–52. Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., Carvalho, A.C.P.L.F., 2009. A survey of evolutionary algorithms for clustering. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 39 (2), 133–155. Jain, A.K., Murty, M.N., Flynn, P.J., 1999. Data clustering: a review. J. ACM Comput. Surv. 31, 264–323. Jain, A.K., Duin, R.P.W., Mao, J., 2000. Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intel. 22, 4–37. Jarboui, B., Cheikh, M., Sarry, P., Rebai, A., 2007. Combinatorial particle swarm optimization (CPSO) for partitional clustering problem. Appl. Math. Comput. 192, 337–345. Lee, C.Y., Antonsson, E.K., 2000. Dynamic partitional clustering using evolutionary strategies. In: Asia–pacific Conference on Simulated Evolution and Learning. IEEE Press, Nagoya, Japan. Lee, W.P., Chen, S.W., 2010. Automatic clustering with differential evolution using cluster number oscillation method. In: International Workshop on Intelligent Systems and Applications, pp. 1–4. Masoud, H., Jalili, S., Hasheminejad, S.M.H., 2013. Dynamic clustering using combinatorial particle swarm optimization. Appl. Intell. 38, 289–314.

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i

V. Kumar et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ Niknam, T., Amiri, B., 2010. An efficient hybrid approach based on PSO, ACO and K-Means for cluster analysis. Appl. Soft. Comput. 10 (1), 183–197. Omran, M.G.H., Engelbrecht, A.P., Salman, A., 2006. Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Anal. Appl. 8, 332–344. Pan, S.M., Cheng, K.S., 2007. Evolution-based tabu search approach to automatic clustering. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 37 (5), 817–838. S. Quadfel, M. Batouche, A. Taleb-Ahmed, A modified particle swarm optimization algorithm for automatic image clustering. In: Proceedings of the IEEE International Conference on Digital Information Management, 2010, pp. 546–551. Rashedi, E., Pour, H.N., Saryazdi, S., 2009. GSA: a gravitational search algorithm. J. Inf. Sci. 179, 2232–2248. Rashedi, E., Pour, H.N., Saryazdi, S., 2010. BGSA: binary gravitational search algorithm. Nat. Comput. 9, 727–745. Saha, S., Bandyopadhyay, S., 2010. A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recognit. 43, 738–751.

11

Saha, S., Maulik, U., 2011. A new line symmetry distance based automatic clustering technique: application to image segmentation. Imag. Syst. Technol. 21 (1), 86–100. Sarkar, S., Das, S., 2013. Multi-level image thresholding based on two dimensional histogram and maximum tasllis entropy- a differential evolution approach. IEEE Trans. Image Process. Sheng, W., Liu, X., Fairhurst, M., 2008. A niching memetic algorithm for simultaneous clustering and feature selection. Knowl. Data Eng. 20, 868–879. Supratid, S., Kim, H., 2009. Modified fuzzy ants clustering approach. Appl. Intell. 31 (2), 122–134. R.H. Turi, Clustering-based Colour Image Segmentation (Ph.D. Thesis), Monash University, Australia, 2001. Xu, R., Wunsch II, D.C., 2009. Clustering. John Wiley and Sons, USA. Ye, F., Chen, C., 2005. Alternative KPSO-clustering algorithm. J. Sci. Eng. 8 (2), 165–174.

Please cite this article as: Kumar, V., et al., Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng. Appl. Artif. Intel. (2013), http://dx.doi.org/10.1016/j.engappai.2013.11.008i