Multi-elitist immune clonal quantum clustering algorithm

Multi-elitist immune clonal quantum clustering algorithm

Neurocomputing 101 (2013) 275–289 Contents lists available at SciVerse ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom...

1MB Sizes 10 Downloads 111 Views

Neurocomputing 101 (2013) 275–289

Contents lists available at SciVerse ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

Multi-elitist immune clonal quantum clustering algorithm Shuiping Gou n, Xiong Zhuang, Yangyang Li, Cong Xu, Licheng C. Jiao Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China Institute of Intelligent Information Processing, Xidian University, Xi’an 710071, PR China

a r t i c l e i n f o

abstract

Article history: Received 26 July 2011 Received in revised form 19 August 2012 Accepted 19 August 2012 Communicated by: T. Heskes Available online 23 September 2012

The quantum clustering (QC) algorithm suffers from the issues of getting stuck in local extremes and computational bottleneck when handling large-size image segmentation. By embedding a potential evolution formula into affinity function calculation of multi-elitist immune clonal optimization, and updating the cluster center based on the distance matrix, the multi-elitist immune clonal quantum clustering algorithm (ME-ICQC) is proposed in this paper. In the proposed framework, elitist population is composed of the individuals with high affinity, which is considered to play dominant roles in the evolutionary process. It can help to find the global optimal solution or near-optimal solution for most tested tasks. The diversity of population can be well maintained by general subgroup evolution of ME-ICQC. These different functions are implemented by the dissimilar mutation strategies or crossover operators. The bi-group exchanges the information of excellence antibodies using the hypercube coevolution operation. Compared with existing algorithms, the ME-ICQC achieves an improved clustering accuracy with more stable convergence, but it is not significantly better than other optimization techniques combined with QC. Also, the experimental results also show that our algorithm performs well on multi-class, parameters-sensitive and large-size datasets. & 2012 Elsevier B.V. All rights reserved.

Keywords: Quantum clustering Clonal selection Multi-elitist co-evolution Adaptive mutation Image segmentation

1. Introduction Clustering analysis is an important branch of unsupervised statistical pattern recognition. Without any priori knowledge about the samples, it divides unlabeled samples into several subsets according to some criteria, so that similar samples will be grouped into the same class while dissimilar samples will be partitioned into different categories. The existing clustering algorithms include partition clustering, hierarchical clustering, density-based clustering, grid-based clustering, model-based clustering, as well as clustering technologies combined with the fuzzy theory [1], graph theory [2] and so on. The traditional clustering algorithms, like K-means clustering, have been shown to be sensitive to initialization and noise, and the number of clusters has to be predetermined [3]. Thus, the authors in [4,5] proposed a novel quantum clustering (QC) algorithm, which described the distribution of samples in the Hilbert space with ¨ a nonlinear Gaussian wave function. By solving the Schrodinger equation, the resulting potential function has the minimums which correspond to the cluster centers. The idea of QC stems from the scale-space clustering [6] and support vector clustering [7], which is essentially a type of partition nonparametric clustering technologies. Besides, QC has the advantage of discovering

n

Corresponding author. E-mail address: [email protected] (S. Gou).

0925-2312/$ - see front matter & 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.neucom.2012.08.022

the inherent structures of data. It can be used in the fields of pattern recognition, bio-information mining, robot controlling and so on [8–10]. However, the classical quantum clustering algorithm has the following known drawbacks: it is quite sensitive to the selection of the scale parameter; the clustering result is prone to getting stuck in local extremes; its slow convergence has limited its application to large-size datasets. Many improved algorithms have been proposed. Zhang et al. substituted the exponent measuring distance for Euclidean distance [11]. Nasios and Bors [12] used the k-neighbor statistical distribution to estimate the kernel parameter and obtained the final partition by combining the Hessian matrix with region growing algorithm [12]. Li and Wang [13] constructed a uniform framework for QC and Fuzzy Cmean clustering algorithm (FCM) [13]. In 2009, Marvin and David explored dynamic quantum clustering methods for visual exploration of structures in data [14]. Li and Wang [13] proposed the parameter-estimated quantum clustering algorithm [15]. Some recent research has shown that the immune clonal selection algorithm [16] has the capability to find the global optimal, inspired by the biological mechanism. Compared with the existing computational intelligence methods such as genetic algorithm, Simulated annealing, and so on, immune clonal algorithm has the following advantages: the mutation operator is implemented on the memory unit but not all individual to get quickly the global optimal solution. The diversity of immune system is presented by computing the affinity to overcome the

276

S. Gou et al. / Neurocomputing 101 (2013) 275–289

‘‘prematurity’’ problem. Furthermore, the coevolutionary algorithm can be used to well solve high dimension numerical optimization problems [17,18]. Based on these findings, a quantum clustering algorithm with multi-elitist immune clonal optimization (ME-ICQC) is proposed in this paper. The proposed algorithm can overcome the parameter sensitivity issue and converge more quickly than QC by using immune clonal selection strategy. We divide the original population into two elite subgroups to coevolve to deal with high dimension data. In the elite subgroup, an adaptive cloud-model based mutation operator [19,20] is designed to guarantee a quick local search. General subgroup adopts a non-directional uniform hyper-mutation and all interference recombination [21] to extend the search space. After the different mutual communication of the bi-group, the multi-elitist preservation mechanism ensures the right direction of evolution. In addition, quantum potential function based simple but effective affinity formula is designed. The continuously updating of the cluster centers ensures the preferable stability and improvement in convergence speed of our algorithm. The experimental results show the computational precision and efficiency of our algorithm and its capability of dealing with large-scale dataset on public clustering and image segmentation datasets. The remainder of this paper is organized as follows: Section 2 describes the quantum clustering algorithm. Section 3 gives the details of the proposed algorithm, including the design of algorithm, motivation analysis of population partition, as well as the geometric significance of immune operators and parameters analysis. Section 4 presents the time complexity and convergence of the algorithm. Section 5 shows the experimental results on datasets clustering together with texture image and medical image segmentation. Finally, we draw the conclusions.

2.1. Quantum clustering Quantum physics estimates the locations of particles given the energy levels. Quantum clustering can be understood as the inverse of this problem. Namely, knowing the location of data samples, it calculates the states of samples under certain constraints. The quantum clustering algorithm [4,5] uses the Parzenwindow method [22] and sums up all of the N data points’ Gaussian (a set of basis functions) to estimate the probability distribution function c(x) (for the simplicity of algebraic manipulations, in quantum mechanics the probability amplitude which determines the probability distribution is 9c92). This can be represented as X 2 2 cðxÞ ¼ eðxxi Þ =2s ð1Þ i

where xi is the data points and s is the scale parameter called bandwidth or probability density estimator. c(x)is also called Gaussian wave function which assigns the ground state of ¨ Schrodinger equation and defines a map from nonlinear space to Hilbert space. According to the fifth postulate of quantum ¨ mechanics, the evolution of quantum follows the Schrodinger equation. Having known that c(x) is one of the solutions, the ¨ time-dependent Schrodinger equation is given by

s2 2

r2 þ VðxÞÞc ¼ EcðxÞ

VðxÞ ¼ E þ

s2 r 2 c 2

c

2 d 1 X 2 ¼ E þ ðxxi Þ2 eðxxi Þ =2s 2 2s2 c i

ð2Þ

where H is the Hamiltonian operator, E is the energy eigenvalue of H, r2 is the Laplacian operator, and V denotes the potential

ð3Þ

Let V be no-negative, that is minðVÞ ¼ 0, E can be defined as E ¼ min

s2 r 2 c 2

c

ð4Þ

Since c(x)is positive definite and normalized, it abstracts the data points toward minima. The Laplacian operator r2 is also positive definite which diffuses the potential surface to make the data points leave the minima. They balance the effects between Laplacian’s diffusion and potential function’s attraction, making up of the complete distribution of particles. In the case of single point of clustering problems, it is easy to obtain VðxÞ ¼ 1 ðxx1 Þ2 . Besides, the energy eigenvalue of H is E ¼d/2 (d is 2s2 the smallest possible eigenvalue of H [23], which can be the dimensionality of data samples), which is corresponding to the resonance state in quantum mechanics. This means that the ground state of potential is bounded by the single point case, it follows that: 0 oE r

d 2

ð5Þ

Once the potential function of the initial data samples is calculated, QC uses gradient descent formula for iteration. By defining yi(0)¼ xi, where Z(t) is the iteration speed and rV is the gradient of potential function, the cluster center updating formula in potential minima is yi ðt þ DtÞ ¼ yi ðtÞZðtÞrVðyi ðtÞÞ

2. Related work

Hc  ð

function, the minima of which determines the locations of cluster centers. Given c(x), we can solve Eq. (2) to get the general expression of potential function as

ð6Þ

Then, the nearest-neighbor rule is used to obtain the partition of each cluster. The data points which are nearest in Euclidean distance with each cluster center will be partitioned into the same class. Because the scale parameter which determines the extent of diffusion on potential surface directly influences the determination of final number of cluster centers, the classical quantum clustering algorithm is quite sensitive to the scale parameter. Secondly, the speed of gradient descent method is hard to control, and the iteration is so slow. Thirdly, it is inclined to fall into local extremes by directly calculating the gradient of potential function. 2.2. Varietal quantum clustering In order to cluster high-dimensional data, David Horn and Inon Axel proposed novel clustering algorithm for microarray expression data in a truncated SVD space [5]. Their quantum clustering method involves compression of dimensionalities achieved by applying SVD to the gene–sample matrix in microarray problems. Their quantum clustering method has one free scale parameter. Good clustering results were obtained on AML/ALL data. Nasios et.at explored kernel-based classification using quantum mechanics [12]. They used a nonparametric estimation approach. Kernel density estimation associates a function to each data sample. The proposed approach assumes that each data sample is associated with a quantum physics particle that has a radial activation field around it. In their work, the location of each data sample is considered and their corresponding probability density function uses the analogy with the quantum potential function. The kernel scale is estimated from distributions of K-nearest neighbors statistics. This algorithm was used on

S. Gou et al. / Neurocomputing 101 (2013) 275–289

Memory B cells (to boost second immune)

(Mortality)

. Mutation

Self antigen

M

. .

277

Cloning

M

Antibody Selection Plasma cells (to produce antibody) Alien antigens

Fig. 1. Clonal selection theory of Burnet.

artificial data and for the topography segmentation of radar images. Subsequently, Zhang et al. substituted the exponent measuring distance for Euclidean distance, named EDQC [11]. It improved the iterative procedure of QC algorithm and used exponent distance formula to measure the distance between data points and the cluster centers. Their experimental results demonstrate that EDQC outperforms QC, and the exponent distance formula used in the clustering process is a better choice than the Euclidean distance in data pre-processing. Lately, the dynamic quantum clustering (DQC) method has been presented in Physical Review [14]. The DQC drops the probabilistic interpretation of c and replaces it by that of probability amplitude as customary in quantum mechanics. DQC is set up to associate data points with cluster centers in a natural fashion. As a result, this method gains more information from the full wave function of a data point and can handle a large number of features easily. Overall, many improved QC algorithms have been proposed. They still suffer from the bottleneck problems of QC. This paper aims to solve the scale parameter sensitivity issue and to improve the accuracy and time complexity.

3. Proposed multi-elitist immune clonal quantum clustering (ME-ICQC) algorithm Motivated by the above, we design a framework of quantum clustering algorithm with multi-elitist immune clonal optimization (ME-ICQC) to improve the effectiveness and efficiency of QC and contribute to large-size data, such as image segmentation data. Two kinds of complementary mutation strategies guarantee the algorithm a good capability of local and global search. The multi-elitist preservation maintains the population diversity and the cluster center updating mechanism speeds up the convergence speed and improves stability. In addition, the antibody repression operator and clonal death operator improve the convergence by injecting new antibodies into the antibody population. We will show that by merging these two ideas with physics principle and biology mechanism, the proposed clustering algorithm performs better than the classical QC algorithm. 3.1. Immune clonal selection Artificial immune system (AIS) is an intelligent information processing method inspired by biological immune system, with learning mechanisms of self-taught, self-organization and storage

memory. Among them, one of the most vital theories is the clonal selection theory proposed by Burnet [16]. Its main ideas are shown in Fig. 1. The antibody is a native production and spreads on the cell surface in the form of peptide, where antigens can make a specific binding with it. When the immune system encounters antigens that have not been identified, it will produce an initial response. The lymphocyte can not only proliferate and differentiate into plasma cells, but also differentiate into B-memory cells with a longer life. The information code in the memory cells can learn and memorize the structure of protein. When reencounters the antigen, it can be quickly activated and proliferates into effector cells, trigger the second immune response. The basic definition of immune clonal algorithm (ICA) is denoted as follows: 1) Antigen: in AIS, the antigen usually represents the problem and its constraints. Particularly, in the problem of clustering, the antigen denotes the inputs of clustering algorithm, namely, some datasets with certain spatial structure and inner attribution, or images for segmentation. 2) Antibody: antibodies indicate the candidate solutions of problems. Especially in the immune optimization based clustering algorithm, the antibodies are the candidate cluster centers of input data. 3) Antibody–antigen affinity: this reflects the affinity ability between antigens and antibodies. In the following ME-ICQC algorithm, we design a novel and effective affinity function based on the characteristic of quantum mechanics. The state transfer of antibody population is clone

mutation

selection

C pop : AðtÞ!Ac ðtÞ ! Am ðtÞ ! Aðt þ1Þ

3.2. Multi-elitist immune quantum clustering algorithm The clustering algorithm, such as the QC method, divides a dataset into specified number of clusters by minimizing the objective function. Searching for the minimal objective functions can be treated as an optimization problem, which can be solved by global optimization techniques, such as the genetic clustering algorithm [24], ant colony clustering algorithm [25], particle swarm optimization (PSO) clustering algorithm [26], and so on. We propose multi-elitist immune clonal optimization frame in this paper, and the ME-ICQC algorithm is described as follows.

278

S. Gou et al. / Neurocomputing 101 (2013) 275–289

3.2.1. Antibody encoding We treat the clustering problem as a combinational optimization. Without loss of generality, each antibody is a sequence of real numbers, for Ndimensional space, K cluster centers, the length of the antibody is N  K, within which the first N words (gene) represent the first cluster center, the second N words represent the second cluster center, and so on. As an illustration example, let us consider a dataset with 4 samples in 3-dimension space, which can be expressed as 4  3 matrix. Suppose that it can be partitioned into 2 clusters, i.e., N ¼3, K ¼2, and the length of an antibody is 3  2¼6. Randomly select 2 sample points from the dataset as the initial cluster centers, e.g., (1,3,5) and (2,4,6), then the antibody is encoded as1  3 5 2  4 6. 3.2.2. Population initialization Randomly select L sample points from the given dataset as the initial antibody a1(0)AA(0), and then code, repeat M times, where M denotes the population size. According to the following motivation analysis of subgroup partition, the initial population is divided into elite subgroup and general subgroup. The affinity values of all antibodies should be calculated and sorted ascendingly, the first M1 ¼pM individuals are regarded as elite subgroup, and the remainder is the general subgroup, where p is the predetermined percentage values. Here, the term of ‘‘multielitist’’ has two meanings: preserving excellences in elite subgroup and general subgroup separately and preserving multi-excellences with optimal affinities in only one subgroup. 3.2.3. Affinity function design In order to handle high dimensionality, we compute the potential function of ME-ICQC between the single data points only, which uses distance matrix to calculate V instead of pattern matrix in Eq. (3). Upon most occasions, only distance between antibody and data points need to be calculated, which reduces the computational cost. By defining Dij ¼ :xi xj :, we can naturally reach P 2 D2ij d 1 j Dij expð 2s2 Þ V i ¼ E þ D2ij 2 2s2 P j expð 2s2 Þ

ð7Þ

On this basis, by minimizing the values of potential function, cluster centers K {Ci,i¼1,...K} and the partition sets of each class Ci are obtained by using the rule that the nearer data samples apart from a center in Euclidean distance. Let ni be the samples number of Ci, and by averaging all the samples in each Ci, new cluster center is as follows: 1 X Cni ¼ x , i ¼ 1,:::,K, j ¼ 1,:::,n ð8Þ ni x A C j j

i

Let :d:be the operator indicating Euclidean distance, and then the affinity function is defined as Af f inity ¼ ð1 þ

1 K P P i ¼ 1 xj A C i

ð9Þ

to the affinity values, which is antibody stochastic mapping induced by the affinity. It enlarges the search scopes while prevents trapping into local minima [27]. According to the description of information exchanging diversity of monoclonal antibody and polyclonal antibody in biology, we adopt the polyclonal clonal algorithm with mutation operator and recombination operator, which can increase the diversity of population. a) Clone: given the total clone scale nc, the clone operation is defined as h iT C C C Ac ðtÞ ¼ T Cc ðAðtÞÞ ¼ T c ða1 ðtÞÞ T c ða2 ðtÞÞ    T c ðaM ðtÞÞ For the elite subgroup, the clone scale of single antibody is adaptively adjusted based on affinity values f(ai): 0 1 B f ða Þ C B C qi ¼ IntBnc  M i C,i ¼ 1,2,:::,M @ A P f ðaj Þ

ð10Þ

j¼1

where IntðdÞ denotes taking the integer value, i.e., the smaller repression between antibodies, the larger antigen stimulation, the clone scale is increased. For another subgroup, the excellent antibodies are reproduced using fixed scale Nc. b) Mutation: immunology considers that the affinity maturation and the antibody diversity maintenance mainly rely on the high frequency mutation. Thus, differing from the ideas of general evolutionary algorithm, the crossover is the main operator and mutation is the assistant operator. Two kinds of complementary mutation strategies are used in ME-ICQC. For the general subgroup, random hyper mutation is adopted with mutation probability of 1 when meeting uniform distribution. We randomly generated in the range of (0,1) and adjust the corresponding gene l to l þ2d  1. The uniform hyper mutation is non-directional, which can bring mutation far away from original antibody, thus maintains the population diversity. For the elite subgroup, an adaptive cloud-model based mutation operator is adopted. The adaptability is reflected in the calculation of mutation probability Pm. As shown in Eq. (11), Z is a constant associated with precision, t is the iteration number, f, fmax and fave are the affinity value, the maximal affinity value and the average affinity value respectively. Pmax and Pmin are the lower-bound and upper-bound of mutation probability, a and b are the weight coefficients, which decide the importance of P 1m and P2m as opposed to Pm and a þ b ¼1. Generally we set a ¼ b ¼0.5. P1m ¼ expðZtÞ 8 Pmax Pmin < Pmin þ f Zf ave 1 þ expððf f ave Þ=ðf max f ave ÞÞ 2 Pm ¼ : Pmax f of ave P m ¼ aP1m þ bP 2m

ð11Þ

:xj Ci :Þ n

Through cluster center updating, the typical samples of each class are served as current antibodies to minimize the inner-class distance, which can improve the efficiency greatly and maintain the stability of iterations. As a result, the convergence speed is improved so that the affinity value will come to the maximum stable after a few iterations. 3.2.4. Immune operators The substance of CSA is that each generation generate a group of mutation solutions who are close to candidate solutions according

The digressive mutation probability takes into account both the evolutionary generation and antibody’s affinity. In the early stage of evolutionary, all antibodies take relatively large mutation probability, which maintains the population diversity. The mutation probability will decrease gradually under the adjustment of evolutionary generation and antibodies’ affinities, which ensures the algorithm to go deep into the surrounding of local extremes, to approach the optimum solutions with high accuracy while protecting the excellence simultaneously. Furthermore, as shown in Fig. 2, following sigmoid curve, the mutation probabilities are nonlinearly adjusted based on the average affinity fave and

S. Gou et al. / Neurocomputing 101 (2013) 275–289

279

Pmax

Pm

Pmin

fave

f

fmax

fave

fmax

Fig. 2. Adaptive mutation probability.

maximal affinity fmax of antibodies. When most part of the antibodies in the population has a similar affinity and the average affinity is close to the maximal affinity, the increase of mutation probability should be larger than the one alternated with linear pattern. The mutation probabilities near the antibodies with maximal affinities are weighted down at the same time, and the excellence individual are preserved as much as possible. The cloud model [19,20] is the uncertain transition model between a qualitative concept and its quantitative representation using linguistic knowledge, which reflects the fuzziness and randomness of concept in the objective world and human cognition. Among them, the normal cloud model is of random and stable orientation, which uses Expected value Ex, Entropy En and Hyper-entropy He to describe following the rule of normal random distribution. If applying it to the mutation operation, and the stable orientation of it can protect better excellences and make the global positioning more sound. The normal cloud-model based adaptive mutation operation is described as follows: Compute the initial certainty according to the affinity values: m ¼ Pmax ðPmax Pmin Þðf max f Þ=ðf max þ f min Þ, f min is the minimal affinity at each iteration. Let Ex be the antibody before mutation, En¼0.1s (s is the standard deviation of each dimension of data variable) and He¼0.1En. Generate a number randomly with the normal distribution with expected value En and variance He. Randomly generate a number between(0,1), roPffim, and the ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pfor 0 antibody after mutation is Am ðtÞ ¼ Ex 7En 2lnðmÞ. c) Recombination: for the elite subgroup, we do not use the recombination operation so as to protect elitists. For the general subgroup, we introduce all interference recombination [21] based on quantum interference. It can avoid the shortcoming that conventional recombination does not work when two antibodies are the same. It allows more antibodies information to be involved in the recombination. Additionally, for the antibodies beyond the effective range of variables after mutation and recombination, the antibody repression operation is introduced to eliminate antibody invalid genes, which are replaced with random numbers in the range of variable values. d) Selection: the operation selects excellent antibodies from the subgroup after cloning to form new population. For the best   antibody in each subgroup bi ðtÞ ¼ aij ðtÞf ðaij ðtÞÞ ¼ max f ðAm ðtÞÞ,

j ¼ 1,2,:::,qi g, it replaces Ai(t)AA(t) with the probability of Ps. 8 1 > > < f ðA ðtÞÞf ðb ðtÞÞ P s ¼ expð i g i Þ > > :0

f ðAi ðtÞÞ o f ðbi ðtÞÞ f ðAi ðtÞÞ Z f ðbi ðtÞÞ, and Ai ðtÞis not the current best: f ðAi ðtÞÞ Z f ðbi ðtÞÞ, and Ai ðtÞis the current best:

ð12Þ where g is a parameter related to the antibody population diversity, g 40, the larger it is, the better the diversity. e) Co-evolution: after the clonal selection operation, the current best antibodies cbest1 and cbest2 (corresponding to the maximal affinities of the elite subgroup (fitb1) and the general subgroup (fitb2)) make the comparison to obtain the global optimal antibody gbest, take gbest as the cluster centers to output the partition according to the nearest-neighbor principle. Meanwhile, hypercube co-evolution operation is conducted between the elite subgroup and general subgroup. For fitb1Zfitb2, make the recombination between the antibodies of elite subgroup and the M1 antibodies with minimal affinities of the general subgroup; for fitb1ofitb2, make the recombination between the M1 antibodies with maximal affinities of the general subgroup and the antibody of elite subgroup. The recombination operation here is performed as follows: for the two parent genes xk and yk, letlmin ¼ minðxk ,yk Þ, lmax ¼ maxðxk ,yk Þ, and d ¼ lmax lmin , then the hypercube co-evolution operator is defined as follows: ( xk þ 1 ¼ ( yk þ 1 ¼

unif rndðlmin oUd,lmax þ oUdÞ xk

unif rndðlmin oUd,lmax þ oUdÞ yk

r o Pc

else else

r oP c

ð13Þ

where r is a random number, Pc is the recombination probability, unif rndðdÞis to take a random number uniformly in a certain range, and o ¼0.2 is the spatial stretch factor of hypercube. f) Clonal death: when the changes of affinity values are lower than a threshold in successive four generations, the clonal death operation is employed, which injects several fresh antibodies to renew the population. It may enhance the process of evolution in perturbation and get rid of the local extremes.

3.2.5. The partition of multi-elitist subgroup Suppose that the size of antibody population is M, the elite subgroup size is M1, the general subgroup size is M2, and M¼ M1 þM2. Let x* be the global optimal solution of the proposed algorithm, and e be the convergence precision. The normal cloud mutation meets the

280

S. Gou et al. / Neurocomputing 101 (2013) 275–289

distribution of N(0,s2)with the probability density of fG(x). The random mutation of general subgroup meets the uniform distribution with the probability density of fU(x)¼1/2, xA( 1,1). After mutation, the solution falls into the interval of [x*  e, x* þ e] with the probability of

P ¼ ðM1

Z

xn þ e xn 

e

f G ðxÞdx þM 2

Z

xn þ e

xn 

e

f U ðxÞdxÞ=M

ð14Þ

a) For [x*  e, x* þ e]\[  3s,3s]¼|, we have Z þ1 Z 3s f G ðxÞdx þ f G ðxÞdx  0 1

3s

Eq. (14) can be written as Z n M2 x þ e M2 f ðxÞdx ¼ e P1 ¼ M xn e U M

ð15Þ

b) For [x*  e, x* þ e]\[  3s,3s]a|, Eq. (14) turns to Z n M1 x þ e 1 x2 M pffiffiffiffiffiffi expð 2 Þdxþ 2 e P2 ¼ M xn e M 2s 2ps When s is small enough Z n M1 x þ e 1 x2 pffiffiffiffiffiffi expð 2 Þdx P2  M xn e 2s 2ps

ð16Þ

It can be seen from Eq. (15), for the given antibody population and convergence precision, M2 should be the larger one in order to reach better solution after mutation. Owing to M¼M1 þ M2, large M2 leads small M1.under the condition of Eq. (16), P should decrease [28]. Consequently, we draw the conclusion that the partition of multi-elitist subgroup would affect the performance of the proposed algorithm. Generally speaking, the elite subgroup is smaller than general subgroup in size. After multi-verification, we take the applicable value of M1/M2 ¼ 1/3 in the following experiments. The first generation of general subgroup could be consisted of M  M1 antibodies of the initial population after elitist antibody selection. For the purpose of better population diversity, the first generation of general subgroup is renewed randomly here. The main procedure of ME-ICQC is summarized in Fig. 3. 3.3. Geometric significance of operators and parameters analysis 3.3.1. Normal cloud model based adaptive mutation In the normal cloud model based adaptive mutation operator, expected value Exis the typical sample of discourse domain which could be the most representative concept. Corresponding to the center of the cloud gravity, Entropy En is the synthetic measure of the random probability and fuzziness of the qualitative concept. Hyper-entropy He measures the uncertainty of entropy, which can also be considered as the entropy of En. As shown in Fig. 4, for the 1-dimensional normal cloud-model C(0, En and He), by the side of the origin of coordinate with expected value of 0, the larger of En is the greater coverage of cloud drops; the larger of He is the more discrete of the cloud drops. It is obvious that expected value reflects the stability of mutation, entropy reflects the span of mutation, and Hyper-entropy reflects the accuracy of mutation. The larger of the certainty m, the closer of the cloud drops to the top, the narrower search scope of the

variable, and the more dedicate of the mutation. As shown above,

m ¼ Pmax ðPmax Pmin Þðf max f Þ=ðf max þ f min Þ, the affinity based linear adjusted formula of initial certainty could make the search scope of the antibodies with large affinities more narrow, which benefits from excellent antibodies preservation. 3.3.2. All interference recombination The general recombination operator is confined to the two antibodies. In the worst case, when all the antibodies are identical, they lose the effectiveness, but all interference recombination is able to generate new antibodies to impel the evolution. For an easy understanding, let antibody number be 5 and the antibody length be 8, the new recombination operator is described in Table 1. Practically, we randomly generate a recombination point in the range of antibody, select the corresponding genes continueously until a new antibody is obtained. It stems from the quantum theory and facilitates the prevention prematurely, which is able to overcome the locality and one-sidedness of classical recombination. 3.3.3. Hypercube co-evolution The hypercube co-evolution operator in Eq. (13) is used to achieve co-evolution between elite subgroup and general subgroup, and the elitist could lead the general antibodies toward the optimal solution. As shown in Fig. 5(a), for the 1-dimensional case, the search space of hypercube co-evolution is to extend line segment AB into line segment CD; as shown in Fig. 5(b), for the 2-dimensional case, the search space is to extend the plane; as shown in Fig. 5(c), for the three-dimensional case, the search space consists of the inner cuboids and the outer extend space. For the case with more than three dimensions, the search space is to extend the hypercube in the corresponding dimensionality. Thus, it can increase the population diversity effectively. 3.3.4. Parameters analysis ME-ICQC uses immune optimization iteration to avoid computing n  n distance matrix (n is the size of data), which allows solving large-scale problems. However, the characteristic of quantum clustering is only reflected through the affinity function, and has not been integrated into the entire implementation. Therefore, as opposed to quantum ideas, the immune clonal impacts the algorithm performance greatly. Next, we focus on the key parameters in immune clonal optimization. The immune clonal optimization in ME-ICQC mainly includes the clone operation, the immune genic operation, the clonal selection operation and the hypercube co-evolution operation. The immune genic operation includes adaptive cloud mutation, the uniform hyper mutation and all interference recombination of general subgroup. As the antibody repression and clonal death make a fine-tuning of precision, they less affect the algorithm. Thus, we are concerned with the following parameters: the iteration number Nmax , the scale parameter s, the scale of population M, the proportion of elite subgroup p the clone scale Nc, the precision parameterZ, the diversity control parameter b, and the probability of hypercube co-evolution P. Among them, the value of p has been discussed in the previous section. Once it is chosen in an appropriate range, Z and g hardly affect the algorithm’s performance, hence we use the empirical values. The work in [29] had proved that for the problem of function optimization, the influence of nc is small [20,200]. The clone scale together with scale of population determine the size of candidate solutions in searching space, and the bigger the better in theory. Actually, once their values become too large, the oversized searching space would cause high computational complexity. It is experimentally proved that the optimal solution

S. Gou et al. / Neurocomputing 101 (2013) 275–289

281

Fig. 3. ME-ICQC algorithm.

exists usually near local extremes, so the too small searching space could cause a repeated searching. In addition, ME-ICQC embedded the minimization potential function into the evolution iteration of random search, which dilutes the dependence on parameter selection, and the scale parameter can take the value in a wide range, s ¼0.5 is simply used. For the recombination probability Pc of hypercube co-evolution, the larger value corresponds to wider random search space, and the global convergence speed is expedited, and the smaller value can hold good retention of the father antibodies. Taking into account both population diversity and maintenance of excellence, we choose Pc ¼0.5.

4. Complexity and convergence analysis 4.1. Computational complexity Assume that the size of the data samples is S, dimensionality of data samples is N, the cluster number is K, the number of

iteration is T, and the complexity of QC is O(S2  N  K  T). The population size is M, the size of elite subgroup and general subgroup are M1 and M2 (M1 oM2), respectively, and the overall size after cloning is Sc. The time complexity in each iteration of ME-ICQC includes time complexity of clone operation O (Sc  N  K), time complexity of immune genic operationO(Sc  N  K), time complexity of clonal selection operation O(M1  N2  K  S)þO(M2  N2  K  S), and time complexity of hypercube co-evolution operation O(M1  N  K). Hence in the worst case, the overall time complexity is 2O(Sc  N  K)þ 2O(M1  N2  K  S)þ O(M2  N2  K  S). Since the overall scale after cloning is more than the population scale, we make a reduction based on the rule of time complexity operation. The complexity of ME-ICQC is O(Sc  N2  K  S). The complexity of Kmean is O(N  K  T  S), and that of ICQC is O(Sc  N2  K  S). QC combined with the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) is compared with our method. The complexity 0 of GA–QC is O(M  N2  K  S), PSO–QC is O(M  N2  K  S), 0 where M is the particle number.

282

S. Gou et al. / Neurocomputing 101 (2013) 275–289

C(0,0.5,0.1)

C(0,0.5,0.2)

1

1

0.8

0.8 3En

0.6

0.6 He

0.4

He

0.4

0.2

0.2

0 -2

-1

0 Ex

1

0 -2

2

-1

C(0,1,0.1)

0

1

2

2

4

C(0,1,0.2)

1

1

0.8

0.8 3En

0.6

0.6

0.4

0.4

0.2

0.2

0 -4

-2

0

2

0 -4

4

-2

0

Fig. 4. Numeric feature of cloud mutation.

Table 1 All interference recombination operator. No.

Antibody (before recombination)

1 2 3 4 5

A(1) B(1) C(1) D(1) E(1)

No.

Antibody (after recombination)

1 2 3 4 5

A(1) B(1) C(1) D(1) E(1)

0.2δ C

E(2) A(2) B(2) C(2) D(2)

D(3) E(3) A(3) B(3) C(3)

A(2) B(2) C(2) D(2) E(2)

A(3) B(3) C(3) D(3) E(3)

B(5) C(5) D(5) E(5) A(5)

A(6) B(6) C(6) D(6) E(6)

E(7) A(7) B(7) C(7) D(7)

D(8) E(8) A(8) B(8) C(8)

A(4) B(4) C(4) D(4) E(4)

A(5) B(5) C(5) D(5) E(5)

A(6) B(6) C(6) D(6) E(6)

A(7) B(7) C(7) D(7) E(7)

A(8) B(8) C(8) D(8) E(8)

0.2δ

δ A

C(4) D(4) E(4) A(4) B(4)

B

D

0.2

0.2δ

Fig.5. Hypercube co-evolution operator. (a) 1-D recombination, (b) 2-D recombination and (c) 3-D recombination.

4.2. Convergence analysis

Theorem 1. Let the set I be the antibody space, t be the number of iterative generation, and the antibody population A¼(a1,a2,...,aM) with the scale of M be an element in the antibody

population spaceIM,. Namely, IM ¼{A9A¼ (a1,a2,...,aM), aiAI, 1rirM}. Then, the antibody population series {At, t Z0} of ME-ICQC is a finite homogeneous Markov Chain. Proof. ME-ICQC uses the cluster center coding, and adopts the random mutation based on the initial cluster centers, thus in

S. Gou et al. / Neurocomputing 101 (2013) 275–289

theory the state space of multi-elitist antibody population is infinite. Actually, for the antibody of K cluster centers and N dimension, A is finite in precision, and the state space of population is dKNM. Therefore, the antibody population series is finite. Note that, the clone operation, immune genic operation and clonal selection operation have ensured that the child population At þ 1 only be related with the parent population At, i.e., {At,tZ0} is a finite homogeneous Markov Chain [29]. Assume that fit is the affinity function of A, then the global optimum of clustering problem is n  o 0 0  Bn ¼ bbA IM , f itðbÞ ¼ f itn ¼ maxðf itðb ÞÞ, b A IM Definition 1. For any initial antibody population A0AIM, if the following equation holds:    lim P AðtÞ A Bn Að0Þ ¼ A0 ¼ 1 t-1

then we say that the algorithm is convergent to the optimal antibody population with probability 1. In other words, once the number of iterations is large enough, the probability of antibody population with global optimal antibodies is close to 1. Theorem 2. The quantum clustering algorithm with multi-elitist immune clonal optimization is convergent with probability 1. Proof. According to Theorem 1, the state transfer of our algorithm is described as Markov Chain. Let P{A(t)AB*9A(0) ¼A0} be pt, P{A(t)¼i9A(0) ¼A0} be pi(t), then P pt ¼ pi ðtÞ. i A Bn

The state transfer probability of random process {At,t Z0} is  ð17Þ pij ðtÞ ¼ pAðtÞ ¼ jAð0Þ ¼ i P P Known from the property of transfer probability, pij ðtÞ þ n j A B j= 2 Bn pij ðtÞ ¼ 1, so 0 1 X X X X pt ¼ pi ðtÞ ¼ pi ðtÞ@ pij ðtÞ þ pij ðtÞA i A Bn

¼

i A Bn

XX

j A Bn

j= 2Bn

XX pi ðtÞpij ðtÞ þ pi ðtÞpij ðtÞ

i A Bn j A Bn

ð18Þ

i A Bn j= 2 Bn

Known from the property of Markov Chain P P P P P P pi ðtÞpij ðtÞ ¼ pi ðtÞpij ðtÞ þ pi ðtÞpij ðtÞ

pt þ 1 ¼

i j A Bn

i A Bn j A B n

i= 2 Bn j A Bn

By combining Eqs. (18) and (20), we obtain XX pi ðtÞpij ðtÞ 4 pt pt þ 1 ¼ pt þ

5.1. Numerical datasets clustering 5.1.1. Test on sigma parameters sensitivity In order to test the robustness of ME-ICQC on datasets clustering, we choose 7 datasets from the UCI standard database [30]. Table 2 gives the information of the datasets used in the experiments. As the QC algorithm always requires the pre-processing of the input data, the whitened PCA and SVD methods are used in this experiment [26]. QC combined with the standard genetic algorithm (GA) and particle swarm optimization (PSO) are compared with our method. Due to evolutionary computation is a random search algorithm, the clustering results may be influenced by its initialization. For this consideration, we compare the evaluation indexes of these algorithms with average and maximum of 20 runs. For the clustering inconsistency, Jaccard index is used to measure the efficiency of clustering algorithm in [4], which is defined as Jaccard ¼

n11 n11 þn01 þ n10

ð23Þ

where n11 denotes the number of samples which match the real label and clustering result simultaneously; n01 and n10 denote the matching of only one case. The other index is clustering accuracy, which is defined as Rate ¼

K K X 1X Correctði,jÞ N i ¼ 1 j ¼ 1,i a j

ð24Þ

where Correct(i,j) represents the number of instances in the ideal clustering result, N is the total number of instances, and K is the class number. For the completely correct clustering, Jaccard index and Rate index are 1, hence it can be used to evaluate the clustering result, as is shown in Table 3. Table 3 presents the evaluation indexes of different clustering methods, where PSO–QC, GA–QC and ME-ICQC methods use the same sigma (0.5). The clustering results of QC evaluated by t sigma values in Table 3 are the optimum ones after repeated experiments. The results of four methods are shown by standard Table 2 Datasets used in the experiments.

ð21Þ

Datasets

Number of sample

Number of dimension

Class

ð22Þ

Iris Wine Heart Sonar Wpbc Vote Liver

150 178 303 208 198 435 345

4 14 13 60 33 16 6

3 3 2 2 2 2 2

Since 1 Zpt þ 1 4pt 4pt  1 4pt  2..., lim pt ¼ 1

Our experimental environment is: MATLAB 7.8 (R2009a), Intel(R) Pentium(R) 4 CPU 3.2GHz, Window XP Professional. We use C-MEX mixed programming for the subprogram of potential function computation. The parameters setting of MEICQC are explained as follows: the maximal iteration number Nmax ¼ 20, the scale parameter s ¼0.5, the scale of population M¼40, the proportion of elite subgroup p ¼0.25, the clone scale of single antibody Nc ¼3, the overall clone scale nc ¼30, the precision parameter Z ¼0.05, the diversity control parameter g ¼0.3, and the probability of hypercube co-evolution Pc ¼0.5. In GA–QC, crossover probability is set 0.8 and mutation probability 0.1. The parameters set of PSO are referred [18], and base r of NMF is set the category of dataset.

ð20Þ

i= 2 Bn j A B n

t-1

5. Experimental results

i A Bn j= 2Bn

Due to the adoption of multi-elitist preservation strategy, the gradient of quantum potential function Vi is always degressive in the affinity function. Namely, the particle’s track in the space is toward the potential well which corresponds to the cluster center. It ensures pij(t)¼0 for iAB*, jeB*, i.e., once the optimums appear in the parent, no matter how many generations it evolves the optimums will not degrade. That is to say XX pi ðtÞpij ðtÞ ¼ 0 ð19Þ

283

According to Definition 1, ME-ICQC is convergent with probability 1.

284

S. Gou et al. / Neurocomputing 101 (2013) 275–289

Table 3 The clustering results of different clustering methods. Methods

QC

PSO þQC

GA þQC

Datasets

Jaccard

Avg rate

Sigma

Jaccard

Avg rate

Std

Jaccard

Avg rate

Std

Jaccard

Avg rate

Std

Iris Wine Heart Sonar Wpbc Vote Liver

0.9007 0.7984 0.4425 0.3434 0.4350 0.6267 0.4109

0.9733 0.9438 0.7096 0.5721 0.6162 0.8621 0.5014

0.30 0.88 0.66 0.60 0.60 0.60 0.30

0.6674 0.4962 0.3959 0.3680 0.4565 0.4718 0.3904

0.7618 0.7079 0.6233 0.5279 0.6165 07249 0.5320

0.2021 0.1431 0.0777 0.0659 0.0886 0.0928 0.0474

0.9239 0.7869 0.4115 0.3537 0.3916 0.6005 0.3774

0.9733 0.9213 0.6964 0.6288 0.5468 0.8259 0.5167

0.0000 0.0036 0.0036 0.0153 0.0147 0.0086 0.0000

0.9239 0.8013 0.5166 0.3486 0.3905 0.6126 0.3771

0.9800 0.9449 0.7965 0.5988 0.5391 0.8480 0.5362

0.0000 0.0078 0.0045 0.0136 0.0304 0.0158 0.0041

0.75

ME-ICQC

0.8 0.75

0.7

0.7 0.65

0.65

0.6

0.6

0.5

0.55

ME-ICQC KM GAC

0.55

0

2

4

6

8

ME-ICQC KM GAC

0.5 10

12

14

16

18

20

0.45

0

2

4

6

8

10

12

14

16

18

20

Fig. 6. Correct rates of 20 independent experiments. (a) Wine and (b) Heart.

deviation and average correct rate of 20 runs, along with Jaccard index. It is shown that the classical QC algorithm is quite sensitive to the scale parameter. The sigma values of QC listed in Table 3 are the optimum ones after repeated experiments. Based on the observation, the other three methods applied evolution strategy to ease the parameter setting problem. But the PSOþQC algorithm suffers greatly from its parameter setting, and the clustering result is worse than GA–QC and ME-ICQC. Moreover, comparing to the GA–QC, ME-ICQC has higher Jaccard index and average accuracy for most test data. Additionally, Fig. 6 shows the trend of the two original datasets’ clustering correct rates in 20 runs by three algorithms: ME-ICQC, K-means clustering (KM) and genetic algorithm based clustering algorithm (GAC) [24]. Obviously, ME-ICQC performs quite well in stability compared with the traditional genetic clustering and K-means clustering, which comes from the addition of cluster center updating mechanism and effective immune operators.

5.1.2. Test performance on high dimension dataset In order to test the efficiency of ME-ICQC on high dimensional datasets clustering, we choose 6 bio-information datasets described in Table 4. The comparing results of our algorithm with other common clustering algorithm are shown in Table 4. QC algorithm can be applied to high dimension dataset but its results are not satisfied [5]. We compare the clustering results of original datasets and pre-processing data in Tables 5 and 6. It is worth pointing out that ME-ICQC can handle high dimensional UCI dataset better but its performance is not always the best. Compared with quantum clustering combined with immune clone algorithm (IC-QC) and KM (a robustness and traditional clustering method), we can see that the classical QC algorithm fails to classify most original UCI datasets. So, the whitened PCA and SVD data pre-processing methods [31] are

Table 4 Datasets used in the experiments. Datasets

Number of sample

Number of dimension

Class

Colon Ovarian Lung nci Gloub Spell

62 54 203 61 72 798

2000 1536 12600 5244 7129 72

2 2 5 8 4 5

used in this experiment. Then some bio-information datasets are supplemented and their clustering results are given Table 6. Further, as compared algorithms, results of non-negative matrix factorization (NMF) [32] and KM are given in Table 6. From Table 6 we can see, the clustering results after data preprocessing have a significant improvement over the results obtained by original data clustering. Compared with QC and KM algorithm, NMF has better partition performance on both UCI datasets and high-dimensional bio-information datasets. Besides, ME-ICQC has higher clustering average accuracy for most test tasks. However, ME-ICQC has higher computation cost and requires more parameters than KM and NMF.

5.1.3. Test multi-class clustering performance on artificial data We generate artificial datasets to validate proposed algorithm for multi-class datasets clustering performance. In order to a fair comparison, datasets AD_10_2 [33], AD_15_2 and AD_20_2 are used in this experiment. These datasets are of category from 10 to 20, and each class contains 50 samples with two dimensions. Most of these datasets have spherical distribution as shown in Fig. 7, where AD_15_2 and AD_20_2 are the extension of AD_10_2 by coping and shifting. The cluster results of 20 runs are recorded in Table 7.

S. Gou et al. / Neurocomputing 101 (2013) 275–289

285

Table 5 The clustering results of different clustering methods without dimension reduction. Methods

ME-ICQC

QC

IC-QC

KM

Datasets

Max rate

Avg rate

Jaccard

Max rate

Avg rate

Jaccard

Max rate

Avg rate

Jaccard

Max rate

Avg rate

Jaccard

Iris Wine Heart Sonar Wpbc Vote Liver

0.9267 0.7135 0.7261 0.5673 0.7020 0.8828 0.5014

0.9007 0.7079 0.7167 0.5472 0.5980 0.8687 0.4943

0.7096 0.4123 0.4447 0.3548 0.4122 0.6358 0.4116

0.8754 0.2673 – – – – –

0.8467 0.2673 – – – – –

0.6631 0.0853 – – – – –

0.9000 0.6573 0.7162 0.4952 0.7576 0.3862 0.4348

0.8858 0.6446 0.7162 0.4612 0.7522 0.3848 0.4282

0.6872 0.4637 0.4444 0.4674 0.6235 0.5211 0.5024

0.8933 0.7022 0.7162 0.5529 0.6010 0.8667 0.4464

0.8173 0.6652 0.6421 0.5529 0.6010 0.8667 0.4464

0.6526 0.4118 0.3964 0.3358 0.4048 0.6312 0.4538

Table 6 The clustering results of different clustering methods with dimension reduction. Methods

ME-ICQC

QC

KM

NMF

Datasets

Max rate

Avg rate

Std

Avg rate

Jaccard

Max rate

Avg rate

Std

Max rate

Avg rate

Std

Iris Wine Heart Sonar Wpbc Vote Liver colon Gloub lung nci ovarian Spell

0.9800 0.9494 0.8119 0.6442 0.5657 0.8782 0.5478 0.8871 0.7911 0.8916 0.7549 0.8704 0.7556

0.9800 0.9449 0.7965 0.5988 0.5391 0.8480 05362 0.8785 0.7741 0.8811 0.7482 0.8580 0.7218

3.45E  16 0.0078 0.0045 0.0136 0.0304 0.0158 0.0041 0.0083 0.1026 0.0212 0.0154 0.0090 0.0252

0.9733 0.9438 0.7096 0.5721 0.6162 0.8621 0.5014 0.8732 0.8011 0.8716 0.7241 0.8324 0.7086

0.9007 0.7984 0.4425 0.3434 0.4350 0.6267 0.4109 0.7150 0.7170 0.7370 0.5640 0.6260 0.4980

0.9733 0.9438 0.7096 0.6442 0.6010 0.8667 0.5159 0.8871 0.7083 0.6010 0.7213 0.8148 0.5000

0.8827 0.9438 0.6144 0.5757 0.5831 0.7706 0.5139 0.8785 0.5935 0.5113 0.5923 0.7975 0.2444

0.0483 0 0.0526 0.0462 0.0128 0.0604 0.0013 0.0083 0.1008 0.0534 0.0619 0.0191 0.1264

0.9600 0.9382 0.7525 0.5721 0.7525 0.8805 0.5565 0.8548 0.6944 0.8966 0.6885 0.9074 0.7419

0.9403 0.9267 0.6015 0.5673 0.6732 0.8805 0.5362 0.8468 0.5410 0.6591 0.4861 0.9074 0.6442

0.0880 0.0058 0.0854 0.0210 0.0420 1.14E  16 0.0054 0.0361 0.0709 0.0974 0.0769 5.7E  16 0.0499

14

12

8

12

10

6

10

8

8

6

6

4

4

2

0

2

0

-2

0

-2

-2

-4

-4

-6

-6 -20 -15 -10 -5

0

5

10 15 20 25

4 2

-4 -6

-8 -20 -15 -10

-5

0

5

10

15

20

-8 -6

-4

-2

0

2

4

6

8

10

Fig. 7. Correct rates of 20 independent experiments. (a) AD_10_2, (b) AD_15_2 and (c) AD_20_2.

Table 7 The multi-class dataset clustering results of different clustering methods. Methods

ME-ICQC

QC

Datasets

Max rate

Avg rate

Std

Max rate

Avg rate

Std

Max rate

Avg rate

Std

AD_10_2 AD_15_2 AD_20_2

0.9653 0.9940 0.8970

0.7913 0.8609 0.8162

0.0927 0.0975 0.0708

0.8613 0.5940 0.8940

0.7913 0.5940 0.8162

0.0442 0.0000 0.0605

0.8947 0.9960 0.8730

0.7826 0.8169 0.7374

0.0660 0.1362 0.0732

It is obvious that standard QC and ME-ICQCQC can better cluster multi-class datasets than KM algorithm. This is based on ¨ the fact that they solve the Schrodinger equation to achieve cluster centers by minimums potential function. In other words,

KM

QC does not need predefine the number of clustering categories, which makes it less sensitive to the number of partition. This good character can be utilized in web data clustering and image segmentation.

286

S. Gou et al. / Neurocomputing 101 (2013) 275–289

5.2. Texture images segmentation Next, we apply ME-ICQC to the segmentation of artificial synthetic texture images and medical images. As the synthetic texture images have rich texture features, bear some similarity, their contours contain high frequency information abundantly; the image segmentation results are affected by different image feature representations. By searching the feature extraction methods, the segmentation results can be further improved. For now, we only focus on wavelet transform features. Wavelet transform [34] has the capability to represent the local feature of signal in the domain of time and frequency. Thus the sub-band energy information of wavelet decomposition can describe the image feature well. In order to test the efficiency of our algorithm, we extract the discrete wavelet transform features of texture images. The detail of feature extraction methods of the following texture image segmentation experiments is: wavelet decomposition is performed in 3 levels on the pixels; the size of slide window is 16  16; the sub-band energy based on the decomposition coefficient is computed, which finally produces 10 features of each pixel. The size of image used in the experiments is 256  256, including 2–5 classes texture images, thus the dataset scale to deal with is 65536  10. QC can hardly suffer from large size data and without its result in Tables 8 and 9.

Fig. 8 shows the segmentation results of ME-ICQC on 4 groups texture images. For the artificial synthetic texture images, ME-ICQC’s segmentation results and reference templates are given in sequence. It can be seen that the segmentation results are better than the case when the class of texture is smaller. For the 5-class case, the left and right blocks of texture are similar, and difficult to recognition, thus the segmentation result is worse than the less class cases. Moreover, as shown in Table 8, the clustering error rate and Kappa index [35] using different clustering methods are given. In Tables 8 and 9, the clustering error rate [36] is computed by erate ¼ 1

K K X 1X Correctði,jÞ N i ¼ 1 j ¼ 1,i a j

ð25Þ

where Correct(i,j) represents the number of pixels emergimg both in the real texture template and clustering result, N is the total number of pixels, and K is the class number of texture. PSO–QC gives minimal error rate due to its instability. Table 8 shows the comparison results of common clustering methods used to image segmentation. From the numerical values of Table 8, in most cases, stability and accuracy of the segmentation produced by ME-ICQC is superior to GA–QC, PSO–QC. GA–QC is quicker than ME-ICQC. From Table 9 we found that the minimal error rate is fair for three algorithms. ME-ICQC is advantageous to average accuracy

Table 8 Error rates and Kappa indexes of GA–QC, PSO–QC and ME–ICQC. Methods

ME-ICQC

Datasets

Avg error(%)

Kappa index

Time (s)

Avg error (%)

Kappa index

Time (s)

Minimal error (%)

Kappa index

Time (s)

2-class 3-class 4-class 5-class

1.33 5.22 4.03 14.99

0.9734 0.9209 0.9459 0.8126

94.49 104.47 114.19 129.17

1.42 5.82 3.98 16.15

0.9716 0.9117 0.9466 0.7859

20.52 24.09 25.92 29.16

7.25 16.37 21.45 20.19

0.8352 0.7785 0.7327 0.7065

392.78 434.32 486.52 545.99

GA-QC

PSO-QC

Table 9 Image segmentation results of KM, NMF and ME-ICQC. Methods

ME-ICQC

KM

NMF

Datasets

Minimal error (%)

Avg rate

Std

Minimal error (%)

Avg rate

Std

Minimal error (%)

Avg rate

Std

2-class 3-class 4-class 5-class

1.30 5.13 3.99 14.36

1.33 5.22 4.03 14.99

0.0212 0.1072 0.0037 0.084

2.00 5.41 3.92 15.60

2.12 5.42 9.44 44.59

0.1084 0.0021 5.9424 16.7645

1.30 5.21 3.91 14.52

1.31 5.26 14.49 29.51

0.0102 0.0177 8.5471 0.7047

Fig. 8. Synthetic texture images segmentation results. (a) 2-class texture, (b) 3-class texture, (c) 4-class texture and (c) 4-class texture.

S. Gou et al. / Neurocomputing 101 (2013) 275–289

compared with K-means and NMF. Moreover, the best results of ME-ICQC clustering’s in 20 runs are not diverse and its robustness is obvious for multi-class image segmentation. 5.3. Medical images segmentation Medical images usually have very complex texture information and little difference in gray level. Besides, the human anatomy structure is sophisticated, the tissues and organs are irregular, the edges of key parts are invisible, and the imaging quality depends on many factors. Consequently, the medical image segmentation is a classical puzzle in the field of image segmentation. The accurate segmentation on human tissues and organs with the help of anatomical knowledge could be directly applied to 3D reconstruction of medical images and the research on pathogenesis. On the other hand, directly extracting the abnormal feature of the cancer or tumor images could locate the lesions rapidly, and facilitate the diagnosis and therapy. Furthermore, the subtle texture structure and faint gray contrast have determined that general feature extraction methods are not entirely suitable for medical image. Based on it, we use the gray value of images for clustering straightly in the following segmentation experiments. Again, QC can not deal with large size data while ME-ICQC and GA–QC can. 5.3.1. Meningioma medical images segmentation Fig. 9(a) and (b) shows the magnetic resonance images of human brain section, which come from /http://www.spl.harvard. edu:8000/S. The database consists of magnetic resonance images of several anonymous brain tumor patients, as well as segmentations of the brain and tumor from these scans. Manual segmentations are obtained by neurosurgeons, and automated segmentations are obtained by using the method in [37,38]. The original data format is: no-header, unsigned short 16 bit (byte order: MSB LSB). Their resolution: 256  256  124, pixel size: 0.9375  0.9375 mm, slice thickness: 1.5 mm, slice gap: 0.0 mm, acquisitions order: LR. This image database contains 124 slices meningioma each person

Case2-058

Gold standard

287

adds up to 10 persons. In this experiment, case 2-58 and case 10-35 are selected to test our proposed algorithm performance, in which meningioma location of case 2-58 is left parasellar and low grade glioma of case10-35 is left tempora. It is seen from the segmentation results of two images, that ME-ICQC does a good job in detecting brain tumor region and the segmentation accuracy (1270/1286¼98.76%, 1656/1706¼97.07%) is higher than the GA– QC method. 5.3.2. Eyeball blood vessel image segmentation Computer analysis of eyeball blood vessel image can measure quantitatively many eyeground tissues and diagnose abnormal organic. In retinal image, accurate vessel extracting is very crucial for checking and finding hyperpietic, diabetes and arteriosclerosis so on disease. In this experiment, medical images and segmentation analysis come from document [39] and /http://www.ces. clemson.edu/ ahoover/stareS. Original data format, their resolution: 700  605, view scene has diameter about pixel 650  550. STARE contains 20 digital slides where there are 10 images exiting pathology case. These images were manually segmented by two experts. Segmentation results were put in ah set vk set, respectively. Here, segmentation accuracy is defined as the ratio of accurately partition pixel number and retinal ground truth image pixel number. serial number 0077 and 0082 are selected to test our proposed algorithm performance, in which they are normal. The blood vessel of human eyeball in Fig. 10(b) is compact and fine in distribution, the branches are irregular and the background contrast is tiny. From the segmentation results of the leftside original image, ME-ICQC does a good job in local details treatment.

6. Conclusions In summary, by the embedding of potential evolution formula into affinity function of multi-elitist immune clonal optimization, ME-ICQC has solved the sensitivity problem to scale parameter; by the designing of effective immune operators and the mechanism

ME-ICQC (98.76%)

GAQC (96.58%)

(256×256)

Case10-035

Gold standard

ME-ICQC (97.07%

(256×256) Fig. 9. Meningioma medical images segmentation.

)

GAQC (20.19%)

288

S. Gou et al. / Neurocomputing 101 (2013) 275–289

0077 (256×256)

Ground truth (ah set)

Ground truth (vk set)

ME-ICQC (ah: 96.55%,vk: 93.55%)

GA-QC (ah: 96.55%,vk: 93.47%)

0082 (256×256)

Ground truth (ah set)

Ground truth (vk set)

ME-ICQC (ah: 96.45%,vk: 93.86%)

GA-QC (ah: 96.62%,vk: 93.40%)

Fig. 10. Eyeball blood vessel image segmentation.

of cluster center updating, ME-ICQC has solved the problems of slow iteration and local extreme; by computing the minimum of potential function based on the distance matrix and the single antibody evolution, ME-ICQC has also solved the problem of large-scale data processing bottleneck, although it is not necessarily better than other optimization techniques combined with QC. In detail, if the sample size is less than 104, the QC algorithm is implemented, or else the ME-ICQC is done. The experimental results on UCI standard datasets have shown the good performance of our algorithm. The segmentation results on artificial synthetics texture images and medical images indicated that our algorithm is applicable to large-size data. In ME-ICQC algorithm, we simply used the mechanism of cluster center updating with a property of K-means clustering, which could influence the clustering results. More robust cluster center updating methods and affinity function designing methods could further improve our algorithm. e.g., updating the cluster center by introducing FCM one step iterative operator [40], substituting the Euclidean distance for density-sensitive distance measure to search for the cluster centers of data with manifold structure [41], designing the affinity function using PBM index [42]. Besides, it is also worthy to study effectively preventing prematurely, constructing more appropriate communication and collaboration strategy among multi-elitist populations.

Acknowledgments The authors would like to thank two anonymous reviewers for their suggestions Dr. PhD Shuo Wang, (School of Computer Science at the University of Birmingham). This work was supported in part by the National Natural Science Foundation of China (Nos. 61003198 and 60970067), the fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project) (No. B07048) and the Program for Cheung Kong Scholars and Innovative Research Team in University (No. IRT1170).

References [1] E.H. Ruspini, Numerical methods for fuzzy clustering, Inf. Sci. 15 (2) (1970) 319–350. [2] J. Shi, J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 22 (8) (2000) 888–905. [3] W.L. Cai, S.G. Chen, D.Q. Zhang, Fast and robust fuzzy C-means clustering algorithms incorporating local information for image segmentation, Pattern Recognition 40 (3) (2007) 825–838. [4] D. Horn, A. Gottlieb, The method of quantum clustering, Proc. Adv. Neural Inf. Process Syst. (NIPS) 14 (2001) 769–776. [5] D. Horn, I. Axel, Novel clustering algorithm for microarray expression data in a truncated SVD space, Bioinformatics 19 (9) (2003) 1110–1115. [6] S.J. Roberts, Non-parametric unsupervised cluster analysis, Pattern Recognition 30 (2) (1997) 261–272. [7] A. Ben-Hur, D. Horn, H.T. Siegelmann, V. Vapnik, Support vector clustering, J. Mach. Learn. Res. no.2 (2001) 125–137. [8] N. Kumar, L. Behera, Visual–motor coordination using a quantum clustering based neural control scheme, Neural Process. Lett. 20 (2004) 11–22. [9] M. Bhattacharyya, S. Vishveshwara, Quantum clustering and network analysis of MD simulation trajectories to probe the conformational ensembles of protein-ligand interactions, Mol. Biosyst. 7 (7) (2011) 2320–2330. [10] Z. Zhao, X. Zhao, W. Li, X. Duan, Improving quantum clustering algorithm of categorical data, Comput. Appl. Software 27 (12) (2010) 101–104. [11] Y. Zhang, P. Wang, G.Y. Chen, et al., Quantum clustering algorithm based on exponent measuring distance, IEEE International Symposium on Knowledge Acquisition and Modeling Workshop, 3CA, 1, 2008, 436–439. [12] N. Nasios, A.G. Bors, Kernel-based classification using quantum mechanics, Pattern Recognition 40 (3) (2007) 875–889. [13] Z.H. Li, S.T. Wang, Quantum theory: the unified framework for FCM and QC algorithm, in: Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR), 3, 2007, 1045–1048. [14] M. Weinstein, D. Horn, Dynamic quantum clustering: a method for visual exploration of structures in data, Phys. Rev. E 80 (066117) (2009) 1–10. [15] Z.H. Li, S. Wang, Parameter-estimated quantum clustering algorithm, J. Data Acquis. Process. 23 (02) (2008) 211–214. [16] F.M. Burnet, The Clonal Selection Theory of Acquired Immunity, Cambridge University Press, London, 1959. [17] Mitchell A. Potter, Kenneth A. De Jong, Acooperative coevolutionary approach to function optimization, in: Y. Davidor, H.P. Schwefel, R. Manner (Eds.), Parallel Problem Solving Fom Nature—PPSN IIISpringer-Verlag, Berlin, 1994, pp. 49–257. [18] Frans van den Bergh, Andries P. Engelbrecht, A cooperative approach to particle swarm optimization, IEEE Trans. Evol. Comput. 8 (3) (2004) 225–239. [19] D.Y. Li, H.J. Meng, X.M. Shi, Membership clouds and membership cloud generators, J. Comput. Res. Dev. 32 (6) (1995) 15–20.

S. Gou et al. / Neurocomputing 101 (2013) 275–289

[20] C.Y. Liu, D.Y. Li, Y. Du, et al., Some statistical analysis of the normal cloud model, Inf. Control 34 (2) (2005) 236–239. [21] A. Narayanan, M. Moore, Quantum-inspired genetic algorithms, Proc. IEEE Evol. Comput. (EC) (1996) 61–66. [22] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, 2nd ed., WileyInterscience, New York, 2001. [23] S. Gasiorowicz, Quantum Physics, John Wiley and Sons, Inc., New York, 1996. [24] U. Maulik, S. Bandyopadhyay, Genetic algorithm based clustering technique, Pattern Recognition 33 (9) (2000) 1455–1465. [25] J.L. Deneubourg, S. Goss, N. Franks, et al., The dynamics of collective sorting: robot-like ants and ant-like robots, in: Proceedings of the First International Conference on Simulation of Adaptive Behavior: From Animals to Animates, Cambridge, MA, MIT Press, 1991, pp. 356–365. [26] M. Omran, Particle Swarm Optimization Methods for Pattern Recognition and Image Processing, Ph.D. Thesis, Department of Computer Science, University of Pretoria, South Africa, 2005. [27] H.F. Du, L.C. Jiao, S.A. Wang, Clonal operator and antibody clone algorihm, in: Proceedings of the First International Conference on Machine Learning and Cybemetics, 1, 2002, pp. 506–510. [28] W.M. Wu, B.C. Chen, D. Chen., et al., Evolutionary strategy algorithm based on bi-group, J. Comput. Appl. 29 (5) (2009) 1254–1256. [29] L.C. Jiao, Y.Y. Li, M.G. Gong, et al., Quantum-inspired immune clonal algorithm for global optimization, IEEE Trans. Syst. Man Cybern. Part B: Cybern. 38 (5) (2008) 1234–1253. [30] A. Asuncion, Newman, D.J. UCI Machine Learning Repository (/www.ics.uci. edu/  mlearn/MLRepository.htmlS). Irvine, University of California, CA, School of Information and Computer Science. [31] A. Gottlieb, Quantum Clustering: A Innovative Clustering Method Derived From Physics, M. Sc. Thesis, Tel-Aviv University, Israel, 2001. [32] D.D. Lee, H. Sebastian Seung, Learning the parts of objects by non-negative matrix factorization, Nature 401 (21) (1999) 788–791. [33] S. Bandyopadhyay, U. Maulik, Nonparametric genetic clustering: comparison of validity indices, IEEE Trans. Syst Man Cybern. C 31 (1) (2001) 120–125. [34] S.G. Mallat, A theory of multiresolution signal decomposition: the wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell. 11 (1989) 674–693. [35] G.M. Foody, Status of land cover classification accuracy assessment, Remote Sensing Environ. 80 (1) (2002) 185–201. [36] M.G. Gong, L.C. Jiao, L.F. Bo, L. Wang, X.R. Zhang, Image texture classification using a manifold distance based evolutionary clustering method, Opt. Eng. 47 (7) (2008) 077201-1–077201-10. [37] Michael Kaus, Simon K. Warfield, Arya Nabavi, Peter M. Black, Ferenc A. Jolesz, Ron Kikinis, Automated segmentation of MRI of brain tumors, Radiology 218 (2) (2001) 586–591. [38] Simon K. Warfield, Michael Kaus, Ferenc A. Jolesz, Ron Kikinis. Adaptive, Template moderated, spatially varying statistical classification, Med. Image Anal. 4 (1) (2000) 43–55. [39] A. Hoover, V. Kouznetsova, M. Goldbaum., Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response, IEEE Trans. Med. Image 19 (5) (2000) 203–210. [40] J. Li, X.B. Gao, L.C. Jiao, A CSA-based new fuzzy clustering algorithm, J. Electron. Inf. Technol. 27 (2) (2005) 302–305. [41] M.G. Gong, L.C. Jiao, L. Wang, L.F. Bo, Density-sensitive evolutionary clustering, Advances In Knowledge Discovery And Data Mining, Lecture Notes in Computer Science, vol. 4426/2007, Springer-Verlag, pp. 507–514. [42] M.K. Pakhira, S. Bandyopadhyay, U. Maulik, Validity index for crisp and fuzzy clusters, Pattern Recognition 37 (3) (2004) 487–501.

Shuiping Gou received the B.S. and M.S. degrees in Computer Science and Technology from Xidian University, Xi’an, China, in 2000 and 2003 respectively, and the Ph.D. degree in Pattern Recognition and Intelligent System from Xidian University, Xi’an, China, in 2008. She is currently an associate professor with Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China at Xidian University. Her current research interests include machine learning, evolutionary computation, image understand and interpretation, and data mining.

289 Xiong Zhuang received the B.S. degree in 2008 from Xidian University, Xi’an, China. He is currently a graduate student of School of Electronic Engineering, and majors in Intelligent Information Processing. His current research interests include clustering analysis, image segmentation, evolutionary computation and parallel computing.

Yangyang Li received the B.S. and M.S. degrees in Computer Science and Technology from Xidian University, Xi’an, China, in 2001 and 2004 respectively, and the Ph.D. degree in Pattern Recognition and Intelligent System from Xidian University, Xi’an, China, in 2007. She is currently an associate professor with Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China at Xidian University. Her current research interests include quantum-inspired evolutionary computation, artificial immune systems, and data mining.

Cong Xu received the B.A. degree in Electronic science and technology from the Xi’an University of Posts and Telecommunications, China. She is currently working towards the Master’s degree at the Department of Electronic engineering, Xidian University. Her research interests are in Evolutionary computation, Machine learning, data mining, and computational intelligence.

Licheng Jiao received theB.S.degree from Shanghai Jiao Tong University, Shanghai, China, in1982, and the M.Sc. and the Ph.D.degree from Xi’an Jiao Tong University, Xi’an,China,in1984 and in1990, respectively. His research interests include signal and image processing, natural computation, and intelligentin formation processing. He is an IEEE senior member, a member of the IEEE Xi’an Section Executive Committee and the chairman of its Awards and Recognition Committee, and an executive committee member of the Chinese Association of Artificial Intelligence.