Expert Systems with Applications 38 (2011) 9096–9104
Contents lists available at ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Fault diagnosis of car assembly line based on fuzzy wavelet kernel support vector classifier machine and modified genetic algorithm Qi Wu a,b,⇑, Rob Law b,⇑, Shuyan Wu c a
Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing, Jiangsu 210096, China School of Hotel and Tourism Management, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong c Zhengzhou College of Animal Husbandry, Zhengzhou, Henan 450011, China b
a r t i c l e
i n f o
Keywords: Triangular fuzzy number Support vector classifier machine Wavelet analysis Genetic algorithm Fault diagnosis
a b s t r a c t This paper presents a new version of fuzzy wavelet support vector classifier machine to diagnosing the nonlinear fuzzy fault system with multi-dimensional input variables. Since there exist problems of finite samples and uncertain data in complex fuzzy fault system, the input and output variables are described as fuzzy numbers. Then by integrating the fuzzy theory, wavelet analysis theory and v-support vector classifier machine, fuzzy wavelet v-support vector classifier machine (FWv-SVCM) is proposed. To seek the optimal parameters of FWv-SVCM, genetic algorithm (GA) is also applied to optimize unknown parameters of FWv-SVCM. A diagnosing method based on FWv-SVCM and GA is put forward. The results of the application in car assembly line diagnosis confirm the feasibility and the validity of the diagnosing method. Compared with the traditional model and other SVCM methods, FWv-SVCM method requires fewer samples and has better diagnosing precision. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction Recently, a novel machine learning technique, called support vector machine (SVM), has drawn much attention in the fields of pattern classification and regression estimation. SVM was first introduced by Vapnik (2000, 1999). It is an approximate implementation to the structure risk minimization (SRM) principle in statistical learning theory, rather than the empirical risk minimization (ERM) method. This SRM principle is based on the fact that the generalization error is bounded by the sum of the empirical error and a confidence interval term depending on the Vapnik– Chervonenkis (VC) dimension (Vapnik, 2000). By minimizing this bound, good generalization performance can be achieved. Compared with traditional neural networks, SVM can obtain a unique global optimal solution and avoid the curse of dimensionality. These attractive properties make SVM become a promising technique. SVM was initially designed to solve pattern recognition problems (Cevikalp, Neamtu, & Barkana, 2007; Doumpos, Zopounidis, & Golfinopoulou, 2007; Guo & Li, 2003; Huysmans, Setiono, Baesens, & Vanthienen, 2008; Jayadeva & Chandra, 2007; Juang, Chiu, & Shiu, 2007; Peng, Zhang, & Riedel, 2008; Ratsch, ⇑ Corresponding authors. Address: Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing, Jiangsu 210096, China. Tel.: +86 25 83792418; fax: +86 25 52090000 (Q. Wu). E-mail addresses:
[email protected] (Q. Wu),
[email protected]. edu.hk (R. Law). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2010.12.109
Mika, Scholkopf, & Muller, 2002; Wang, Xue, & Chan, 2008; Widodo & Yang, 2008; Zhang, Zhou, & Jiao, 2004; Wu & Law, 2010). With the introduction of Vapnik’s e-insensitive loss function, SVM has been extended to function approximation and regression estimation problems (Hu et al., 2007; Lin, Chen, & Chao, 2001; Rueda, Arciniegas, & Embrechts, 2004; Wu, 2009; Wu & Law, 2011; Wu, Wu, & Liu, 2010; Zhang et al., 2004; Zhu, Hoi, & Lyu, 2008) before 2001. In SVM approach, the parameter e controls the sparseness of the solution in an indirect way. However, it is difficult to come up with a reasonable value of e without the prior information about the accuracy of output values. Schölkopf, Smola, Williamson, and Bartlett (2000), Chalimourda, Schölkopf, and Smola (2004) modify the original e-SVM and introduce v-SVM, where a new parameter v controls the number of support vectors and the points that lie outside of the e-insensitive tube. Then, the value of e in the v-SVRM is traded off between model complexity and slack variables via the constant v. In many real applications, the observed input data cannot be measured precisely and usually described in linguistic levels or ambiguous metrics. However, traditional support vector classifier (SVC) method cannot cope with qualitative information. It is well known that fuzzy logic is a powerful tool to deal with fuzzy and uncertain data. Some scholars have explored the fuzzy support vector machine (FSVM). For pattern classification problems, Shieh and Yang (2008) apply a fuzzy SVM to construct a classification model of product form design based on consumer preferences by allocating continuous and discrete attributes to the product form. Each product sample was assigned a class label, and a fuzzy
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
membership, which is used to describe the semantic differential score corresponding to this label. To better handle uncertainties existing in real classification data and in the membership functions in the traditional type-1 fuzzy logic system, Chen, Li, Harrison, and Zhang (2008) apply interval type-2 fuzzy sets to construct a type-2 SVMs fusion FLS. This type-2 fusion architecture takes consideration of the classification results from individual SVC and generates the combined classification decision as the output. Yang, Jin, and Chuang (2006) propose system uses both fuzzy support vector machines and the variable-degree variable-step-size least-meansquare algorithm to achieve these objectives. They apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. However, the fuzzy support vector classifier machines mentioned in the above literatures are not suitable for the input and output variables described as triangular fuzzy numbers from the triangular fuzzy number space of which the input variables in classification problem of SVM may come from. Moreover, the kernel function of the published fuzzy SVCM pays less attention to wavelet support vector kernel. It is obvious that the left and right parts of triangular fuzzy number can represent the uncertain information of expert judgement. For ordinary fuzzy SVRM, all fuzzy information is transformed into a crisp number via membership or a mapping, the regression analysis is based on the dealt sample set with crisp numbers. However, the paper suggests a novel fuzzy v-SVCM with wavelet kernels. The major novelty of the present work is that the inputs and outputs are described by triangular fuzzy numbers, and hence it allows a more effective description of system involving uncertainties. Additionally, the parameters pertaining in formulation are determined via GA-based search. Compared with ordinary SVCM, the fuzzy SVCM model in triangular fuzzy number space, the proposed model, whose constraint conditions of are three times that of standard SVCM, establishes the optimal problem based on the left, middle and right of triangular fuzzy number respectively. In a word, the uncertain information is considered into the establishment of the novel fuzzy v-SVCM with wavelet kernels, as is suitable to complex nonlinear fuzzy system diagnosing problem with uncertain influencing factors. To overcome this disadvantage that the solution to the optimal parameter b of fuzzy v-SVCM with wavelet kernel model is difficult, the influencing of parameter b is taken into account confidence interval of fuzzy v-SVCM with wavelet kernel model. Finally, parameter b will be not come out in the classifier output function of the modified fuzzy v-SVCM with wavelet kernel model. The modified fuzzy v-SVCM with wavelet kernels model according to the structure risk minimization (SRM) is a new version of v-SVCM, named FWv-SVCM. In this paper, we put forward a new FSVCM, called FWv-SVCM. Based on the FWv-SVCM, a diagnosing method for nonlinear fuzzy fault system is proposed. The rest of this paper is organized as follows. The FWv-SVCM is described in Section 2. In Section 3, a GA is used to optimize the unknown parameters of FWv-SVCM. In Section 4, a diagnosing method based on FWv-SVCM and GA is proposed. Section 5 gives an application in car assembly line diagnosis. FWv-SVCM is also compared with other SVCMs. Section 6 draws the conclusions.
8 xaM ; aM 6 x < rM ; > r M aM > > < x ¼ rM ; lM ðxÞ ¼ 1; > > > xbM : ; rM 6 x < bM ; r M bM
ð1Þ
where aM 6 rM < bM, aM, rM, bM 2 R, aM 6 x < bM, x 2 R. Then we have the formulation M = (aM, rM, bM) in which rM is the center, aM is the left boundary and bM is the right boundary. The standard triangular fuzzy number is difficult to deal with input variable of SVM, the extended version of Definition 1 is considered and described as following: ¼ ðra ; Dra ; Dr a Þ is extended triangular fuzzy numDefinition 2. a ber (ETFN) in which ra 2 R is the center, Dra = ra aa is the left spread and Dra ¼ ba r a is the right spread, where Dra > 0 and Dra > 0. Let A ¼ ðr A ; Dr A ; DrA Þ and B ¼ ðrB ; DrB ; Dr B Þ be two ETFNs, whose k-cuts are shown in Fig. 1. In the space T(R) of all ETFNs, we define linear operations by the extension principle: A þ B ¼ ðr A þ r B ; maxðDr A ; Dr B Þ; maxðDr A ; Dr B ÞÞ,kA ¼ ðkrA ; DrA ; DrA Þ if k P 0, kA ¼ ðkr A ; Dr A ; DrA Þ if k < 0, and A B ¼ ðr A r B ; maxðDrA ; DrB Þ; maxðDr A ; Dr B ÞÞ. k-cut of A can be labeled as Ak ¼ ½AðkÞ; AðkÞ for k 2 [0, 1], where A(k) and AðkÞ are two boundaries of the k-cut, as shown in Fig. 1. The k-cut of a fuzzy number is always a closed and bounded interval. By the Hausdorff distance of real numbers, we can define a metric in T(R) as
DðA; BÞ ¼ sup maxfjAðkÞ BðkÞj; jAðkÞ BðkÞjg;
ð2Þ
k
where Ak ¼ ½AðkÞ; AðkÞ and Bk ¼ ½BðkÞ; BðkÞ are k-cuts of two fuzzy numbers. Theorem 1. In T(R), the Hausdorff metric can be obtained as follows:
DðA; BÞ ¼ maxfjðr A DrA Þ ðr B Dr B Þj; jrA rB j; jðr A þ Dr A Þ ðrB þ Dr B Þjg:
ð3Þ
Appendix A shows the proof of Theorem 1. Deduction 1. If A and B are two symmetric triangular fuzzy numbers in T(R), where A = (rA, DrA) and B = (rB, DrB), then the Hausdorff metric of A and B can be written as
DðA; BÞ ¼ maxfjðr A DrA Þ ðr B Dr B Þj; jðr A þ Dr A Þ ðr B þ Dr B Þjg: ð4Þ Appendix B shows the proof of Deduction 1.
2. Fuzzy wavelet support vector classifier machine (FWv-SVCM) 2.1. Triangular fuzzy theory Definition 1. Suppose M 2 T(R) is triangular fuzzy number (TFN) in triangular fuzzy space, whose membership function is represented as follows:
9097
Fig. 1. The k-cuts of two triangular fuzzy numbers.
9098
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
2.2. Wavelet analysis theory If the wavelet function w(x) satisfies the conditions: is the Fourier transform of w(x) 2 L2(R) \ L1(R), and wðxÞ ¼ 0; w function w(x). The wavelet function group can be defined as:
x m wa;m ðxÞ ¼ ðaÞ w ; a 1 2
ð5Þ
where a is the so-called scaling parameter, m is the horizontal floating coefficient, and w(x) is called the ‘‘mother wavelet’’. The parameter of translation m(m 2 R) and dilation a(a > 0), may be continuous or discrete. For the function f(x), f(x) 2 L2(R), The wavelet transform of f(x) can be defined as: 1
Wða; mÞ ¼ ðaÞ2
Z
þ1
f ðxÞw
1
x m dx; a
ð6Þ
⁄
where w (x) stands for the complex conjugation of w (x). The wavelet transform W(a, m) can be considered as functions of translation m with each scale a. Eq. (6) indicates the wavelet analysis is a time–frequency analysis, or a time-scaled analysis. Different from the short time Fourier transform, the wavelet transform can be used for multi-scale analysis of a signal through dilation and translation so it can extract time–frequency features of a signal effectively. Wavelet transform is also reversible, which provides the possibility to reconstruct the original signal. A classical inversion formula for f(x) is:
Z
f ðxÞ ¼ C 1 w
þ1
Z
1
þ1
Wða; mÞwa;m ðxÞ
1
da dm; a2
This lemma provides the straightforward procedure to judge and construct the kernel function. Therefore, the inner product kernel K(x, x0 ) is an admissive SV kernel function. Lemma 2 (Widodo and Yang, 2008; Wu, 2009; Zhang et al., 2004). In L2(Rn), symmetric function K(x, x0 ), x, x0 2 Rn, translation invariant kernel function has the formulation K(x, x0 ) = K(x x0 ), if K(x, x0 ) satisfies Mercer conditions, then it is an admissive support vector (SV) kernel. Lemmas 1 and 2 provide a method of judging and constructing SV kernel function. However, it is difficult to decompose the translation invariant kernels into the product of two functions, and then to prove them as SV kernels. Then, the necessary and sufficient condition for translation invariant kernel can be described as follows: Lemma 3 (Widodo and Yang, 2008; Wu, 2009; Zhang et al., 2004). The translation invariant kernel function K(x, x0 ) = K(x x0 ) is an admissive SV kernel, if and only if the Fourier transform of K(x) satisfies the condition:
F½xðxÞ ¼ ð2pÞn=2
Z
1
1
xÞ ¼ wð
Z
Kðr x ; rx0 Þ ¼
r xi 2 Rd ; xi 2 TðRÞd ;
ð13Þ
ð8Þ
and translation invariant wavelet kernel function satisfying Lemma 2 can be also be described as follows:
wðxÞ expðjxxÞdx:
ð9Þ
Kðr x ; rx0 Þ ¼
l Y
wðxi Þ;
xi 2 Rd ;
ð10Þ
2.3. The conditions of wavelet support vector’s kernel function
ð14Þ
There is not horizontal floating coefficient vector b, but variable vector r xi in Eq. (14). The r xi is equivalent to b. Therefore, we can build wavelet kernel function with horizontal floating invariability by means of wavelet function. In the light of Theorem 2, Morlet wavelet kernel function can be described in Theorem 3.
Kðr x ; rx0 Þ ¼
The support vector’s kernel function can be described as not only the dot-product, such as K(x, x0 ) = K(hx x0 i), but also the translation invariant kernel function, such as K(x, x0 ) = K(x x0 ). In fact, if a function satisfies condition of Mercer, it is an admissive support vector (SV) kernel function. Lemma 1 (Mercer Widodo and Yang, 2008; Wu, 2009; Zhang et al., 2004). The necessary and sufficient condition that continuous symmetry function K(x, x0 ) under L2(Rn) is an being inner product of the feature space is: R For all g(x) – 0 and g 2 ðxÞdx < 1, such that the condition
Kðx; x0 ÞgðxÞgðxÞdx dx0 P 0
l Y r xi r x0i w ai i¼1
Theorem 3. In T(R), Morlet wavelet function can be described as: wðr x Þ ¼ cosð1:75rx Þ exp r2x =2 , then Morlet wavelet kernel function, which is a type of admissive SV kernel can be described as follows:
where xi is a column vector with dimensions d.
holds.
0! l Y r x0i b rx b w w i ; ai ai i¼1
xÞj2 jwð dx < 1; j xj
i¼1
Z Z
x 2 Rn ; x 2 Rn :
expðjðx; xÞÞKðxÞdx P 0;
Theorem 2. In T(R), Let wðr xi Þ be a mother wavelet function. a is the so-called scaling coefficient vector with dimension l, b and b0 are the horizontal floating coefficient vectors with dimension d, a 2 Rl; b, b0 2 Rd, then the dot-product wavelet kernel function satisfying Lemma 1 can be described as follows:
ð7Þ
For the above Eq. (7), Cw is a constant with respect to w(x). The theory of wavelet decomposition is to approach the function f(x) by the linear combination of wavelet function group. If the wavelet function of one dimension is w(x), using tensor theory, the multi-dimensional wavelet function can be defined as:
wl ðxÞ ¼
Rn
ð12Þ
where
Cw ¼
Z
ð11Þ
l Y i¼1
! rxi r x0i kr xi r x0i k ; exp cos 1:75 ai 2a2i
rxi ; rx0i 2 Rd ; xi 2 TðRÞd :
ð15Þ
Appendix C shows the proof of Theorem 3. Now, we give Mexican hat wavelet kernel function. Mexican hat wavelet function is defined as follows:
2 r wðr x Þ ¼ 1 r 2x exp x : 2
ð16Þ
Mexican hat wavelet kernel function is defined as:
Kðr x ; rx0 Þ ¼
l Y
1
krxi rx0i k2
i¼1
rxi ; rx0i 2 Rd ; xi 2 TðRÞd
a2i
! exp
krxi rx0i k2 2a2i
! ; ð17Þ
9099
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
and this kernel function is an admissive support vector kernel function.
min
2.4. Fuzzy wavelet support vector classifier machine
w;n;q;b
fðxi ; yi Þgli¼1 ,
Suppose a set of fuzzy training samples where xi 2 T(R)d and yi 2 T(R). T(R)d is the set of d dimensional vectors of ETFNs. For computational simplicity, only symmetric triangular fuzzy numbers are taken into account, i.e. xi ¼ ðr xi ; Dr xi Þ and yi ¼ ðryi ; Dr yi Þ, where Dr xi ¼ Dr xi ; r yi ¼ Dr yi . We consider the approximation function f(x) = sgn(w x + b), where w = (w1, w2, . . . , wd), and w x denotes an inner product of w and x. In T(R), f(x) can be written as
f ðxÞ ¼ sgnðw rx þ b; qðDr x ÞÞ;
w 2 Rd ; b 2 R;
and 3, respectively, solves the following quadratic programming problem:
ð18Þ
1 2
sðw; n; q; eÞ ¼ ðkwk2 þ bÞ v q þ 8 > < r yi ðw rxi þ bÞ P q n1i ; Dryi ðw Drxi þ bÞ P q n2i ; > : nki P 0; q P 0;
s:t:
2 X l 1X n l k¼1 i¼1 ki
ð19Þ
where C > 0 is a penalty factor, nki (k = 1, 2; i = 1, . . . , l) are slack variables and v 2 (0, 1] is an adjustable regularization parameter. Problem (19) is a quadratic programming (QP) problem. By introducing Lagrangian multipliers, a Lagrangian function can be defined as follows: Lðw; b; a; b; n; q; dÞ ¼
where qðDr x Þ ¼ maxðDrx1 ; Dr x2 ; . . . ; Dr xd Þ. Then, the fuzzy v-support vector classifier machine (Fv-SVCM), whose e-insensitive tube and architecture are illuminated in Figs. 2
2 X l 1 1X 2 2 n ðkwk þ b Þ v q þ 2 l k¼1 i¼1 ki
l X ða1i ðr yi ðw r xi þ bÞ P q n1i Þ þ bn1i Þ i¼1
l X ða2i ðDr yi ðw Dr xi þ bÞ P q n2i Þ þ bn2i Þ dq; i¼1
ð20Þ T
T
Rlþ ;
Rlþ ;
where a ¼ ða11 ; . . . ; a1l ; a21 ; . . . ; a2l Þ 2 b ¼ ðb1 ; . . . ; bl Þ 2 d>0 are Lagrangian multipliers. Differentiating the Lagrangian function (20) with respect to w, b, q, n, we have
8 2 P l P > > rw Lðw; b; q; nÞ ¼ 0 ) w ¼ aki ryi /ðrxi Þ; > > > k¼1 i¼1 > > > > 2 P l > P > > aki ryi ¼ b; > < rb Lðw; b; q; nÞ ¼ 0 ) k¼1 i¼1
2 P l > P > > rq Lðw; b; q; nÞ ¼ 0 ) v aki þ d ¼ 0; > > > > k¼1 i¼1 > > > 2 P l > P > > : rn Lðw; b; q; nÞ ¼ 0 ) aki þ bi ¼ 1l :
ð21Þ
k¼1 i¼1
By substituting Eq. (21) into Eq. (20), we can obtain the corresponding dual form of function (19) as follows:
max a;a
WðaðÞ Þ ¼
Fig. 2. The e-insensitive tube of FWv-SVCM.
l 1X ðry þ Dr yi qðDr xi ÞÞðryj þ Dryj qðDrxj ÞÞ 2 i;j¼1 i
a1i a1j ðKðDrxi Drxj Þ þ 1Þ
l 1X ðr y þ Dr yi qðDrxi ÞÞðr yj þ Dr yj qðDr xj ÞÞ 2 i;j¼1 i
a a KðDrxi Drxj Þ
2i 2j 8 1 > < 0 6 aki 6 l ; 2 P l P > aki P v : :
s:t:
k¼1 i¼1
ð22Þ Selecting the appropriate v and K(x, x ), we can construct and solve the optimal problem T (19) by QP method to obtain the optimal solution a ¼ a1 ; . . . ; al . Select j 2 Sþ ¼ ijai 2 ð0; 1=lÞ; yi ¼ 1 ; k 2 S ¼ ijai 2 ð0; 1=lÞ; yi ¼ 1 , then we have 0
Fig. 3. The architecture of FWv-SVCM.
b¼
l 1X a1i ðryi þ Dryi qðDrxi ÞÞðKðDrxi Drxj Þ þ KðDrxi Drxk ÞÞ 2 i¼1
l 1X a2i ðryi þ Dryi qðDrxi ÞÞðKðDrxi Drxj Þ þ KðDrxi Drxk ÞÞ 2 i¼1
ð23Þ
9100
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
The relation between x and f(x) of the Fv-SVCM is described as follows:
f ðxÞ ¼ sgn
2 X l X
aki aki ðryi þ Drxi qðDrxi ÞÞ
k¼1 i¼1
!!
ðKðr xi rx Þ þ 1Þ; qðDrx Þ
ð24Þ
3. The modified genetic algorithm Genetic algorithm (GA) is a stochastic global search technique that solves problems by imitating processes during natural evolution. Based on the survival and reproduction of the fitness, GA continually exploits new and better solutions without any preassumptions, such as continuity and unimodality. GA has been successfully applied to many complex optimization problems and shows its merits over traditional optimization methods, especially when the system under study has multiple optimum solutions (Avci, 2009; Chiu & Chen, 2009; Fei, Liu, & Miao, 2009; Si et al., 2009; Wu, Tzeng, & Lin, 2009). GA evolves a population of candidate solutions. Each solution is represented by a chromosome that is usually coded as a binary string. The fitness of each chromosome is then evaluated using a performance function after the chromosome has been decoded. Upon completion of the evaluation, a biased roulette wheel is used to randomly select pairs of chromosomes and to undergo genetic operations that mimic natural phenomena observed in nature (such as crossover and mutation). This evolution process continues until the stopping criteria are reached. A real-coded GA proposed in this experiment is a genetic algorithm representation that uses a vector of floating-point numbers to encode the chromosome. The crossover operator of a real-coded GA is performed by borrowing the concept of convex combination of vectors. The mutation operator proposed for real-coded GA is to change the gene with some probability in the problem’s domain. With some modifications of genetic operators, real coded GA has resulted in better performance than binary coded GA for continuous problems. GA differs from conventional non-linear optimization techniques in that it searches by maintaining a population (or data base) of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. GA simultaneously possesses a large number of candidate solutions to a problem, called a population. The key feature of GA is the manipulation of a population whose individuals are characterized by possessing a chromosome. Two important issues in GA are the genetic coding used to define the problem and the evaluation function, called fitness function. Each individual solution in GA is represented by a string called chromosome. The initial solution population is generated randomly, which evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The selection operator allows strings with higher fitness to appear with higher probability in the next generation. Crossover is performed between two selected individuals, called parents, by exchanging parts of their strings, starting from a randomly chosen crossover point. This operator tends to enable the evolutionary process to move toward promising regions of the search space. Mutation is used to search for further problem space and to avoid local convergence of GA. It is difficult to find the optimal parameters of Eq. (22) by the simple GA when fault pattern with multi-dimension, small samples and nonlinearity are used as the training set. Some improvements on the simple GA are presented below:
(A) Niching Technique. A niching technique-simple subpopulation scheme and deterministic crowding are included in this review. In simple sub-population scheme, the population is divided into sub-populations, in each of which an individual in each sub-population can only perform mating with other individuals from the same sub-population. Each individual will be tagged or labeled to indicate which subpopulation it belongs to. In this case, evolution strategy is sharing. The optimal solutions are treated as resources. Overcrowding on one particular optimal solution implies that the resource is overused. In this case, the perceived fitness of that solution will decrease. On the other hand, if a few individuals are concentrated on one solution, that resource is underused. The perceived fitness of that solution will increase. This is a modification to the internal structure of genetic algorithm. It is a change in the perception of genetic algorithm to the objective function, but not the objective function itself. Crowding method is a method for maintaining subpopulations in genetic algorithm at different niches in multimodal fitness landscape. In crowding method, a small set of parents is selected from the population. Each child will replace an individual from this set which is most similar to itself. In this method, stochastic replacement errors prevent the algorithm from locating all niches. In deterministic crowding, the children compete against their parents for inclusion in the new generation. Unlike general crowding method, all parents have to participate in the competition. In deterministic crowding, similarity between children and parents can be measured using either genotypic or phenotypic distance. (B) The fitness assignment based on linear ranking. To distinguish the excellent chromosomes, the mapping relation is established among the objective function, chromosome position and fitness value. The chromosomes are arranged by descending order of objective function values. The excellent chromosomes are sent into sub-population with a niching technique according to this fitness ranking rule: the chromosome with a bigger objective function value arranged in the front position has lesser fitness value, while the chromosomes with lesser fitness value are inferior. Then, the chromosome with the least fitness value is arranged in the first position of objective function value rank. The most excellent chromosome is arranged in the last position of objective function value ranking. The fitness value of each chromosome is determined by its position in the ranked population. The computing formulation of fitness value is
FitnV ¼ 2 sp þ 2 ðsp 1Þ
Pos 1 ; Pop size 1
ð25Þ
where sp is assigned press difference, Pos the position of chromosome, and Pop_size the population size. FitnV 2 [1, 2]. (C) Selection. The population of the next generation is first formed by means of a probabilistic reproduction process. In general, there are two types of reproduction processes: generational reproduction and steady-state reproduction. Generational reproduction replaces the entire population with a new one. In contrast, steady-state reproduction replaces only a few individuals in a generation. Whichever type of reproduction is used, individuals with higher fitness usually have a greater chance of contributing to the generation of offspring. Several selection methods may be used to determine the fitness of an individual. The hybrid selection based on the proportional selection and ranking is employed for the selection schemes in the proposed GA in this paper. Then, the excellent chromosomes controlled by parameter sub_n 2 (0, 1] are sent into sub-population by a niching
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
technique. The selected probability (P(xi)) of the ith chromosome xi is
f ðxi Þ Pðxi Þ ¼ PPop size ; f ðxi Þ i¼1
ð26Þ
where Pop_size denotes the size of population, and f(xi) the fitness function of chromosome xi. The above resultant population is sometimes called the intermediate population with a niching technique. (D) Mutation. The intermediate population is processed by using crossover and mutation to form the next generation. To secure the excellent chromosome, some additional measures are taken as follows: the inferior chromosomes of the latest generation population are substituted by the excellent ones of intermediate population via the substituting probability (substitute_P) to form the next generation population. 4. The diagnosing algorithms and steps GA is considered as an excellent technique to solve the combinatorial optimization problems. The modified GA is also a random search method, which is based on the concept of natural selection and a niching technique operator. It starts with an initial population and then applies a mixture of reproduction, crossover, and mutation to create new and hopefully better populations. The steps of the modified GA are listed below: Algorithm 1. The modified genetic algorithm Step 1: Data preparation: Training and testing sets are represented as Tr and Te, respectively. Step 2: gen = 0. Step 3: Initialize the parameters: the crossover probability (Pc), mutation probability (Pm), pressure difference (sp), size of population (Pop_size), maximal genetic generation (Max_gen), iterative variable (gen), first generation population Chrom(gen), child generation population with niching technique (SelCh(gen + 1)), niching coefficient (sub_n) controlling excellent chromosomes into SelCh(gen + 1), optimal fitness of objective function (ObjV), and current mutation substituting probability (substitute_P). Step 4: Compute the objective function, evaluate the fitness and obtain the global optimal fitness. Step 5: If the global optimal fitness value meets the limitation of accuracy, or Algorithm 1 has been carried out for many generations and the optimal fitness value has not changed obviously, then go to Step 9 otherwise go to the next step. Step 6: gen = gen + 1. Step 7: Send the excellent chromosomes in Chrom(gen 1) into the current child generation population SelCh(gen) by parameter sub_n. Perform crossover and mutation operation. The size of the obtained new population New_SelCh(gen) is less than that of the latest generation population Chrom(gen 1). Step 8: Substitute the optimal chromosomes (whose number is equal to the product of substitute_P and the size of New_SelCh(gen)) from population New_SelCh(gen) for inferior chromosomes in population Chrom(gen 1) to form population Chrom(gen). The current generation population Chrom(gen) has the same size as the last generation population Chrom(gen 1). Then go to Step 4. Step 9: End the training procedure, and output the optimal chromosome.
9101
The steps of the fault diagnosis method based on FWv-SVCM are described as follows: Step 1: Initialize the original data by fuzzification and normalization, and then form training and testing sample set. Step 2: Select the wavelet kernel function K(x, x0 ), call Algorithm 1 and get the optimal parameters. Construct the QP problem (19) of the FWv-SVCM. Step 3: Solve the optimization problem (22) and obtain the parameters aki. Step 4: For a new diagnosing task, extract fault factors and form a set of input variables x. Then compute the diagnosing results by Eq. (24). 5. Experiment To analyze the performance of the Fv-SVCM model, the fault diagnosis of car assembly line is studied. The car assembly line is a type of fuzzy fault system influenced by manufacture equipments and some state factors of production environment in manufacturing system and its diagnosis action is usually driven by many uncertain factors. In our experiments, car assembly line pattern are selected from past production record in a typical company. The detailed characteristic data and fault pattern series of the car assembly line compose the corresponding training and testing sample sets. During the process of the car assembly line fault diagnosis, eight influencing factors, viz., numerical information: oscillating signal from machines (A, B, D, E, H), linguistic information (C, F and G) are taken into account. In fact, all the numerical variables from (1)–(26) are the normalized values although they are not marked by bars. The fault sample is shown in Table 1. The FWv-SVCM has been implemented in Matlab 7.1 programming language. The experiments are made on a 1.80 GHz Core (TM) 2 CPU personal computer (PC) with 1.0G memory under Microsoft Windows XP professional. The initial parameters of Fv-SVCM are Max_gen = 100, Max_cgen = 100, Pop_size = 80, Chaos_Pop_size = 80, sub_n = 0.9, Pc = 0.9, Pm = 0.1, substitute_P = 0.9, sp = 0.2,v 2 [0.01, 1], and r 2 [0.01, 1]. Gaussian radial basis function can be ascertained as kernel function of the Fv-SVCM model, two parameters also are determined as follows:v 2 [0, 1] and r 2 (0, 1]. Morelet wavelet kernel function is used as kernel function of the FWv-SVCM model. The optimal combinational parameters are obtained by the modified GA, viz., v = 0.97 and a = 0.84. The change trend of the fitness function is shown in Fig. 4. It is obvious that the proposed GA is convergent. To analyze the diagnosing capability of the FWv-SVCM model, the models (fuzzy v-SVCM, standard v-SVCM and fuzzy neural network (FNN)) are selected to handle the above car assembly line pattern. For standard v-SVCM model and fuzzy neural network (FNN) model, center of triangular fuzzy numbers of original sample is used as the sample data. To analyze the error trend well, the diagnosing results of the latest 10 pattern points are used to analyze the diagnosis performance of the above models. Their diagnosing results are shown in Tables 2 and 3. The diagnosing precision of FWv-SVCM is the better than that of standard v-SVCM, FWv-SVCM and FNN. In the FWv-SVCM model, the most diagnosing error is 0.03 (No. 2), and the least diagnosing error is 0 (Nos. 1, 8 and 9). In the v-SVCM model, the most diagnosis error is 0.05 (Nos. 7 and 10), and the least diagnosis error is 0.02 (Nos. 2, 3, 4 and 9). In the Fv-SVCM model, the most diagnosis error is 0.03 (Nos. 3), and the least diagnosis error is 0 (Nos. 6 and 9). In the FNN model, the most diagnosis error is 0.09 (Nos. 2 and 10), and the least diagnosis error is 0.01 (No. 6). Diagnosing errors of the latest 10 pattern points from the above three models are shown in Table 3. Considering the enterprise manufacturing environment, some errors exist inevitably in the process of data gather and estimation. Thus, the above diagnosing results are
9102
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
Table 1 Sample set of fault diagnosis. No.
A
B
C
D
E
F
G
H
Membership
1 2 3 4 5 6 7 8 9 .. . 56 57 58 59 60
0.5979 0.9492 0.2888 0.8888 0.1016 0.0653 0.2343 0.9331 0.0631 .. . 0.5163 0.4582 0.7032 0.5825 0.5092
0.0743 0.1932 0.3796 0.2764 0.7709 0.3139 0.6382 0.9866 0.5029 .. . 0.7157 0.2507 0.9339 0.1372 0.5216
0.89 0.94 0.34 0.43 0.47 0.14 0.13 0.53 0.72 .. . 0.17 0.01 0.79 0.51 0.21
0.1034 0.1573 0.4075 0.4078 0.0527 0.9418 0.15 0.3844 0.3111 .. . 0.90 0.28 0.07 0.48 0.98
0.9223 0.5612 0.6523 0.7727 0.1062 0.0011 0.5418 0.0069 0.4513 .. . 0.2312 0.4161 0.2988 0.6724 0.9383
0.34 0.56 0.12 0.17 0.28 0.56 0.49 0.95 0.23 .. . 0.31 0.27 0.54 0.16 0.21
0.22 0.65 0.05 0.23 0.67 0.31 0.72 0.95 0.13 .. . 0.16 0.16 0.31 0.02 0.36
0.0272 0.7937 0.9992 0.1102 0.6226 0.1326 0.31 0.1348 0.2233 .. . 0.8021 0. 6683 0.671 0.8206 0.5216
0.49(+) 0.78(+) 0.43() 0.93() 0.49() 0.54() 0.86() 0.79(+) 0.52(+) .. . 0.47() 0.42(+) 0.79() 0.51(+) 0.61(+)
Fig. 4. The change trend of the fitness function.
Table 2 The diagnosis results from the latest 10 pattern points. No.
Actual membership
Actual pattern
FNN
Predicting pattern
vSVCM
Predicting pattern
FvSVCM
Predicting pattern
FWvSVCM
Predicting pattern
1 2 3 4 5 6 7 8 9 10
0.59 0.61 0.78 0.43 0.53 0.47 0.42 0.79 0.51 0.61
1 +1 1 1 +1 1 +1 1 +1 +1
0. 52 0.70 0.83 0.50 0.56 0.46 0.49 0.65 0.48 0.52
1 +1 1 +1 +1 1 +1 1 1 +1
0.56 0.63 0.76 0.41 0.56 0.43 0.47 0.75 0.49 0.56
1 +1 1 1 +1 1 +1 1 1 +1
0.57 0.60 0.75 0.41 0.0.54 0.47 0.44 0.77 0.51 0.59
1 +1 1 1 +1 1 +1 1 +1 +1
0.59 0.58 0.80 0.42 0.55 0.48 0.43 0.79 0.51 0.60
1 +1 1 1 +1 1 +1 1 +1 +1
Table 3 Diagnosing error of the latest 10 pattern points. Model
1
2
3
4
5
6
7
8
9
10
FNN v-SVCM Fv-SVCM FWv-SVCM
0.07 0.03 0.02 0
0.09 0.02 0.01 0.03
0.05 0.02 0.03 0.02
0.07 0.02 0.02 0.01
0.03 0.03 0.01 0.02
0.01 0.04 0 0.01
0.07 0.05 0.02 0.01
0.14 0.04 0.02 0
0.03 0.02 0 0
0.09 0.05 0.02 0.01
Table 4 Precision analysis of fault diagnosis. Model
OCER (%)
TCER (%)
ER (%)
FNN v-SVCM Fv-SVCM FWv-SVCM
10 10 0 0
10 0 0 0
20 10 0 0
Note: OCER: one-class-error ratio; TCER: two-class-error ratio and ER: error ratio.
satisfying. The results of application in car assembly line indicate that the diagnosing method based on FWv-SVCM is effective and feasible. For analyzing the diagnosing capability of the proposed model, the comparison among different diagnosing approaches is shown in Table 4. The classifier indexes are adopted as follows: oneclass-error ratio (the negative class is judged into positive class), two-class-error ratio (the positive class is judged into negative class). It is obvious that the classifier indexes of FWv-SVCM and Fv-SVCM are better than that of v-SVCM and FNN. Moreover, the
9103
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
precision of FWv-SVCM is better than that of Fv-SVCM. The result show that the proposed diagnosing model based on FWv-SVCM is effective and feasible. It is obvious that the left and right parts of triangular fuzzy number can represent the uncertain information part of expert knowledge. For ordinary fuzzy SVCM, the all fuzzy information is transformed into a crisp number or membership, the diagnosing analysis is based on the dealt sample set with crisp numbers. The major novelty of the present work is that the inputs and outputs are described by triangular fuzzy numbers, and hence it allows a more effective description of diagnosing system involving uncertainties. Additionally, the parameters pertaining in formulation are determined via GA-based search. Compared with ordinary SVCM, the proposed model, whose constraint conditions of the proposed model are three times that of standard SVCM, establishes the optimal problem based on the left, middle and right of triangular fuzzy number, respectively. In a word, the uncertain information is considered into the establishment of the novel FWv-SVCM, as is suitable to complex nonlinear fuzzy diagnosing problems with uncertain influencing factors. 6. Conclusion In this paper, a new version of SVM, named FWv-SVCM, is proposed to establish the nonlinear diagnosing system of car assembly line. parameter b will be not come out in the classifier output function of the modified fuzzy v-SVCM with wavelet kernel model. The FWv-SVCM model can handle fuzzy fault pattern and provide better classifier precision, compared with fuzzy neural network. Therefore, the Fv-SVCM, which can deal with fault diagnosis with fuzzy input variables effectively, extends the application scope of support vector classifier machine. The performance of the FWv-SVCM is evaluated using the fault pattern of car assembly line, and the simulation results demonstrate that the FWv-SVCM is effective in handling uncertain data and finite samples. Moreover, it is shown that the modified GA presented here is available for the FWv-SVCM to seek the optimal parameters.
For the given triangular fuzzy numbers A and B, (rA rB) + (k 1) (DrA DrB) and ðr A r B Þ þ ð1 kÞðDr A DrB Þ are two linear functions of k. As k 2 [0, 1], the following formulas must hold:
sup jðrA rB Þ þ ðk 1ÞðDr A DrB Þj k
¼ maxfjðr A Dr A Þ ðr B Dr B Þj; jrA rB jg sup jðrA rB Þ þ ð1 kÞðDr A DrB Þj k
¼ maxfjðr A þ Dr A Þ ðr B þ Dr B Þj; jrA rB jg:
Appendix A A.1. Proof for Theorem 1 Proof. The lower boundary of the k-cut of A meets the following formula:
k¼
AðkÞ ðr A Dr A Þ : Dr A
ð26Þ
Then we have A(k) = rA + (k 1)DrA. In the same way, we can obtain AðkÞ ¼ r A þ ð1 kÞDr A ; BðkÞ ¼ rB þ ðk 1ÞDr B and BðkÞ ¼ rB þ ð1 kÞ DrB . According to the definition of Eq. (2), we get
ð29Þ
Substituting (28) and (29) into (26), we can obtain Eq. (3). This completes the proof of Theorem 1. h
Appendix B B.1. Proof for Deduction 1 Proof. For symmetric triangular fuzzy numbers A and B, we have DrA ¼ Dr A and Dr B ¼ DrB . From Theorem 1, the following formulas must hold: DðA; BÞ ¼ maxfjðrA Dr A Þ ðrB DrB Þj; jrA rB j; jðrA þ Dr A Þ ðrB þ DrB Þjg ¼ maxfjðrA rB Þ ðDrA DrB Þj; jrA rB j; jðrA rB Þ þ ðDr A DrB Þjg ¼ maxfjðrA rB Þ ðDrA DrB Þj; jðrA rB Þ þ ðDrA DrB Þjg ¼ maxfjðrA Dr A Þ ðrB DrB Þj; jðrA þ DrA Þ ðr B þ DrB Þjg:
ð30Þ This completes the proof of Deduction 1. h
Appendix C C.1. Proof of Theorem 3 Proof. According to Lemma 3, it is sufficient to prove the inequality
F½rx ðxÞ ¼ ð2pÞl=2
Z
Acknowledgements This research was partly supported by the National Natural Science Foundation of China under Grant 60904043 and 70761002, a research grant funded by the Hong Kong Polytechnic University (G-YX5J), China Postdoctoral Science Foundation (20090451152), Jiangsu Planned Projects for Postdoctoral Research Funds (0901023C) and Southeast University Planned Projects for Postdoctoral Research Funds.
ð28Þ
where Kðr x Þ ¼
l Q
W
TðRÞld
r
i¼1
xi
ai
expðjðx; rx ÞÞKðr x Þdrx P 0
¼
l Q i¼1
cos
1:75r xi
ai
ð31Þ
expðkr xi k2 =2a2i ÞFrist, we
calculate the integral term Z TðRÞld
¼
expðjðx r x ÞÞKðrx Þdrx
Z TðRÞld
expðjðx r x ÞÞ
l Y i¼1
!! rx kr x j2 cos 1:75 i exp i 2 ai 2ai
! expðj1:75r xi =ai Þ þ expðj1:75r xi =ai Þ krx k2 exp i 2 dr xi expðjxi r xi Þ ¼ 2 2ai i¼1 1 ! Z 1 2 l Y krx k 1 1:75j ¼ exp i 2 þ jxi ai rxi 2 1 ai 2ai i¼1 !! 2 kr x k 1:75j þ exp i 2 dr xi þ jxi ai rxi ai =2ai ! !! pffiffiffiffiffiffiffi l Y jai j 2p ð1:75 xi ai Þ2 ð1:75 þ xi ai Þ2 exp þ exp ¼ ð32Þ 2 2 2 i¼1 l Z Y
1
Substituting Eq. (32) into (31), then we have ! !! 2 2 l Y jai j ð1:75 xi ai Þ ð1:75 þ xi ai Þ F½r x ðxÞ ¼ exp þ exp 2 2 2 i¼1
DðA; BÞ ¼ sup maxfjðr A r B Þ þ ðk 1ÞðDr A Dr B Þj; jðrA rB Þ
ð33Þ
k
þ ð1 kÞðDr A DrB Þjg ¼ maxfsup jðr A r B Þ
When ai –0, we have
k
þ ðk 1ÞðDr A DrB Þj; sup jðrA rB Þ þ ð1 kÞðDr A DrB Þjg: k
ð27Þ
F½rx ðxÞ 0 This completes the proof of Therorem 3. h
ð34Þ
9104
Q. Wu et al. / Expert Systems with Applications 38 (2011) 9096–9104
References Avci, E. (2009). Selecting of the optimal feature subset and kernel parameters in digital modulation classification by using hybrid genetic algorithm–support vector machines: HGASVM. Expert Systems with Applications, 36(2), 1391–1402. Cevikalp, H., Neamtu, M., & Barkana, A. (2007). The kernel common vector method: A novel nonlinear subspace classifier for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 37(4), 937–951. Chalimourda, A., Schölkopf, B., & Smola, A. J. (2004). Experimentally optimal m in support vector regression for different noise models and parameter settings. Neural Networks, 17(1), 127–141. Chen, X., Li, Y., Harrison, R., & Zhang, Y. Q. (2008). Type-2 fuzzy logic-based classifier fusion for support vector machines. Applied Soft Computing, 8(3), 1222–1231. Chiu, D. Y., & Chen, P. J. (2009). Dynamically exploring internal mechanism of stock market by fuzzy-based support vector machines with high dimension input space and genetic algorithm. Expert Systems with Applications, 36(2), 1240–1248. Doumpos, M., Zopounidis, C., & Golfinopoulou, V. (2007). Additive support vector machines for pattern classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 37(3), 540–550. Fei, S. W., Liu, C. L., & Miao, Y. B. (2009). Support vector machine with genetic algorithm for forecasting of key-gas ratios in oil-immersed transformer. Expert Systems with Applications, 36(3), 6326–6331. Guo, G. D., & Li, S. Z. (2003). Content-based audio classification and retrieval by support vector machines. IEEE Transactions on Neural Networks, 14(1), 209–215. Hu, P. J. H., Cheng, T. H., Wei, C. P., Yu, C. H., Chan, A. L. F., & Wang, H. Y. (2007). Managing clinical use of high-alert drugs: A supervised learning approach to pharmaokinetic data analysis. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 37(4), 481–492. Huysmans, J., Setiono, R., Baesens, B., & Vanthienen, J. (2008). Minerva: sequential covering for rule extraction. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 38(2), 299–309. Jayadeva, R., & Chandra, S. (2007). Twin support vector machines for pattern classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5), 905–910. Juang, C. F., Chiu, S. H., & Shiu, S. J. (2007). Fuzzy system learned through fuzzy clustering and support vector machine for human skin color segmentation. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 37(6), 1077–1087. Lin, S. F., Chen, J. Y., & Chao, H. X. (2001). Estimation of number of people in crowded scenes using perspective transformation. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 31(6), 645–654. Peng, J., Zhang, P., & Riedel, N. (2008). Discriminant learning analysis. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 38(6), 1614–1625.
Ratsch, G., Mika, S., Scholkopf, B., & Muller, K. R. (2002). Constructing boosting algorithms from SVMs: An application to one-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1184–1199. Rueda, I. E. A., Arciniegas, F. A., & Embrechts, M. J. (2004). SVM sensitivity analysis: An application to currency crises aftermaths. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 34(3), 387–398. Schölkopf, B., Smola, A. J., Williamson, R. C., & Bartlett, P. L. (2000). New support vector algorithms. Neural Computation, 12(5), 1207–1245. Shieh, M. D., & Yang, C. C. (2008). Classification model for product form design using fuzzy support vector machines. Computers & Industrial Engineering, 55(1), 150–164. Si, F., Romero, C. E., Yao, Z., Schuster, E., Xu, Z., Morey, R. L., et al. (2009). Optimization of coal-fired boiler SCRs based on modified support vector machine models and genetic algorithms. Fuel, 88(5), 806–816. Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE Transactions on Neural Networks, 10(5), 988–999. Vapnik, V. N. (2000). The Nature of Statistical Learning. New York: Springer-Verlag. Wang, L., Xue, P., & Chan, K. L. (2008). Two criteria for model selection in multiclass support vector machines. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 38(6), 1432–1448. Widodo, A., & Yang, B. S. (2008). Wavelet support vector machine for induction machine fault diagnosis based on transient current signal. Expert Systems with Applications, 35, 307–316. Wu, Q. (2009). The forecasting model based on wavelet v-support vector machine. Expert Systems with Applications, 36(4), 7604–7610. Wu, Q., & Law, R. (2010). Complex system fault diagnosis based on a fuzzy robust wavelet support vector classifier and an adaptive Gaussian particle swarm optimization. Information Sciences, 180(23), 4514–4528. Wu, Q., & Law, R. (2011). Cauchy mutation based on objective variable of Gaussian particle swarm optimization for parameters selection of SVM. Expert Systems with Applications, 38(6), 6405–6411. Wu, C. H., Tzeng, G. H., & Lin, R. H. (2009). A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression. Expert Systems with Applications, 36(3), 4725–4735. Wu, Q., Wu, S. Y., & Liu, J. (2010). Hybrid model based on SVM with Gaussian loss function and adaptive Gaussian PSO. Engineering Applications of Artificial Intelligence, 23(4), 487–494. Yang, C. H., Jin, L. C., & Chuang, L. Y. (2006). Fuzzy support vector machines for adaptive morse code recognition. Medical Engineering & Physics, 28(9), 925–931. Zhang, L., Zhou, W., & Jiao, L. (2004). Wavelet support vector machine. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 34(1), 34–39. Zhu, J., Hoi, S. C. H., & Lyu, M. R. T. (2008). Robust regularized kernel regression. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 38(6), 1639–1644.