The forecasting model based on wavelet ν-support vector machine

The forecasting model based on wavelet ν-support vector machine

Expert Systems with Applications 36 (2009) 7604–7610 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ww...

931KB Sizes 5 Downloads 146 Views

Expert Systems with Applications 36 (2009) 7604–7610

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

The forecasting model based on wavelet m-support vector machine Qi Wu * Key Laboratory of Measurement and Control of Complex Systems of Engineering, School of Automation, Southeast University, Ministry of Education, Nanjing, Jiangsu 210096, China

a r t i c l e

i n f o

Keywords: Wavelet kernel Support vector machine (SVM) Particle swarm optimization Sales forecasting

a b s t r a c t Aiming at the series with small samples, seasonal character, nonlinearity, randomicity and fuzziness, the existing support vector kernel does not approach the random curve of the sales time series in the L2(Rn) space (quadratic continuous integral space). A new wavelet support vector machine (WN m-SVM) is proposed based on wavelet theory and modified support vector machine. A particle swarm optimization (PSO) algorithm is designed to select the best parameters of WN m-SVM model in the scope of constraint permission. The results of application in car sale series forecasting show that the forecasting approach based on the PSOWN m-SVM model is effective and feasible, the comparison between the method proposed in this paper and other ones is also given which proves this method is better than PSOW m-SVM and other traditional methods. Ó 2008 Elsevier Ltd. All rights reserved.

1. Introduction Recently, a novel machine learning technique, called support vector machine (SVM), has drawn much attention in the fields of pattern classification and regression forecasting. SVM was first introduced by Vapnik in 1995 (Vapnik, 1995). Support vector machine (SVM) is a kind of classifier’s studying method on statistic study theory. This algorithm derives from linear classifier, and can solve the problem of two kind classifier, later this algorithm applies in non-linear fields, that is to say, we can find the optimal hyperplane (large margin) to classify the samples set. It is an approximate implementation to the structure risk minimization (SRM) principle in statistical learning theory, rather than the empirical risk minimization (ERM) method (Kwok, 1999). Compared with traditional neural networks, SVM can use the theory of minimizing the structure risk to avoid the problems of excessive study, calamity data, local minimal value and so on. For the small samples set, this algorithm can be generalized well. Support vector machine (SVM) has been successfully used for machine learning with large and high dimensional data sets. These attractive properties make SVM become a promising technique. This is due to the fact that the generalization property of an SVM does not depend on the complete training data but only a subset thereof, the so-called support vectors. Now, SVM has been applied in many fields as follows: handwriting recognition, three-dimension objects recognition, faces recognition, text images recognition, voice recognition, regression analysis, and so on (Carbonneau, Laframbois, & Vahidov, 2008; Chen & Hsieh, 2006; Huang, 2008; Seo, 2007; Trontl, Smuc, & Pevec, 2007; Wohlberg, Tartakovsky, * Tel.: +86 25 51166581; fax: +86 25 511665260. E-mail address: [email protected] 0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.09.048

& Guadagnini, 2006; Wu, Yan, & Yang, 2008; Wu, Yan & Yang, in press; Wu, Yan, in press a,b,c). For pattern recognition and regression analysis, the non-linear ability of SVM can use kernel mapping to achieve. For the kernel mapping, the kernel function must satisfy the condition of Mercer theorem. The Gauss function is a kind of kernel function which is general used. It shows the good generalization ability. However, for our used kernel functions so far, the SVM can not approach any curve in L2(Rn) space (quadratic continuous integral space), because the kernel function which is used now is not the complete orthonormal base. This character lead the SVM can not approach every curve in the L2(Rn) space. Similarly, the regression SVM can not approach every function. According to the above describing, we need find a new kernel function, and this function can build a set of complete base through horizontal floating and flexing. As we know, this kind of function has already existed, and it is the wavelet functions. Based on wavelet decomposition, this paper propose a kind of allowable support vector’s kernel function which is named wavelet kernel function, and we can prove that this kind of kernel function is existent. The Morlet and Mexican wavelet kernel functions are the orthonormal base of L2(Rn) space. Based on the wavelet analysis and conditions of the support vector kernel function, Morlet or Mexican wavelet kernel function for support vector regression machine (SVM) is proposed, which is a kind of approximately orthonormal function. This kernel function can simulate almost any curve in quadratic continuous integral space, thus it enhances the generalization ability of the SVM. The papers (Khandoker, Lai, Begg, & Palaniswami, 2007; Widodo & Yang, 2008) research on the wavelet e-support vector regression machine. Much research indicates the performance of m-SVM is better than one of e-SVM. According to the wavelet kernel function and the regularization theory,

7605

Q. Wu / Expert Systems with Applications 36 (2009) 7604–7610

m-support vector regression machine on wavelet kernel function (W m-SVM) is proposed in this paper. To overcome this disadvantage that the solution to the optimal parameter b of W m-SVM model is difficult, the influencing of parameter b is taken into account confidence interval of W m-SVM model. Finally, parameter b will be not come out in the regression output function of the modified W m-SVM model. The modified W m-SVM model according to the structure risk minimization (SRM) is a new version of W m-SVM, named WN m-SVM. Based on the WN m-SVM, an intelligence forecasting approach for scale series with multi-dimension, nonlinearity, seasonal feature and uncertain characteristics is proposed in this paper. Section 2 construct an intelligence forecasting model based on a new msupport vector regression machine on wavelet kernel function (WN m-SVM) and a particle swarm optimization algorithm (PSO). Section 3 gives an application of the intelligence forecasting system based on the WN m-SVM model optimized by PSO algorithm. Section 4 draws the conclusion. 2. Wavelet m-support vector machine (W m-SVM) 2.1. The conditions of wavelet support vector’s kernel function The support vector’s kernel function can be described as not only the product of point, such as K(x, x0 ) = K(hx  x0 i), but also the horizontal floating function, such as K(x, x0 ) = K(x  x0 ). In fact, if a function satisfied condition of Mercer, it is the allowable support vector kernel function. Theorem 1. The symmetry function K(x, x0 ) is the kernel function of SVM if and only if: for all function g – 0 which satisfied the R condition of Rd g 2 ðnÞdn < 1, we need satisfy the condition as follows:

Z Z

Kðx; x0 ÞuðxÞd xdx0  0

ð1Þ

This theorem proposed a simple method to build kernel function.

The wavelet transform W({a, m}) can be considered as functions of translation m with each scale. Eq. (4) indicates the wavelet analysis is a time-frequency analysis, or a time-scaled analysis. Different from the short time Fourier transform (STFT), the wavelet transform can be used for multi-scale analysis of a signal through dilation and translation so it can extract time-frequency features of a signal effectively. Wavelet transform is also reversible, which provides the possibility to reconstruct the original signal. A classical inversion formula for f(x) is

Z

f ðxÞ ¼ C 1 w

þ1

1

Z

Z

þ1

Wða; mÞwa;m ðxÞ

1 1

wl ðxÞ ¼

l Y

lwðxi Þ;

where x is a column vector with d dimensions. We can build the horizontal floating kernel function as follows:

Kðx; x0 Þ ¼

  xi  x0i lw ; ai i¼1

l Y

2.2. Wavelet kernel function If the wavelet function w(x) satisfied the conditions: wðxÞ 2 L2 ðRÞ \ L1 ðRÞ, and wðxÞ ¼ 0, w is the Fourier transform of function w(x). The wavelet function group can be defined as 1

wa;m ðxÞ ¼ ðaÞ2 w

x  m a

ð3Þ

;

where a is the so-called scaling parameter, m is the horizontal floating coefficient, and w(x) is called the ‘‘mother wavelet”. The parameter of translation mðm 2 RÞ and dilation a (a > 0), may be continuous or discrete. For the function f(x), f ðxÞ 2 L2 ðRÞ, The wavelet transform f(x) can be defined as 1

Wða; mÞ ¼ ðaÞ2

Z

þ1

f ðxÞw 1

x  m dx; a

where w ðxÞ stands for the complex conjugation of w(x).

ð4Þ

ð9Þ

where ai is the scaling parameter of wavelet, ai > 0. So far, because the wavelet kernel function must satisfy the conditions of Theorem 2, the number of wavelet kernel function which can be showed by existent functions is few. Now, we give an existent wavelet kernel function: Morlet wavelet kernel function. Morlet wavelet function is defined as follows: x2

x 2 R;

w0 2 R:

ð10Þ

Morlet wavelet kernel function is defined as l Y i¼1

x 2 Rld ;

Rn

ð8Þ

i¼1

Kðx; x0 Þ ¼

ð2Þ

ð7Þ

x 2 Rld ;

Theorem 2. The horizontal floating function is allowable support vector’s kernel function if and only if the Fourier transform of K(x) need satisfy the condition follows:

expðjðw:xÞÞKðxÞdx  0:

ð6Þ

For the above Eq. (5), Cw is a constant with respect to w(x). The theory of wavelet decomposition is to approach the function f(x) by the linear combination of wavelet function group. If the wavelet function of one dimension is w(x), using tensor theory, the multi-dimension wavelet function can be defined as

wðxÞ ¼ cosðw0 xÞ exp 2 ;

F½xðwÞ ¼ ð2pÞn=2

ð5Þ

jwðwÞj2 where C w ¼ dw < 1; jwj 1 Z wðwÞ ¼ wðxÞ expðjwxÞdx:

For the horizontal floating function, because hardly dividing this function into two same functions, we can give the condition of horizontal floating kernel function.

Z

da dm; a2

!   xi  x0i kxi  x0i k2 ; cos w0  exp  ai 2a2i

ai 2 R;

ð11Þ

and this kernel function is an allowable support vector kernel function. If we use wavelet kernel function as the support vector’s kernel function, the estimation of wavelet m-support vector machine (W m-SVM) is defined as

f ðxÞ ¼

l l X Y xj  xji ðai  ai Þ w ai i¼1 i¼1

! þ b;

b 2 R:

ð12Þ

For wavelet analysis and theory, see Krantz (1994) and Liu and Di (1992). 2.3. The proposed wavelet m-support vector machine (WN m-SVM) The optimal hyperplane of standard support vector machine can be defined as f ðxÞ ¼ w  x þ b. Considering the Hilbert space composed of x ¼ ðxT ; gÞT , and defining the inner product transform ðx1  x2 Þ ¼ ðx1  x2 Þ þ g2 between the vector x1 ¼ ðxT1 ; gÞT and x2 ¼ ðxT2 ; gÞT . Let w ¼ ðwT ; gb ÞT , where g–0, then we have f ðxÞ ¼ ðw  xÞ ¼ ðw  xÞ þ gb  g ¼ ðw  xÞ þ b ¼ f ðxÞ. Therefore, the

7606

Q. Wu / Expert Systems with Applications 36 (2009) 7604–7610

function f ðxÞ ¼ ðw  xÞ is another expression of the optimal hyperplane function f(x). The new support vector machine is proposed based on the function f ðxÞ. At the same time, combining the wavelet kernel function with m-support vector machine, we can build a new SVM learning algorithm that is m-support vector machine on wavelet kernel function (Wm-SVM). The parameter b is taken into account confidence interval of Wv-SVM and form the new variable w of optimal problem, then the new wavelet m-support vector machine (WN m-SVM) whose e-insensitive tube is illuminated by Fig. 1 can be reformulated as

1 sðw; n ; eÞ ¼ ðkwk2 þ b2 Þ þ C  2 ðÞ

min

w;nðÞ ;e;b

! l 1X  v eþ ðn þ ni Þ ; l i¼1 i ð13Þ

Subject to

ðw  xi þ bÞ  yi  e þ ni

ð14Þ

yi  ðw  xi þ bÞ  e þ ni

ð15Þ

ðÞ ni

ð16Þ

e  0;

 0;

where w is a Column vector with d dimensions, C > 0 is a penalty ðÞ factor, ni ði ¼ 1; . . . ; lÞ are slack variables and nu 2 ð0; 1 is an adjustable regularization parameter. Problem (13) is a quadratic programming (QP) problem. By means of the Wolfe principle, wavelet kernel function technique and Karush–Kuhn–Tucker (KKT) conditions, we have the duality problem (17) of the original optimal problem (13)

max  a ;a

wða; a Þ ¼  þ

l 1X ða  ai Þðaj  aj ÞðKðxi  xj Þ þ 1Þ 2 i;j¼1 i

l X ðai  ai Þyi

ð17Þ

i¼1

s:t:

0  ai ; ai 

C l

ð18Þ

l X

ðai þ ai Þ  C  m:

ð19Þ

i¼1

Select the appropriate parameters C and m, and the optimal mother wavelet function which can match well the original series in some scope of scales as the kernel function of WN m-SVM model. Then, WN m-SVM output function is described as following:

f ðxÞ ¼

l l X Y xj  xji ðai  ai Þ w ai i¼1 i¼1

!

! þ b; b 2 R :

ð20Þ

lated to the partitioned regions and the different WN m-SVM in the different input–output spaces are adopted to forecast the product sale time series. For each particular region only the most adequate WN m-SVM is used for the final forecasting. This is very different from a single SVM model which learns the whole input space globally and thus cannot guarantee that each local input region is the best learned. Similarly to evolutionary computation techniques, PSO uses a set of particles, representing potential solutions to the problem under consideration. The swarm consists of m particles; each particle has a position X i ¼ fxi1 ; xid ; . . . ; xim g, a velocity V i ¼ fmi1 ; mid ; . . . ; mim g, where i ¼ 1; 2; . . . ; n; d ¼ 1; 2; . . . ; m and moves through a n-dimensional search space. According to the global variant of the PSO algorithm, each particle moves towards its best previous position and towards the best particle g in the swarm. Let us denote the best previously visited position of the ith particle that gives the best fitness value as Pi ¼ fpi1 ; pi2 ; . . . ; pid ; . . . ; pim g, and the best previously visited position of the swarm that gives best fitness as P g ¼ fP g 1 ; P g 2 ; . . . ; P g d ; . . . ; P g m g. The change of position of each particle from one iteration to another can be computed according to the distance between the current position and its previous best position and the distance between the current position and the best position of swarm. Then the updating of velocity and particle position can be obtained by using the following equations:

v kþ1 ¼ wv kid þ c1 r 1 ðpid  xkid Þ þ c2 r 2 ðP g d  xkid Þ id kþ1 xid ¼ xkid þ v kþ1 id ;

ð21Þ ð22Þ

where w is called inertia weight and is employed to control the impact of the previous history of velocities on the current one. Accordingly, the parameter w regulates the trade-off between the global and local exploration abilities of the swarm. A large inertia weight facilitates global exploration, while a small one tends to facilitate local exploration. A suitable value of the inertia weight w usually provides balance between global and local exploration abilities and consequently results in a reduction of the number of iterations required to locate the optimum solution. k ¼ 1; 2; . . . ; K max denotes the iteration number, c1 is the cognition learning factor, c2 is the social learning factor and r1 and r2 are random numbers uniformly distributed in [0, 1]. Thus, the particle flies through potential solutions towards Pki and P g k in a navigated way while still exploring new areas by the stochastic mechanism to escape from local optima. Since there was no actual mechanism for controlling the velocity of a particle, it was necessary to impose a maximum value Vmax on it. If the velocity exceeds the threshold, it is set equal to Vmax, which controls the maximum travel distance at each iteration to avoid this particle flying past good solutions.

2.4. The proposed optimization algorithm The confirmation of unknown parameters of the WN m-SVM is complicated process. In fact, it is a multivariable optimization problem in a continuous space. The appropriate parameter combination of models can enhance approximating degree of the original series Therefore, it is necessary to select an intelligence algorithm to get the optimal parameters of the proposed models. The parameters of WN m-SVM have a great effect on the generalization performance of WN m-SVM. An appropriate parameter combination corresponds to a high generalization performance of the W mSVM. PSO algorithm is considered as an excellent technique to solve the combinatorial optimization problems (Krusienski, 2006; Lin, Ying, Chen, & Lee, 2008; Yamaguchi, 2007; Zhao & Yang, in press). The PSO algorithm, introduced by Kenedy and Eberhart (1995), is used to determine the parameter combination of WN m-SVM re-

yi ξ

+ε −ε

Support vector

f (x)

Optimal hyperplane

ξ*

yi

Fig. 1. The e-insensitive tube of WN m-SVM.

7607

Q. Wu / Expert Systems with Applications 36 (2009) 7604–7610

2.5. Mixture experts (ME) architecture

Table 1 Influencing factors of product sale forecasting

In the forecasting technique of sale series, two of the key problems are how to deal with nonlinearity and non-stationarity. A potential solution to the above two problem is to use a mixture of experts (ME) architecture illuminated by Fig. 2. ME architecture is generalized into a two-stage architecture to handle the non-stationary in the data. In the first of the two-stage architecture, a mixture of experts including evolutionary algorithm are competed to optimize the model in the second part of the two-stage architecture.

Product characteristics

Unit

Expression

Weight

Brand famous degree (BF) Performance parameter (PP) Form beauty (FB) Sales experience (SE) Oil price (OP) Dweller deposit (DD)

Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless

Linguistic information Linguistic information Linguistic information Linguistic information Linguistic information Numerical information

0.9 0.8 0.8 0.5 0.8 0.4

3. Experiment To illustrate the proposed intelligence forecasting method, the forecast of car sale series is studied. The car is a type of consumption product influenced by macroeconomic in manufacturing system and its sale action is usually driven by many uncertain factors. Some factors with large influencing weights are gathered to develop a factor list, as shown in Table 1. The first five factors are expressed as linguistic information and the last one factor are expressed as numerical data. In our experiments, car sale series are selected from past sale record in a typical company. The detailed characteristic data and sale series of these cars compose the corresponding training and testing sample sets. During the process of the car scale series forecasting, six influencing factors, viz., brand famous degree(BF), performance parameter (PP), form beauty (FB), sales experience(SE), oil price (OP) and dweller deposit (nd) are taken into account the first five influencing factors are linguistic information, the latest one factor are numerical information. All linguistic information of

gotten influencing factors is dealt with fuzzy comprehensive evaluation (Feng & Xu, 1999) and form numerical information. The proposed forecasting model has been implemented in Matlab 7.1 programming language. The experiments are made on a 1.80 GHz Core(TM)2 CPU personal computer (PC) with 1.0G memory under Microsoft Windows xp professional. Some criteria, such as mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square error (MSE), are adopted to evaluate the performance of the intelligence forecasting system. The initial parameters of the intelligence forecasting system are given as follows: inertia weight wmax = 0.9, wmin = 0.1; positive acceleration constants c1, c2 = 2; the fitness accuracy of the normalized samples is equal to 0.005. The selected wavelet functions consist of Morlet, Mexican and Gaussian wavelet. To reduce the length of this paper, only representational Mexican, Morlet and Gaussian wavelet transforms on the different scales are given in Figs. 3–5. Morlet wavelet transform is the best wavelet transform that can inosculate the original sale series on the scope of scale from 0.3 to 4 among all given wavelet transforms.

Output the current combinational parameters

Accuracy check

v

Output the optimal combinational paramters Fig. 2. The intelligence forecasting system.

7608

Q. Wu / Expert Systems with Applications 36 (2009) 7604–7610

Fig. 3. Mexican hat wavelet transform of sales time series in the scope of different scale.

Fig. 4. Morlet wavelet transform of sales time series in the scope of different scale.

Therefore, Mexican wavelet can be ascertained as a kernel function of WN m-SVM model, three parameters also are determined as follows:

m 2 ½0; 1; a 2 ½0:3; 4 and C2

  maxðxi;j Þ  minðxi;j Þ maxðxi:j Þ  minðxi;j Þ  103 ;  103 : l l

The optimal combinational parameters are obtained by PSO, viz., C = 1000, m = 0.98 and a = 0.56. Fig. 6 illuminated the forecasting result of the original car sale series. For checking the forecasting capacity of PSOWN m-SVM, ARMA, PSOW m-SVM and PSOWN m-SVM are used to train the original sale series, respectively, then give the latest 12 months forecasting results of each model shown in Table 2. The linear inertia weight of standard PSO is adopted

7609

Q. Wu / Expert Systems with Applications 36 (2009) 7604–7610

Fig. 5. Gaussian wavelet transform of sales time series in the scope of different scale.

w ¼ wmax 

wmax  wmin k; kmax

ð23Þ

where wmin = 0.1 is the minimal inertia weight, wmax = 0.9 is the maximal inertia weight, k is iterative number of controlling procedure process. For finding the forecasting capacity of the proposed model, the comparison among different forecasting approaches is shown in Table 3. The Table 3 shows the got error index distribution by means of dealt with there different models. The index MAE, MAPE and MSE of PSOWN m-SVM model is as good as ones of PSOW m-SVM model, but ones of ARMA lower than ones of PSOWN m-SVM and PSOW mSVM. The WN m-SVM is then applied to the regression analysis. Experiment results show that the regression’s precision of WN

Table 2 Comparison of forecasting result from three different models The latest 12 months

1 2 3 4 5 6 7 8 9 10 11 12

Real value

1809 1508 1246 915 2623 754 2304 2913 2970 2367 1316 1495

Fig. 6. The car sales forecasting result based on PSOWN m-SVM model.

Forecasting value ARMA

PSOW m-SVM

PSOWN m-SVM

1743 1736 1805 1855 1900 1517 1984 1565 1451 1541 1689 1850

1840 1570 1195 859 2573 826 2250 2834 2918 2343 1397 1522

1840 1570 1195 859 2573 826 2250 2834 2918 2343 1397 1552

7610

Q. Wu / Expert Systems with Applications 36 (2009) 7604–7610

Table 3 Error statistic of three forecasting models

References

Model

MAE

MAPE

MSE

ARMA PSOW m-SVM PSOWN m-SVM

668 55.75 55.75

0.411 0.038 0.038

627,630 3373 3373

m-SVM is the same as one of W m-SVM, compared with W m-SVM whose kernel function is Gauss function under the same conditions. 4. Conclusion In this paper, a new version of WSVM, named WN m-SVM, is proposed to setup the non-linear system of product sale series by combining the wavelet theory with m-SVM. The WN m-SVM avoid the solution to parameter b in the final forecasting function (20), while W m-SVM need computer the value of parameter b in the final forecasting function (12). Therefore the proposed WN m-SVM can enhance the efficiency of the solution to wavelet support vector machine. The new forecasting model based on the PSO algorithm and WN m-SVM, named PSOWN m-SVM, is presented to approximate arbitrary sales curve in L2(Rn) space and give good forecasting result of the product sale series. The performance of the PSOWN m-SVM is evaluated using the data of car sales, and the simulation results demonstrate that the PSOWN m-SVM is effective in dealing with uncertain data and finite samples. Moreover, it is shown that the particle swarm optimization algorithm presented here is available for the WN m-SVM to seek optimized parameters. Compared to PSOW m-SVM, the PSOWN m-SVM has the same forecasting capacity. Compared to ARMA, PSOW m-SVM and PSOWN m-SVM have better MAE, MAPE and MSE. PSOWN m-SVM overcomes the ‘‘curse of dimensionality” and has some other attractive properties, such as the strong learning capability for small samples, the good generalization performance, the insensitivity to noise or outliers and the automatic select of optimal parameters. Moreover, the wavelet transform can reduce noise in data while preserve the detail or resolution of the data. Therefore, in the process of establishing the forecasting models, much uncertain information of scale data is not neglected but considered wholly into the wavelet kernel function. The forecasting accuracy is improved by means of adopting wavelet technique.

Carbonneau, R., Laframbois, K., & Vahidov, R. (2008). Application of machine learning techniques for supply chain demand forecasting. European Journal of Operational Research, 184(3), 1140–1154. Chen, R. C., & Hsieh, C. H. (2006). Web page classification based on a support vector machine using a weighted vote schema. Expert Systems with Applications, 31(2), 427–435. Feng, S., & Xu, L. (1999). An intelligent decision support system for fuzzy comprehensive evaluation of urban development. Expert Systems with Applications, 16(1), 21–32. Huang, S. C. (2008). Online option price forecasting by using unscented Kalman filters and support vector machines. Expert Systems with Applications, 34(4), 2819–2825. Kenedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of the IEEE international conference on neural networks (pp. 1942–1948). Krantz, S. G. (Ed.). (1994). Wavelet: Mathematics and application. Boca Raton, FL: CRC. Khandoker, A. H., Lai, D. T. H., Begg, R. K., & Palaniswami, M. (2007). Wavelet-based feature extraction for support vector machines for screening balance impairments in the elderly. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 15(4), 587–597. Krusienski, D. J. (2006). A modified particle swarm optimization algorithm for adaptive filtering. In IEEE international symposium on circuits and systems (pp. 137–140). Kos, Greece. Kwok, J. T. (1999). Moderating the outputs of support vector machine classifiers. IEEE Transactions on Neural Networks, 10(5), 1018–1031. Lin, S. W., Ying, K. C., Chen, S. C., & Lee, Z. J. (2008). Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Systems with Applications, 35(4), 1817–1824. Liu, G. Z., & Di, S. L. (1992). Wavelet analysis and application. Xi’an, China: Xidian University Press. Seo, K. K. (2007). An application of one-class support vector machines in contentbased image retrieval. Expert Systems with Applications, 33(2), 491–498. Trontl, K., Smuc, T., & Pevec, D. (2007). Support vector regression model for the estimation of c-ray buildup factors for multi-layer shields. Annals of Nuclear Energy, 34(12), 939–952. Vapnik, V. (1995). The nature of statistical learning. New York: Springer. Widodo, A., & Yang, B. S. (2008). Wavelet support vector machine for induction machine fault diagnosis based on transient current signal. Expert Systems with Applications, 35(1-2), 307–316. Wohlberg, B., Tartakovsky, D. M., & Guadagnini, A. (2006). Subsurface characterization with support vector machines. IEEE Transactions on Geoscience and Remote Sensing, 44(1), 47–57. Wu, Q., Yan, H. S., & Yang, H. B. (2008). A forecasting model based support vector machine and particle swarm optimization. In The workshop on Power Electronics and Intelligent Transportation System (pp. 218–222). . Wu, Q., Yan, H. S., & Yang, H. B. (in press). A hybrid forecasting model based on chaotic mapping and improved support vector machine. In The 2008 International Workshop on Chaos-Fractals Theories and Applications. Wu, Q., & Yan, H. S. (in press a). The product sales forecasting model based on robust wavelet v-support vector machine. Acta Automatica Sinica (in Chinese). Wu, Q., & Yan, H. S. (in press b). A forecasting method based on support vector machine with Gaussian loss function. Computer Integrated Manufacturing Systems (in Chinese). Wu, Q., & Yan, H. S. (in press c). Product sales forecasting model based on robust vsupport vector machine. Computer Integrated Manufacturing Systems (in Chinese). Yamaguchi, T. (2007). Adaptive particle swarm optimization -Self-coordinating mechanism with updating information. In IEEE international conference on systems, man and cybernetics (Vol. 3, pp. 2303–2308). Taipei, Taiwan. Zhao, L., & Yang, Y. (in press). PSO-based single multiplicative neuron model for time series prediction. Expert Systems with Applications.