Power load forecasts based on hybrid PSO with Gaussian and adaptive mutation and Wv-SVM

Power load forecasts based on hybrid PSO with Gaussian and adaptive mutation and Wv-SVM

Expert Systems with Applications 37 (2010) 194–201 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www...

2MB Sizes 0 Downloads 53 Views

Expert Systems with Applications 37 (2010) 194–201

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Power load forecasts based on hybrid PSO with Gaussian and adaptive mutation and Wv-SVM Qi Wu * Key Laboratory of Measurement and Control of CSE (School of Automation, Southeast University), Ministry of Education, Nanjing, Jiangsu 210096, China School of Mechanical Engineering, Southeast University, Nanjing, Jiangsu 210096, China

a r t i c l e

i n f o

Keywords: Load forecasts Wv-SVM Particle swarm optimization Adaptive mutation Gaussian mutation

a b s t r a c t This paper presents a new load forecasting model based on hybrid particle swarm optimization with Gaussian and adaptive mutation (HAGPSO) and wavelet v-support vector machine (Wv-SVM). Firstly, it is proved that mother wavelet function can build a set of complete base through horizontal floating and form the wavelet kernel function. And then, Wv-SVM with wavelet kernel function is proposed in this paper. Secondly, aiming to the disadvantage of standard PSO, HAGPSO is proposed to seek the optimal parameter of Wv-SVM. Finally, the load forecasting model based on HAGPSO and Wv-SVM is proposed in this paper. The results of application in load forecasts show the proposed model is effective and feasible. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction The theoretical study of load forecasting of power systems started in the middle of last century, simultaneously with the flourishing of system identification and modern control theories, etc. Before that, because the scales of power systems were limited and there were many uncertain factors, the study of load forecasting had not taken shape. It was not until the 1980s that the theoretical study of mid-long term load forecasting began to occur, and a series of forecasting methods, such as AR algorithm, MA algorithm, General Exponential Smoothing algorithm, ARMA algorithm and ARIMA algorithm, had been successively developed and are widely accepted in the load forecasting of power systems at present (Chenhui, 1987). With the improvement of the grey system, manual neural network, expert system, genetic algorithm (Wu, Yan, & Yang, 2008a) and other theories and methods, the method of mid-long term load forecasting of power systems has continuously improved (Benaouda, Murtagh, Starck, & Renaud, 2006; Liang, 1997; Santos, Martins, & Pires, 2007; Topalli, Erkmen, & Topalli, 2006; Ying & Pan, 2008). In general, most of the algorithms above are based on the time series. Recently, SVM which was developed by Vapnik (1995) is one of the methods that receives increasing attention with remarkable results in the field of load forecasting (Hong, 2009; Pai & Hong, 2005; Wu, Tzeng, & Lin, 2009). The main difference between NN and SVM is the principle of risk minimization. ANN implements empirical risk minimization (ERM) to minimize the error on the training data, while SVM implements the principle of structural risk minimization * Tel.: +86 25 51166581; fax: +86 25 511665260. E-mail address: [email protected] 0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2009.05.011

(SRM) by constructing an optimal separating hyper-plane in the hidden feature space, and using quadratic programming to find a unique solution. SVM has yielded excellent generalization performance that is significantly better than that of competing methods in load forecasts (Hong, 2009; Pai & Hong, 2005; Wu, Tzeng et al., 2009). However, for our used kernel functions so far, the SVM cannot approach any curve in L2 ðRn Þ space (quadratic continuous integral space), because the kernel function which is used now is not the complete orthonormal base. This character lead the SVM cannot approach every curve in the L2 ðRn Þ space. Similarly, the regression SVM cannot approach every function. Therefore we need find a new kernel function, and this function can build a set of complete base through horizontal floating and flexing. As we know, this kind of function has already existed, and it is the wavelet functions. The SVM with wavelet kernel function is called by wavelet SVM (WSVM). Reviewing the load forecasts literatures about support vector machine technique (Hong, 2009; Pai & Hong, 2005; Wu, Tzeng et al., 2009), little has been written about in the literature on application of Wv-SVM to load forecast research field. However, the confirmation of unknown parameters of the WvSVM is complicated process. In fact, it is a multivariable optimization problem in a continuous space. The appropriate parameter combination of models can enhance approximating degree of the original series. Therefore, it is necessary to select an evolutionary algorithm to seek the optimal parameters of Wv-SVM. These unknown parameters have a great effect on the generalization performance of WvSVM. An appropriate parameter combination corresponds to a high generalization performance of Wv-SVM. Particle swarm optimization (PSO), which is an evolutionary computation technique developed by Kennedy and Eberhart (1995), is considered as an excellent technique to solve the combinatorial optimization prob-

195

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

lems (Lin, Ying, Chen, & Lee, 2008; Shen, Shi, Kong, & Ye, 2007; Wu, Liu, Xiong, & Liu, 2009; Wu, Yan, & Wang, 2009; Wu, 2009; Wu, Yan, & Yang, 2008b; Wu, & Yan, 2009, in press; Yuan & Chu, 2007; Yang, Yuan, Yuan, & Mao, 2007; Zhao & Yang, 2009). PSO is based on the metaphor of social interaction and communication such as bird flocking. Original PSO is distinctly different from other evolutionary-type methods in a way that it does not use the filtering operation (such as crossover and mutation) and the members of the entire population are maintained through the search procedure so that information is socially shared among individuals to direct the search towards the best position in the search space. One of the major drawbacks of the standard PSO is its premature convergence. To overcome the shortage, there have been a lot of reported works focused on the modification PSO such as in (Lin et al., 2008; Shen et al., 2007; Wu et al., 2008b; Yuan & Chu, 2007; Zhao & Yang, 2009) to solve the parameter selection problems of SVM, but little attention is given in Wv-SVM. And then, a hybrid PSO with adaptive mutation and Gaussian mutation (HAGPSO) is proposed to optimize the parameters of Wv-SVM in this paper. Based on the above analysis, a new load forecasting model based and Wv-SVM is proposed in this paper. Their superiority over traditional model is verified in numerical simulation. The rest of this paper is organized as follows. Section 2 introduces Wv-SVM. HAGPSO is arranged in Section 3. In Section 4 the steps of HAGPSO and forecasting method are described. Section 5 gives experimental simulation and results. Conclusions are drawn in the end.

where a is the so-called scaling parameter, m is the horizontal floating coefficient, and wðxÞ is called the ‘‘mother wavelet”. The parameter of translation m 2 R and dilation a > 0, may be continuous or discrete. For the function f ðxÞ; f ðxÞ 2 L2 ðRÞ, The wavelet transform f ðxÞ can be defined as: 1

Wða; mÞ ¼ ðaÞ2

Z

þ1

f ðxÞw

1

x  m dx a

ð4Þ

where w ðxÞ stands for the complex conjugation of wðxÞ. The wavelet transform Wða; mÞ can be considered as functions of translation m with each scale a. Eq. (4) indicates the wavelet analysis is a time-frequency analysis, or a time-scaled analysis. Different from the Short Time Fourier Transform (STFT), the wavelet transform can be used for multi-scale analysis of a signal through dilation and translation so it can extract time-frequency features of a signal effectively. Wavelet transform is also reversible, which provides the possibility to reconstruct the original signal. A classical inversion formula for f ðxÞ is:

Z

f ðxÞ ¼ C 1 w

þ1

Z

1

þ1

Wða; mÞwa;m ðxÞ

1

da dm a2

ð5Þ

where

Cw ¼

Z

1

1

^ wðwÞ ¼

Z

2 ^ jwðwÞj dw < 1 jwj

ð6Þ

wðxÞ expðjwxÞdx

ð7Þ

2. Wavelet v-support vector machine (Wv-SVM) 2.1. Wavelet kernel theory Let us consider a set of data points ðx1; y1 Þ; ðx2; y2 Þ; . . . ; ðxl; yl Þ, which are independently and randomly generated from an unknown function. Specifically, xi is a column vector of attributes, yi is a scalar, which represents the dependent variable, and l denotes the number of data points in the training set. The support vector’s kernel function can be described as not only the product of point, such as Kðx; x0 Þ ¼ Kð< x  x0 >Þ, but also the horizontal floating function, such as Kðx; x0 Þ ¼ Kðx  x0 Þ. In fact, if a function satisfied condition of Mercer, it is the allowable support vector kernel function. Lemma 1 (Mercer, 1909). The symmetry function Kðx; x0 Þ is the kernel function of SVM if and only if: for all function g – 0 which R satisfies the condition of Rn g 2 ðnÞdn < 1, we need satisfy the condition as follows:

Z Z

0

Kðx; x0 ÞgðxÞgðxÞdxdx P 0; x; x0 2 Rn

ð1Þ

This theorem proposed a simple method to build kernel function. For the horizontal floating function, because hardly dividing this function into two same functions, we can give the condition of horizontal floating kernel function. Lemma 2 (Smola and Scholkopf, 1998). The horizontal floating function is allowable support vector’s kernel function if and only if the Fourier transform of KðxÞ need satisfy the condition follows:

F½xðxÞ ¼ ð2pÞn=2

Z Rn

expðjðx:xÞÞKðxÞdx P 0;

x 2 Rn

ð2Þ

If the wavelet function wðxÞ satisfied the conditions: wðxÞ 2 ^ ^ is the Fourier transform of function ¼ 0; w L2 ðRÞ \ L1 ðRÞ, and wðxÞ wðxÞ. The wavelet function group can be defined as: 1

wa;m ðxÞ ¼ ðaÞ2 w

x  m a

;

x2R

ð3Þ

For the above Eq. (6), C w is a constant with respect to wðxÞ. The theory of wavelet decomposition is to approach the function f ðxÞ by the linear combination of wavelet function group. If the wavelet function of one dimension is wðxÞ, using tensor theory, the multidimensional wavelet function can be defined as:

wl ðxÞ ¼

l Y

wðxi Þ;

x 2 Rld ;

xi 2 R d

ð8Þ

i¼1

We can build the horizontal floating kernel function as follows:

Kðx; x0 Þ ¼

  l Y xi  x0i w ai i¼1

ð9Þ

where ai is the scaling parameter of wavelet, ai > 0. So far, because the wavelet kernel function must satisfy the conditions of Lemma 2, the number of wavelet kernel function which can be showed by existent functions is few. Now, we give an existent wavelet kernel function: Morlet wavelet kernel function, and we can prove that this function can satisfy the condition of allowable support vector’s kernel function. Morlet wavelet function is defined as follows: x2

wðxÞ ¼ cosð1:75xÞexp 2

ð10Þ

Theorem 1. Morlet wavelet kernel function is defined as: 0

Kðx; x Þ ¼

l Y i¼1

x 2 Rld ;

!   xi  x0i kxi  x0i k2 exp  ; cos 1:75  a 2a2

xi 2 Rd

ð11Þ

and this kernel function is an allowable support vector kernel function. Proof. According to Lemma 2, we only need to prove

F½xðxÞ ¼ ð2pÞl=2

Z Rld

expðjðx:xÞÞKðxÞdx P 0 

   Q Q 2 2 i expðkxi k =2a Þ ; j where KðxÞ ¼ li¼1 w xai ¼ li¼1 cos 1:75x a imaginary number unit. We have

ð12Þ denotes

196

Z Rld

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

expðjxxÞKðxÞdx ¼

Z

expðjxxÞ

Rld

i¼1

kxi k2  exp  2a2 ¼

l Z Y i¼1

l Y

 xi  cos 1:75 a

1 sðw; n ; eÞ ¼ kwk2 þ C  2 ðÞ

min

w;nðÞ ;e;b

!!

dx

Subject to ðw  xi þ bÞ  yi 6 e þ ni

ð17Þ

ni

ð18Þ

yi  ðw  xi þ bÞ 6 e þ ðÞ ni

1

expðjxi xi Þ

1

  expðj1:75xi =aÞ þ expðj1:75xi =aÞ  2 ! Z 2 l Y kxi k 1 1  exp  dxi ¼ 2 2 1 2a i¼1   ! kxi k2 1:75j  exp   j x i a xi þ a 2   !! kxi k2 1:75i þ j x i a xi  þ exp  a 2 ! pffiffiffiffiffiffiffi l Y jaj 2p ð1:75  xi aÞ2 ¼ exp  2 2 i¼1 !! ð1:75 þ xi aÞ2 ð13Þ þ exp  2 Substituting formula (13) into Eq. (12), we can obtain Eq. (14).

 l  Y jaj

ð1:75  xi aÞ2 F½xðxÞ ¼ exp  2 2 i¼1 !! 2 ð1:75 þ xi aÞ þ exp  2

!

ð14Þ

P 0; e P 0; b 2 R

F½xðxÞ P 0

ð15Þ

This completes the proof of Theorem 1. h 2.2. Wavelet v-support vector machine Combining the wavelet kernel function with v-SVM, we can build a new SVM learning algorithm that is Wv-SVM. The structure of Wv-SVM is shown in Fig. 1. For a set of data points ðx1 ; y1 Þ; ðx2; y2 Þ; . . . ; ðxl; yl Þ, Wv-SVM can be described as:

ð19Þ

where w and xi are a column vector with d dimensions, C > 0 is a ðÞ penalty factor, ni ði ¼ 1; . . . ; lÞ are slack variables and m 2 (0, 1] is an adjustable regularization parameter. Problem (16) is a quadratic programming (QP) problem. By means of the Wolfe principle, wavelet kernel function technique and Karush–Kuhn–Tucker (KKT) conditions, we have the duality problem (20) of the original optimal problem (16).

max  a;a

Wða; a Þ ¼ 

l 1X ða  ai Þðaj  aj ÞKðxi  xj Þ 2 i;j¼1 i

þ

l X ðai  ai Þyi

ð20Þ

i¼1

s:t:

0 6 ai ; ai 6

C l

ð21Þ

l X

ðai  ai Þ ¼ 0

ð22Þ

l X ðai þ ai Þ 6 C  v

ð23Þ

i¼1

i¼1

Select the appropriate parameters C and v, and the optimal mother wavelet function which can match well the original series in some scope of scales as the kernel function of Wv-SVM model. Then, W v-SVM output function is described as following: l l X Y xj  xji f ðxÞ ¼ ðai  ai Þ w a i¼1 i¼1

where a – 0, we have

v

! l 1X  eþ ðn þ ni Þ ð16Þ l i¼1 i

! þ b; b 2 R

ð24Þ

where wðxÞ is wavelet transform function, a is the scaling parameter of wavelet, a > 0. xj is the jth value of test vector x. xji is the jth value of sample vector xi . Parameter b can be computed by Eq. (25), select the two scalars aj ðaj 2 ð0; l=CÞÞ and ak ðak 2 ð0; l=CÞÞ, then we have

" !# l l X X 1   b¼ ðai  ai ÞKðxi ; xj Þ þ ðai  ai ÞKðxi ; xk Þ y þ yk  2 j i¼1 i¼1 ð25Þ 3. Hybrid particle swarm optimization

y

Σ

w2

w1

K ( x1 , x )

x1

wn

wi

K ( x2 , x)

x2

K ( xi , x )

x

Fig. 1. The architecture of Wv-SVM.

K ( xn , x)

xn

The confirmation of unknown parameters of the Wv-SVM is complicated process. In fact, It is a multivariable optimization problem in a continuous space. The appropriate parameter combination of models can enhance approximating degree of the original series Therefore, it is necessary to select an intelligence algorithm to get the optimal parameters of the proposed models. The parameters of Wv-SVM have a great effect on the generalization performance of Wv-SVM. An appropriate parameter combination corresponds to a high generalization performance of the Wv-SVM. PSO algorithm is considered as an excellent technique to solve the combinatorial optimization problems. The proposed HAGPSO algorithm is used to determine the parameters of Wv-SVM. The intelligence system shown in Fig. 2 based on the HAGPSO algorithm and Wv-SVM model can evaluate the performance of HAGPSO algorithm by forecasting time series. The different Wv-SVMs in the different Hilbert spaces are adopted to forecast the power load time

197

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

Fig. 2. The AGPSO optimizes the parameters of Wv-SVM.

series. For each particular region only the most adequate Wv-SVM with the optimal parameters is used for the final forecasting. To valuate forecasting capacity of the intelligence system, the fitness function of AGPSO algorithm is designed as follows:

fitness ¼

2 l  1X yi  yi l i¼1 yi

ð26Þ

where l is the size of the selected sample, yi denote the forecasting value of the selected sample, yi is original date of the selected sample. 3.1. Standard particle swarm optimization Similarly to evolutionary computation techniques, PSO (Yang et al., 2007) uses a set of particles, representing potential solutions to the problem under consideration. The swarm consists of n particles. Each particle has a position X i ¼ ðxi1 ; xi2 ; . . . ; xij ; . . . xid Þ, a velocity V i ¼ ðv i1 ; v i2 ; . . . ; v ij ; . . . ; v id Þ, where i ¼ 1; 2; . . . ; N; j ¼ 1; 2; . . . ; d, and moves through an d-dimensional search space. According to the global variant of the PSO, each particle moves towards its best previous position and towards the best particle pg in the swarm. Let us denote the best previously visited position of the ith particle that gives the best fitness value as pi ¼ ðpi1 ; pi2 ; . . . ; pij ; . . . ; pid Þ, and the best previously visited position of the swarm that gives best fitness as pg ¼ ðpg 1 ; pg 2 ; . . . ; pg j ; . . . ; pg d Þ. The change of position of each particle from one iteration to another can be computed according to the distance between the current position and its previous best position and the distance between the current position and the best position of swarm. Then the updating of velocity and particle position can be obtained by using the following equations:

v

kþ1 ij

¼ wv

k ij

    þ c1  r 1  pij  xkij þ c2  r 2  pg j  xkij

xkþ1 ¼ xkij þ v ijkþ1 ij

ð27Þ ð28Þ

where w is called inertia weight and is employed to control the impact of the previous history of velocities on the current one. k

denotes the iteration number, c1 is the cognition learning factor, c2 is the social learning factor, r 1 and r 2 are random numbers uniformly distributed in the range [0,1]. Thus, the particle flies through potential solutions towards pki and pg k in a navigated way while still exploring new areas by the stochastic mechanism to escape from local optima. Since there was no actual mechanism for controlling the velocity of a particle, it was necessary to impose a maximum value V max on it. If the velocity exceeds the threshold, it is set equal to V max , which controls the maximum travel distance at each iteration to avoid this particle flying past good solutions. The PSO is terminated with a maximal number of generations or the best particle position of the entire swarm cannot be improved further after a sufficiently large number of generations. The PSO has shown its robustness and efficacy in solving function value optimization problems in real number spaces. 3.2. Hybrid particle swarm optimization with Gaussian mutation and adaptive mutation Aiming to the disadvantage of the standard PSO, the adaptive mutation operator is proposed to regulate the inertia weight of velocity by means of the fitness value of object function and iterative variable. The Gaussian mutation operator is considered to correct the direction of particle velocity. The aforementioned problem is addressed by incorporating adaptive mutation and Gaussian mutation for the previous velocity of the particle. Thus, the HAGPSO can update the velocity and particle position by using the following equations:2

v kþ1 ¼ ð1  kÞwkij v kij þ kN ij



  þ c2 r 2 pg j  xkij

   0; rki þ c1 r1 pij  xkij ð29Þ

xijkþ1 ¼ xkij þ v kþ1 ij      2 wkij ¼ b 1  f xki =f xkm þ ð1  bÞw0ij expðak Þ

ð31Þ

rkþ1 ¼ rki expðNi ð0; MrÞÞ i

ð32Þ

ð30Þ

198

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

where i ¼ 1; 2; . . . ; N; t ¼ 1; 2. Mr is standard error of Gaussian distribution, b is the adaptive coefficient, k is an increment coefficient,   a is the coefficient of controlling particle velocity attenuation, f xki is the fitness of the ith particle in the kth iterative process.   f xkm is the optimal fitness of particle swarms in the k iterative process. The parameter w regulates the trade-off between the global and local exploration abilities of the swarm. A large inertia weight facilitates global exploration, while a small one tends to facilitate local exploration. A suitable value of the inertia weight w usually provides balance between global and local exploration abilities and consequently results in a reduction of the number of iterations required to locate the optimum solution. Adaptive mutation, which makes the quality of the solution depend on mutation operator, is high effective mutation operator in real code. The proposed adaptive mutation operator based on iterative variable k and the fitness function value f ðxk Þ is described in Eq. (31). Then, in first item of the right of Eq. (29), velocity inertia weight wkij can provide balance between global and local exploration abilities and consequently results in a reduction of the number of iterations required to locate the optimum solution. In Eq. (31)      1  f xki =f xkm represents the particles with the bigger fitness mutate in a smaller scope, while the ones with the smaller fitness 2 mutate in a big scope. w0ij expðak Þ represents the initial inertia 0 weight wij mutate in big scope and search the local optimal value in bigger space in the start moment (smaller kÞ, while the parameter w0ij mutate in small scope and search the global optimal value in small space and gradually reach the global optimal value in the end moment (bigger kÞ. The second item of Eq. (29) represents Gaussian mutation based on the iterative variable k. The Gaussian mutation operator which can correct the moving direction of particle velocity is represented in Eq. (32). In the strategy of Gaussian mutation, the proposed veloc  kþ1 kþ1 consists of last generation ity vector v kþ1 ¼ v kþ1 1 ; v 2 ;    ; v d k k k k velocity vector v ¼ v 1 ; v 2 ;    ; v d and perturbation vector rk ¼  k k  r1 ; r2 ; . . . ; rkd . The perturbation vector mutates itself by Eq. (32) on the each iterative process as a controlling vector of velocity vector. The adaptive and Gaussian mutation operators can restore the diversity loss of the population and improve the capacity of the global search of the algorithm. 4. The procedures of HAGPSO and Wv-SVM The HAGPSO algorithm is described in steps as follows: Algorithm 1 Step 1. Data preparation: Training, validation, and test sets are represented as Tr, Va, and Te, respectively. Step 2. Particle initialization and PSO parameters setting: Generate initial particles. Set the PSO parameters including number of particles ðnÞ, particle dimension ðmÞ, number of maximal iterations ðkmax Þ, error limitation of the fitness function, velocity limitation ðV max Þ, and inertia weight for particle velocity ðw0 Þ, Gaussian distribution ðNð0; MrÞÞ, the perturbation momentum ðr0i Þ; the coefficient of controlling particle velocity attenuation ðaÞ, adaptive coefficient ðbÞ, increment coefficient ðkÞ. Set iterative variable: k ¼ 0. And perform the training process from Step 3–8. Step 3. Set iterative variable: k ¼ k þ 1. Step 4. Compute the fitness function value of each particle. Take current particle as individual extremum point of every particle and do the particle with minimal fitness value as the global extremum point.

Step 5. Stop condition checking: if stopping criteria (maximum iterations predefined or the error accuracy of the fitness function) are met, go to Step 8. Otherwise, go to the next step. Step 6. Adopt the adaptive mutation operator by Eq. (31) and Gaussian mutation operator by Eq. (32) to manipulate particle velocity. Step 7. Update the particle position by Eqs. (29) and (30) and form new particle swarms, go to step 3. Step 8. End the training procedure, output the optimal particle. On the basis of the Wv-SVM model, we can summarize an estimation algorithm as the follows. Algorithm 2 Step 1. Initialize the original data by normalization and fuzzification, then form training patterns. Step 2. Select the appropriate wavelet kernel function K, the control constant m and the penalty factor C. Construct the QP problem (16) of the Wv-SVM. Step 3. Solve the optimization problem and obtain the parameðÞ ters ai . Compute the regression coefficient b by (25). Step 4. For a new forecasting task, extract load characteristics and form a set of input variables x. Then compute the estima^ by (24). tion result y 5. Experiment To analyze the performance of the proposed HAGPSO algorithm, the forecast of power load series by means of the intelligence system based on HAGPSO and Wv-SVM is studied. To compare the performance of HAGPSO algorithm, the standard PSO is also adopted to optimize the parameters of Wv-SVM. The better algorithm will give the better combinational parameters of Wv-SVM. Therefore, there is a good forecasting capability provided by the better combinational parameters in the regression estimation of Wv-SVM. Better algorithm provides better forecasting capability. To evaluate forecasting capacity of the intelligent system, some evaluation indexes, such as mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square error (MSE), are adopted to deal with the forecasting results of HAGPSOWvSVM and PSOWv-SVM. In our experiments, power load series are selected from past load record in a typical power company. The detailed characteristic data and load series compose the corresponding training and testing sample sets. During the process of the power load forecasting, six influencing factors shown in Table 1, viz., sunlight, data, air pressure, temperature, rainfall and humidity are taken into account. All linguistic information of gotten influencing factors is dealt with fuzzy comprehensive evaluation (Feng & Xu, 1999) and form numerical information. Suppose the number of variables is n, and n ¼ n1 þ n2 , where n1 and n2 , respectively denote the number of fuzzy linguistic variables and crisp numerical variables. The linguistic variables are evaluated in several description levels,

Table 1 Influencing factors of power load forecasts. Load characteristics

Unit

Expression

Weight

Sunlight Data Air pressure Temperature Rainfall Humidity

Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless

Linguistic Linguistic Linguistic Linguistic Linguistic Linguistic

0.9 0.7 0.68 0.8 0.7 0.4

information information information information information information

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

199

Fig. 3. Mexican hat wavelet transform of load series in the scope of different scale.

Fig. 4. Morlet wavelet transform of load series in the scope of different scale.

Fig. 5. Gaussian wavelet transform of load series in the scope of different scale.

and a real number between 0 and 1 can be assigned to each description level. Distinct numerical variables have different dimensions and should be normalized firstly. The following normalization is adopted:

  l xei  min xei i¼1 xei ¼    ; l l max xei i¼1  min xei i¼1

e ¼ 1; 2; . . . ; n2

ð33Þ

where l is the number of samples, xei and xei denote the original value and the normalized value, respectively. In fact, all the numerical

variables from (1) through (32) are the normalized values although they are not marked by bars. The proposed HAGPSO algorithm has been implemented in Matlab 7.1 programming language. The experiments are made on a 1.80 GHz Core(TM)2 CPU personal computer (PC) with 1.0 G memory under Microsoft Windows XP professional. The initial parameters of HAGPSO are given as follows: inertia weight: w0 ¼ 0:9; positive acceleration constants: c1 ; c2 ¼ 2; the standard error of Gaussian distribution: Mr ¼ 0:5; the adaptive coefficient b ¼ 0:8; increment coefficient: k ¼ 0:1; the fitness accuracy of the

200

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

Fig. 6. The change trend of the fitness function.

Fig. 7. The load forecasting results based on HAGPSOWv-SVM model.

Table 2 Comparison of forecasting result from two different models. The latest 12 weeks

1 2 3 4 5 6 7 8 9 10 11 12

Real value

580 2046 908 1625 452 2937 1135 2580 2561 781 1489 1532

Table 3 Error statistic of two forecasting models.

Forecasting value PSOWv-SVM

HAGPSOWv-SVM

725 2010 858 1585 547 2880 1046 2493 2508 908 1536 1519

703 2018 880 1606 525 2920 1167 2499 2566 884 1516 1525

normalized samples is equal to 0.0002; the coefficient of controlling particle velocity attenuation: a ¼ 2. The Morlet, Mexican hot and Gaussian wavelet are selected to analyze the load series on the different scales shown in Figs. 3–5. Morlet wavelet transform is the best wavelet transform that can inosculate the original load series on the scope of scale from 0.3 to 4 among all given wavelet transforms. Therefore, Morlet wavelet can be ascertained as a kernel function of Wv-SVM model, three parameters also are determined as follows:

v 2 ½0; 1; a 2 ½0:3; 2 and C2

maxðxi;j Þ  minðxi;j Þ maxðxi:j Þ  minðxi;j Þ  103 ;  103 l l

The trend of fitness value of HAGPSO is shown in Fig. 6. It is obvious that the HAGPSO is convergent. Therefore, HAGPSO is able to be applied to seek the parameters of Wv-SVM. The optimal combinational parameters are obtained by Algorithm HAGPSO, viz., C ¼ 960:10; v ¼ 0:88 and a ¼ 0:89. Fig. 7

Model

MAE

MAPE

MSE

PSOWv-SVM HAGPSOWv-SVM

69.92 45.25

0.068 0.048

6292 3473

illuminates the load series forecasting results given by HAGPSO and Wv-SVM. For analyzing the parameter searching capacity of HAGPSO algorithm, the standard PSO algorithm is used to optimize parameters of Wv-SVM by training the original load series, then give the latest 12 weeks forecasting results of each model shown in Table 2. The comparison between HAGPSO and PSO optimizing the parameters of the same model (Wv-SVM) is shown in Table 3. The Table 3 shows the error index distribution from two different models. The MAE, MAPE and MSE of HAGPSOWv-SVM are better than ones of PSOWv-SVM. It is obvious that adaptive and Gaussian mutation operators can improve the global search ability of particle swarm optimization algorithm. Experiment results show that the forecast’s precision is improved by HAGPSO, compared with PSO under the same conditions. 6. Conclusion In this paper, a new load forecasting model based on HAGPSO and Wv-SVM is proposed. A new version of PSO, viz., hybrid particle swarm optimization with adaptive mutation and Gaussian mutation (HAGPSO), is also proposed to optimize the parameters of Wv-SVM. The performance of the HAGPSOWv-SVM is evaluated by means of forecasting the data of power loads, and the simulation results demonstrate that the Wv-SVM is effective in dealing with many dimensions, nonlinearity and finite samples. Moreover, it is shown that the HAGPSO presented here is available for the Wv-SVM to seek optimized parameters.

Q. Wu / Expert Systems with Applications 37 (2010) 194–201

In our experiments, the fixed adaptive coefficients ðb; kÞ, the second step control parameter Mr of normal mutation and the parameter a of control the velocity attenuation are adopted. However, how to choose an appropriate coefficient is not described in this paper. The research on the velocity changes when different above parameters are adopted is a meaningful problem for future research. References Benaouda, D., Murtagh, F., Starck, J. L., & Renaud, O. (2006). Wavelet-based nonlinear multiscale decomposition model for electricity load forecasting. Neurocomputing, 70(1–3), 139–154. Chenhui, L. (1987). Theory and method of load forecasting of power systems. Ha’erbin Institute of Technology Press. Feng, S., & Xu, L. (1999). An intelligent decision support system for fuzzy comprehensive evaluation of urban development. Expert Systems with Applications, 16(1), 21–32. Hong, W. C. (2009). Electric load forecasting by support vector model. Applied Mathematical Modelling, 33(5), 2444–2454. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. IEEE International Conference on Neural Networks, Australia, 1942–1948. Liang, R. H. (1997). Application of grey linear programming to short-term hydro scheduling. Electric Power Systems Research, 41(3), 159–165. Lin, S. W., Ying, K. C., Chen, S. C., & Lee, Z. J. (2008). Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Systems with Applications, 35(4), 1817–1824. Mercer, J. (1909). Functions of positive and negative type and their connection with the theory of integral equation, Philos. Transactions of the Royal Society of London, A-209, 415–446. Pai, P. F., & Hong, W. C. (2005). Support vector machines with simulated annealing algorithms in electricity load forecasting. Energy Conversion and Management, 46(17), 2669–2688. Santos, P. J., Martins, A. G., & Pires, A. J. (2007). Designing the input vector to ANNbased models for short-term load forecast in electricity distribution systems. International Journal of Electrical Power and Energy Systems, 29(4), 338–347. Shen, Q., Shi, W. M., Kong, W., & Ye, B. X. (2007). A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. Talanta, 71(4), 1679–1683.

201

Smola, A., & Scholkopf, B. (1998). The connection between regularization operators and support vector kernels. Neural Network, 11, 637–649. Topalli, A. K., Erkmen, I., & Topalli, I. (2006). Intelligent short-term load forecasting in Turkey. International Journal of Electrical Power and Energy Systems, 28(7), 437–447. Vapnik, V. (1995). The Nature of Statistical Learning. New York: Springer. Wu, Q. (2009). The forecasting model based on wavelet v-support vector machine. Expert Systems with Applications, 36(4), 7604–7610. Wu, Q., Liu, J., Xiong, F. L., & Liu, X. J. (2009). The fuzzy wavelet classifier machine with penalizing hybrid noises from complex diagnosis system. Acta Automatica Sinica, 35(6), 773–779 (in Chinese). Wu, C. H., Tzeng, G. H., & Lin, R. H. (2009). A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression. Expert Systems with Applications, 36(3), 4725–4735. Wu, Q., & Yan, H. S. (2009). Forecasting method based on support vector machine with Gaussian loss function. Computer Integrated Manufacturing Systems, 15(2), 306–312 (in Chinese). Wu, Q., & Yan, H. S. (in press). Product sales forecasting model based on robust vsupport vector machine. Computer Integrated Manufacturing Systems (in Chinese). Wu, Q., Yan, H. S., & Wang, B. (2009). The product sales forecasting model based on robust wavelet v-support vector machine. Acta Automatica Sinica, 37(7), 1227–1232 (in Chinese). Wu, Q., Yan, H. S., & Yang, H. B. (2008a). A hybrid forecasting model based on chaotic mapping and improved support vector machine. In: Proceedings of the ninth international conference for young computer scientists, (pp. 2701-2706). Wu, Q., Yan, H. S., & Yang, H. B. (2008b). A forecasting model based support vector machine and particle swarm optimization. In: Proceedings of the 2008 workshop on power electronics and intelligent transportation system, (pp. 218-222). Yang, X. M., Yuan, J. S., Yuan, J. Y., & Mao, H. (2007). A modified particle swarm optimizer with dynamic adaptation. Applied Mathematics and Computation, 189, 1205–1213. Ying, L. C., & Pan, M. C. (2008). Using adaptive network based fuzzy inference system to forecast regional electricity loads. Energy Conversion and Management, 49(2), 205–211. Yuan, S. F., & Chu, F. L. (2007). Fault diagnostics based on particle swarm optimization and support vector machines. Mechanical Systems and Signal Processing, 21(4), 1787–1798. Zhao, L., & Yang, Y. (2009). PSO-based single multiplicative neuron model for time series prediction. Expert Systems with Applications, 36(2), 2805–2812.