Applied Mathematical Modelling xxx (2014) xxx–xxx
Contents lists available at ScienceDirect
Applied Mathematical Modelling journal homepage: www.elsevier.com/locate/apm
A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting Yanhua Chen ⇑, Yi Yang, Chaoqun Liu, Caihong Li, Lian Li School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, PR China
a r t i c l e
i n f o
Article history: Received 7 November 2013 Received in revised form 8 August 2014 Accepted 17 October 2014 Available online xxxx Keywords: Electric load forecasting Empirical mode decomposition Seasonal adjustment PSO LSSVM
a b s t r a c t Accurate electric load forecasting could prove to be a very useful tool for all market participants in electricity markets. Because it can not only help power producers and consumers make their plans but also can maximize their profits. In this paper, a new combined forecasting method (ESPLSSVM) based on empirical mode decomposition, seasonal adjustment, particle swarm optimization (PSO) and least squares support vector machine (LSSVM) model is proposed. In the electric market, noise signals usually affect the forecasting accuracy, which were caused by different erratic factors. First of all, ESPLSSVM uses an empirical mode decomposition-based signal filtering method to reduce the influence of noise signals. Secondly, ESPLSSVM eliminates the seasonal components from the de-noised resulting series and then it models the resultant series using the LSSVM which is optimized by PSO (PLSSVM). Finally, by multiplying the seasonal indexes by the PLSSVM forecasts, ESPLSSVM acquires the final forecasting result. The effectiveness of the presented method is examined by comparing with different methods including basic LSSVM (LSSVM), empirical mode decomposition-based signal filtering method processed by LSSVM (ELSSVM) and seasonal adjustment processed by LSSVM (SLSSVM). Case studies show ESPLSSVM performed better than the other three load forecasting approaches. Ó 2014 Elsevier Inc. All rights reserved.
1. Introduction With the improvement of people’s living standard, people are consuming more and more oil, coal, natural gas and other natural energies. As a result, the nation of China is facing a shortage of natural resources, so it is very important to develop a reasonable energy saving plan. Electricity is one of the most important energy sources in people’s life. It plays a more and more significant role in economic development, and its application has attracted particular attention in recent years. However, due to the huge usage of electricity, the big data generated by the electric load and real time required by electricity, make the electric load difficult to store in real life. In addition, the electric load is always influenced by various factors, including weather conditions, social and economic environments, dynamic electricity prices and more. Therefore, in the power system, the electric load is difficult to forecast and remains an enormous problem. The goal of electric load forecasting is to take advantage of every model used and find a balance between production and consumption. In the electricity market, precise electricity demand forecasting is often needed and is fundamental in many applications, for example supplying ⇑ Corresponding author. E-mail address:
[email protected] (Y. Chen). http://dx.doi.org/10.1016/j.apm.2014.10.065 0307-904X/Ó 2014 Elsevier Inc. All rights reserved.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
2
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
energy transfer scheduling, system security, and management [1]. With an accurate, quick, simple and robust electric load forecasting method, essential operating functions such as load dispatch, unit commitment, reliability analysis and unit maintenance can be operated more effectively [2]. On the contrary, inaccurate load forecasting increases the cost of a utility corporation, and with an inaccurate load forecasting, power consumers cannot reasonably adjust the use of electricity. Thus, developing a method to improve the accuracy of electric load forecasting is essential. During the past several decades, a wide variety of models have been implemented for electric load forecasting. For example, on the basis of the fuzzy logic system, Yang et al. [3] applied the Wang–Mendel (WM) method for short-term electric load forecasting. Lou and Dong [4] proposed the T2SDSA-FNN (Type-2 Self- Developing and Self-Adaptive Fuzzy Neural Networks) model. Kucukali and Baris [5] used the fuzzy logic approach to forecast short-term electricity demand in Turkey. On the basis of the time-series method, Wang et al. [6] applied the chaotic time series method based on PSO and trend adjustment for the electricity demand forecasting in New South Wales. Vilar et al. [7] used the nonparametric functional method to forecast electricity demand and price in Spain. On the basis of the decomposition method, Wang et al. [8] decomposed the electricity demand data of Queensland, Victoria, and the South East Queensland region into a number of components according to seasonality and day of week. Abu-Shikhah and Elkarmi [9] used the singular value decomposition method to disintegrate the electric load data of the Jordanian power system. While these methods (fuzzy logic system, time series, and decomposition method) improved the forecasting accuracy in their aspects, they could not yield the desired accuracy in all electric load forecasting. Just like Moghram and Rahman [10] concluded, there is no one best method, under all specific situations that can yield a better forecasting accuracy. As a result, the concept of combination and hybrid models were developed. The combination and hybrid models sum up the merit of two, three or more models, which were first used by Bates et al. [11]. Later, Dickinson [12] proved the fact that the mean absolute error of the combination and hybrid model is lower than that of an individual model, which means combination and hybrid models perform better than the individual model. Then combination and hybrid models were widely used in many applications, such as, a combination model based on the generalized regression neural network with the fruit fly optimization algorithm was proposed by Li et al. [13]. It was used to forecast the electricity consumption of Beijing and the cities of China. Zhang et al. [14] used a new hybrid method of a chaotic genetic algorithm-simulated annealing and support vector regression to forecast the electric load of Northeast China. Other combination models included: combining artificial neural networks and the fuzzy neural network with other models [15–18], which predicted electricity demand in different regions. All of them were verified by real data to prove the effectiveness of the combination method. However, the traditional neural network and the fuzzy neural network methods have several drawbacks. They have restrictions on generalization ability, can easily to fall into a local minimum, and they are unstable in training results which usually requires a large sample. The cause of these drawbacks were due to the optimization algorithms used to select the parameters as well as the statistical measures used to choose the model in the neural network and the fuzzy neural network. Compared with the traditional neural network, support vector machine (SVM) is a new solution in machine learning fields. Instead of using the principle of empirical risk minimization (ERM), SVM uses the principle of structural risk minimization (SRM). In addition ERM minimizes the error on the training data, whereas SRM simultaneously minimizes the empirical error and model complexity, which can increase the generalization capacity of the SVM for classification or regression problems in a lot of training. Compared with the widely used methods for regression problems, such as: onedimensional linear regression, multiple linear regression and neural network approach, SVM possesses a concise mathematical form and shows many advantages. Besides, SVM can effectively solve the practical problems of small sample size, nonlinearity, high dimension and local minimum point. In paper [19], the author applied SVM, optimal training subset and adaptive particle swarm optimization to forecast the electric load in California. However, when solving large sample problems, SVM faced some problems. For example: quadratic programming problem. Least squares support vector machine is the improvement of SVM, compared with SVM, it has the following advantage: use equality constraints to instead of the inequality in standard SVM, solve a set of linear equations instead of quadratic programming. However, single LSSVM model did not get a preferable result, because the selection of the parameters in a LSSVM model was very random and uncertain. And it still lacked systemic methods for the parameters selection. Then Hong [20] used the chaotic particle swarm optimization algorithm to optimize the parameters of support vector regression (SVR) [20]. Wang et al. [21] used the adaptive particle swarm optimization to determine the parameters for the chaotic system. Shayeghi and Ghasemi used applied chaotic gravitational search algorithm to optimize the parameters of LSSVM [22]. Xu and Chen applied the improved particle swarm optimization algorithm to optimize the parameters of LSSVM [23]. Among the LSSVM parameters optimization algorithms, PSO can save time and is very efficient in searching suitable parameters of a LSSVM model, so this paper used the PSO to optimize the parameters of LSSVM, namely PLSSVM [24]. Though the above-mentioned methods can generate a sufficiently accurate prediction for different cases, in general, they have concentrated on the accuracy improvement of the models acquired without paying attention to the internal characteristics of the data. In fact, electric load data is usually influenced by unstable factors thus causing noise signals, which increases the difficulty of forecasting. Due to the influence of season, week and month, electric load data often presents periodicity and seasonal components. Considering the noise signal and seasonal aspects of electric load data, not preprocessing the data and directly forecasting the electric load is bound to affect the forecasting accuracy. In this paper, the empirical mode decomposition-based signal filtering method was used to handle the noise signals, and the seasonal adjustment approach was used to eliminate the seasonal factors. First of all, the empirical mode decomposition disintegrates the noise signals into intrinsic oscillatory components called intrinsic mode functions (IMFs) by means of an algorithm referred to as a Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
3
sifting process [25]. Some papers regarded the first IMF as noise signals, see [26,27]. In their simulations, they removed the first IMF, thus improving the forecasting accuracy. In empirical mode decomposition-based signal filtering method, a distortion measure called consecutive mean square error was proposed to determine which IMF should be removed. After that, the data distribution of the signal overcame the high-frequency factors of the signal. Then the signal factors were removed from the original data. On the basis of the empirical mode decomposition processed data series, a common seasonal adjustment approach was applied to eliminate the seasonal element that affects the modeling procedure. Next, this paper models the processed data series by LSSVM which is optimized by PSO. At last, by multiplying the seasonal indexes back to the PLSSVM forecasting results, the final electric load forecasting series were obtained. At the end of this paper, the grey relational analysis (GRA) method was used to choose the best method from LSSVM, ELSSVM, SLSSVM and ESPLSSVM. This choice was the multiple attribute decision making (MADM) problems. In fact, MADM problems exist in daily life. These MADM problems may consist of whether a national electrical network company raises electricity prices or reduces electricity prices either according to the numbers of electricity users or according to the status of the national economy because of the state budget, people’s income and expenses, quality of people’s lives, etc. So the policy maker should choose a preferable one when facing several criteria. In recent years, a wide variety of methods dealing with MADM problems have been proposed, including: the simple additive weighting method, see [28,29], analytical hierarchy process [30], operational competitiveness rating methods [31], and the grey relational analysis method [32,33] that used in this paper. The reason for choosing GRA to make the multiple attribute decision is that its result is based on the original data and its calculation is simple and easy to understand. The rest of this paper is organized as follows. The empirical mode decomposition-based signal filtering method is specified in Section 2. In Section 3, the seasonal adjustment is presented. The particle swarm optimization, least squares support vector machine model and least squares support vector machine which optimized by particle swarm optimization are given in Section 4. In Section 5, the proposed method is presented. Section 6 displays the experimental results. Finally, Section 7 concludes this paper and discusses the contribution of this paper. 2. Empirical mode decomposition-based signal filtering The explicit theory of empirical mode decomposition-based signal filtering is introduced in this section. Section 2.1 states the theory of empirical mode decomposition, while the empirical mode decomposition-based signal filtering approach is presented in Section 2.2. 2.1. Empirical mode decomposition The empirical mode decomposition method which disintegrates the signal into a sum of intrinsic mode functions (IMF) was proposed by Huang et al. [34]. According to paper [34], any signal y(t) can then be disintegrated in the following five steps: (1) Identify all the local extrema, and then connect all the local maxima with a cubic spline line as the upper envelope. (2) Repeat the procedure for the local minima to produce the lower envelope. The upper and lower envelopes should cover all the data between them. (3) The mean of upper and lower envelopes is designated as n1(t), and the difference between the signal y(t) and n1(t) is the first component p1(t)
p1 ðtÞ ¼ yðtÞ n1 ðtÞð1Þ;
ð1Þ
Ideally, if p1(t) satisfies the definition of an IMF, then it is the first IMF. (4) In contrast, if p1(t) does not satisfy the definition of an IMF, then p1(t) is treated as the original signal in the next iteration. By repeating process (3), p11(t) = p1(t) n11(t) is acquired. The sifting process will repeat k times; that is p1k(t) = p1(k1)(t) n 1k.(t). The stoppage criterion which stops the sifting process is defined as SDk ¼ PT 2 h2 1ðk1Þ . According to [35], if SDk is less than 0.2 or 0.3, then set c1(t) = p1k(t) and c1(t) is the first t¼0 h1ðk1Þ h1k ðtÞ IMF. (5) Separate c1(t) from the original signal to get r1(t) = y(t) c1(t). Treating r1(t) as a new signal and repeating the process (1)–(4), then the second IMF and residue were obtained. Again the second residue was treated as a new signal the process (1)–(4) was repeated. By repeating the above process several times, and the result was ri(t) = ri1(t) ci(t), i = 2, 3, . . . , n. Then, ci(t), i = 1, 2, . . . , n are the n IMFs that were obtained. 2.2. Empirical mode decomposition-based signal filtering approach The empirical mode decomposition-based signal filtering method was introduced by Boudraa and Cexus [25]. Given a signal y(t) that is a deterministic signal x(t) corrupted by a noise z(t), the objective was to extract x(t) from y(t), that is, to find an approximation ^ xðtÞ to the original signal x(t) that minimizes the mean square error (MSE) given by
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
4
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
MSEðx; ^xÞ ¼
T 1X 2 ½xðtÞ ^xðtÞ ; T t¼1
ð2Þ
where T is the length of x(t). Using the above empirical mode decomposition method, y(t) was disintegrated into n IMFs and a xðtÞ could be reconstructed by the kth nth IMFs and the specific form was as follows residue rn, then ^ P ^ xk ðtÞ ¼ ni¼k IMFi ðtÞ þ rn ðtÞ; k ¼ 1; 2; . . . ; n since x(t) was unknown, the consecutive mean square error (CMSE) was proposed to substitute the measure in Eq. (2). It was defined as follows:
CMSEð^xk ; ^xkþ1 Þ ¼
T T 1X 1X 2 ½^xk ðtÞ ^xkþ1 ðtÞ ¼ IMF2k ; T t¼1 T t¼1
ð3Þ
where k = 1, . . . , n 1. If j ¼ arg min1
Pn
i¼j IMFi ðtÞ
þ rn ðtÞ.
2
3 y11 y12 y1s 6 y21 y22 y2s 7 7 Given a data series y1, y2, . . . , yT (T = ms), it was denoted as 6 4 5, then the average value of each row was ym1 ym2 yms k ¼ ðyk1 þ yk2 þ þ yks Þ=sðk ¼ 1; 2; . . . ; mÞ, and then let each data divide the mean value of each row calculated as follows: y k ðk ¼ 1; 2; . . . ; mÞ, the specific form was as follows: Iks ¼ yykl ; ðk ¼ 1; 2; . . . m; l ¼ 1; 2; . . . ; sÞ. Then the ith seasonal index was y k
found:
Ii ¼
I1i þ I2i þ þ Imi ði ¼ 1; 2; . . . ; sÞ: m
ð4Þ
Using the above seasonal indexes, the seasonal effect could be eliminated according to Eq. (5).
y0ki ¼
yki ; Ii
k ¼ 1; 2; . . . ; m;
i ¼ 1; 2; . . . ; s:
ð5Þ
4. Least squares support vector machine optimized by the particle swarm optimization model This section introduces the explicit theory of the least squares support vector machine optimized by particle swarm optimization model. The two parameters c and c are optimized by PSO. Section 4.1 states the theory of the least squares support vector machine model. The particle swarm optimization algorithm is introduced in Section 4.2. The combined model PLSSVM is presented in Section 4.3. 4.1. Least squares support vector machine model The least squares support vector machine was developed by Suykens et al. [36]. Given a training data set fxj ; yj gnj¼1 , with xj 2 Rn and yj 2 (1, 1), to class the training set, the LSSVM has to find the optimal (with maximum margin) separating hyperplane so that the LSSVM has good generalization ability. All of the separating hyperplanes have the following representation in the feature space: y(x) = wTu(x) + b, where w is the weight vector, u(x) is a nonlinear mapping from the input space to a feature space, and b is the bias term. For the function estimation problem, the structural minimization is used to formulate the following optimization problem:
Minimize :
N 1 T 1 X W W þ c n2j ; 2 2 j¼1
Subject to : yj ¼ W T uðxj Þ þ b þ nj ;
ð6Þ
j ¼ 1; 2; . . . N;
ð7Þ
where nj represents the error variable at time j and c represents the regulation constant. To derive the solutions W and n, the Lagrange function comes in the form of: h i P P LðW; b; n; aÞ ¼ 12 W T W þ 12 c Nj¼1 n2j Nj¼1 aj W T uðxj Þ þ b þ nj yj , where aj is the Lagrange multiplier. The final non-linear P function estimate of LSSVM can be written as Y j ¼ f ðX j Þ ¼ Nj¼1 aj KðX; X j Þ þ b, where K(X, Xj) is the kernel function. Since the Gaussian radial basis function (RBF) is the most effective one in nonlinear regression problems, this paper used the RBF which can be expressed as: KðX; X j Þ ¼ exp
kXX j k2 2d2
.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
5
4.2. Particle swarm optimization algorithm The particle swarm optimization is an evolutionary computational technique which was proposed by Eberhart and Kennedy [37]. The PSO is based on the analogy of biological and sociological behavior of animals, for example birds and insects. It is initialized with a set of potential solutions which can be defined as a particle. Then each particle changes its position by adjusting its previous best position and global best position, until the optimal position has been reached or the computation limitations are exceeded. Define a swarm consists of q particles flying in a m dimensional search space. The current position and velocity of particle i are defined as xi = (xi1, xi2, . . . , xim), vi = (vi1, vi2, . . . , vim) respectively. In addition, vi 2 (vmax, vmax), where vmax controls the maximum global exploration ability that PSO can have. The previous best position of each particle which is called the local best is known as pbesti = (pbesti1, pbesti2, . . . , pbestim), and the best position among all particles which is called the global best is known as gbesti = (gbesti1, gbesti2, . . . , gbestim). Then the best position of particle i can be computed by the following equations:
v kþ1 ¼ w:v ki þ c1 :r 1 : i
k k pbest i xki þ c2 :r2 : gbesti xki ;
ð8Þ
xkþ1 ¼ xki þ a:v ki ; i
ð9Þ
where w is the inertia weight factor; r1 and r2 are two independent randomly distributed variables with the range of [0, 1]; c1 and c2 are two positive constants called acceleration coefficients. 4.3. Least squares support vector machine optimized by the particle swarm optimization The parameter optimization problem of the LSSVM is usually converted into a parametric estimation problem of a multiple linear regression function. The regularization parameter c and kernel parameter c have great influence on the prediction accuracy of the LSSVM. In order to get a better prediction result, it is necessary to adjust the two parameters. In this paper, the PSO algorithm was used to optimize the parameters. The specific process of PLSSVM is described below and the flowchart is shown as Fig. 1.
Initialization
Calculate the fitness function as Eq. (10)
No
Update pbesti and gbesti according to the fitness function
Update the velocity vi as Eq. (8) and the position xi as Eq. (9)
Satisfy stopping condition
Yes
Optimal parameters
Fig. 1. Flowchart of PLSSVM algorithm.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
6
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
Step1. Initialization. The quantity of the population q is initialized; the initial position and velocity of each particle are randomly assigned. Step2. Fitness evaluation. For each particle, evaluate its fitness. In this paper, the fitness function is defined as the following:
Fitness ¼
N ^ 1X xi xi ; N i¼1 xi
ð10Þ
where xi and ^ xi represent the actual and forecast values respectively. Step3. Update pbesti and gbesti according to the fitness evaluation results. Step4. Update the velocity of each particle vi according to formula Eq. (8), and the position of each particle xi using formula Eq. (9). Step5. Termination. The velocity and position of the particle are updated until the stop conditions are satisfied. 5. The proposed method In electricity markets, the operation of electric load data presents a number of uncertain factors. Nobody can control the majority of these factors, and these factors may affect the data series acquisition process, data series pre-processing and prediction accuracy. In addition, any of these factors can introduce noises. The result is that it is difficult to forecast an electric load accurately, so it requires special treatment in electric load forecasting. To address the problem of noise data caused by uncertain factors, this paper used the empirical mode decomposition-based signal filtering approach, which has many advantages discussed in Section 2. The trend component of an electric load data series is often influenced by seasonal factors, so it always has seasonal components. In order to remove the seasonal components and further improve the forecasting accuracy, the above mentioned seasonal adjustment approach is used in this paper. After the processing of the empirical mode decomposition and seasonal adjustment, this paper used the LSSVM which was optimized by PSO on the processed data series to acquire the final forecasting results. In a word, the proposed method referred to as ESPLSSVM combines the least squares support vector optimized by the particle swarm optimization with the empirical mode decomposition-based signal filtering and the seasonal adjustment. The specific process of the ESPLSSVM is described below and the flowchart is shown in Fig. 2. Step1. Noise reduction. Use empirical mode decomposition-based signal filtering to reduce the noise interference of the original signal. Step2. Seasonal adjustment. Remove seasonal component from the resultant signal from Step1 and obtain the trend component. Step3. Least squares support vector method forecasting. Use the processed data of Step1 and Step2, and then add it to PLSSVM model to get the future values. Step4. Gain the final forecasts. Restore the seasonal factor back to the above predicted values and achieve the final results. 6. Simulation The main purpose of this section is the simulation of the four proposed methods. Section 6.1 states the data collection process. The statistics measures of forecasting performance is presented in Section 6.2. Section 6.3 introduces the specific forecasting process of the four methods. The comparison between the four methods is shown in Section 6.4. And finally accuracy test was made in Section 6.5. 6.1. Data collection process Firstly, the website http://www.aemo.com.au/ was entered in order to download the electric load data of South Australia (SA) from 1999 to 2012. Then, these data were divided into seven groups according to the week. These seven groups were Monday, Tuesday, and Wednesday and so on. Through a large amount of experiments and simulations to these data, the following data period was obtained. The feasibility data used in this paper is the data period covered from 2nd May, 2011 to 3rd July, 2011 (9 weeks in all). Among these data, the electric load data from 2nd May, 2011 to 26th June, 2011were used to model fitting and training, then the constructed models were applied to forecast the electric load from 27th June, 2011 to 3rd July, 2011. The electric load curves for the South Australia area from 2nd May, 2011 to 26th June, 2011 are outlined in Fig. 3. From Fig. 3, it can be seen that the data for model fitting and training was divided into seven groups. Each group consisted of 8 days of electric load data and each day had 48 observations. Consequently, each group had 384 electric load data. The corresponding forecasting methods were used the Monday group data to forecast the following Monday data, the Tuesday group data to forecast the following Tuesday data, and so on. For example, the data period from 2nd May, 2011 to 20th June, 2011 was used to forecast the electric data of 27th June, 2011, and the data period from 3rd May, 2011 to 20th June, 2011 Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
7
Fig. 2. Flowchart of the proposed ESPLSSVM algorithm.
Fig. 3. Electric load data from 2nd May, 2011 to 26th June, 2011.
was used to forecast the electric data of 28th June, 2011. Due to that, we used all the 8 weeks electric load data to forecast the ninth weeks on the same day. The forecasting horizon is one-week ahead. 6.2. Statistical measures of forecasting performance In this paper, the following three criteria were employed to evaluate the four methods. They are the root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE), which are calculated as:
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u1 X RMSE ¼ t ðxi ^xi Þ N i¼1
ð11Þ
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
8
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
MAE ¼
N 1X jxi ^xi j N i¼1
MAPE ¼
ð12Þ
N ^ 1X xi xi 100% N i¼1 xi
ð13Þ
where xi represents the actual value and ^ xi represents the forecasted value. 6.3. Specific forecasting process of the four methods To highlight the advantage of the ESPLSSVM method, in this section, it is compared with other three methods, i.e. LSSVM (least squares support vector machine), ELSSVM (least squares support vector machine with empirical mode decompositionbased signal filtering) and SLSSVM (least squares support vector machine with seasonal adjustment). The detailed forecasting process of the four methods is described as follows. Firstly, the ESPLSSVM uses empirical mode decomposition-based signal filtering to eliminate the noise interference from the original data. After the noise elimination process of the original data, the data was named as de-noised data. Fig. 4 shows this noise eliminating process for Thursday. By comparing the original data with the de-noised data, it can be seen that the de-noised data became a little smooth. So instead of using the original series the proposed method ESPLSSVM used the denoised data to model fitting and training. Observations revealed that seasonal factors exist in the electric load data. So ESPLSSVM used seasonal adjustment to eliminate the seasonal factors from the de-noised data. The actual electric load data of 5th May, 2011 (Thursday group) and the electric load data after eliminating seasonal factors from the original series are shown in Table 1 and Fig. 5. From Table 1, the seasonal index can be seen, where season index1 and season index2 represent the seasonal index of the ELSSVM method and the ESPLSSVM method respectively. As anticipated, no prominent seasonal factors can be seen in Fig. 5. The collection data rate was every half hour, so there were 48 observations in 1 day, and the period s = 48. Next, the ESPLSSVM used the LSSVM method optimized by the PSO to further forecast. Here, each group data was normalized and each normalized data was divided into training sets and test sets, where the training sets were the 8 days of electric load data consisting of the seven groups, and the test sets were the latter 8 days of electric load data among which there was the electric load data to be predicted. The parameters of the PSO in the ESPLSSVM were assigned the value as follows: the population size was 100; the acceleration coefficients c1, c2 were 1.5, 0.7 respectively and the maximum velocity threshold was vmax = 500. After the PLSSVM predicted the 48 values of 1 day, and re-predicted them for the other seven groups respectively, by multiplying their corresponding seasonal indexes to those forecasting results, the ESPLSSVM acquired the final electric load forecasts. For the other three methods, to highlight differences between these three methods and ESPLSSVM, the three methods all used the LSSVM to predict while the proposed method used the LSSVM optimized by the PSO to predict. In addition, LSSVM simulated the original data directly, the ELSSVM simulated the de-noised data with intrinsic seasonal factors unhandled and the SLSSVM simulated the seasonal-factor-removed data with intrinsic noise unhandled. Fig. 6 shows the final predicted val-
Fig. 4. The noise eliminating process for thursday.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
9
Table 1 Electric load data after eliminating seasonal factors. Time
Original data
Season index1
Season index2
Load demand after eliminating the seasonal factor
0:00 0:30 1:00 1:30 2:00 2:30 3:00 3:30 4:00 4:30 5:00 5:30 6:00 6:30 7:00 7:30 8:00 8:30 9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 22:00 22:30 23:00 23:00
8663 8331.68 8146.03 7955.81 7735.64 7368.79 7066.1 6885.75 6794.4 6790.57 6885.67 7146.35 7525.03 8234.82 8889.66 9171.06 9475.62 9550.36 9444.62 9446.89 9453.11 9482.14 9458.94 9392.81 9263.38 9213.65 9184.33 9178.39 9168.99 9132.81 9098.59 9094.74 9169.09 9246.13 9403.17 9784.86 10398.1 10434.5 10303.7 10116.32 9902.58 9730.77 9633.3 9353 9062.06 9099.51 8889.21 8784.72
0.9668 0.9327 0.9105 0.8912 0.8668 0.8269 0.7930 0.7692 0.7553 0.7531 0.7619 0.7962 0.8438 0.9343 1.0185 1.0653 1.1055 1.1050 1.0861 1.0810 1.0736 1.0658 1.0571 1.0402 1.0273 1.0196 1.0089 1.0051 1.0012 0.9981 0.9929 0.9929 1.0016 1.0184 1.0531 1.1235 1.1870 1.1917 1.1832 1.1621 1.1434 1.1302 1.1117 1.0768 1.0389 1.0388 1.0069 0.9871
0.9672 0.9347 0.9112 0.8832 0.8535 0.8240 0.7960 0.7722 0.7575 0.7558 0.7709 0.8059 0.8615 0.9320 1.0032 1.0597 1.0914 1.1007 1.0972 1.0895 1.0804 1.0694 1.0561 1.0418 1.0276 1.0150 1.0049 0.9962 0.9887 0.9831 0.9829 0.9908 1.0087 1.0379 1.0773 1.1214 1.1597 1.1830 1.1899 1.1819 1.1610 1.1305 1.0978 1.0701 1.0480 1.0296 1.0104 0.9883
8960.488 8932.862 8946.766 8927.076 8924.365 8911.344 8910.593 8951.833 8995.631 9016.824 9037.498 8975.571 8918.026 8813.893 8728.189 8608.899 8571.343 8642.860 8695.903 8739.029 8805.058 8896.735 8948.009 9029.812 9017.210 9036.534 9103.311 9131.818 9158.000 9150.195 9163.652 9159.774 9154.443 9079.075 8929.038 8709.266 8759.983 8755.979 8708.333 8705.206 8660.644 8609.777 8665.377 8685.921 8722.745 8759.636 8828.295 8899.524
ues for each day by these four methods. In order to clearly illustrate the predicted value, the actual and predicted electric load values by the four methods on Thursday were listed in Table 2, and a diagram was drawn in Fig. 7. From Fig. 7, it can be seen that compared the LSSVM with the ELSSVM, the proposed method ESPLSSVM curve was more consistent with the original data curve. But when comparing the proposed method ESPLSSVM with the SLSSVM, it can be seen that the SLSSVM curve was more consistent with the original data curve. This means that the SLSSVM performed better than ESPLSSVM on Thursday. But this does not mean the SLSSVM performs better than ESPLSSVM for the whole week. In order to prove ESPLSSVM was indeed better than SLSSVM. The RMSE, MAE and MAPE of the four methods were calculated and accuracy test was made in Section 6.5.
6.4. Comparison between the four methods According to the calculation methods of the three statistical measures in Section 6.2, in this section, calculation for RMSE, MAE and MAPE of the four methods were made and are recorded in Table 3. It shows the RMSE, MAE and MAPE of the four methods (LSSVM, ELSSVM, SLSSVM and ESPLSSVM) on each day and the average RMSE, MAE and MAPE of the four methods. Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
10
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
Fig. 5. Electric load data of 5th May and electric load data after eliminating seasonal factor.
Fig. 6. Final predicted values for each day by the four methods.
As previously stated, the three statistical variables reveal much detailed information. When comparing the ELSSVM with the LSSVM, the SLSSVM with the LSSVM, and the SLSSVM with the ELSSVM, the following results occurred. When comparing the ELSSVM with the LSSVM: by observing Table 3, if the three parameters RMSE, MAE and MAPE of each day are considered, it can be seen that the ELSSVM had anticipated lower values than the LSSVM except on Tuesday, Wednesday, Thursday and Friday. It is worth noting that the MAPE of ELSSVM is lower than LSSVM on Thursday. Though the RMSE, MAE and MAPE of the LSSVM were lower than the ELSSVM for almost 4 days, but in general, if the average values of the whole week are considered, the ELSSVM had lower values than the LSSVM, and the ELSSVM has reduced RMSE by 9.66%, MAE by 16.18% and MAPE by 18.60%. When comparing the SLSSVM with the LSSVM: by observing Table 3, if the three parameters RMSE, MAE and MAPE of each day are considered, it can be seen that the SLSSVM had anticipated lower values than the LSSVM for every day of the week except that the MSE and MAPE of SLSSVM is higher than LSSVM on Tuesday, and the MSE of SLSSVM is higher than
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
11
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx Table 2 Actual and predicted electric load values predicted by the four methods on thursday. Time (h)
0:00 0:30 1:00 1:30 2:00 2:30 3:00 3:30 4:00 4:30 5:00 5:30 6:00 6:30 7:00 7:30 8:00 8:30 9:00 9:30 10:00 10:30 11:00 11:30
Actual value (MW)
9330.92 8936.76 8753.75 8583.2 8342.05 7949.09 7583.75 7327.52 7195.26 7140.74 7245.84 7538.59 7978.97 8814.22 9607.01 9987.29 10401.3 10515.62 10429.45 10413.28 10299.43 10240.43 10147.72 9948.32
Predicted value (MW) of
Time (h)
LSSVM
ELSSVM
SLSSVM
ESPLSSVM
8834.580 8586.656 8412.572 8231.414 7958.183 7507.634 7103.557 6840.244 6706.436 6685.576 6762.928 7089.395 7651.697 8499.253 8959.628 9223.555 9571.377 9563.854 9463.104 9485.278 9456.266 9454.809 9421.668 9341.928
8852.787 8959.137 8715.271 8490.960 8173.865 7773.583 7383.739 7107.441 6990.620 7041.446 7282.730 7760.782 8396.873 8855.217 9051.896 9247.740 9457.423 9607.047 9683.790 9681.467 9591.276 9447.959 9319.983 9228.650
9247.965 8987.253 8760.657 8576.733 8350.938 7964.878 7643.442 7400.384 7259.106 7231.504 7312.868 7657.868 8125.164 9028.319 9859.635 10329.797 10724.850 10717.751 10518.698 10467.211 10383.601 10298.097 10192.317 10005.900
9130.561 8829.917 8779.536 8556.327 8271.477 7972.443 7664.199 7371.502 7169.805 7108.786 7218.327 7540.290 8139.138 8969.142 9759.097 10337.073 10635.898 10682.125 10571.482 10434.098 10343.346 10270.655 10159.216 10002.606
12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 22:00 22:30 23:00 23:00
Actual value (MW)
9775.36 9742.8 9745.13 9666.1 9573.78 9568.54 9524.79 9568.38 9711.49 9939.43 10382.47 11066.64 11587.02 11602.13 11527.36 11292.44 11065.19 10908.68 10697.34 10335.75 9962.38 9963.98 9676.46 9503.5
Predicted value (MW) of LSSVM
ELSSVM
SLSSVM
ESPLSSVM
9316.512 9248.294 9180.751 9193.758 9163.359 9129.022 9104.491 9096.328 9140.304 9289.613 9467.119 10027.624 10704.321 10688.375 10510.577 10224.493 10002.806 9805.113 9637.177 9325.824 9060.496 9165.391 8999.134 8923.793
9141.574 9047.105 8943.378 8874.430 8845.193 8833.121 8860.661 8939.651 9070.765 9251.153 9500.630 9822.825 10137.310 10310.073 10313.006 10169.631 9905.066 9576.043 9289.382 9115.640 9022.429 8967.789 8914.420 8863.811
9842.641 9780.193 9683.524 9616.552 9605.595 9557.376 9499.966 9505.453 9602.044 9769.838 10165.905 10869.472 11490.789 11542.775 11464.896 11257.390 11078.309 10953.988 10772.273 10429.296 10057.580 10017.996 9680.144 9466.647
9843.915 9735.805 9677.612 9611.699 9538.899 9474.973 9454.494 9509.993 9667.348 9947.779 10348.109 10805.000 11200.116 11435.434 11502.792 11422.924 11218.981 10922.144 10599.175 10314.155 10070.156 9849.603 9621.414 9357.722
Fig. 7. Predicted electric load values by the four methods on thursday.
Table 3 Three statistics measures of the four forecasting methods. Date
RMSE of LSSVM
MAE of ELSSVM
SLSSVM
ESPLSSVM LSSVM
Monday 1188.6597 541.3885 378.9353 370.8248 Tuesday 486.6158 714.0457 476.9515 171.3476 Wednesday 229.4437 241.1863 191.9124 200.5389 Thursday 708.1429 789.2288 119.0024 117.4811 Friday 979.9081 1027.3444 152.1427 128.1433 Saturday 283.0341 228.4812 223.7391 180.8896 Sunday 443.8858 360.8113 272.1641 251.9379 Whole week 617.0986 557.4980 259.2639 203.0233
1109.7460 405.9450 166.1144 664.1955 864.6368 230.4355 375.8896 545.2804
MAPE (%) of ELSSVM
SLSSVM
ESPLSSVM LSSVM ELSSVM SLSSVM ESPLSSVM
416.9081 528.3299 184.8493 682.5333 898.9610 182.3881 305.0325 457.0003
326.3475 454.3615 174.4376 89.4909 110.8835 185.6552 236.9473 225.4462
303.2652 140.7981 169.6065 87.9658 109.8307 141.3494 224.9034 168.7963
11.42 4.20 1.94 6.77 8.68 2.54 4.36 5.70
4.07 5.12 2.12 6.73 8.94 2.00 3.53 4.64
3.33 4.50 1.81 0.94 1.17 2.13 2.78 2.38
3.05 1.45 1.71 0.92 1.11 1.61 2.69 1.79
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
12
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
Fig. 8. Bar figures comparing the three prediction error.
LSSVM on Wednesday too. In terms of the average values of the whole week, the SLSSVM had lower values than the LSSVM too, and the SLSSVM has reduced RMSE by 57.99%, MAE by 58.65% and MAPE by 58.25%. When comparing the SLSSVM with the ELSSVM: by observing Table 3, if the three parameters RMSE, MAE and MAPE of each day are considered, it can be seen that the SLSSVM had anticipated lower values than the ELSSVM for every day of the week except that the MSE and MAPE of SLSSVM is higher than ELSSVM on Saturday. In terms of the average values of the whole week, the SLSSVM has lower values than ELSSVM too, and SLSSVM has reduced RMSE by 53.50%, MAE by 50.67% and MAPE by 48.71%. In general, the ELSSVM was better than the LSSVM for most days of the week, the SLLVM was better than the ELSSVM and the LSSVM for every day of the week. Among the three methods, relative to the LSSVM, the ELSSVM, the SLSSVM had a better performance. If the merits of the ELSSVM and the SLSSVM are combined and the two parameters (c, c) of the LSSVM are optimized by PSO, the result should be very good. So the proposed method ESPLSSVM could acquire better results. By observing Table 3, if the three parameters RMSE, MAE and MAPE of each day are considered, it can be seen that the ESPLSSVM had anticipated lower values than the other three methods for every day of the week except that the RMSE of ESPLSSVM is higher than SLSSVM on Wednesday. In terms of the average values of the whole week, the ESPLSSVM also has lower values than the other three methods. Compared the ELSSVM with the SLSSVM, ESPLSSVM has reduced RMSE by 63.58% and 21.69%, MAE by 63.06% and 25.13%, and MAPE by 61.42% and 24.79%. In order to further illustrate the aforementioned results, the bar figures comparing the three prediction errors among these four methods are displayed in Fig. 8. From Fig. 8, the following conclusion can be drawn: the RMSE, the MAE as well as the MAPE of LSSVM, ELSSVM, SLSSVM and ESPLSSVM are all getting lower and lower. As mentioned above, the smaller the three statistical measures are, the better the method is. That is to say the ESPLSSVM is the most accurate forecasting method compared with other three methods.
6.5. Accuracy test In addition to the three statistical measures listed in Section 6.3, there are other forecasting accuracy analysis methods, such as absolute error, mean square error, and grey relation analysis (GRA). Compared with traditional statistical measures that require large amounts of data and standard distribution in the data, the grey relation analysis uses fewer data to analyze many factors. Therefore, in the following, the GRA was used to evaluate the four forecasting results. Based on the grey system, as a new technology method, the GRA is put forward. Fig. 9 presents a synoptic chart to illustrate the mechanism of the grey relation analysis. The reference data sequence (denoted as ‘‘h’’ in Fig. 9) indicates an ideal solution to a problem, which means that it has the best performance when considering all the criteria. The candidate solution data sequences (denoted as ‘‘s’’ and ‘‘4’’ in Fig. 9) are the data sequences that are to be evaluated. The specific process of the grey relation analysis is that it provides an impact evaluation model that measures the degree of similarity between each Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
13
candidate solution and the reference based on the grade of relation. The result is that a candidate solution that possesses a higher grade of relation is more similar with the reference, and it is supposed to be the ideal solution. More specifically, the grey relation analysis includes the following six steps [38]: Step1. Initialization. Define the reference sequence as x0 = {x0(1), x0(2), . . . , x0(n)}, and the candidate solutions to be compared with the reference sequence (forecasting sequence) as
2
x1 ð1Þ 6 x ð1Þ 6 2 6 4
x1 ð2Þ x2 ð2Þ
3 x1 ðnÞ x2 ðnÞ 7 7 7: 5
xm ð1Þ xm ð2Þ xm ðnÞ Step2. Normalization. Based on the criteria of the larger-the-better, the reference sequence and forecasting sequence were normalized in Step1.The process is as follows: m
x0i ðjÞ ¼
xi ðjÞ mini¼1 ½xi ðjÞ m maxm i¼1 ½xi ðjÞ mini¼1 ½xi ðjÞ
where xi(j) is the entity in the reference sequence and forecasting sequence. Step3. Based on Step2, then the normalized reference sequence and forecasting sequence were obtained. The normalized 2 0 3 x1 ð1Þ x01 ð2Þ x01 ðnÞ 6 x02 ð1Þ x02 ð2Þ x02 ðnÞ 7 7. reference sequence is x00 ¼ x00 ð1Þ; x00 ð2Þ; . . . ; x00 ðnÞ , and forecasting sequence is 6 4 5 0 0 0 xm ð1Þ xm ð2Þ xm ðnÞ Step4. Calculate the difference between a normalized entity and its reference value, and construct the difference matrix as follows:
D0i ðjÞ ¼ jx00 ðjÞ x0i ðjÞj; 2
D01 ð1Þ D01 ð2Þ 6 D ð1Þ D ð2Þ 02 6 02 D¼6 4 D0m ð1Þ D0m ð2Þ
ði ¼ 1; 2; . . . ; m;
j ¼ 1; 2; . . . ; nÞ;
3 D01 ðnÞ D02 ðnÞ 7 7 7: 5
D0m ðnÞ
Step5. Calculate the grey relational coefficient for each entity: m
n0i ðjÞ ¼
n
n mini¼1 minj¼1 D0i ðjÞ þ q maxm i¼1 maxj¼1 D0i ðjÞ n D0i ðjÞ þ q maxm i¼1 maxj¼1 D0i ðjÞ
;
where qð0 6 q 6 1Þ is known as the distinguishing coefficient or the index for distinguishability. The smaller q is, the higher is its distinguishability. In most situations, q takes the value of 0.5 because this value usually offers moderate distinguishing effects and good stability [39].
Fig. 9. Comparison of data sequence in grey relation analysis.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
14
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
Table 4 Grey relational degrees of the four methods in each day. Date
Grey relational degree of
Monday Tuesday Wednesday Thursday Friday Saturday Sunday Whole week
LSSVM
ELSSVM
SLSSVM
ESPLSSVM
0.611 0.614 0.623 0.563 0.599 0.609 0.629 0.607
0.703 0.761 0.577 0.741 0.554 0.655 0.655 0.664
0.729 0.881 0.576 0.853 0.913 0.725 0.746 0.775
0.796 0.869 0.695 0.906 0.916 0.763 0.686 0.805
Table 5 Comparison between MFES and ESPLSSVM. Date
Monday Tuesday Wednesday Thursday Friday Saturday Sunday Whole week
RMSE of
MAE of
MAPE of
MFES
ESPLSSVM
MFES
ESPLSSVM
MFES
ESPLSSVM
600.5528 236.1970 132.8307 125.6975 151.1043 340.1020 303.7013 311.8225
370.8248 171.3476 200.5389 117.4811 128.1433 180.8896 251.9379 203.0233
481.3816 191.4427 105.8698 99.0080 127.5911 304.7845 259.1748 224.1789
303.2652 140.7981 169.6065 87.9658 109.8307 141.3494 224.9034 168.7963
4.7902 1.9099 1.1094 1.0716 1.3551 3.5294 3.2031 2.4241
3.05 1.45 1.71 0.92 1.11 1.61 2.69 1.79
Step 6. Calculate the grey relational degree. The relational degree between two sequences can be expressed by dividing the relational coefficient by its average value, in order to show the relationship for the entire system [40]: P r0i ¼ 1=n nk¼1 n0i ðkÞ, where n0i(k) is the grey relational coefficient at the point k. In this paper, the reference sequence was the original electric load values of each day, and candidate solutions were electric load values forecasted by the above-mentioned four methods. The parameters in Step1 were n ¼ 48; m ¼ 4 and by calculation, the grey relational degree was acquired in Table 4. As listed in Table 4, by observing the average grey relational degree of the whole week, it can be seen that ESPLSSVM had the highest value, which means the ESPLSSVM had the best performance. The value of the three residual methods dramatically decreased, which means the ESPLSSVM performed better than the SLSSVM; the SLSSVM performed better than the ELSSVM and the ELSSVM performed better than the LSSVM. By observing the values of each day, it can be found that on Monday, Thursday and Saturday, the grey relational degree values of the four methods increased according to the following order: ESPLSSVM, SLSSVM, ELSSVM, and LSSVM. The result was the same as above. But on Tuesday, Wednesday, Friday and Sunday, there were slight changes in the result. This is discussed in the following three cases. On Tuesday and Sunday, when comparing the three methods: LSSVM, ELSSVM and ESPLSSVM, it can be found that the ESPLSSVM performed better than the ELSSVM and the ELSSVM performed better than the LSSVM. Using the same method, comparing the SLSSVM with the ESPLSSVM, it can be found that the SLSSVM performed better than the ESPLSSVM. So the order of the 2 days was: SLSSVM, ESPLSSVM, ELSSVM and LSSVM. On Wednesday, when comparing the three methods: LSSVM, ELSSVM and SLSSVM, it can be found that the LSSVM performed better than the ELSSVM and the ELSSVM performed better than the SLSSVM. Using the same method, comparing the ESPLSSVM with the LSSVM, it can be found that the ESPLSSVM performed better than the LSSVM. So the order of Wednesday’s performance was ESPLSSVM, LSSVM, ELSSVM and SLSSVM. On Friday, when comparing the three methods: LSSVM, ELSSVM and SLSSVM, it can be found that the SLSSVM performed better than the LSSVM and the LSSVM performed better than the ELSSVM. Using the same method, comparing the ESPLSSVM with the SLSSVM, it can be found that the ESPLSSVM performed better than the SLSSVM. So the order of Friday’s performance was ESPLSSVM, SLSSVM, LSSVM and ELSSVM. In a word, it can be seen that considering the average grey relational degree of the whole week, ESPLSSVM performed better than other three methods on 5 days (Monday, Wednesday, Thursday, Friday and Saturday). Though on Tuesday and Sunday, SLSSVM performed better than ESPLSSVM, but in general, for the 7 days in a week, there are only 2 days that ESPLSSVM method performed worse than SLSSVM. So, generally speaking, ESPLSSVM is superior to other three methods.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
15
7. Conclusions In the past few decades, different kinds of methods have been adopted to forecast the electric load in various situations by experts and scholars. However, the prediction accuracy has not been desirable. Based on paper [41], this paper proposes a new electric load forecasting method ESPLSSVM. A comparison of the ESPLSSVM and the MFES by the three statistical variables (RMSE, MAE, and MAPE) was made. The comparison results are listed in Table 5. From Table 5, it can be seen that the proposed method ESPLSSVM has reduced the average RMSE by 34.89%, the average MAE by 24.70% and the average MAPE by 26.16%. In addition, considering the three parameters RMSE, MAE and MAPE, the proposed method ESPLSSVM has lower values than MFES for most days of the week except Wednesday. On Wednesday, the RMSE of MFES was 132.8307, and the RMSE of the proposed method ESPLSSVM was 200.5389. The MAE of MFES was 105.8698, and the MAE of ESPLSSVM was 169.6065, the MAPE of MFES was1.1094, and the MAPE of ESPLSSVM was 1.71. Though the three parameters of the ESPLSSVM have slightly higher values than MFES on Wednesday, in general, however, the ESPLSSVM performed better than MFES. In order to prove ESPLSSVM really performed better than other methods, three methods LSSVM, ELSSVM and SLSSVM were used for comparison. The grey relation analysis method was also used to test which electric load value predicted by the three methods was closer to the actual value. The result showed that the electric load value predicted by the ESPLSSVM was close to the actual value. There are three advantages of ESPLSSVM. First of all, the time series of the electric load of New South Wales displayed periodicity and seasonality while the volatility that changes with time often influences capability of storing Thus, a method that could remove seasonality first before forecasting was developed. In addition, this method is simple and easily comprehensible for application. Secondly, the proposed method ESPLSSVM greatly improved the forecasting accuracy and was relatively suitable for current research. In regard to the forecast models, single model has deficiencies in many aspects, such as some models can only forecast the linear component of the signal, while some other models can only forecast the nonlinear component of the signal. So if these models are combined together, then full use of the advantage of each model can be utilized. On this basis it is believed that the combined model performed better than the traditional single model for electric load forecasting. As a result, the proposed method combined the empirical mode decomposition, seasonal adjustment, particle swarm optimization and least squares support vector machine together. It can effectively make use of the advantage of each model. Simulation results showed that the forecasting accuracy was indeed improved. Thirdly, the proposed method is automatic and does not require making complicated decisions about the explicit form of models for each particular case. The ESPLSSVM gave the minimum RMSE, MAE and MAPE; the ELSSVM and SLSSVM gave the moderate RMSE, MAE and MAPEE; whereas, the traditional LSSVM model gave the maximum RMSE, MAE and MAPE. For the above reasons, it is clear that the proposed method is more effective than the existing LSSVM model for electric load forecasting. All in all, it distinctly indicates that the ESPLSSVM can obviously improve the electric load forecasting accuracy and can provide a very powerful tool for market players and regulators to control and arrange their electricity supply. A lot of work will be done to extend the ESPLSSVM to electricity systems in the future. Acknowledgments The authors would like to thank the Natural Science Foundation of PR of China (61073193, 61300230), the Key Science and Technology Foundation of Gansu Province (1102FKDA010), the Natural Science Foundation of Gansu Province (1107RJZA188), and the Science and Technology Support Program of Gansu Province (1104GKCA037) for supporting this research. References [1] H.M. Al-Hamadi, S.A. Soliman, Short-term electric load forecasting based on Kalman filtering algorithm with moving window weather and load model, Electr. Power Syst. Res. 68 (1) (2004) 47–59. [2] T. Senjyu, P. Mandal, K. Uezato, et al, Next day load curve forecasting using hybrid correction method, IEEE Trans. Power Syst. 20 (1) (2005) 102–109. [3] X. Yang, J. Yuan, J. Yuan, et al, An improved WM method based on PSO for electric load forecasting, Expert Syst. Appl. 37 (12) (2010) 8036–8041. [4] C.W. Lou, M.C. Dong, Modeling data uncertainty on electric load forecasting based on type-2 fuzzy logic set theory, Eng. Appl. Artif. Intel. 25 (8) (2012) 1567–1576. [5] S. Kucukali, K. Baris, Turkey’s short-term gross annual electricity demand forecast by fuzzy logic approach, Energy Policy 38 (5) (2010) 2438–2445. [6] J. Wang, D. Chi, J. Wu, et al, Chaotic time series method combined with particle swarm optimization and trend adjustment for electricity demand forecasting, Expert Syst. Appl. 38 (7) (2011) 8419–8429. [7] J.M. Vilar, R. Cao, G. Aneiros, Forecasting next-day electricity demand and price using nonparametric functional methods, Int. J. Electr. Power Energy Syst. 39 (1) (2012) 48–55. [8] C. Wang, G. Grozev, S. Seo, Decomposition and statistical analysis for regional electricity demand forecasting, Energy 41 (1) (2012) 313–325. [9] N. Abu-Shikhah, F. Elkarmi, Medium-term electric load forecasting using singular value decomposition, Energy 36 (7) (2011) 4259–4271. [10] I.S. Moghram, S. Rahman, Analysis and evaluation of five short-term load forecasting techniques, IEEE Trans. Power Syst. 4 (4) (1989) 1484–1491. [11] J.M. Bates, C.W.J. Granger, The combination of forecasts, OR (1969) 451–468. [12] J.P. Dickinson, Some comments on the combination of forecasts, Oper. Res. Q. (1975) 205–210. [13] H. Li, S. Guo, C. Li, et al, A hybrid annual power load forecasting model based on generalized regression neural network with fruit fly optimization algorithm, Knowl. Based Syst. 37 (2013) 378–387. [14] W.Y. Zhang, W.C. Hong, Y. Dong, et al, Application of SVR with chaotic GASA algorithm in cyclic electric load forecasting, Energy 45 (1) (2012) 850–858. [15] Z. Xiao, S.J. Ye, B. Zhong, et al, BP neural network with rough set for short term load forecasting, Expert Syst. Appl. 36 (1) (2009) 273–279. [16] A. Kheirkhah, A. Azadeh, M. Saberi, et al, Improved estimation of electricity demand function by using of artificial neural network, principal component analysis and data envelopment analysis, Comput. Ind. Eng. 64 (1) (2013) 425–441.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065
16
Y. Chen et al. / Applied Mathematical Modelling xxx (2014) xxx–xxx
[17] P.C. Chang, C.Y. Fan, J.J. Lin, Monthly electricity demand forecasting based on a weighted evolving fuzzy neural network approach, Int. J. Electr. Power Energy Syst. 33 (1) (2011) 17–27. [18] J. Wang, J. Wang, Y. Li, et al, Techniques of applying wavelet de-noising into a combined model for short-term load forecasting, Int. J. Electr. Power Energy Syst. 62 (2014) 816–824. [19] J.X. Che, A novel hybrid model for bi-objective short-term electric load forecasting, Int. J. Electr. Power Energy Syst. 61 (2014) 259–266. [20] W.C. Hong, Chaotic particle swarm optimization algorithm in a support vector regression electric load forecasting model, Energy Convers. Manage. 50 (1) (2009) 105–117. [21] J. Wang, H. Lu, Y. Dong, et al, The model of chaotic sequences based on adaptive particle swarm optimization arithmetic combined with seasonal term, Appl. Math. Model. 36 (3) (2012) 1184–1196. [22] H. Shayeghi, A. Ghasemi, Day-ahead electricity prices forecasting by a modified CGSA technique and hybrid WT in LSSVM based scheme, Energy Convers. Manage. 74 (2013) 482–491. [23] H. Xu, G. Chen, An intelligent fault identification method of rolling bearings based on LSSVM optimized by improved PSO, Mech. Syst. Signal Process. 35 (1) (2013) 167–175. [24] J. Zhang, Z. Tan, S. Yang, Day-ahead electricity price forecasting by a new hybrid method, Comput. Ind. Eng. 63 (3) (2012) 695–701. [25] A.O. Boudraa, J.C. Cexus, EMD-based signal filtering, IEEE Trans. Instrum. Measure. 56 (6) (2007) 2196–2202. [26] Z. Guo, W. Zhao, H. Lu, et al, Multi-step forecasting for wind speed using a modified EMD-based artificial neural network model, Renew. Energy 37 (1) (2012) 241–249. [27] Y. Dong, J. Wang, H. Jiang, et al, Short-term electricity price forecast based on the improved hybrid model, Energy Convers. Manage. 52 (8) (2011) 2987–2995. [28] J.L. Cochrane, M. Zeleny, Multiple Criteria Decision Making, University of South Carolina Press, 1973. [29] M. Zeleny, Multiple Criteria Decision Making, McGraw-Hill, New York, 1982. [30] T.L. Saaty, The Analytical Hierarchy Process, RWS Publications, Pittsburgh, 1990. [31] K.P. Yoon, C.L. Hwang, Multiple Attribute Decision Making: An Introduction, Sage Publications, 1995. [32] J.L. Deng, Introduction to grey system theory, J. Grey Syst. 1 (1) (1989) 1–24. [33] J.H. Wu, K.-L. Wen, M.-L. You, A multi decision making based on modified grey relational grade, J. Grey Syst. 11 (4) (1999) 381–387. [34] N.E. Huang, Z. Shen, S.R. Long, et al, The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 454 (1971) (1988) 903–995. [35] K. Qi, Z. He, Y. Zi, Cosine window-based boundary processing method for EMD and its application in rubbing fault diagnosis, Mech. Syst. Signal Process. 21 (7) (2007) 2750–2760. [36] J.A.K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, J. Vandewalle, Least Square Support Vector Machines, World Scientific, Singapore, 2002. [37] R.C. Eberhart, J. Kennedy, A new optimizer using particle swarm theory, in: Proceedings of the Sixth International Symposium on Micro Machine and Human Science, vol. 1, 1995, pp. 39–43. [38] H.H. Wu, A comparative study of using grey relational analysis in multiple attribute decision making problems, Qual. Eng. 15 (2) (2002) 209–217. [39] T.C. Chang, K.L. Wen, H.T. Chang, et al, A new method for evaluation of design alternatives based on the fuzzy gray relational analysis, 1999 IEEE International Fuzzy Systems Conference Proceedings, FUZZ-IEEE’99, vol. 2, IEEE, 1999, pp. 708–713. [40] D. Ju-Long, Control problems of grey systems, Syst. Control Lett. 1 (5) (1982) 288–294. [41] N. An, W. Zhao, J. Wang, et al, Using multi-output feedforward neural network with empirical mode decomposition based signal filtering for electricity demand forecasting, Energy 49 (2013) 279–288.
Please cite this article in press as: Y. Chen et al., A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modell. (2014), http://dx.doi.org/10.1016/j.apm.2014.10.065