Chaotic time series method combined with particle swarm optimization and trend adjustment for electricity demand forecasting

Chaotic time series method combined with particle swarm optimization and trend adjustment for electricity demand forecasting

Expert Systems with Applications 38 (2011) 8419–8429 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ww...

1MB Sizes 2 Downloads 173 Views

Expert Systems with Applications 38 (2011) 8419–8429

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Chaotic time series method combined with particle swarm optimization and trend adjustment for electricity demand forecasting Jianzhou Wang a, Dezhong Chi a,⇑, Jie Wu a, Hai-yan Lu b a b

School of Mathematics and Statistics, Lanzhou University, Lanzhou 730000, China Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia

a r t i c l e

i n f o

Keywords: Chaotic time series Particle swarm optimization algorithm Trend adjustment

a b s t r a c t Electricity demand forecasting plays an important role in electric power systems planning. In this paper, nonlinear time series modeling technique is applied to analyze electricity demand. Firstly, the phase space, which describes the evolution of the behavior of a nonlinear system, is reconstructed using the delay embedding theorem. Secondly, the largest Lyapunov exponent forecasting method (LLEF) is employed to make a prediction of the chaotic time series. In order to overcome the limitation of LLEF, a weighted largest Lyapunov exponent forecasting method (WLLEF) is proposed to improve the prediction accuracy. The particle swarm optimization algorithm (PSO) is used to determine the optimal weight parameters of WLLEF. The trend adjustment technique is used to take into account the seasonal effects in the data set for improving the forecasting precision of WLLEF. A simulation is performed using a data set that was collected from the grid of New South Wales, Australia during May 14–18, 2007. The results show that chaotic characteristics obviously exist in electricity demand series and the proposed prediction model can effectively predict the electricity demand. The mean absolute relative error of the new prediction model is 2.48%, which is lower than the forecasting errors of existing methods. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction Electricity-supply planning requires efficient management of existing power systems and optimal decisions about additional capacity. Electricity demand prediction is playing an important role in electricity planning. The electricity demand can be predicted in the form of annual demand (GW), peak demand (MW), daily, weekly or annual load curves depending on the type of planning and the accuracy required. Short-term load forecasting, which can vary from minutes to several hours ahead, are required as inputs to scheduling algorithms for the generation and transmission of electricity. The load forecasting helps in determining which generators to operate in a given period, so as to minimize costs and secure demand even when local failures may occur in the system (Taylor, Menezes, & McSharry, 2006). The forecasting models for electricity demand can be broadly classified into two categories, i.e. causal and time-series models. Causal models exploit the relationship between a dependent variable and independent variables, and assume that the variations in dependent variables could be explained by independent variables. Multiple linear regression analysis and econometric models are the most popular causal models. A limitation of causal models is that ⇑ Corresponding author. Tel.: +86 931 8914050; fax: +86 931 8912481. E-mail address: [email protected] (D. Chi). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.01.037

these models depend on the availability and reliability of independent variables over the forecasting period which requires further efforts in data collection and estimation. During the last two decades, a wide variety of time series forecasting methods have been proposed along the line of improving the forecasting precision, such as regression analysis, exponential smoothing, ARMA model, multiplicative autoregressive (AR) model, Bayesian estimation model and the Kalman filtering technology (Metaxiotis, Kagiannas, Askounis, & Psarras, 2003; Moghram & Rahman, 1989). In using these models, electric load was decomposed into weather insensitive and weather sensitive components, respectively; which were assumed to be linear. However, the electric load series are actually non-linear in nature. This makes these models improper for electric load forecasting (Hong, 2009). Power systems are complicated nonlinear systems with chaotic behavior, which is caused by the chaos in a non-linear dynamic system. This indicates that the power systems are not random in nature. Chaos theory is an important way to study the characteristics (the inner uncertainties in particular) of complicated systems. According to Xue and Shi (2008), if an irregular movement characterized by time series can be regarded as a kind of chaos phenomenon, short-term prediction with higher precision is available with chaos theory. Therefore, Chaos theory can be used for short-term electric load forecasting (Camastra & Colla, 1999; Michanos, Tsakoumis, Fessas, Vladov, & Mladenov, 2003). Chaotic time series

8420

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

prediction has drawn significant attention of researchers over the past few years and some chaotic prediction methods have been developed for short-term electric load, such as local-region method, Lyapunov exponents method, artificial neural network method, which are based on dynamics reconstruction technique and have also achieved some promising results (Yang, Ye, Wang, Khan, & Hu, 2006). Regarding chaotic time series prediction, a chaotic attractor is obtained by measuring a chaotic time series. The properties of the chaotic attractor can be retained through a reconstruction procedure, which is known as the delay coordinate embedding. This procedure results in a reconstructed state space which contains a reconstructed chaotic attractor preserving both geometrical and dynamical properties of the original chaotic attractor (Lau & Wu, 2008). Short-term demand forecasting is a complicated system involving many factors and how to improve forecasting accuracy becomes increasingly important due to the significance of forecasting accuracy in various serious decision making process. To improve the accuracy, some artificial intelligence techniques can be employed. However, the general intelligence techniques are lack of knowledge memory or storage functions. For instances, genetic algorithms (GA) and simulated annealing algorithm (SA) destroy the previous knowledge of the problem once the population (GA) or the temperature changes (SA). This would lead to time consuming and inefficiency in the searching the suitable parameters of the model (Wang, Zhu, Zhang, & Sun, 2009). Recently, inspired by the social behavior of organisms such as school of fish or flock of birds, Kennedy and Eberhart first introduced particle swarm optimization (PSO), which is a population-based stochastic optimization algorithm. PSO has become the latest evolutionary optimization technology used for solving a wide range of real world non-linear optimization problems due to its outstanding features of simple concept, easy implementation and quick convergence (Zhao & Yang, 2009). Compared with GA and SA, PSO has memory to store the knowledge of good solutions by all particles and particles in the swarm share information with each other. Considering short-term electric demand forecasting, the electric load is mainly influenced by meteorological conditions, seasonal effects (daily and weekly cycles, calendar holidays) and special events. Weather related variation is certainly critical in predicting electricity demand for lead times beyond a day-ahead (Chow & Leung, 1996; Taylor & Buizza, 2003). Based on the traditional model, an electricity demand time series is made up of three components: the trend, the seasonal and the stochastic errors (Tsekouras & Dialynas, 2007). Chaotic time series techniques can be used to find the underlying trends. A trend adjusted series can then be produced by adding the seasonal component to the trend series. The method of trend adjustment most commonly used for official statistics is the US Bureau of the Census X-11 method (Xiao, Ye, Zhong, & Sun, 2009). It has been observed that the addition of the seasonal component is indeed helpful in the interpretation of the data. The electric power demand has both the non-linear and seasonal trends. This paper proposes a weighted largest Lyapunov exponent forecasting method (WLLEF) to count for the non-linear characteristics. The particle swarm optimization algorithm (PSO) is used to determine the optimal weight parameters of WLLEF and the multiplication model of trend adjustment is used to take into account the seasonal effects for improving the forecasting precision of WLLEF. A simulation is performed using a data set that was collected from the grid of New South Wales, Australia during May 14–18, 2007. The simulation results show that the proposed prediction model can effectively predict the electricity demand. The rest of the paper is organized as follows. The fundamental principle of chaotic time series analysis and its formulation are

presented in Section 2. In Section 3, the particle swarm optimization algorithm (PSO) is overviewed, which are used to determine the optimal weight parameters of WLLEF. Section 4 introduces the method of the trend adjustment. Numerical simulations to demonstrate the forecasting performance of the proposed models and corresponding comparison results with the original algorithms are provided in Section 5. Section 6 provides the conclusions. 2. Chaotic time series analysis 2.1. Phase space reconstruction For a scalar time series, the phase space can be reconstructed using the method of delays. The basic idea of this method is that the evolution of any single variable of a system is determined by the other variables with which it interacts. Information about the relevant variables is thus implicitly contained in the historical values for any single variable. Consequently, an ‘‘equivalent’’ phase space can be reconstructed as follows: Considering a chaos system

V t 2 Rm

V tþ1 ¼ f ðV t Þ; m

ð1Þ

m

where f: R ? R is a continuous smooth function, if the system can be observed through an observer: g: Rm ? RL (L 6 m), then

xt ¼ gðV t Þ;

xt 2 R L

ð2Þ

To simplify the presentation, we suppose L = 1, then

Y t ¼ ðxt ; xt1 ; . . . ; xtdþ1 Þ

ð3Þ

where Yt is a state in the d-order state space. According to the famous Takens theory, when d P 2m + 1(m is the order of the attractor), there exists a deterministic mapping with dimension d: F(d):Rd ? Rd. It describes the track of Yt evolvement in the state space and has the same geometric structure and topology as the original system. Therefore,

Y tþ1 ¼ F ðdÞ ðY t Þ

ð4Þ

Eq. (4) can also be presented as

xtþ1 ¼ F~ðdÞ ðxt ; xt1 ; . . . ; xtdþ1 Þ

ð5Þ

The state space described by F~ðdÞ is called the restructured state space (Jiang & Li, 2005). By assigning an element of the time series xt and its successive delays as coordinates of a new vector time series Yt = {xt, xt-s, xt-2s, . . . , xt-(d-1)s}, where s is referred to as the delay time which is a multiple of the sampling interval used for a digitized time series, d is termed the embedding dimension. The dimension d of the reconstructed phase space is considered as the sufficient dimension for recovering the object without distorting any of its topological properties, thus it may be different from the true dimension of the space where this object lies (Shang, Na, & Kamae, 2008). The method of delays has become popular for dynamic reconstruction, however, the choice of reconstruction parameters s and d has not been fully developed. Many researches have been developed considering the better approaches to estimate the reconstruction parameters for different kinds of time series. 2.1.1. Estimation of the embedding dimension using G–P algorithm The G–P algorithm proposed by Grassberger and Procaccia (Grassberger & Procaccia, 1983) is introduced to estimate the dimension of a dynamic system from the given time series. It can be determined from the correlation integral which is defined as follows:

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429 ^l 1X hðr  kY i  Y j kÞ ^l!1 ^ l2

CðrÞ ¼ lim

ð6Þ

i;j¼1

where ^l ¼ l  ðd  1Þs and h(x) is the Heaviside step function:

 hðxÞ ¼

0

if x < 0

1 if x  0

ð7Þ

and the correlation dimension d2 is given by

d2 ¼ lim r!0

logðCðrÞÞ log r

ð8Þ

where d2 is the corresponding fractal dimensions. Embedding dimension d is not increased before dc, i.e. d2(d) is not added with the increase of d, so that d2 = d2(dc) = d2(dc+1) = d2(dc+2) = . . . will be obtained. d2(dc) becomes the corresponding fractal dimension of attractor, dc is the saturated embedding dimension. 2.1.2. Estimation of the delay time constant In order to use the delay coordinate embedding, it is necessary to estimate the delay constant s. If s is too small, each coordinate is almost the same and the trajectories of the reconstructed space are squeezed along an identity line and this phenomenon is known as redundancy. If s is too large, in the presence of chaos and noise, the dynamics at any one time become effectively and causally disconnected from the dynamics at a later time, so that even simple geometric objects look extremely complicated and this phenomenon is known as irrelevance. A reasonable choice of s is the first minimum of the autocorrelation function, which makes each coordinate linearly independent. However, it is not sufficient for handling the non-linearity of a chaotic time series. Mutual information measures both linear and non-linear dependences of each coordinate. Therefore the first local minimum of mutual information provides a better basis for estimating s. The mutual information from a time series has been proposed by Fraser and Swinney (1986). It is defined as

Id ðsÞ ¼ dH0  Hd ðsÞ

ð9Þ

P^l 1

where Hd ðsÞ ¼ ^l i¼1 ln P r ðY i ðsÞÞ. However, Fraser’s algorithm is difficult to implement and has only been applied to a two-dimensional case. Another algorithm has been proposed by Liebert and Schuster (1989), in which the first local minimum of mutual information Id(s) is equal to the first local minimum of log C d1 , where C d1 is defined as the limq!1 C dq and C dq is defined as

C dq

¼

!1=ðq1Þ ^l 1X q1 Pr ðY i ðsÞÞ ^l

ð10Þ

i¼1

where Pr(Yi(s)) is defined as

Pr ðY i ðsÞÞ ¼

^l 1X hðr  kY i ðsÞ  Y j ðsÞkÞ ^l

ð11Þ

j¼1

where r is a constant (Lau & Wu, 2008). 2.2. The largest Lyapunov exponent A technique to determine the presence of chaotic behavior is the largest Lyapunov exponent, which measures the divergence of nearby trajectories. As the system evolves, the sum of a series of attractor point values (in each dimension) will converge or diverge. Lyapunov exponents measure the rate of convergence/divergence in each dimension, and a chaotic system will exhibit trajectory divergence in at least one dimension. Thus, a positive Lyapunov exponent is a strong indicator of chaos.

8421

The method used in calculating the largest Lyapunov exponent is based on averaging the local divergence rates or the local Lyapunov exponents. Rosenstein et al. proposed a new method to calculate the largest Lyapunov exponent from an observed times series (Rosenstein, Collins, & Deluca, 1993). After reconstructing the phase space using suitable values for s and d, we compute logarithm of the average distance of a point Yn0 in phase space with respect to all points Yn in its r-neighborhood. This is repeated for N number of points along the orbit. At last, an average quantity S known as the stretching factor can be calculated as



  N 1 X 1 X ln jY n0  Y n j N n ¼1 juY n0 j

ð12Þ

0

where juY n0 j is the number of neighbors found around point Yn0. In the case of chaotic dynamics, a plot of the stretching factor S against the number of points N will yield a curve with a linearly increase at the beginning and followed by an almost flat region. The slope of the linear portion of the first part of this curve gives an estimate of the largest Lyapunov exponent. The above method for calculating largest Lyapunov exponents is easy to implement and fast because it uses a simple measure of exponential divergence that circumvents the need to approximate the tangent map. It is also attractive from a practical viewpoint because it does not require large data sets and accurate for small data sets because it takes advantage of all the available data. Although chaos is fundamentally deterministic, it is unpredictable beyond short intervals. The approximate period limit on accurate predictions of a chaotic system is a function of the largest Lyapunov exponent.

Dt max ¼

1 kmax

ð13Þ

To be chaotic, the largest Lyapunov exponent must exceed zero. If it exceeds one, the maximum length of an accurate prediction is less than the data series sampling frequency. Thus, only for systems with Lyapunov exponents between zero and one are chaotic predictions of any practical use. If the exponent is much less than one, long, accurate predictions are possible (Shang et al., 2008). 2.3. Weighted largest Lyapunov exponent forecasting method The predictability of a time series using phase space techniques can be considered as a test for the deterministic nature of the system. These prediction techniques have been based on the fact that nearby trajectories, converge fast enough for small sample steps in the phase space. Consider a one-dimensional series of N observations from a chaotic system (x1, . . . , xN), whose future values are the targets of forecasting. Recall that a chaotic system is characterized by the existence of an attractor in a d-dimensional phase space, where d > 1 is the embedding dimension. A possible embedding method involves building a d-dimensional orbit. By definition, the largest Lyapunov exponent of a dynamic system characterizes the rate of separation of infinitesimally close points of an orbit. Quantitatively, two neighboring points in phase space with initial separation dX0 are separation by the distance:

kdXk ¼ kdX 0 kek0

ð14Þ

where the operator k  k represents the modulus of the considered vectors and k is the largest Lyapunov exponent of the system in the vicinity of the initial points. Typically, this local rate of divergence (or convergence, if k0 < 0) depends on the orientation of the initial vector dX0.The optimum reconstruction parameters can be estimated using the estimation technique which was given in Sec-

8422

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

tion 2.1. Using the observed sequence x1, x2, x3, . . . , xN, the value of the k-step-ahead sample point xN+k can be predicted. In order to make a further prediction, we have to find the nearest neighbor Yi of the center point YT. Then the value of the future point YT+1 can be calculated by the following formula:

demand series will be as closer as possible to the series in the same stage of the last cycle. Assuming that Xm-1 = {x(m-1)l+1, x(m-1)l+2, . . . , x(m-1)l+k}, where l is the cycle of the data set, is the actual load demand series in the same stage of the last cycle, the objective function can be defined as:

kY^ Tþ1  Y iþ1 k ¼ kY T  Y i kek0

^ m ðaÞ  X m1 k f ðaÞ ¼ kX

ð15Þ

where kY T  Y i k is the Euclidean distance between YT and Yi, and Yi+1 is the following point of Yi in the phase space. Since there is only ^ Tþ1 , so the one unknown component, the last component ^ xTþ1 of Y value of ^xTþ1 can be calculated by (15) easily. This prediction procedure should then be repeated for M times, where M is referred to as the prediction period and the corresponding prediction time is given by T = M  Dt. However, the largest Lyapunov exponent only describes the average divergence of the whole trajectories. In fact, the ratio of divergence in different period may has large difference. So the accuracy of this forecasting method will be poor. In order to overcome this limitation, A new parameter a is added into Eq. (15) to adjust the ratio of divergence in different period, as expressed as follows:

kY^ Tþ1  Y iþ1 k ¼ a  kY T  Y i kek0

ð16Þ

This modified largest Lyapunov exponent method is referred to as the weighted largest Lyapunov exponents forecasting method. In this paper, the parameter a will be estimated by PSO. 3. Particle swarm optimization algorithm The particle swarm optimization (PSO) was first introduced by Kennedy and Eberhart. It can be applied to virtually any problem that can be expressed in terms of an objective function whose optimals are to be found. The PSO algorithm is iterative and involves initializing a number of vectors randomly within the search space of the objective function. These particles are collectively known as the swarm. Each particle represents a potential solution to the problem expressed by the objective function. During each time step the objective function is evaluated to establish the fitness of each particle using its position as input. Fitness values are used to determine which positions in the search space are better than others. Particles are then made to ‘‘fly’’ through the search space being attracted to both their personal best position as well as the best position found by the swarm so far. The particles are flying through the search space of dimension n by updating the position of the ith particle i = 1, 2, . . . , N (N is the number of particles in the swarm) at time step t according to the following equation:

xi ðt þ 1Þ ¼ xi ðtÞ þ v i ðtÞ

ð17Þ

where xi(t) and vi(t) are vectors representing the current position and velocity respectively. The updates of the jth velocity, j e 1, 2, . . ., n, is governed by the following equation:

v i;j ðt þ 1Þ ¼ wv i;j ðtÞ þ c1 r1;j ðyi;j  xi;j ðtÞÞ þ c2 r1;j ðy^j  xi;j ðtÞÞ

ð18Þ

where 0 6 w 6 1 is an inertia weight determining how much of the particle’s previous velocity is preserved, c1 and c2 are two positive acceleration constants; r1,j and r2,j are two values from a uniform random sequence sampled from U(0, 1), yi,j is the personal best po^j is the best position found by sition found by the ith particle and y the entire swarm so far. The termination condition is either reaching the maximum iteration times or the best position satisfies the pre-defined minimum error. The flowchart of PSO is shown in Fig. 1. When PSO is applied to optimize the value of the new parameter w for WLLEF, it is expected that the predicted electricity load

ð19Þ

^ m ðaÞ is the predicted value series by using WLLEF. where X The procedure using PSO to optimize the parameter a in (19) is illustrated as follow: Step 1: Initialize a defined population of particle with random positions and velocities, where each particle contains n components. Step 2: Compute the objective function values (forecasting errors) of all particles. Set the best position and function value of each particle, yi and fbest i to be its initial position and the corresponding function value and the best position and function ^ and fglobalbest i, to be the best initial value of the entire swarm, y particle’s best position and the corresponding function value. Step 3: Update the velocity and position for each particle using (17) and (18), respectively, and evaluate the objective function values for all particles. Step 4: For each particle, compare its current objective function value with fbest i. If the current value is better (with smaller forecasting accuracy index value), then, update the best position and its objective function value the particle with the current position and corresponding objective function value. Step 5: Determine the best particle of whole population based on their best objective function values. If their objectives function value is smaller than fglobalbest i, then update the best position and objective function value for the entire swarm with the current best particle’s position and objective function value. Step 6: Check for the termination condition. If the termination condition is satisfied, output the results, otherwise go back to step 3. 4. Trend adjustment 4.1. Decomposition model Besides the non-linear trend, the electric power load data also exhibits seasonal pattern. The temperature variation throughout a year causes the power load to change in the cycle of years, months, weeks or days. There is referred to as the seasonal effect in this study. This effect is due to the seasonal component in the data series. It has been empirically observed that high-frequency returns exhibit seasonality in daily electricity demand. Any attempt of predicting high-frequency returns must estimate the seasonal component. Consider a time series that exhibits increasing or decreasing seasonal variation. When the parameters describing the series are not changing over time, the time series sometimes can be modeled adequately by the multiplicative decomposition model, which states that the observed value of the time series in time period t can be decomposed into the trend and the seasonal components as described below:

yt ¼ TRt  SNt

ð20Þ

where yt is the observed value of the time series in time period t, TRt is the trend component (or factor) in time period t, and SNt is the seasonal component (or factor) in time period t. It is worth of noting that the seasonal factor is multiplicative in this decomposition. This model is sometimes referred to as the product model.

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

8423

Fig. 1. The flowchart of particle swarm optimization algorithm.

This model has been found to be useful for modeling a time series that displays increasing or decreasing seasonal variation and can be used for estimating the seasonal component in an electric load data series. 4.2. Decomposition procedure Assume that a time series x1, x2, . . . , xT exhibits both the trend and seasonal effects at the same time (for the period l, T = lm), and satisfies the multiplicative decomposition model yt = TRt  SNt. The following procedure describes how to perform the decomposition. Step 1: Use the weighted largest Lyapunov exponent forecasting method (WLLEF) to calculate the trend  x1 ;  x2 ; . . . ;  xT , which yields the seasonal component; Step 2: Apply the multiplicative model, one can have xt ¼  xt  Ij ; Step 3: Calculate the seasonal index from xt ¼  xt  Ij as Ij ¼ xt = xt , i.e. Ij can be calculated as the ratio between the jth data in each period and its trend value. When the data series is not strictly in accord with the product model, one can still use this procedure to approximate the seasonal index. Define another variable pt as:

Pt ¼ xt =xt ðt ¼ 1; 2; . . . ; TÞ

ð21Þ

Set t = j, l + j, 2l + j, . . . , (m  1)l + j, i.e., taking the same time points in each period, and then calculate the corresponding pt values in different periods. Since the pt values are usually not the same, one can calculate the approximate seasonal index by the averaging the pt values as below:

Ij ¼ 1=m  ½pj þ plþj þ p2lþj þ    þ pðm1Þlþj  ðj ¼ 1; 2; . . . ; lÞ

ð22Þ

Finally perform the standardization process so that I1 + I2 +    + Il = 1 using the following formula:

Ij ¼

lIt I1 þ I 2 þ    þ I l

ð23Þ

We employ

^xTþk ¼ xTþk  I

ð24Þ

to forecast, and correspondingly, I means Ik. The principal advantage of this model is allowing the identification and estimation of the deterministic and stochastic trends and the seasonal components simultaneously, the interactions between these two components as well as the continuous changes in them.

8424

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

tralia in May 2007. The three dimensional reconstructed state space is shown in Fig. 3.

5. Case studies In order to test the performance of the proposed method, a case study is carried out. The load demand data is collected from New South Wales, Australia in May 2007. The data set contains the load demand data from 00:00 on May 14 to 23:30 on May 18 with the interval of 30 min, i.e., 48 data points per day and 240 data points in five days. The data set is divided as two parts, one part is the historical data which contains the 96 data points in the first two days, and the other part is the rest 144 data points in the last three days, which is used to validate the predicted values. The original load demand series is depicted in Fig. 2. In order to reconstruct the original phase space, we first estimate the reconstruction parameters, the delay time s = 3 and embedding dimension d = 3. This paper applies the restructure theory proposed by Takens to analyze a power grid load demand series of New South Wales Aus-

5.1. The application of LLEF and WLLEF The largest Lyapunov exponent forecasting method (LLEF) is applied to predict the load demand on May 16. The series can be divided into five stages according to its variation trend as shown in Fig. 4. Based on the largest Lyapunov exponent determined by using the historical load data in the test, k = 0.3256 > 0, the chaos characteristic of the load demand series can be determined. The maximum length of an accurate prediction is Dtmax = 3.071. For the first training rolling, 96 (May 14, 00:00–May 15, 23:30) load demand data are fed into the largest Lyapunov exponent forecasting method (LLEF) to compute the load demand of May 16, 00:00. Then, the second training rolling, the next 96

Fig. 2. The original load demand series (from 0:00 on May 14 to 23:30 on May 18).

Fig. 3. Original load series’ three dimensional reconstructed phase space.

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

8425

Fig. 4. The divided stages of load demand series of May 16.

(May 14, 00:30–May 16, 00:00) load data are similarly fed into LLEF to compute the forecasting load of May 16, 00:30. Repeat previous procedures to compute the other forecasting load of May 16. Then weighted largest Lyapunov exponent forecasting method (WLLEF) is exployed to predict the load demand value of 0:00 on May 16. As that is discussed in the Section 3, the PSO algorithm is used to select the optimal parameter a. The object function is defined as (19). The parameters of PSO in the proposed model are set as follow: The population size is 20; both of the acceleration constants c1 and c2 are 0.4; inertia weight a is equal to 0.45; maximum

velocity threshold vmax = 0.5; and the destined minimum error threshold k = 5. In different stages, parameter w is respectively equal to 0.7659, 0.7419, 0.5283, 0.6909 and 0.6961. The value of the object function is equal to 295.0, 1109.5, 1392.3, 501.9 and 711.8 respectively. Take the w into (16), the predicted value of load demand can be calculated one by one. The predicted results of May 16 by using LLEF and WLLEF are shown in Fig. 5. The first point in the figure represents the load of 00:30, and the last point in the figure represents the load of 24:00. It can be observed from Fig. 5 that the predicted data of

Fig. 5. The predicted data of May 16 by using LLEF and WLLEF.

8426

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

Fig. 6. Comparison of ARE of LLEF and WLLEF.

Table 1 Comparison of ARE of LLEF and WLLEF.

ARE ¼

Predicted method

Maximum ARE (%)

Minimal ARE (%)

Mean ARE (%)

LLEF WLLEF

143.8667 8.4237

1.5446 0.0116

17.7503 2.2413

WLLEF is more satisfactory. Fig. 6 shows the forecast error curves of both methods. In the investigation, the absolute relative error (ARE), shown as (25), serves as the forecasting accuracy index.

^i j jyi  y  100% yi

ð25Þ

^i denotes is the forecasting where yi is the actual value at period i; y value at period i. From Fig. 6, it can be seen that the error for WLLEF is smaller than the one for LLEF, which indicates that the precision of WLLEF is higher than that of LLEF. It can be observed that the former prediction method gives a greater error at the end of the each stage, but WLLEF can reduce the error dexterously. The maximum,

Table 2 The seasonal factor, original data and the forecasting data of WLLEF and T-WLLEF of May 18. Time

0:00 0:30 1:00 1:30 2:00 2:30 3:00 3:30 4:00 4:30 5:00 5:30 6:00 6:30 7:00 7:30 8:00 8:30 9:00 9:30 10:00 10:30 11:00 11:30

Actual value (MW)

Predicted value (MW) WLLEF

T-WLLEF

8394.13 8234.51 8016.87 7864.51 7637.93 7260.71 6951.07 6762.18 6621.81 6601.36 6710.02 6929.58 7278.28 7912.22 8553.59 8992.17 9423.47 9682.03 9675.31 9841.22 9926.56 9938.56 9869.29 9770.20

8452.53 8127.90 8025.72 7884.87 7639.31 7330.79 7044.53 6758.07 6621.79 6557.13 6837.91 7051.69 7367.40 7909.28 8503.67 8787.47 9204.16 9338.61 9404.69 9696.53 9946.10 10111.36 9738.09 9516.29

8441.836 8180.090 8082.591 7959.165 7682.893 7287.840 6913.049 6571.962 6394.529 6538.009 6640.085 6851.039 7262.050 7842.962 8479.075 8768.787 9268.358 9367.701 9203.457 9334.256 9457.208 9570.061 9490.849 9397.029

Seasonal factor

0.9994 1.0071 1.0079 1.0096 1.0064 0.9940 0.9805 0.9762 0.9691 1.000 0.9764 0.9924 0.9980 0.9991 0.9979 0.9932 0.9986 0.9930 0.9837 0.9821 0.9566 0.9368 0.9780 0.9888

Absolute relation error (%) WLLEF

T-WLLEF

0.6957 1.2946 0.1104 0.2589 0.0181 0.9652 1.3446 0.0607 0.0002 0.6699 1.9060 1.7622 1.2245 0.0371 0.5835 2.2763 2.3272 3.5469 2.7969 1.4702 0.1968 1.7387 1.3292 2.5988

0.5683 0.6608 0.8197 1.2035 0.5886 0.3736 0.5469 2.8129 3.4323 0.9596 1.0422 1.1334 0.2229 0.8753 0.8711 2.4841 1.6460 3.2465 4.8768 5.1514 4.7282 3.7077 3.8345 3.8194

Time

12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 22:00 22:30 23:00 23:30

Actual value (MW)

9788.53 9800.41 9722.10 9686.40 9576.81 9459.65 9318.22 9349.78 9369.97 9422.56 9547.97 9950.06 10339.64 10173.65 9929.56 9657.61 9406.65 9299.11 9070.31 8858.46 8603.65 8730.62 8651.32 8617.34

Predicted value (MW) WLLEF

T-WLLEF

9406.717 9383.656 9080.667 8719.466 8595.255 8345.507 8043.635 8450.801 8767.669 9172.592 9445.139 9525.933 9903.947 10121.460 9931.194 9694.718 9494.896 9180.084 8941.375 8800.747 8607.283 8595.665 8536.728 8284.977

9539.532 9549.624 9535.659 9206.976 9126.672 9051.563 8830.177 8971.340 9096.390 9253.712 9415.396 9566.595 10158.93 9926.227 9700.601 9398.224 9232.373 8983.979 8782.221 8534.595 8275.197 8628.748 8586.350 8438.549

Seasonal factor

0.9917 1.0085 1.0207 1.0353 1.0531 1.0957 1.1239 1.0071 0.9932 0.9837 0.9835 1.0173 1.0387 1.0094 0.9830 0.9752 0.9767 0.9873 0.9936 0.9870 0.9778 1.0084 1.0071 1.0170

Absolute relation error (%) WLLEF

T-WLLEF

3.9006 4.2524 6.5976 9.9823 10.2492 11.7778 13.6784 9.6149 6.4279 2.6528 1.0769 4.2625 4.2138 0.5130 0.0164 0.3842 0.9381 1.2799 1.4215 0.6515 0.0422 1.5457 1.3245 3.8569

2.5437 2.5589 1.9177 4.9494 4.7002 4.3139 5.2375 4.0475 2.9197 1.7919 1.3885 3.8539 1.7477 2.4320 2.3058 2.6858 1.8526 3.3888 3.1761 3.6559 3.8176 1.1668 0.7509 2.0747

8427

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

Fig. 7. The original data and the forecasting data of WLLEF and T-WLLEF of May 18.

minimal and mean absolute relative error (ARE) of the two methods are tabulated in the Table 1. By using the WLLEF method, the load demand series of May 17 and 18 were predicted. The predicted values of May 18 are defined as  xt .

from May 14 to May 18 are shown in Fig. 2. Obviously, the load demand series have the seasonal trend. The cycle of the load demand series is 48.

Table 3 Comparison of WLLEF and T-WLLEF.

5.2. Trend adjustment to WLLEF (T-WLLEF) The values in the same stage of different cycle are discrepant. So in order to adjust the disparity, the trend adjustment is used to improve WLLEF. The combination of the trend adjustment can make the ARE of the predicted values smaller. The load demand series

Predicted method

Maximum ARE (%)

Minimal ARE (%)

Mean ARE (%)

WLLEF T-WLLEF

13.6784 5.2375

2.5490  104 0.2230

2.7057 2.4768

Fig. 8. ARE of May 18 by using WLLEF and T-WLLEF.

8428

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429

Fig. 9. The distributing of ARE by using WLLEF and T-WLLEF (May 18).

ture work, the proposed model T-WLLEF will be applied to other practical problems such as weather and stock forecasting.

Table 4 The distributing of ARE by using WLLEF and T-WLLEF (May 18). Stage

<3% 3–6% >6%

WLLEF

T-WLLEF

Amount

Proportion (%)

Amount

Proportion (%)

35 6 7

72.92 12.50 14.58

30 18 0

62.50 37.50 0

The seasonal factor It can be calculated, the results are shown in Table 2. The forecasting formula is ^ xt ¼  xt  It . The actual values and the predicted values of two methods (WLLEF and T-WLLEF) of May 18 are shown in Figs. 2 and 7. Fig. 8 shows the values of ARE for T-WLLEF and WLLEF. The distributions of ARE of WLLEF and T-WLLEF are shown in Table 3 and Fig. 9. In the Table 4, the maximum, minimal and mean ARE of WLLEF and T-WLLEF are tabulated. According to the figures and tables, T-WLLEF can be regarded as a more efficient algorithm than WLLEF for its contribution in reducing the maximum and mean ARE. 6. Conclusion and future work Aiming at developing more accurate and reliable forecasting models, this paper proposes a novel model which combines a weighted Largest Lyapunov Exponent forecasting method (WLLEF) and the trend adjustment method for forecasting electricity load demand. The parameter in WLLEF is optimized by the particle swarm optimization algorithm. Using this model, the phase space of the load demand is firstly reconstructed, and then WLLEF is used to predict the power load. Finally the trend adjustment technique is combined with WLLEF (T-WLLEF) to improve the forecasting accuracy. A case study using the power grid demand data of New South Wales, Australia is presented to demonstrate the effectiveness and reliability of the proposed model. The results of case study show that the T-WLLEF can improve the forecasting accuracy significantly and can satisfy the real-time requirements. In our fu-

Acknowledgements The research was supported by the Ministry of Education overseas cooperation ‘‘Chunhui Projects’’ under Grant (Z2007-1-62012) and the National Science Foundation of Gansu Province in China under Grant (ZS031-A25-010-G). References Camastra, F., & Colla, A. M. (1999). Neural short-term prediction based on dynamics reconstruction. Neural Processing Letters, 9, 45–52. Chow, T. W. S., & Leung, C. T. (1996). Neural network based short term load forecasting system using weather compensation. IEEE Transactions on Power Systems, 11, 1736–1742. Fraser, A. M., & Swinney, H. L. (1986). Independent coordinates for strange attractors from mutual information. Physical Review A, 33, 1134. Grassberger, P., & Procaccia, I. (1983). Estimation of the Kolmogorov entropy from a chaotic signal. Physical Review A, 28, 2591. Hong, W. (2009). Chaotic particle swarm optimization algorithm in a support vector regression electric load forecasting model. Energy Conversion and Management, 50, 105–117. Jiang, C., & Li, T. (2005). Forecasting method study on chaotic load series with high embedded dimension. Energy Conversion and Management, 46, 667–676. Lau, K. W., & Wu, Q. H. (2008). Local prediction of non-linear time series using support vector regression. Pattern Recognition, 41, 1539–1547. Liebert, W., & Schuster, H. G. (1989). Proper choice of the time delay for the analysis of chaotic time series. Physics Letters A, 142, 107. Metaxiotis, K., Kagiannas, A., Askounis, D., & Psarras, J. (2003). Artificial intelligence in short term electric load forecasting: a state-of-the-art survey for the researcher. Energy Conversion and Management, 44, 1525–1534. Michanos, S. P., Tsakoumis, A. C., Fessas, P., Vladov, S. S., & Mladenov, V. M. (2003). Short-term load forecasting using a chaotic time series. SCS International Symposium, 2, 437–440. Moghram, I., & Rahman, S. (1989). Analysis and evaluation of five short-term load forecasting techniques. IEEE Transactions on Power Systems, 4, 1484–1491. Rosenstein, M. T., Collins, J. J., & Deluca, C. J. (1993). A practical method for the calculating largest Lyapunov exponents from small datasets. Physica D, 65, 117–134. Shang, P., Na, X., & Kamae, S. (2008). Chaotic analysis of time series in the sediment transport. Chaos, Solitons & Fractals. doi:10.1016/j.chaos.2008.01.014. Taylor, J. W., & Buizza, R. (2003). Using weather ensemble predictions in electricity demand forecasting. International Journal of Forecasting, 19, 57–70.

J. Wang et al. / Expert Systems with Applications 38 (2011) 8419–8429 Taylor, J. W., Menezes, L. M., & McSharry, P. E. (2006). A comparison of univariate methods for forecasting electricity demand up to a day ahead. International Journal of Forecasting, 22, 1–16. Tsekouras, G. J., & Dialynas, E. N. (2007). A non-linear multivariable regression model for midterm energy forecasting of power systems. Electric Power Systems Research, 77, 1560–1568. Wang, J., Zhu, W., Zhang, W., & Sun, D. (2009). A trend fixed on firstly and seasonal adjustment model combined with the e-SVR for short-term. Energy Policy. doi:10.1016/j. enpol.2009.06.046. Xiao, Z., Ye, S., Zhong, B., & Sun, C. (2009). BP neural network with rough set for short term load forecasting. Expert Systems with Applications, 36, 273–279.

8429

Xue, J., & Shi, Z. (2008). Short-Time Traffic Flow Prediction Based on Chaos Time Series Theory. Journal of Transportation Systems Engineering and Information Technology, 8(5), 68–72. Yang, H. Y., Ye, H., Wang, G., Khan, J., & Hu, T. (2006). Fuzzy neural very-short-term load forecasting based on chaotic dynamics reconstruction. Chaos, Solitons and Fractals, 29, 462–469. Zhao, L., & Yang, Y. (2009). PSO-based single multiplicative neuron model for time series prediction. Expert Systems with Applications, 36, 2805–2812.