Electrical Power and Energy Systems 55 (2014) 187–194
Contents lists available at ScienceDirect
Electrical Power and Energy Systems journal homepage: www.elsevier.com/locate/ijepes
A hybrid ARFIMA and neural network model for electricity price prediction Najeh Chaâbane ⇑ Faculty of Economic Sciences and Management, Computational Mathematics Laboratory, 4023 Sousse, Tunisia
a r t i c l e
i n f o
Article history: Received 7 June 2012 Received in revised form 5 September 2013 Accepted 5 September 2013
Keywords: Electricity price prediction ARFIMA ANN Hybrid model NordPool electricity market
a b s t r a c t In the framework of competitive electricity market, prices forecasting has become a real challenge for all market participants. However, forecasting is a rather complex task since electricity prices involve many features comparably with those in financial markets. Electricity markets are more unpredictable than other commodities referred to as extreme volatile. Therefore, the choice of the forecasting model has become even more important. In this paper, a new hybrid model is proposed. This model exploits the feature and strength of the Auto-Regressive Fractionally Integrated Moving Average (ARFIMA) model as well as the feedforward neural networks model. The expected prediction combination takes advantage of each model’s strength or unique capability. The proposed model is examined by using data from the Nordpool electricity market. Empirical results showed that the proposed method has the best prediction accuracy compared to other methods. Ó 2013 Elsevier Ltd. All rights reserved.
1. Introduction The reforms of the electricity sector have enabled a transition from a vertically-integrated monopoly market structure to a competitive wholesale retail mechanism, with marketplaces like power exchanges [1,2]. In this vein, power exchange provides an organized marketplace which offers standardized products to the most recently developed European markets (e.g., the Netherlands, Germany, Poland, France, Austria) which are based on this model. The nations that have restructured their power system accordingly have similar goals. All of them seek to boost competition in the electricity market so as to achieve economic efficiency, higher quality services, and lower consumer prices for electricity. In power markets, price investigation has become an essential topic for all its participants. Background information about the electricity price is important for risk management. It may stand an advantage for a market player facing competition. Forecasting electricity prices at different time frames is valuable for all industry stakeholders for cash flow analysis, capital budgeting, financial procurement, regulatory rule-making, and integrated resource planning. The behavior of electricity prices differs from that of other commodity markets. The most obvious of these differences is that electricity is a non-storable merchandize, and so relatively small changes in load or generation in a matter of hours or minutes can occasion huge changes in price. In this respect, there is no ⇑ Tel.: +216 24715148. E-mail address:
[email protected] 0142-0615/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ijepes.2013.09.004
other market like it [1]. Electricity prices show some particular characteristics which belong mainly to its physical features which may affect the prices dramatically. Electricity spot prices often exhibit specific and salient characteristics like, seasonality (i.e., on annual, weekly, and daily levels), long memory, mean reversion, extreme volatility, and price spikes. In recent years, a number of time-series models have been suggested in order to capture the mentioned characteristics of electricity prices. Long-memory models have gained popularity for modeling of electricity spot prices. In a pioneer study, [3] suggested an approach using periodic dynamic long-memory regression models with GARCH errors. They used European data and modeled seasonality by means of sinusoids and weekday dummies. They found evidence of mean reversion in the stochastic part of the model and long-memory in the older electricity market of NordPool in Norway. Many other models were proposed by [1,4–7]. Other recent studies dealing with long range dependence aspects for electricity data have been proposed. [8,9] observed that the autocorrelation of electricity spot prices displays a hyperbolic decay, or equivalently, prices exhibit a multi-scale autocorrelation structure. This means that electricity spot price time-series have a long memory. They model this behavior by fitting the autocorrelation function using the sum of two exponential functions. In a more recent study, [10] modeled the presence of long memory in the time series of prices by a first-order fractional autoregressive process. Despite the huge number of successful applications of linear models in electricity price modeling, such models suffer from some weaknesses, notably their incapability to capture nonlinear pat-
188
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194
terns. In order to grapple with the limitations of such models and explain the nonlinear patterns that exist in real cases, several nonlinear models have been suggested. In this context, artificial neural networks have been extensively studied and used in modeling and forecasting electricity spot prices. [11,12] applied neural networks to model and forecast the dynamics of intra-day prices. The main advantage of neural networks is their flexible nonlinear modeling capability. Contribution of artificial intelligence methods has led to solid time series forecasting methods that replaced the old fashion ones. Although these methods give accurate forecasts for linear time series, their weakness appears with noise or nonlinear components which characterize intra-day power spot prices. However, the combination of different models has become an effective way to improve the forecasting accuracy. Different hybrid forecasting models have been proposed in the literature, especially, using linear autoregressive models and intelligent techniques applied to time series forecasting. [13] affirmed that the motivation for combining such models derives from the assumption that either one cannot identify the true data generating process or that a single model may not be sufficient to identify all the characteristics of the time-series. [14] suggested a hybrid methodology that combines both auto-regressive integrated moving average (ARIMA) and artificial neural network (ANN) models. [15] put forward the same procedure coupling ARIMA models with support vector machines (SVMs) for stock prices forecasting. Using data of Taiwan’s machinery industry production values, [16] examined the forecasting accuracy of a proposed hybrid method combining both seasonal auto-regressive integrated moving average (SARIMA) and support vector machines (SVM) models. [17] proposed a hybridization of intelligent techniques such as ANNs, fuzzy systems, and evolutionary algorithms so that the final hybrid ARIMA-ANN model could outperform the prediction accuracy of those models when used separately. Examined on Spanish and PJM electricity markets, a novel price forecasting method based on wavelet transform combined with ARIMA and GARCH models were later suggested by [18]. To take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, [19] worked on a hybrid model named SVRARIMA. It combines both support vector regression (SVR) and auto-regressive integrated moving average (ARIMA) models. [20] conceived a hybrid method that is based on wavelet transform, auto-regressive integrated moving average (ARIMA) models and Radial Basis Function Neural Networks (RBFN). [21] proposed a hybrid approach combining exponential smoothing model (ESM), autoregressive integrated moving average model (ARIMA), and the back propagation neural network (BPNN) for forecasting stock indices. In a recent paper, [22] adopted an hybrid method based on wavelet transform, auto-regressive integrated moving average (ARIMA) and least squares support vector machines (LSSVM) to predict New South Wales’s electricity prices from the Australian national electricity market. In this paper, a new hybrid model is proposed in order to exploit the feature of strength of Auto-regressive fractionally integrated moving average (ARFIMA) model and ANN model. The proposed combination takes advantage of each model’s capability to predict hourly electricity prices of the NordPool electricity market. The remainder of the paper is organized as follows. In Section 2 and 3 ARFIMA and feedforward neural networks models are briefly reviewed. Section 4 presents the formulation of the proposed model. The last sections includes a report on the numerical results, their discussion, and the conclusion.
average (ARFIMA) model. [23,24] introduced the ARFIMA model as a parametric way of capturing long memory dynamics. Analytically, an ARFIMA (p, d, q) process {yt} can be defined as
/ðLÞðyt lÞ ¼ hðLÞDd t ;
where t is a white noise process. The formulae /ðLÞ ¼ 1 þ i¼1 /i Li P and hðLÞ ¼ 1 þ qi¼1 hi Li are polynomials in a lag operator L, of orders p and q, respectively. The functions /(L) and h(L) are supposed to have no common roots. Dd = (1 L)d is a fractional differencing operator defined by the binomial expansion
Dd ¼
1 X
gj Lj ¼ gðLÞ;
j¼0
where
gj ¼
Cðj þ dÞ : Cðj þ 1ÞCðdÞ
ð2Þ
with C(.) denoting the gamma function. Using the Stirling’s approxpffiffiffiffiffiffiffi imation CðzÞ 2pez zz0:5 , thereby the gj become gj 1 1d . CðdÞj
Depending on [25], the covariance of spectrum fractionally integrated noise is expressed as follows:
ch ¼ ð1Þh r2
Cð1 2dÞ : Cðh d þ 1ÞCð1 d hÞ
ð3Þ
The most useful of fractionally integrated time-series is their long-range dependence. The dependence between observations produced by the ARMA structure of the model decays at a geometric rate while the dependence produced by fractionally differencing parameter deteriorates at a much slower hyperbolic rate. Hence, the long-range dependence between observations is eventually determined only by the fractional differencing parameter d, which is allowed to assume any real value. According to [24], when jdj > 0,5, the process is non-stationary. When 0.5 < d < 0.5, the process is said to be stationary. For 0 < d < 0.5, the autocorrelations of the process decay at a hyperbolic rate and the process is said to exhibit long memory. For 0.5 < d < 0, the sum of absolute values of the autocorrelation of the process converges to a constant. Such process is stationary and is deemed to have short memory and is sometimes referred to as anti-persistent. When d = 0, the process is reduced to the standard ARMA process and does not have any structure of dependence in long term. Such processes are for example white noises. However, for d = 1, the series follows a unit root process (see [26] for more details). The evaluation of the long-memory intensity can be carried out by estimating the Hurst parameter H, which varies between 0 and 1 [27]. In fact, the long memory parameter d is simply related to the Hurst exponent H as H d þ 12 [28]. H = 0 implies that there is no long-range dependence. When 12 < H < 1, the process shows long memory. For H < 12, the process is said to be anti-persistent (negative dependence). Currently, there are many estimation methods for the Hurst exponent H and analogously for the longmemory parameter d. In a recent study, [29] reviewed different estimation procedures of Hurst parameter using discrete variations of time series. The provided estimators have the advantages of being quickly computable and independent on other parameters. For real data and in the presence of outliers, the authors recommend the use of estimations based on trimmed means (TM). Such estimation method has proved its robustness to outliers and it seems to have slightly better properties than those in other models. Thus, we focus on this estimation technique and we define the estimator of H as follows
2. ARFIMA (p, d, q) model A parsimonious way to model long term behavior of a time series is by means of an autoregressive fractionally integrated moving
ð1Þ Pp
b TM ¼ H
AT 2kAk
2
m 2 log ðX a Þ ðbÞ
ð4Þ m¼M1 ;...;M 2
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194
189
Fig. 1. A schematic diagram of a three-layered feedforward network model with a single output and two hidden units.
Fig. 2. Overview of the proposed forecasting framework.
PM 2 1 where Am ¼ logðmÞ M2 M m¼M1 logðmÞ, 1 6 M1 6 M2 and X de1 þ1 notes the vector of initial data. According to [29], the following convergence hold almost surely as n ? +11
b TM ! H H
ð5Þ
3. Feedforward neural networks for time series prediction In feedforward networks, the neurons are organized in layers (Fig. 1). Depending on the complexity of the network or the nature of the studied problem, there can be a number of hidden layers in a neural network model. The single hidden layer feedforward network is the most widely used model for time series modeling and forecasting [30]. The simplest feedforward models, are those considering only the relationship between the scalar output yt and its lagged values (yt1, . . . , ytp), where p is the number of input nodes. In such cases, a feedforward network regression model with q hidden nodes is written as follows:
yt ¼ a0 þ
q X
p X aj g b0j þ bij yti
j¼1
! þ t
ð6Þ
i¼1
where g is the activation function.2, t is an error term with zero mean and standard deviation rt, aj and bij for i = 0, 1, 2, . . . , p and j = 0, 1, 2, . . . , q are the model parameters to be estimated. They are often called the connection weights. The neural network is then equivalent to a univariate nonlinear autoregressive (NAR) model, that is: 1
see Proposition 1 of [29]. An activation function is often deterministic and symmetrically nonlinear. There are a number of different activation functions, such as the threshold function, piecewise linear function, and sigmoidal function, see [31] The logistic function is the 1 often used as the hidden-layer transfer function, i.e. gðtÞ ¼ 1þe t .
yt ¼ f ðyt1 ; . . . ; ytK ; mÞ þ t ;
ð7Þ
where f(.) is a function determined by the network structure and connection weights and m a vector of all parameters. Although simple, this model is sufficiently powerful in the sense that it can approximate the arbitrary function as the number of hidden nodes q is sufficiently large [32]. We can show, in Fig. 1, a simplified diagram of a three-layered feedforward network model with a single output, four inputs and two hidden nodes. One critical decision is to determine the appropriate architecture [33]. However, there is no theory that can be used for determining the best architecture. Hence, experiments are often conducted to choose and select the appropriate p and q. Given the network structure, the network is then ready for training to estimate the unknown parameters set m for a sample of data. The most widely used training method is the backpropagation algorithm. A backpropagation is an estimation scheme which allows to compute m recursively [34,35]. 4. The hybrid proposed method: ARFIMA-ANN prediction procedure In our proposed methodology, the time series {yt} is considered as a function of a linear and a nonlinear component. Thus,
yt ¼ f ðLt ; Nt Þ
ð8Þ
where Lt denotes the linear component and Nt denotes the nonlinear component. Following Zhang’s hybrid methodology [14], one of the most efficient models in improving forecast accuracy, we can establish the additive relationship between the linear and nonlinear components. Consequently, we can write
yt ¼ Lt þ Nt
ð9Þ
2
Starting from the assumption that the linear and the nonlinear patterns of the considered time series can be modeled separately by different models, the proposed method involves essentially three
190
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194
where b L t is the predicted values of the ARFIMA model at time t. To discover the nonlinear pattern, residuals are predicted independently through a ANN model where the input–output relationship is modeled using a Nonlinear Autoregressive (NAR)-type model as follows
Spot prices (euros/MWh)
50
40
et ¼ gðet1 ; et2 ; . . . ; etN Þ þ et 30
ð11Þ
where g(.) is a nonlinear regression function determined by the ANN model. Finally, prediction of the linear component b L t , and the nonlinear b t , obtained from Eq. (11), are recombined to generate component N an aggregate prediction.
20
bt ^t ¼ b Lt þ N y
10 0
200
400
600
800
1000
1200
1400
ð12Þ
In brief, the proposed hybrid methodology consists in (a) using ARFIMA model to analyze the linear component of the time series based on the long memory behavior of the data and (b) modeling the residuals gathered from the previous step through using a ANN model. Consequently, the predictions obtained from the models are summed separately. Hence, the hypothesized hybrid method would exploit the strength of both ARFIMA and ANN models. The proposed prediction system is shown in Fig. 2.
Time (hours)
40
60
Fig. 3. Hourly spot prices for the NordPool electricity market.
20
5. Empirical results
0
5.1. Time series
-40
-20 0.0
0.1
0.2
0.3
0.4
0.5
frequency Fig. 4. NordPool periodogram.
steps. Firstly, an ARFIMA model is used to fit the linear component and to take into account the long-memory behavior. Such model generates a series of predictions defined as b L t . Next, instead of predicting the linear component, the residuals containing the nonlinear patterns in the considered time series, denoted as et, can be generated by simply subtracting the predicted values b L t of the linear component from the actual values yt of the time series. Hence
Lt et ¼ yt b
I N ð xk Þ ¼
ð13Þ
1.0 0.8
0.8
ACF
0.6
0.6 0.4
0.0
0.0
0.2
0.2
ACF
2 N 1 X yt expf2Piðt 1Þwk g N t¼2
where {yt, t = 2, .. . , N} is the vector of observations, xk ¼ Nk ; k ¼ 1; . . . ; N2 , where [y] denotes the largest integer less than or equal to y.
1.0
ð10Þ
We took into account hourly spot prices from the NordPool electricity market between October 1, 2012 and November 28, 2012 with the sum of N = 1416 hourly observations illustrated in Fig. 3. It is well known that electricity spot prices are subject to weekly cycles, daily patterns, persistent level changes, and spikes. In fact, electricity demand goes through seasonal fluctuations which mostly emanate from changing climate conditions (e.g., like temperature and the number of daylight hours). In addition, the supply side shows seasonal variations in output. These seasonal fluctuations can explain the seasonal behavior of electricity prices and spot prices in particular. The seasonality can be easily observed in the frequency domain through plotting the periodogram given by
0.4
spectrum
0.041671 0.083338
0
20
40
60
Lag
80
100
0
20
40
60
80
100
Lag
Fig. 5. Autocorrelation functions for NordPool time series. (a) represent the autocorrelogram before deseasonalization and (b) is the autocorrelogram for deseasonalized data.
191
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194 Table 3 ARFIMA estimation results for the NordPool prices (deseas.). The error term assumed to be Gaussian.
Estimated density
0.12
ARFIMA
^ d
ln(L)
AIC
(0, d, 0)
0.499a (0.0005a) 0.498a (0.0015) 0.496a (0.005) 0.484b (0.0202) 0.461b (0.0379) 0.469b (0.0328) 0.470b (0.0323) 0.481b (0.0240) 0.482b (0.0235)
2198.895
4403.790
1917.659
3843.318
1854.067
3718.134
1859.347
3726.694
1845.234
3700.469
1843.386
3698.772
1843.421
3696.842
1842.740
3697.480
1842.725
3699.450
(0, d, 1) 0.08
(0, d, 2) (1, d, 0) 0.04
(1, d, 1) (1, d, 2)
0.00 20
25
30
35
40
45
(2, d, 0)
50
NordPool prices (deseas.)
(2, d, 1)
Fig. 6. NordPool density (deseas).
(2, d, 2)
Table 1 ~t . Descriptive statistics of the deseasonalized time series y NordPool Number of observations Mean Standard deviation Skewness Kurtosis Minimum Maximum Jarque–Bera test
1416 34.32036 2.899442 0.367668 4.539322 21.09602 46.12408 171.7036
ADF test PP test KPSS test
1% Critical value
5% Critical value
Table 4 ^ 0Þ for the NordPool prices Parameters estimation results of the selected ARFIMA ð2; d; (deseas.). Parameter
Value of statistic
3.43
2.86
2.56
5.4765
3.43 0.739
2.86 0.463
2.56 0.347
6.690254 0.4277
As shown in Fig. 4 the spectral density, estimated by the periodogram, is unbounded at equidistant frequencies, which proves presence of seasonalities. It shows special peaks at frequencies xk = 0.041671 and 0.083338 corresponding to cycles with daily and semi-daily periods, respectively. Once seasonalities are identified, we have to figure out how to remove them. Hence, we used a seasonal adjustment technique, the median period, which proved to be efficient in many studies3[36,37]. With such a technique, the idea proceeds as follows: rearrange the time series containing the seasonality of period p into a matrix with rows of length p (168 elements for hourly data for a week). Then, take the median of the data in each column. Finally, the resulting row vector of length p is the estimate of the seasonal component. To obtain the seasonal adjustment, the seasonal component can be subtracted from the time series. The effectiveness of the seasonal adjustment technique used is clearly observed in Fig. 5. The autocorrelation functions also exhibit a slow decay pattern which allows us to identify a long-range dependence behavior.
3
Coefficient
It is noticeable that this technique is mostly used for hourly data.
Std. err.
t-Ratio
p-Value
0.47
0.0323
14.6
0.000
0.7355
0.0415
17.7
0.000
0.1498
0.0264
5.67
0.000
33.8527
4.129
8.20
0.000
^ d ^1 / ^2 / Constant
10% Critical value
is
Standard errors are reported in the parentheses below corresponding parameters estimates. ln (L) is the value of the maximized Gaussian log-likelihood. AIC is the information criteria.Standard errors in brackets. a Significance at levels 1%. b Significance at levels 5%.
Table 2 ~t . The unit root tests for the deseasonalized time series y Test
t
Table 5 BDS test results on residuals from ARFIMA model.
e 0.5r
r
1.5r
2r
m=2
16.911a
14.292a
13. 933a
12.088a
m=3
18.222a
17.843a
15. 424a
13. 438a
m=4
22.023a
21.193a
17.644a
15.765a
m=5
26.084a
23.256a
18.645a
16.703a
Note: m: embedding dimension; e: distance between points measured in terms of number of standard deviations of the raw data; r: standard deviation. a Significance at level 5%. Table 6 Network architecture. Number of neurons in the first hidden layer Number of neurons in the second hidden layer Preprocessing function Layer conversion function
20 1 Feed Forward Network Levenberg–Marquardt
~t ; t ¼ 1; . . . ; Ng is obIn this paper, deseasonalized time series fy tained by subtracting only the daily (1/0.0416 ’ 24) seasonal component,4 so we have
~t ¼ yt sðtÞ; y
t ¼ 2; . . . ; N
ð14Þ
where s(t) is a periodic function corresponding to the daily seasonality, such as we have sðtÞ ¼ sðt þ 24nÞ; 8n 2 N, where N is the set of integer numbers. 4
We assume an additively composed model.
192
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194
NordPool prices (deseas.)
45
40
35
30
25
20
0
200
400
600
Training
800
1000
1200
1400
Validation
Test
set
set
set
Fig. 7. NordPool prices (desais).
The density of the obtained deseasonalized time-series is illustrated in Fig. 6. Descriptive statistics are summarized in Table 1. The deseasonalized time-series reveals that it does not correspond to the assumption of normality. Both skewness and kurtosis statistics indicate that the considered time-series tends to have a higher peak and fatter-tail distribution than a normal distribution. Further evidence about the nature of departure from normality may be obtained from the Jarque–Bera test statistics which confirms deviation from normality. By performing unit root tests, namely the ADF (Augmented Dickey Fuller), the PP (Philips–Perron) and KPSS (Kwiatkowski, Phillips, Schmidt and Shin), we tested for stationarity. As illustrated by Table 2, for ADF and PP test, on the 1%, 5%, and 10% test levels, all the statistical values fall into the reject region, which means H0 (time series is non-stationary) should be rejected. For KPSS test, on the 1% and 5% test levels, the statistical values fall into the accept region, which means H0 (time series is stationary) should be accepted. Thus, the considered time-series is stationary and suitable for subsequent tests in our study. Thereafter, a robust estimator of the long-memory parameter d is obtained by estimating the Hurst parameter H following [29] and as defined in Eq. (4). b TM ¼ 0:8715 P 0:5, indicate The obtained Hurst estimator, H evidence of long memory in the considered time series. 5.2. Results and discussion First, we estimate some specifications of ARFIMA (p, d, q) models with different orders of (p, q) under the assumption of normal distribution.5 Thereafter, we compare the performance of some ARFIMA models to determine the adequate orders in detecting the long memory property in the level of the series. We estimate following [39]. That is, we consider all the possible combinations for the ARMA (p, q) part with maximum p = 0, 1, 2 and q = 0, 1, 2. To choose appropriate models that describe data, we use the Akaike information criterion (AIC) based on the Kullback information. The estimation results and diagnostic 5
All ARFIMA numerical results and modeling are carried out using the ARFIMA 1.0 packages for OxMetrics [38].
^ 0Þ statistics are reported in table 3. On this basis, an ARFIMAð2; d; seems appropriate for the considered time series (see Table 4). Next, the BDS test [40] is applied for assessing the existence of nonlinear pattern in the residuals extracted from the selected ARFIMA model. The results of the BDS test given in Table 5, indicate that the null hypothesis of iid is rejected, suggesting that nonlinear structures may exist in the data. The obtained residuals are then shaped through a neural network model. In order to find the best neural network architecture, about 2–20 neurons were tested with two or three layered networks. Each one was trained 50 times. Error of the test data were randomly set as criterion in different models to compare their performance. Finally, the number of optimal neurons was found to be 20 and there were two optimal hidden layers. Table 6 provides the summary of information related to the network architecture. As shown in Fig. 7, the deseasonalized time series is divided into three successive parts as follows: (a) a training set regrouping the first 658 hourly observations from October 1, 2012 at 01 h:00 to October 28, 2012 at 10 h:00 , (b) a validation set gathering the next 658 h, from October 28, 2012 at 11 h:00 to November 24, 2012 at 20 h:00 and (c) a test set covering the hourly data from November 24, 2012 at 21 h:00 to November 28, 2012 at 00 h:00 for a total of 100 testing data points. One step ahead prediction experiment is performed over the test set using an iterative forecasting scheme. The Root Mean Squared Error (RMSE), the Mean Absolute Error (MAE) and the Mean Average Percentage Error (MAPE) are used as a performance index to evaluate the prediction capability of each model. Typically we choose the model that minimizes these criteria, which are formallyr given by: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi
1 XM ^ i Þ2 ðy y i¼1 i M M 1X ^i j MAE ¼ jy y M i¼1 i M ^ 1X yi yi MAPE ¼ M i¼1 yi
RMSE ¼
ð15Þ ð16Þ ð17Þ
where M is the size of the test set, and yˆi is the predicted value of yi.
193
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194 Table 7 One step ahead prediction errors for the ARFIMA, ANN, ARIMA-ANN and ARFIMA-ANN models. RMSE
MAE
0.0358 0.0641 0.7599 0.0499
0.0132 0.0218 0.5478 0.0193
6.47 10.93 13.89 9.23
30
Original series ARFIMA prediction
20
35
NordPool prices
35
25
40
10
30
50
70
90
Time (Hours) Fig. 10. Actual values versus ARFIMA-ANN predictions.
30
25
40
20 30
50
70
90
Time (Hours) Fig. 8. Actual values versus ARFIMA prediction.
40
Observed series ARIMA-ANN
35
NordPool prices
10
30
25
Observed serie ANN prediction
20
35
NordPool prices
Observed series ARFIMA-ANN
MAPE (%)
NordPool prices
ARFIMA-ANN model ANN model ARFIMA model Zhang’s Hybrid model
40
10
30
50
70
90
Time (Hours) Fig. 11. Actual values versus Zhang’s hybrid model predictions.
30
6. Conclusion 25
20 10
30
50
70
90
Time (Hours) Fig. 9. Actual values versus ANN prediction.
To evaluate the prediction performance of the proposed hybrid methodology we carry out prediction experiments using a pure ARFIMA, a pure ANN model and Zhang’s hybrid model (ARIMAANN).6 Using the RMSE, the MAE and the MAPE, the forecast evaluation results are reported in Table 7. It shows that the ARFIMA-ANN model outperforms all other competing techniques in terms of prediction accuracy. Indeed, the proposed ARFIMA-ANN model prediction errors are the smallest for all evaluation criteria. As shown in Figs. 8–11, the ARFIMA-ANN model is the best one in terms of prediction accuracy. Obtained results are very interesting in the sense that it was always difficult to achieve such precision when modeling power spot prices. This highlights the relevance of the proposed methodology.
6 The best linear ARIMA model is found to be the random walk model, and a neural network model composed of seven inputs, five hidden and one output neurons, is used to model the nonlinear patterns.
Forecasting electricity prices at different time frames is valuable for all market participants. A prior knowledge of the electricity price is important for risk management. It may represent an advantage for a market player facing competition. Despite numerous time-series models available, the research for improving the effectiveness of forecasting electricity spot prices has never stopped. To overcome the deficiencies of commonly used model and yield results that are more accurate, the present study has presented a framework for using jointly ARFIMA and Neural network models to capture both long memory and nonlinear patterns that may be present in time-series models. The proposed method was applied to the NordPool market as the most promising and rising power markets in the world. Numerical results allowed us to prove that the proposed predicting system significantly outperforms the existing approaches in modeling and prediction. Therefore, the proposed model leads to improved performance. It can be an effective way in the forecasting task, especially when higher forecasting accuracy is needed. This procedure supports the validity of the suggested forecasting method. Acknowledgment The author would like to thank the anonymous reviewers for their valuable and constructive reports on a first draft of this article.
194
N. Chaâbane / Electrical Power and Energy Systems 55 (2014) 187–194
References [1] Weron R, Bierbrauer M, Trück S. Modeling electricity prices: jump diffusion and regime switching. Phys A: Stat Mech Appl 2004;336(1):39–48. [2] Harris C. Electricity markets: pricing, structures and economics. Wiley; 2006. p. 328. [3] Koopman S, Ooms M, Carnero M. Periodic seasonal REG-ARFIMA–GARCH models for daily electricity spot prices. J Am Stat Assoc 2007;102(477):16–27. [4] De Jong C, Huisman R. Option formulas for mean-reverting power prices with spikes. Energy Power Risk Manage 2002;7:12–6. [5] Bierbrauer M, Trück S, Weron R. Modeling electricity prices with regime switching models. Comput Sci – ICCS 2004 2004:859–67. [6] Haldrup N, Nielsen M. A regime switching long memory model for electricity prices. J Econ 2006;135(1):349–76. [7] Diongue A, Guegan D, Vignal B. Forecasting electricity spot market prices with a k-factor GIGARCH process. Appl Energy 2009;86(4):505–10. [8] Meyer-Brandis T, Tankov P, et al. Multi-factor jump-diffusion models of electricity prices. Int J Theor Appl Finance 2008;11:503–28. [9] Klüppelberg C, Meyer-Brandis T, Schmidt A. Electricity spot price modelling with a view towards extreme spike risk. Quant Finance 2010;10(9):963–74. [10] Fanone E, Gamba A, Prokopczuk M. The case of negative day-ahead electricity prices. Energy Econ 2011. [11] Wang A, Ramsay B. A neural network based estimator for electricity spotpricing with particular reference to weekend and public holidays. Neurocomputing 1998;23(1–3):47–57. [12] Szkuta B, Sanabria L, Dillon T. Electricity price short-term forecasting using artificial neural networks. IEEE Trans Power Syst 1999;14(3):851–7. [13] Khashei M, Bijari M. An artificial neural network model for timeseries forecasting. Expert Syst Appl 2010;37(1):479–89. [14] Zhang G. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003;50:159–75. [15] Pai P, Lin C. A hybrid ARIMA and support vector machines model in stock price forecasting. Omega 2005;33(6):497–505. [16] Chen K, Wang C. A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan. Expert Syst Appl 2007;32(1):254–64. [17] Valenzuela O, Rojas I, Rojas F, Pomares H, Herrera L, Guillen A, et al. Hybridization of intelligent techniques and ARIMA models for time series prediction. Fuzzy Sets Syst 2008;159(7):821–45. [18] Tan Z, Zhang J, Wang J, Xu J. Day-ahead electricity price forecasting using wavelet transform combined with ARIMA and GARCH models. Appl Energy 2010;87(11):3606–10. [19] Che J, Wang J. Short-term electricity prices forecasting based on support vector regression and auto-regressive integrated moving average modeling. Energy Convers Manage 2010;51(10):1911–7.
[20] Shafie-Khah M, Moghaddam M, Sheikh-El-Eslami M. Price forecasting of dayahead electricity markets using a hybrid forecast method. Energy Convers Manage 2011;52(5):2165–9. [21] Wang J, Wang J, Zhang Z, Guo S. Stock index forecasting based on a hybrid model. Omega 2012;40(6):758–66. [22] Zhang J, Tan Z, Yang S. Day-ahead electricity price forecasting by a new hybrid method. Comput Ind Eng 2012. [23] Granger C, Joyeux R. An introduction to long-memory time series models and fractional differencing. J Time Ser Anal 1980;1(1):15–29. [24] Hosking J. Fractional differencing. Biometrika 1981;68(1):165–76. [25] Gradshteyn I, Ryzhik I. Table of integrals, series and products. In: Jeffrey, Alan, editor. 5th Ed completely reset. New York: Academic Press; 1994, p. 1. [26] Baillie R, Bollerslev T, Mikkelsen H. Fractionally integrated generalized autoregressive conditional heteroskedasticity. J Econ 1996;74(1):3–30. [27] Montanari A, Taqqu M, Teverovsky V. Estimating long-range dependence in the presence of periodicity: an empirical study. Math Comput Model 1999;29(10):217–28. [28] Doukhan P, Oppenheim G, Taqqu M. Theory and applications of long-range dependence. Birkhauser; 2003. [29] Achard S, Coeurjolly J. Discrete variations of the fractional Brownian motion in the presence of outliers and an additive noise. Stat Surv 2010;4:117–47. [30] Zhang G, Eddy Patuwo B, Y Hu M. Forecasting with artificial neural networks: the state of the art. Int J Forecasting 1998;14(1):35–62. [31] Gençay R, Selçuk F, Whitcher BJ. An introduction to wavelets and other filtering methods in finance and economics. Access Online via Elsevier; 2001. [32] Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw 1989;2(5):359–66. [33] Zurada J. Introduction to artificial neural systems, vol. 8. New York: West Publishing Company; 1992. [34] Nikolaev N, Iba H. Adaptive learning of polynomial networks: genetic programming, backpropagation and Bayesian methods. Springer-Verlag New York Inc.; 2006. [35] White H. Artificial neural networks: approximation and learning theory. Blackwell Publishers Inc.; 1992. [36] Brockwell P, Davis R. Introduction to time series and forecasting. Springer Verlag; 2002. [37] Weron R. Modeling and forecasting electricity loads and prices: a statistical approach. HSC Books; 2006. [38] Doornik J, Ooms M. Introduction to ox: an object-oriented matrix language. Timberlake Consultants Ltd.; 2007. [39] Cheung Y. Tests for fractional integration: A monte carlo investigation. J Time Ser Anal 1993;14(4):331–45. [40] Broock W, Scheinkman JA, Dechert WD, LeBaron B. A test for independence based on the correlation dimension. Econ Rev 1996;15(3):197–235.