Energy Conversion and Management 52 (2011) 1157–1165
Contents lists available at ScienceDirect
Energy Conversion and Management journal homepage: www.elsevier.com/locate/enconman
Stochastic models for wind speed forecasting S. Bivona, G. Bonanno, R. Burlon, D. Gurrera ⇑, C. Leone Dipartimento di Fisica e Tecnologie Relative, Università degli Studi di Palermo, Italy
a r t i c l e
i n f o
Article history: Received 20 January 2010 Accepted 4 September 2010
Keywords: Stochastic models Time series Model selection Spectral analysis Artificial neural networks Wind forecasting
a b s t r a c t This paper is concerned with the problem of developing a general class of stochastic models for hourly average wind speed time series. The proposed approach has been applied to the time series recorded during 4 years in two sites of Sicily, a region of Italy, and it has attained valuable results in terms both of modelling and forecasting. Moreover, the 24 h predictions obtained employing only 1-month time series are quite similar to those provided by a feed-forward artificial neural network trained on 2 years data. Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction The idea of using a mathematical model to describe the behaviour of a physical phenomenon is well established. In particular, it is sometimes possible to derive a deterministic model based on physical laws which enables us to calculate the value of some time-dependent quantity nearly exactly at any instant of time. In many problems however we have to consider a time-dependent phenomenon in which there are many unknown factors or large experimental resources needed to produce a deterministic model are not available. In such cases, it may be possible to derive a stochastic model. As for wind speed forecasting, which is the subject of this paper, fluid-dynamical models have reached a great degree of accuracy because of the advent of the satellite technology providing an inestimable amount of data and because of the computing improvement necessary for equation solving. However, up to now they cannot solve the problem completely given that, because of the great complexity of the addressed atmospheric phenomenon, they may lack great accuracy when focussing on the local scale and on the hour to hour variations. In this sense, the stochastic models approach can provide an effective support and together they can yield a better solution to the wind speed variability problem for the exploitation of wind power [1,2].
⇑ Corresponding author. Address: Dipartimento di Fisica e Tecnologie Relative, Università degli Studi di Palermo Viale delle Scienze, Ed. 18, 90128 Palermo, Italy. Tel.: +39 09123899015, mobile: +39 3332185747; fax: +39 09123860815. E-mail address:
[email protected] (D. Gurrera). 0196-8904/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.enconman.2010.09.010
The first studies on hourly average wind speed (HAWS) forecasting were based upon a Monte Carlo simulation, given an assumed wind speed distribution, but the predictions obtained using this method presented the weakness of not considering the correlation between consecutive observations. Successive researches have attempted to incorporate autocorrelation into wind speed models. In 1984, Brown et al. [3] proposed a method to take into account all the main features of HAWS data, but they applied it to a single case study. Seasonal nonstationarity was removed by considering a reasonable short series, while diurnal nonstationarity was taken off by data standardization. In order to apply a pure autoregressive model, the non-Gaussian wind speed distribution was taken into account by applying a power transformation. Particularly, the authors indicated the square root as the most appropriate choice to obtain an approximate normal distribution. After the seminal paper of Brown et al. the standardization technique to remove diurnal nonstationarity became popular, other power transformations have been proposed and the obtained data were modelled by some kind of autoregressive moving average model. The estimation of the power k in the Box–Cox transformation.
(
v
0 t ðkÞ
¼
v kt 1 k
if k – 0
lnðv t Þ if k ¼ 0
ð1Þ
was usually carried out on the basis of the following considerations: – k = g/3.6, where g is the estimated shape parameter of the Weibull probability density function fitted to the actual distribution, so as to obtain a Weibull distribution with shape parameter equal to 3.6, approximately Gaussian (see [3] for details);
1158
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
– k = 0.5, equivalent to the assumption that the square normal distribution is adequate for wind speed [3]; – k is selected so as to minimize the sample skewness coefficient of the transformed distribution [4–7].
The automatic approach for wind speed modelling and forecasting proposed in this work is applied to the HAWS time series recorded from the 1st January 2003 to the 31st December 2006 in two sites of Sicily (Italy), namely Cammarata and Mazzarrone, at 10 m above ground by the Servizio Informativo Agrometeorologico Siciliano. The two locations are distant from each other and present different features. Cammarata (UTM ED 50: 41°650 75.000 N– 38°850 0000 W) is at 350 m above sea level, near the centre of Sicily, and it is surrounded by mountains; Mazzarrone (UTM ED 50: 41°050 61.300 N–46°100 5200 W) is at 300 m above sea level, in the south-east, and it is situated in a more exposed area. The HAWS values are obtained using the standard fixed by the World Meteorological Organization.
3. The proposed approach The contribution of this work to the problem of wind speed forecasting is based on the development of a general class of stochastic models. However, given a stochastic process {Zt}, and only one realization of its {zt}, it is not straightforward to determine a possible model. Indicating with {Wt} an i.i.d. Gaussian noise, the general linear model
Zt ¼
1 X
pj Z tj þ W t ¼
j¼1
1 X
wj W tj þ W t
ð2Þ
j¼1
2
4
6
8
10
with suitable values for the coefficients pj e wj is entirely adequate to describe many different types of time series, but a parsimonious model that can be explained by the ‘‘physical” mechanism that generates the data would be obviously preferable. In the following, a parsimonious class of models for wind speed modelling and forecasting is developed, using the data described in the previous section.
0
5
10
15 Days
20
25
30
0.2
0.6
Spectrum 0.1 0.5 5.0 50.0
1.0
0
−0.2
Autocorrelation Coefficient
Hourly Average Wind Speed (m/s)
As for prediction, referred works are characterized by shortterm forecasts while recently Kavassari and Seetharaman [8] employing long-memory models have obtained reliable forecasts 24–48 h ahead. Their analysis however was restricted to only four time series. Comparing the previous works on the same subject, this research introduces some novel features. The first concerns the treatment of diurnal nonstationarity. The approach to HAWS forecasting proposed in this paper attains stationarity by the use of the differencing technique and employs seasonal models to allow for randomness in the daily cycle and to take into account the day to day correlation. This would be impossible by using the simple ARMA models suggested in the literature, given that they can only take into account very short-term correlation. As a result, most of the obtained forecasts show no degradation during the considered forecast window whose length is 24 h (medium-range prediction). Finally, the Box–Cox transformation elsewhere considered to obtain a normal distribution, which is an essential requisite for the considered type of stochastic models, has been applied to the time series employed in this study using each of the aforementioned power estimation methods, but in no case the transformed series have passed the Shapiro–Wilk (S–W) test for normality. Previous works have omitted to perform similar tests. To solve this problem, a different transformation has been employed in our analysis, already used by Amato et al. [9] working on the daily solar irradiance. The conclusions drawn in this study are based upon the detailed analysis of a sizeable data set: 96 time series, recorded during 4 years in two different sites of Italy. Time series analysis and models development have been performed using the R environment while as for the artificial neural network (ANN) approach the Java Neural Network Simulator, developed at the Wilhelm-SchickardInstitute for Computer Science in Tübingen (Germany), has been employed.
2. Data description
0
1
2 3 Lag (days)
4
5
0.0
0.1 0.2 0.3 0.4 Frequency (1/h)
0.5
Fig. 1. Top – HAWS (m/s) at 10 m above ground in Mazzarrone, March 2006. Bottom – autocorrelation function and spectrum (the broken lines stand at frequencies 1/24 h, 2/ 24 h, 3/24 h) of the series shown above.
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
1159
3.1. Data pretreatment The investigated process is highly nonstationary and, before attempting any model, any sources of nonstationarity must be removed. Moreover, the stochastic models employed in this study, also known as Box–Jenkins (B–J) models [10], were rigorously developed for random variables possessing a Gaussian distribution, while many previous researches [11] have already revealed a remarkable deviation from (univariate) normality. Therefore, in this section a recipe is given which allows to obtain a time series that may be considered the realization of a stationary, normal process. As in previous studies, seasonal nonstationarity is removed by a monthly stratification thus obtaining 48 time series for each of the two considered sites. After such stratification, the only expected residual periodicity should be the daily cycle typical of wind speed on a land area. To verify this, a spectral analysis has been performed for each time series. Fig. 1 shows one of the analysed monthly time series along with its autocorrelation function and spectrum, the latter obtained using an opportune autoregressive model fitted to the data. The presence of the daily cycle is outstanding. The same holds for the other months and for the other site, with some few exceptions especially in winter when strong atmospheric disturbances are predominant. As it is expected in Fig. 1, the autocorrelation function and the spectrum provide the same information, but in some cases the latter fails to reveal the aforesaid period shown instead by the former. Having removed any possible seasonal variation, we still need to transform the obtained monthly time series in order to eliminate any trend, the daily cycle and to make the data normally distributed. There are two main approaches to face this last problem. One ignores the non-Gaussian distribution of wind speed data at all (e.g. [12]), the other applies a Box–Cox transformation to make their distribution approximately Gaussian (Section 1). In view of the insufficient results obtained using this power transformation, a new approach is presented.
Fig. 3. Fit of the Weibull cumulative distribution function to the empirical cumulative probabilities of the series shown in Fig. 1.
The new time series is obtained solving the following equation for zt:
1 pffiffiffiffiffiffiffi r 2p
Z
zt
e
ðx v Þ2 2r2
dx ¼ Pðv t Þ
ð3Þ
1
{vt} being the empirical non-Gaussian series, {P(vt)} the series of estimated empirical cumulative probabilities (i.e. the probabilities the mean and r the standard for wind speed not to exceed vt), v deviation of {vt}. The left hand side of this equation is the Gaussian cumulative distribution function D(zt), usually expressed in terms of the error function:
Fig. 2. Histogram and spectrum (the broken lines stand at frequencies 1/24 h, 2/24 h, 3/24 h) of the time series shown in Fig. 1 (top) and of the same series transformed using the error function (bottom).
1160
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
1 zt v pffiffiffi Dðzt Þ ¼ 1 þ erf 2 r 2
ð4Þ
It allows to build a series {zt} whose distribution is normal. In fact, most of the transformed time series have passed the S–W test as it will be shown later in Section 5.1. The only Achilles’ heel in this method is represented by equal values of wind speed which correspond to the same value for zt and eventually may produce a skew distribution. For example, departure from normality may be encountered when a lot of calms occur during a month. The above transformation also leaves the 24 h period of the original time series unaltered. This has been empirically verified for each series by comparing the spectra of the original and of the transformed data. An example is shown in Fig. 2, along with the distributions of {vt} and {zt}. A B–J model can now be applied to the zt’s to predict future values which eventually must be back-transformed to be of any utility. This can be accomplished by fitting the Weibull cumulative distribution function v
F W ðv Þ ¼ 1 eðbÞ
to the cumulative frequency distribution P(vt). Once a and b have been estimated, the equation
Dðzt Þ ¼ F W ðv t Þ
ð6Þ
may be solved for the forecasted vt’s. The Weibull distribution function has proved to fit perfectly almost for each time series and an example is shown in Fig. 3. 3.2. The proposed class of models In view of the previous considerations, a SARIMA class of models with period s = 24 h is proposed. In order to obtain stationary series, a first order seasonal difference (D = 1) and a first order nonseasonal difference (d = 1) are taken to remove diurnal nonstationarity and trend respectively. A model selection criterion, here the bias-corrected Akaike’s Information Criterion (AICC), is employed to select the most appropriate, invertible SARIMA model within the proposed class:
SARIMA ðp; 1; qÞ ðP; 1; Q Þ24 ;
a
with p; q; P; Q 6 3:
ð7Þ
ð5Þ
Table 1 For each analysed time series, the p-value of the S–W test, the selected SARIMA model (models marked by two asterisks have not passed the L–B–P test) and its out of sample forecasting accuracy, as assessed by the MASE, are reported. In 2006, the MASE values on the right indicate the forecasting accuracy of the described ANN trained on 2 years data. Month and year
01
02
03
04
05
06
07
08
09
10
11
12
Cammarata
2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006 2003 2004 2005 2006
Mazzarrone
S-W test
Model
MASE
S–W test
Model
MASE
0.57 <2.2e-16 0.069 0.14 0.25 1.7e-14 0.53 0.44 0.27 2.9e-13 0.27 0.54 0.45 0.13 0.55 0.29 0.55 0.029 0.24 0.37 0.43 0.0039 0.00013 0.47 0.00071 1.1e-9 0.25 0.24 0.34 2.5e-16 0.25 0.21 0.19 2.9e-15 0.082 0.099 0.16 7.3e-16 0.14 0.18 0.011 0.0069 0.15 0.31 3.2e-8 0.28 0.25 0.13
(2,1,2)(0,1,3) (1,1,3)(0,1,2) (1,1,1)(0,1,1) (2,1,2)(0,1,2) (3,1,1)(1,1,2) (3,1,2)(1,1,3) (2,1,2)(0,1,2) (3,1,3)(0,1,2) (3,1,3)(0,1,2) (3,1,3)(0,1,1) (1,1,1)(0,1,1) (2,1,2)(0,1,2) (2,1,1)(0,1,2) (1,1,3)(0,1,1) (1,1,3)(0,1,1) (3,1,2)(1,1,2) (1,1,1)(0,1,1) (2,1,1)(0,1,3) (1,1,1)(0,1,2) (1,1,1)(0,1,2) (2,1,2)(0,1,2) (1,1,1)(0,1,2) (1,1,1)(0,1,2) (1,1,1)(0,1,1) (1,1,3)(0,1,1) (2,1,1)(0,1,3) (2,1,1)(0,1,3) (2,1,2)(0,1,2) (3,1,3)(0,1,2) (3,1,3)(0,1,3) (1,1,1)(0,1,1) (3,1,3)(1,1,2) (1,1,1)(0,1,2) (1,1,3)(0,1,3) (3,1,3)(0,1,1) (1,1,1)(0,1,2) (2,1,3)(0,1,2) (3,1,3)(1,1,1) (2,1,2)(0,1,2) (1,1,1)(0,1,1) (3,1,3)(1,1,3) (3,1,3)(1,1,2) (3,1,3)(0,1,2) (3,1,2)(0,1,3) (2,1,3)(0,1,2) (3,1,2)(0,1,2) (3,1,1)(0,1,2) (2,1,3)(1,1,2)
2.18 2.24 1.18 2.02–1.65 0.93 1.95 0.69 0.91–1.22 1.23 1.42 0.86 1.13–1.25 1.02 0.96 0.80 1.42–1.53 0.98 0.78 1.30 1.99–1.82 0.94 0.68 1.40 1.07–1.02 1.01 0.84 0.79 1.07–1.34 0.99 0.84 0.78 1.20–1.12 0.48 1.06 1.93 1.01–1.08 4.21 4.38 0.94 1.38–1.51 2.35 3.45 2.53 1.29–2.20 1.01 1.05 2.50 1.50–1.41
0.65 0.59 0.52 0.84 0.88 0.60 0.66 0.0031 0.69 0.86 0.36 0.83 0.0044 0.00026 0.88 0.41 0.58 2.7e-14 0.54 0.22 0.69 2.2e-5 0.54 0.66 0.27 5.6e-6 1.7e-6 0.41 0.62 2.3e-7 0.74 0.27 0.77 1.8e-14 0.60 0.11 0.67 4.4e-15 0.054 0.57 0.42 6.3e-11 0.11 0.59 0.79 0.73 0.42 0.45
(3,1,3)(1,1,2) (1,1,3)(0,1,2) (3,1,1)(0,1,1) (1,1,2)(0,1,1) (2,1,2)(0,1,3) (3,1,3)(0,1,2) (1,1,2)(0,1,1) (3,1,3)(0,1,3) (1,1,2)(0,1,1) (1,1,1)(0,1,2) (1,1,2)(0,1,1) (2,1,3)(0,1,1) (2,1,1)(0,1,1) (3,1,3)(0,1,1) (3,1,3)(0,1,2) (3,1,1)(1,1,3) (2,1,1)(0,1,1) (2,1,2)(1,1,2) (1,1,2)(0,1,2) (1,1,3)(0,1,2) (2,1,1)(1,1,1) (1,1,2)(0,1,1) (2,1,2)(0,1,1) (1,1,2)(0,1,1) (1,1,1)(0,1,1) (2,1,1)(1,1,2) (3,1,1)(0,1,2) (1,1,2)(0,1,2) (2,1,1)(0,1,2) (1,1,1)(0,1,2) (3,1,3)(1,1,1) (3,1,3)(1,1,2) (3,1,3)(1,1,1) (3,1,3)(0,1,3) (3,1,3)(0,1,2) (1,1,2)(0,1,2) (1,1,1)(0,1,1) (2,1,3)(0,1,3) (3,1,3)(2,1,3) (3,1,2)(0,1,3) (2,1,1)(0,1,1) (1,1,1)(0,1,3) (2,1,2)(1,1,3) (3,1,3)(0,1,2) (1,1,1)(0,1,3) (1,1,3)(0,1,2) (3,1,1)(0,1,2) (1,1,2)(0,1,3)
1.54 1.38 1.71 4.45–2.44 1.91 1.01 6.68 1.43–0.95 2.99 2.56 1.05 0.58–0.61 0.85 1.13 1.00 0.96–1.07 1.43 1.30 1.16 1.18–1.79 1.53 1.02 1.21 1.16–1.27 2.27 1.18 0.80 0.93–1.30 1.08 0.98 1.57 1.22–1.15 2.07 1.32 1.65 0.71–1.00 2.79 3.45 1.69 1.17–1.26 0.86 1.03 1.93 1.37–1.74 1.25 1.47 2.15 1.86–1.86
1161
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
To assess the accuracy of the models selected by the AICC, their residuals are analysed and tested for randomness using the Ljung– Box–Pierce (L–B–P) statistic. In this study, a model has been considered adequate only if the p-value to reject the null hypothesis of randomness has exceeded test level, assumed to be p = 0.05, for each lag h = 1,2, . . . , 48 (hours). The chosen number of lags is arbitrary, but completely adequate given that it includes two entire cycles. 4. An artificial neural network approach The 24 h forecasts obtained by the models proposed in Section 3.2 are compared with those provided by a suitable artificial neural
Table 2 For each analysed time series, the out of sample forecasting accuracy of the selected SARIMA model, as assessed by the MAPE, is reported. In 2006, the MAPE values on the right indicate the forecasting accuracy of the described ANN trained on 2 years data. Values are missing when the MAPE is undefined. Month and year
Cammarata
Mazzarrone
MAPE
MAPE
01
2003 2004 2005 2006
265.83 – 31.30 143.07–86.68
64.05 100.74 41.51 72.25–25.85
02
2003 2004 2005 2006
34.99 104.51 52.42 52.49–48.22
79.83 29.38 55.94 110.12–71.79
03
2003 2004 2005 2006
110.45 72.92 41.70 101.12–120.30
45.24 80.37 33.35 27.03–27.70
04
2003 2004 2005 2006
96.78 35.93 31.55 34.47–53.58
75.56 – 38.40 60.33–45.56
05
2003 2004 2005 2006
44.01 – 57.79 89.64–77.25
52.35 69.28 42.38 52.80–84.31
06
2003 2004 2005 2006
24.44 – 32.80 43.04–28.39
34.70 – 29.88 37.01–59.99
07
2003 2004 2005 2006
26.16 – 30.56 89.72–112.23
40.19 29.69 25.18 59.09–85.65
08
2003 2004 2005 2006
30.58 – 54.04 44.29–36.45
46.37 48.76 52.20 36.63–33.48
09
2003 2004 2005 2006
33.45 – 48.53 46.64–53.07
52.61 – 63.02 51.95–72.21
10
2003 2004 2005 2006
43.14 71.50 47.55 81.86–95.34
40.90 63.04 – 34.53–47.79
11
2003 2004 2005 2006
– 73.94 256.18 132.34–245.34
25.41 – 200.00 32.28–45.79
12
2003 2004 2005 2006
– 73.43 43.60 33.42–45.57
61.39 45.01 37.20 –
Table 3 The out of sample forecasting accuracy in 2006 of the selected SARIMA models (left) and of the described ANN trained on 2 years data (right), as assessed by the RMSE and by the MAE. Month and year
01 02 03 04 05 06 07 08 09 10 11 12
2006 2006 2006 2006 2006 2006 2006 2006 2006 2006 2006 2006
Cammarata
Mazzarrone
RMSE (m/s)
MAE (m/s)
RMSE (m/s)
MAE (m/s)
1.33–1.10 0.65–0.89 0.98–1.09 1.26–1.14 1.33–1.18 0.74–0.75 0.66–0.84 0.77–0.77 0.72–0.84 0.89–1.02 0.76–1.33 1.07–0.90
1.14–0.93 0.54–0.72 0.73–0.81 0.78–0.85 1.02–0.93 0.61–0.58 0.56–0.71 0.63–0.59 0.57–0.61 0.71–0.77 0.61–1.05 0.70–0.66
2.51–1.56 0.93–0.66 0.42–0.43 0.67–0.72 0.81–1.09 0.69–0.83 0.59–0.84 0.80–0.91 0.53–0.65 0.62–0.62 0.68–1.00 0.86–0.82
2.18–1.20 0.80–0.53 0.34–0.36 0.55–0.61 0.56–0.85 0.58–0.64 0.48–0.67 0.68–0.64 0.38–0.53 0.51–0.55 0.60–0.76 0.73–0.73
network. Given their inherent ability in treating highly complex, even nonlinear problems, ANNs are widely used for weather forecasting. They operate like a ‘‘black box” model, usually requiring a considerable amount of data to be trained. On the other hand, they do not need any particular data pretreatment like the proposed transformation or data difference. Details about them and their use in this field may be found elsewhere [12–16]. In order to handle a problem using an ANN, it is necessary, first of all, to choose an appropriate architecture and select a learning method. In this work, a feed-forward connection will be used, with one hidden layer. In fact, in feed-forward networks with any of a wide variety of continuous nonlinear hidden-layer activation functions, one hidden layer with an arbitrarily large number of units suffices for the universal approximation property. But there is no theory yet to determine how many hidden units are needed to approximate any given function, the problem being still an important topic in neural networks research [17–19]. In this work, in order to select the number of hidden neurons (HNN), we have made use of the following empirical formula [15] based on the number of input and output neurons (INN and ONN, respectively) and the number of training patterns (TPN):
HNN ¼ ðINN þ ONNÞ=2 þ
pffiffiffiffiffiffiffiffiffi TPN
ð8Þ
The INN has been chosen by considering the results for the number of parameters in the SARIMA models selection (Section 5.1). By expanding the equation of the general SARIMA process for the maximum number of parameters, it results that the forecasts from the SARIMA models depend nearly on the last 2 days of observations and on the last 4 days of the noise process. Therefore, we have developed an ANN that uses 3 days of history to forecast 24 h ahead. For each of the two sites the data set has been divided into the following: a training set consisting of 2003 and 2004 data; a validation set consisting of 2005 data; a test set consisting of 2006 data. The selected feed-forward ANN is made up of: one input layer, with 72 neurons (corresponding to 3 days of HAWS values); one hidden layer; one output layer, with 24 neurons (corresponding to 1 day prediction). With 182 training patterns, (8) provides a number of hidden units equal to 60. Tests performed on the ANN, increasing the number of hidden neurons from 10 up to 60 with an increment of 5, have shown that the best generalization performance is achieved using 40 hidden units, which is therefore the number of neurons used in the hidden layer. In order to avoid overfitting, the backpropagation training was stopped when the generalization error, measured upon the validation set, ceased decreasing and remained fairly constant for a number of epochs.
1162
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
Table 4 Comparison between the proposed transformation and the Box–Cox (using each of the discussed power estimation methods) in attaining a normal distribution, as assessed by the S–W test applied to five representative examples.
Cammarata January 2004 Mazzarrone August 2004 Cammarata May 2004 Cammarata March 2003 Mazzarrone April 2005
No. of calms
Erf Tr.
k = g/3.6
k = 0.5
Min. skewness
Max. likelihood
194 45 6 0 1
p < 2.2e-16 p = 2.3e-7 p = 0.029 p = 0.27 p = 0.88
p < 2.2e-16 (k = 0.170) p < 2.2e-16 (k = 0.373) p = 1.5e-9 (k = 0.274) p = 3.5e-7 (k = 0.482) p = 2.0e-5 (k = 0.432)
p < 2.2e-16 p = 6.8e-12 p = 9.5e-15 p = 1.2e-7 p = 1.1e-6
p < 2.2e-16 (k = 0.448) p = 6.4e-8 (k = 0.687) p = 1.4e-9 (k = 0.273) p = 0.0016 (k = 0.262) p = 6.4e-5 (k = 0.352)
p < 2.2e-16 (k = 0.241) p = 1.1e-8 (k = 0.759) p < 2.2e-16 (k = 0.039) p = 0.0016 (k = 0.261) p = 1.8e-12 (k = 0.744)
5. Results and discussion In order to assess the out of sample forecasting performance of the models developed in this work, four different indicators of accuracy have been used, namely the mean absolute error (MAE), the root mean square error (RMSE), the mean absolute percentage error (MAPE) and the mean absolute scaled error (MASE) [20]. The main results are now presented and summarized in Tables 1–3. They include the comparison with the forecasts for the year 2006 provided by the ANN set out in Section 4.
that 67 out of the 96 transformed time series (70%) have passed this test at level .1 [21]. The failure of the transformation in the remaining cases is caused by the presence of a considerable number of equal measures which eventually produces a rather skew distribution of the transformed data. However, also in these cases,
(a)
5.1. Modelling accuracy In order to apply B–J models, the recorded data have been transformed using (3) and the obtained data have been tested for normality by the S–W statistic. By inspecting Table 1 which reports the p-value to reject the null hypothesis of normality, it emerges
(b) Table 5 Adjusted models, developed by a non-automatic analysis, for those in Table 1 marked by two asterisks.
Cammarata January 2004 Cammarata February 2006 Cammarata May 2006 Mazzarrone August 2003 Mazzarrone October 2004 Mazzarrone June 2006 Mazzarrone July 2006
Model
MASE
(1,1,5)(0,1,2) (1,1,1)(0,1,1) (1,1,2)(0,1,3) (2,1,2)(0,1,3) (3,1,6)(0,1,1) (3,1,6)(0,1,1) (2,1,5)(0,1,2)
0.74 0.82 2.00 1.12 3.49 1.18 0.97
(a)
(b)
Fig. 4. The forecasting accuracy, as assessed by the MASE, of the developed SARIMA models for the two considered sites, Cammarata (a) and Mazzarrone (b). The four symbols stand for different years (blank square = 2003, blank circle = 2004, filled square = 2005, filled circle = 2006).
Fig. 5. The forecasting accuracy, as assessed by the MAPE, of the developed SARIMA models for the two considered sites, Cammarata (a) and Mazzarrone (b). The four symbols stand for different years (blank square = 2003, blank circle = 2004, filled square = 2005, filled circle = 2006). Points are missing when the MAPE is undefined.
(a)
(b)
Fig. 6. The forecasting accuracy in 2006, as assessed by the MASE, of the developed SARIMA models (filled circle) and of the described ANN trained on 2 years data (blank circle) for the two considered sites, Cammarata (a) and Mazzarrone (b).
1163
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
(a)
(a)
(b)
(b)
Fig. 7. The forecasting accuracy in 2006, as assessed by the MAPE, of the developed SARIMA models (filled circle) and of the described ANN trained on 2 years data (blank circle) for the two considered sites, Cammarata (a) and Mazzarrone (b). Points are missing when the MAPE is undefined.
Fig. 9. The forecasting accuracy in 2006, as assessed by the MAE, of the developed SARIMA models (filled circle) and of the described ANN trained on 2 years data (blank circle) for the two considered sites, Cammarata (a) and Mazzarrone (b).
(a)
(b)
Fig. 8. The forecasting accuracy in 2006, as assessed by the RMSE, of the developed SARIMA models (filled circle) and of the described ANN trained on 2 years data (blank circle) for the two considered sites, Cammarata (a) and Mazzarrone (b).
the suggested approach attains better results, as valued by the same statistic, in comparison with the power transformation elsewhere proposed, as it is shown in Table 4 for five representative examples using each of the power selection methods discussed in the Introduction and the maximum likelihood method derived by Box–Cox [22]. In order to evaluate the accuracy of the models selected by the AICc from the class proposed in Section 3.2, their residuals have been tested by the L–B–P statistic and it results that 93% of the selected models may be considered adequate. As for the remaining cases, the automatic approach suggested in this work does not work perfectly but it can provide good forecasts as shown in Table 1 where these cases are marked by two asterisks. The correct, invertible models, developed after a minute analysis, are reported in Table 5 along with their predictions. As a first result, three of them belong to the proposed class of models but they have not
Fig. 10. Twenty four hours forecast of the time series shown in Fig. 1 by the developed SARIMA models (MASE = 0.58) and by the described ANN trained on 2 years data (MASE = 0.61). Actual data displayed in the forecast window to assess forecasting accuracy have not been used to estimate the parameters of the model (out of sample forecasting).
been selected by the AICc. In fact, sometimes it is possible to find several close competing models and the best is not necessarily that with the minimum AICc value [23]. Secondly, the forecasts provided by the correct models are comparable with those by the corresponding models in Table 1, except in one case (Cammarata – January 2004) where the correction yields an optimum prediction. Finally, models in Table 5 not belonging to the proposed class differ only for the number of nonseasonal moving average parameters (q) which at most is equal to 6. To take into account this last result, an enhanced class of models could be suggested with q 6 6. By inspecting Tables 1 and 5, some general results may be drawn. As a first basic result, it is always possible, with only one exception, to model the stochastic process under study (however after a nonlinear transformation) by a suitable linear model with at most 10 parameters (residuals variance not included). In some cases, just three parameters are sufficient. Moreover, the number
1164
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
Fig. 11. Twenty four hours forecast of the time series recorded in Cammarata, July 2006, by the developed SARIMA models (MASE = 1.07) and by the described ANN trained on 2 years data (MASE = 1.34). Actual data displayed in the forecast window to assess forecasting accuracy have not been used to estimate the parameters of the model (out of sample forecasting).
Fig. 12. Twenty four hours forecast of the time series recorded in Mazzarrone, February 2006, by the developed SARIMA models (MASE = 1.43) and by the described ANN trained on 2 years data (MASE = 0.95). Actual data displayed in the forecast window to assess forecasting accuracy have not been used to estimate the parameters of the model (out of sample forecasting).
of seasonal autoregressive parameters (P) in the selected models is never greater than 1, except in one case where it is 2. This could suggest a further considerable refinement of the proposed class, with 10 parameters at most instead of 12, but there is still no reliable evidence that such result would also characterize models of this type developed for other sites. Finally, in some cases, the same or nearly the same kind of model characterizes the same month regardless of the year, as it could be expected (see, for example, June in Cammarata).
5.2. Forecasting accuracy By inspecting Tables 1–3, it is possible to gather that the developed approach has provided satisfactory forecasts both for Cam-
Fig. 13. Twenty four hours forecast of the time series recorded in Mazzarrone, January 2006, by the developed SARIMA models (MASE = 4.45) and by the described ANN trained on 2 years data (MASE = 2.44). Actual data displayed in the forecast window to assess forecasting accuracy have not been used to estimate the parameters of the model (out of sample forecasting).
marata and for Mazzarrone. Unsatisfactory predictions resulted only when a departure from past conditions occurred in the forecast window. Relying only on recent records of wind speed, the employed models have no chance to predict such variations. In fact, the described approach is more convenient for the warmest months, when atmospheric disturbances are less probable, and for interior sites like Cammarata as shown in Figs. 4 and 5 displaying the seasonal dependence of the forecasting accuracy for each of the two considered stations. In some cases, the two figures exhibit conflicting results because of the inadequacy of the MAPE under particular circumstances. It is very interesting to compare the SARIMA predictions with those by the developed ANN. As reported in Tables 1–3 and displayed in Figs. 6–9, the two approaches, although completely distant, end with almost identical results and in order to evaluate their 24 h forecasts, four examples are given. Figs. 10–12 display three satisfactory predictions by the SARIMA models (MASE equal to 0.58, 1.07, 1.43, respectively) and by the ANN (MASE equal to 0.61, 1.34, 0.95, respectively). In the first two cases, the forecasting accuracy is almost the same, but it differs in Fig. 12 because of the rather irregular event occurred in the 28th day. Finally, Fig. 13 shows the worst case in 2006 where the expected superior capability of the ANN to face unforeseen occurrences is clearly shown.
6. Conclusions To sum up, the approach suggested in this paper has attained valuable results. As reported, the transformation introduced to deal with the non-Gaussian nature of the wind speed distribution has proved adequate in 70% of cases, while the proposed class of models in 96%. This result suggests the possibility to describe the stochastic process under investigation by a linear model (with 3 10 parameters). The developed models may be used for simulation studies and/or to predict wind speed 24 h ahead. In most cases, the obtained forecasts show no degradation during the considered lead time. Particularly, reported results show that upon stationary conditions, as required by the employed models, forecast accuracy is always satisfactory. In fact, poor predictions (more common during winter) are a consequence of sudden disturbances.
S. Bivona et al. / Energy Conversion and Management 52 (2011) 1157–1165
In these cases, results suggest that an artificial neural network trained on some years data would perform better, but still it cannot provide satisfactory forecasts like only a physical model would be able to. In the other cases, instead, the predictions by the two approaches considered in this paper – SARIMA and ANN – are quite similar, although the former based only on 1-month data and the latter on 2 years. Acknowledgements The data analysed in this work were provided by the Servizio Informativo Agrometeorologico Siciliano (Assessorato Agricoltura e Foreste – Regione Siciliana) whose contribution is gratefully acknowledged. This work was supported in part by the Italian Ministry of University and Scientific Research. References [1] Lindley D. The energy storage problem. Nature 2010;463:18–20. [2] Roulston MS, Kaplan DT, Hardenberg J, Smith LA. Using medium-range weather forecasts to improve the value of wind energy production. Renew Energy 2003;28:585–602. [3] Brown BG, Katz RW, Murphy AH. Time series models to simulate and forecast wind speed and wind power. J Clim Appl Meteorol 1984;23:1184–95. [4] Daniel AR, Chen AA. Stochastic simulation and forecasting of hourly average wind speed sequences in Jamaica. Sol Energy 1991;46:1–11. [5] Nfaoui H, Buret J, Saygh AAM. Stochastic simulation of hourly average wind speed sequences in Tangiers. Sol Energy 1996;56:301–14. [6] Poggi P, Muselli M, Notton G, Cristofari C, Louche A. Forecasting and simulating wind speed in Corsica by using an autoregressive model. Energy Convers Manage 2003;44:3177–96.
1165
[7] Torres JL, García A, De Blas M, De Francisco A. Forecast of hourly average wind speed with ARMA models in Navarre (Spain). Sol Energy 2005;79:65–77. [8] Kavasseri RG, Seetharaman K. Day-ahead wind speed forecasting using fARIMA models. Renew Energy 2009;34:1388–93. [9] Amato U, Andretta A, Bartoli B, Coluzzi B, Cuomo V, Fontana F, et al. Markov processes and Fourier analysis as a tool to describe and simulate daily solar irradiance. Sol Energy 1986;37:179–94. [10] Box GEP, Jenkins GM, Reinsel GC. Time series analysis. 2nd ed. Upper Saddle River: Prentice-Hall; 1994. [11] Carta JA, Ramírez P, Velázquez S. A review of wind speed probability distributions used in wind energy analysis. Case studies in the Canary Islands. Renew Sust Energy Rev 2009;13:933–55. [12] Sfetsos A. A comparison of various forecasting techniques applied to mean hourly wind speed time series. Renew Energy 2000;21:23–35. [13] Bishop CM. Neural networks for pattern recognition. Oxford: Oxford University Press; 1995. [14] Lei M, Shiyan L, Chuanwen J, Hongling L, Yan Z. A review on the forecasting of wind speed and generated power. Renew Sust Energy Rev 2009;13:915–20. [15] Kalogirou SA. Artificial neural networks in renewable energy systems applications: a review. Renew Sust Energy Rev 2001;5:373–401. [16] Cadenas E, Rivera W. Short term wind speed forecasting in La Venta, Oaxaca, México, using artificial neural networks. Renew Energy 2009;34:274–8. [17] Swingler K. Applying neural networks: a practical guide. London: Academic; 1996. [18] Camargo LS, Yoneyama T. Specification of training sets and the number of hidden neurons for multilayer perceptrons. Neur Comput 2001;13:2673–80. [19] Liu Y, Starzyk JA, Zhu Z. Optimized approximation algorithm in neural networks without overfitting. IEEE Trans Neur Networks 2008;19:983–95. [20] Hyndman RJ, Koehler AB. Another look at measures of forecast accuracy. Int J For 2006;22:679–88. [21] Royston P. Remark AS R94: a remark on algorithm AS 181: the W test for normality. Appl Stat 1995;44:547–51. [22] McLeod AI, Zhang Y. Improved subset autoregression: with R package. J Stat Soft 2008;28(2):1–5. [23] Brockwell PJ, Davis RA. Introduction to time series and forecasting. 2nd ed. New York: Springer; 2002.