Accepted Manuscript Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods Erick Meira de Oliveira, Fernando Luiz Cyrino Oliveira PII:
S0360-5442(17)32082-0
DOI:
10.1016/j.energy.2017.12.049
Reference:
EGY 11998
To appear in:
Energy
Received Date: 9 May 2017 Revised Date:
15 November 2017
Accepted Date: 11 December 2017
Please cite this article as: de Oliveira EM, Oliveira FLC, Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods, Energy (2018), doi: 10.1016/ j.energy.2017.12.049. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT FORECASTING MID-LONG TERM ELECTRIC ENERGY CONSUMPTION THROUGH BAGGING ARIMA AND EXPONENTIAL SMOOTHING METHODS
Erick Meira de Oliveiraa,b,1 Industrial Engineering Department, Pontifical Catholic University of Rio de Janeiro (PUC-Rio) Rua Marquês de São Vicente, 225, Gávea, 22451-900, Rio de Janeiro, Rio de Janeiro, Brazil
RI PT
a
b
Aerospatial, Energy and National Defense Division, Brazilian Innovation Agency (Finep) Avenida República do Chile, 330, Torre Oeste - 15o andar, 20031-170, Rio de Janeiro, Rio de Janeiro, Brazil
[email protected]
Industrial Engineering Department, Pontifical Catholic University of Rio de Janeiro (PUC-Rio) Rua Marquês de São Vicente, 225, Gávea, 22451-900, Rio de Janeiro, Rio de Janeiro, Brazil
[email protected]
M AN U
a
SC
Fernando Luiz Cyrino Oliveiraa
November 2017
Abstract
AC C
EP
TE D
In the last decades, the world’s energy consumption has increased rapidly due to fundamental changes in the industry and economy. In such terms, accurate demand forecasts are imperative for decision makers to develop an optimal strategy that includes not only risk reduction, but also the betterment of the economy and society as a whole. This paper expands the fields of application of combined Bootstrap aggregating (Bagging) and forecasting methods to the electric energy sector, a novelty in literature, in order to obtain more accurate demand forecasts. A comparative out-of-sample analysis is conducted using monthly electric energy consumption time series from different countries. The results show that the proposed methodologies substantially improve the forecast accuracy of the demand for energy end-use services in both developed and developing countries. Findings and policy implications are further discussed. Keywords. Electricity consumption; Forecasting; Bagging
1
Corresponding author. (+55) 21 2555-0275 E-mail address:
[email protected] ||
[email protected]
1
ACCEPTED MANUSCRIPT 1. Introduction
RI PT
Ensuring an adequate supply of energy is a pressing national priority in almost every nation in the world. The issue is even more crucial when it comes to electric energy since, unlike other energy sources, it cannot be stored for large-scale consumption. From an economic viewpoint, the supply must meet the demand for energy end-use services at any given time. In this connection, accurate electric load forecasting is of the utmost importance for decision-making processes in the electric sector, as the consequences of overestimation or underestimation can be costly. For instance, when delivered power is higher than the actual demand, the provider not only wastes resources but may also bear expensive costs due to strong spot market regulation in several countries. On the other hand, underestimation naturally results in failures and shortages, which in turn translates into a loss of productive time and quality and subjects the provider to sanctions and penalties.
TE D
M AN U
SC
Over the past decades, a large number of approaches have been proposed to adequately estimate and forecast electric energy consumption. In brief terms, the approaches can be divided into two major categories on the basis of forecasting indicators: short-term and mid-/long-term. Whilst the first is concerned with time frames of minutes to hours, the horizon for mid/long-term forecasting ranges from a few weeks to several years [1]. Forecasting electric energy demand accurately over the latter period is often regarded as a challenging task, due to the nonlinear, multidimensional nature of this variable [2]. In addition, many unpredictable factors exist in electricity demand modeling, such as structural breaks [3] and transitory effects from external variables [4]. Nevertheless, mid/long-term load forecasting assumes a particular importance for electric power utility planning. Even though precise short-term forecasting forms the basis of the electrical energy trade and spot price calculation [5], several decisions are made on the basis of mid/long-term energy demand forecasting, such as the construction of new generation facilities, the purchase of existing generating units, the development of transmission and distribution systems, among others [1]. Kaboli et al. [6] add that mid/long-term forecasting is especially interesting for companies operating in a deregulated environment, as it provides them precious insights on the market need of energy, the maintenance schedule of power plants, the fuel supplies and occasional needs of electrical energy imports/exports.
AC C
EP
In light of the aforementioned, despite its drawbacks in terms of complexity and uncertainty, mid-/long-term energy demand planning remains a key issue in many parts of the world. In this connection, the pursuit of models that could enhance the prediction accuracy in forecasting long-term electric energy demand has remained a prominent issue in the fields of energy policy and economic development. In this work we aim to generate two-year ahead (medium to long-term) forecasts for monthly electric energy demand in different parts of the world, including both developing and developed countries. To do so, we use an approach that has never been referenced in the electric energy sector. In brief terms, we combine different decomposition and bootstrap aggregation (bagging) techniques to simulate new series. Then, we forecast each generated series using a wide variety of methods and pool the forecasts into one single output. Finally, a comparative out-of-sample analysis is conducted on the basis of several performance metrics. To the best of our knowledge, no work has explored the potential gains of using bagging techniques to enhance monthly forecasts of total electric energy consumption. Furthermore, we contribute to the literature by proposing a different variation of this technique, which demonstrated satisfactory forecasting results in terms of accuracy for several countries. 2
ACCEPTED MANUSCRIPT The paper unfolds as follows. Section 2 provides a brief overview of the most recent techniques used to estimate and forecast energy demand behavior. Section 3 reviews the methodologies and the forecasting approaches in details. Section 4 introduces the selected data and Section 5 summarizes the results of the quantitative analysis. Finally, Section 6 concludes the findings of the study and suggests directions for future works.
RI PT
2. Literature review There are roughly two major approaches to investigate the dynamics of energy demand, whether in short, mid or long time spans: (1) Traditional forecasting models such as time series and regression-based, econometrics models and (2) Soft computing techniques such as fuzzy logic, genetic algorithms, neural networks, support vector regression models, among others [7].
M AN U
SC
In brief terms, the most popular methodologies for electricity demand forecasting can be summarized in Table 1. We hasten to add that this classification is not exhaustive and does not include combinatorial approaches (hybrid models). Indeed, there has been so much work and discussion concerning applications of different methodologies for electricity demand forecasting that a comprehensive review of all techniques is virtually impossible in an original research paper. The interested reader is referred to the compiling works of Suganthi and Samuel [7] (thorough overview) and Shao et al. [2] - this last one dedicated solely to decomposition based approaches to electric load forecasting. A recent, systematic review over 113 different case studies is presented by Kuster et al. [8], who identify three main approaches (in terms of popularity): Time Series models, Regressionbased formulations and Artificial Neural Networks (ANNs).
EP
TE D
In order to cope with the difficulties presented by each technique in Table 1, different decomposition and combination techniques have been proposed in recent years. The rationale behind such methods is that the isolation of key features of the data into distinct sub-series can enhance the forecasting performance of the models used for their estimation. Each sub-series can then be modelled according to its stylized facts. That way, the estimation error obtained from the further aggregation of the extrapolated sub-series is lower than the estimation error for the series as a whole [28]. Recent examples concerning different industrial sectors can be found in the works of Benaouda et al. [29], Theodosiou [28] and Qiu et al. [30].
AC C
Albeit still timid, another stream of literature that has emerged from the ideas of series decomposition and forecast combinations is the use of bootstrap aggregation (bagging) techniques to improve the accuracy of forecasts for a given time series. The underlying idea is to decompose the historical data on the variable of interest into key components, bootstrap one (or more) components to generate a random pool of similar time series, estimate and subsequently forecast each time series and average all forecasts into one single output. Recent examples concerning different industrial sectors can be found in the works of Cordeiro and Neves [31], Bergmeir et al. [32] and Dantas et al. [33]. The literature review resulted in no evidence of the application of a combined approach of Bootstrap aggregating (Bagging) strategies and time series forecasting methods to predict electric energy consumption. It is our belief that this gap is worth pursuing, as the results of such procedures were promising in other fields.
3
ACCEPTED MANUSCRIPT
Table 1. Summary of previous studies on electricity demand forecasting (most frequent techniques and not including combinatorial approaches)
Disadvantages
Pre-assumed form of the model; difficulty in trend extrapolation due to the presence of both a long-term increasing trend and periodic waves with different frequencies and amplitudes in many countries
Short, mid and long term
Intuitive functional form; adaptable; can represent several types of time series; low computational complexity
Regressionbased
Mid and long (very long) term
Intuitive functional form; straightforward interpretation of results; low computational complexity; Efficient for long and very long prediction
Pre-assumed form of the model; data independence assumption; sensitive to outliers
AC C
EP
TE D
Time Series
RI PT
Advantages
SC
Energy Sampling Type (most frequent)
M AN U
Models
Forecasting for Monthly electric energy consumption in eastern Saudi Arabia [9] Monthly electricity consumption of the Asturian (Northern Spain) domestic sector [10] Daily electricity demand load in Greece [11] Energy consumption (and GHG emission) for a pig iron manufacturing organization in India [12] Annual Residential Electricity Consumption in Brazil [13]
Monthly electricity demand in England and Wales (using weather variables, gross domestic product, and population growth) [14] Monthly electricity demand in Italy (using calendar and weather-related variables) [15] Monthly European and Italian electricity demand (using seasonal climate forecasts of temperature) [16] Monthly Australian electricity demand (industrial, residential and commercial sectors) using climate variables [17]
4
ACCEPTED MANUSCRIPT
Short (very short) and Provide good estimation in cases mid term where data is incomplete; Can address complex nonlinear problems while demonstrating robustness and fault tolerance; Efficient for short and very short term (sub hourly) prediction
Results cannot be easily explained as they are not mathematically based; Computation is time consuming; Extended data is required; Model may never converge in some cases - often get stuck in local minimum, so choosing a proper dataset is paramount for obtaining quality forecasting results
Fuzzy Logic
Short, mid and long term
Mathematical-free model to map the input and output variables; Not necessary to have accurate input variables; Useful in practical situations where the intensity of a phenomenon is described by linguistic variables (very low, low, medium, high, very high); High accuracy achieved on total uncertainty scenarios (eg. economic crisis periods)
Lack of self-learning capability (usually combined with ANNs to overcome this issue); Not capable of generating a definite prediction equation based on the input historical data; Proper criteria must be defined for accurate forecasting (Require more fine tuning and simulation before going operational)
Support Vector Short, mid and long Regressions term (SVR)
Can address complex nonlinear problems with relative simplicity (by creating an optimal separating hyperplane in a higher dimensional feature space so that subsequent observations can be classified into separate subsets); Few parameters need to be determined
AC C
EP
TE D
M AN U
SC
RI PT
Artificial Neural Networks (ANNs)
Kernel function is crucial and difficult to be determined; Real data are not perfectly separable; Computation on high dimension feature space can be very costly; Depend a lot on the proper selection of the hyperparameters
Monthly electricity consumption in Iran [18] Monthly energy demand in the United States (Industrial Sector) [19] Annual gross electricity demand in Turkey [20] Hourly load of the Polish power system and half-hourly loads of the French, British and Victorian (Australia) power systems [21]
Monthly electricity consumption in Iran [22] Hourly electricity load in Jordan [23] Annual electricity demand in Turkey [24] Annual electricity demand in Brazil [25]
Annual electricity demand in Taiwan [26] Sub-hourly (15 minutes) electricity load from four large office buildings in eastern China [27]
5
ACCEPTED MANUSCRIPT 3. Methods 3.1 Decomposition
RI PT
As a first step, we decompose the time series under analysis into key components, in an attempt to categorize their patterns and behaviors. In this work, we use the Seasonal-Trend decomposition using Loess (STL decomposition) [34], to obtain the trend, seasonal and remainder components of the demand time series.
SC
Broadly speaking, STL consists of a sequence of smoothing operations that employ locallyweighted regression (Loess). In Loess, a neighborhood is first defined for each data point and the points in that neighborhood are subsequently weighted according to their distances from the respective data point. A polynomial of degree d is then fitted to these points - usually d=1 or d=2. Higher degrees do not improve much the fit. Indeed, Cleveland et al. [34] argue that taking d=1 is reasonable if the underlying pattern in the data has gentle curvature. The trend component is equal to the value of the polynomial at each data point.
M AN U
STL uses Loess to divide the underlying time series into three additive components (trend, seasonal and remainder). In brief terms, the steps performed during STL decomposition are: (i) detrending; (ii) cycle-subseries smoothing, in which series are built for each seasonal component, and smoothed separately; (iii) low-pass filtering of smoothed cycle-subseries, when the subseries are put together again, and then smoothed; (iv) detrending of the seasonal series; (v) deseasonalizing the original series using the seasonal component calculated in the previous steps; and (vi) smoothing the deseasonalized series to get the trend component.
TE D
In this work, for each STL decomposition of the involved time series, we follow Bergmeir et al. [32] and use the R STL function with its default parameters, i.e. d = 1 in steps (iii) and (iv), and d = 0 in step (ii).
AC C
EP
The STL decomposition has some major advantages compared to other methods, such as: the possibility to handle any type of seasonality (regardless of the frequency) and to change the seasonal component over time; the possibility to control the smoothness of the trend-cycle; and its robustness to outliers when estimating the trend-cycle and seasonal components [35]. Our bootstrap aggregating (bagging) analysis is focused on the third component of the STL decomposition, namely the remainder. In brief terms, we either: directly bootstrap the component by sampling data blocks of equal size until the desired series length is achieved; or estimate specific models to eliminate possible autocorrelations, if any, in the remainder and then simulate different error time series by bootstrapping the residuals. These procedures are described in detail in the next subsections. 3.2 Bootstrapping the remainder The bootstrap was first devised by Efron [36], growing out of an earlier work on the jackknife procedure. In its original form, the technique consisted of re-sampling the underlying data, in order to get an approximation of the sampling distribution of some statistic of interest. Adapted versions of the original technique have been developed for time series, since their data are typically autocorrelated. The Bootstrap aggregation (Bagging), in turn, is a supervised machine learning technique, proposed by Breiman [37]. The aim is to generate multiple versions of a predictor via Bootstrap and then use the simulated results to get an aggregated predictor. Despite the good results in the field of Machine
6
ACCEPTED MANUSCRIPT Learning, there are just a few papers using Bagging to improve accuracy in time series forecasting [33]. In most of them, the authors have demonstrated, majorly with econometric applications, that the combination of techniques produces large improvements in forecast accuracy ([38], [39]).
RI PT
One prerequisite for bootstrapping is the stationarity of the series, which is often achieved for the remainder in a STL decomposition. In addition, care must be taken to ensure that every value from the original series can be placed anywhere in the bootstrapped series. In this work, we deal with this issue for the remainder in two different ways, which are briefly described in the following paragraphs. 3.2.1 Moving Blocks Bootstrap
M AN U
SC
The Moving Blocks Bootstrap (MBB) approach was first devised by Künsch [40]. In its original formulation, data blocks of equal size are drawn from the series until the desired series length is achieved. For a series of length n, with a block size of l, n − l + 1 (overlapping) possible blocks exist. However, care must be taken to ensure that every value from the original series could possibly be placed anywhere in the bootstrapped series. To meet that end, Bergmeir et al. [32] propose drawing ⁄ + 2 blocks from the remainder series of a STL Decomposition, and discarding a random number of values, between zero and l − 1, from the beginning of the bootstrapped series. Then, to obtain a series with the same length as the (original) remainder series, they further discard as many values as necessary to obtain the required length. This process ensures that the bootstrapped series do not necessarily begin or end on a block boundary. Finally, the trend and seasonality are combined with the bootstrapped remainder to get the final bootstrapped sample.
EP
TE D
In this work, we follow the same above-mentioned procedure. According to Bergmeir et al. [32], a MBB-based approach has the advantage that it makes no modelling assumptions other than stationarity, whereas other bootstrap methods assume that the fitted model on the underlying time series captures all serial correlation in the data. Even though no assumptions have been formally made with regard to the predictive accuracies of the MBB, Bergmeir et al. [32] showed that the approach, combined with exponential smoothing methods, delivered good results when dealing with monthly time series from the M3-competition, the main dataset to evaluate and compare time series forecasting methods - see [41] for further details. It is our belief that equally satisfactory results can be achieved in the electric energy sector.
AC C
3.2.2 Remainder Sieve Bootstrap
The sieve bootstrap approach has its roots in the original Bühlmann's [42] methodology, which was based on the idea of fitting parametric models first and afterwards resampling from the residuals. In brief terms, given a sample , … , , from a stationary process, Bühlmann’s [42] method involved: (i) selecting the order p = p(n) of an autoregressive (AR) approximation using the Akaike [43] Information Criterion (AIC); (ii) using the AR(p) model to filter the residuals series, obtaining centered residuals and their empirical cumulative distribution function; (iii) resampling the (supposed) i.i.d. centered residuals; (iv) using the AR(p) for obtaining a new series ∗ by recursion. Finally, given ∗ , … , ∗ , the method computes the estimation of the AR coefficients and then obtain future bootstrap observations by recursion from the new series. In Cordeiro and Neves [31], a different approach is proposed: the authors first suggest to fit an EXPOS model to the data and then to proceed like the Bühlmann’s [42] method over the residuals. The EXPOS model is named after the best fit model from a set of exponential smoothing forecasting
7
ACCEPTED MANUSCRIPT methods. The BOOT.EXPOS procedure, as the method became known, demonstrated very good results in forecasting series with seasonal and trendy components.
RI PT
In this paper, we propose a distinct version of the above-mentioned procedure. In lieu of fitting an exponential smoothing model to the data, we first decompose the original time series into its trend, seasonal and remainder components, following the STL approach (Section 3.1). Then, provided that the last component is already stationary, we estimate the best fit Autoregressive-Moving Average - ARMA(p,q) - model for the remainder, using AIC with corrections (AICc) [44] or the most parsimonious formulation which ensures that there are no autocorrelation issues in the residuals. Finally, we resample the centered residuals and use the ARMA(p,q) to obtain new series for the remainder.
SC
3.3. Forecasting
M AN U
After having obtained all the bootstrapped time series using the procedures described in Sections 3.1 and 3.2, we estimate and subsequently forecast each generated series using different univariate methods. In the following subsections, we briefly explore such methods. 3.3.1 Exponential Smoothing Methods and State Space Formulations
Exponential smoothing consists of traditional procedures for continually revising a forecast in light of more recent information about the estimated data. In brief terms, the methods assign exponentially decreasing weights as the observation gets older, i.e., recent observations are given relatively more weight in forecasting than older ones.
EP
TE D
There are several different approaches to exponential smoothing. For an extensive list of the most commonly used exponential smoothing methods in the literature, the interested reader is referred to the compiling works of Gardner Jr. ([45], [46]). In the underlying paper, the total electric energy demand in different countries (and their bootstrapped versions) are estimated and subsequently forecasted using three different exponential smoothing methods, briefly described in the following lines.
AC C
(i) The Holt-Winters additive model [47], appropriate for series with a linear time trend and additive seasonal variation. Its component form can be written as follows: = + +
(1)
where and are the permanent component (base signal) and the linear trend parameters, respectively, and are the additive seasonal factors. These parameters, in turn, are defined by the following recursive expressions: = − − + 1 − − 1 + − 1 = ! − − 1 + 1 − ! − 1 = " − + 1 − " −
(2)
where 0 < , !, " < 1 are the hyperparameters and is the seasonal frequency (supposedly monthly). In these terms, forecasts are computed by:
8
ACCEPTED MANUSCRIPT = % + % + &'
(3)
where the seasonal factors are used from the last estimates.
(ii) The Holt-Winters multiplicative model [47], adequate to series with a linear time trend and multiplicative seasonal variation. The smoothed series is given by:
The three coefficients are now defined by the following recursions: = (
(4)
RI PT
= +
) + 1 − − 1 + − 1 −
= ! − − 1 + 1 − ! − 1
SC
) + 1 − " −
M AN U
* = " (
Forecasts are then computed by:
= % + % &'
(5)
(6)
TE D
(iii) State space based (exponential smoothing) formulations. This approach consists of a set of different models obtained by considering variations in the combination of the error, trend and seasonal components of a time series. According to the taxonomy proposed by Pegels [48] and extended by Gardner Jr. [45], the possibilities for the trend and seasonal components are depicted in Table 2. In addition, the error term can also vary between additive or multiplicative. That way, a total of 30 different formulations can be achieved.
Trend Component
EP
Table 2. Possible variations for the trend and seasonal components under a state space based approach
AC C
None (N) Additive (A) Additive Damped (Ad) Multiplicative (M) Multiplicative Damped (Md)
Seasonal Component
None (N)
Additive (A)
Multiplicative (M)
N, N A, N Ad, N M, N M d, N
N, A A, A Ad, A M, A M d, A
N, M A, M Ad, M M, M M d, M
Each model in a state space based formulation consists of two sets of equations: (i) a measurement equation that describes the observed data; (ii) and some transition equations that describe how the unobserved components or states (level, trend, seasonal) change over time. To express such formulation, let’s consider one possible combination as an example: an additive error, multiplicative trend, multiplicative season model, or AMM, according to the above-mentioned notation. First, we consider a p-dimensional state vector + = , , , & , … , &' ,, with and being the contemporaneous estimates of the level and linear trend parameters and representing the 9
ACCEPTED MANUSCRIPT included seasonal terms. We also let = & & &' be the one-period ahead forecast of . Then, the prediction error decomposition is = + - = & & &' + -
(7)
= ℎ +& , / + +& , / -
RI PT
Following Ord et al. [49], we may write a nonlinear dynamic model representation of the exponential smoothing equations using a state space model with a common error term: + = 0 +& , / + 1 +& , / -
(8)
SC
where ℎ and are known continuous scalar functions, 0 and 1 are known continuous functions with continuous derivatives from ℜ3 → ℜ3 and - ~ 667 0, 8 9 are the independent past realizations of and +.
M AN U
Conceptually, the equation represents how the various state variable components & , & , &' are combined to express the series in terms of a smoothed forecast = ℎ+& , / and the prediction error (- ). The + equations, in turn, outline the process by which the component estimates are updated using the previous period’s estimates and the current prediction error - - . The multiple functions are a notational device for writing the additive and multiplicative errors in compact form. With additive errors, we have ≡ 1, so that = ℎ +& , / + - . In short, we may think of the updating smoothing equations as being weighted averages of a term which depends on the current prediction error (and prior states), and one which depends on the prior states. The resulting state space equations for an additive error, multiplicative trend and multiplicative season model are:
TE D
= & & &'
= & & +
EP
= & + !
= &' + "
- ;&'
- ; &' &
(9)
- ; & &
AC C
For more information regarding the alternatives on state space estimation methods, the interested reader is referred to [50]. 3.3.2 Seasonal Autoregressive Integrated Moving Average Models (SARIMA) The SARIMA models, first proposed by Box and Jenkins [51], consist of an alternative approach to the traditional exponential smoothing methods. In brief terms, SARIMA models are similar to exponential smoothing methods inasmuch as they are adaptive, can model trends and seasonal patterns, and can be automated. They differ, however, in that they are based on autocorrelations (patterns in time) rather than a structural view of level, trend and seasonality. It is argued that SARIMA formulations tend to succeed better than exponential smoothing methods for longer, more stable data sets and not as well for noisier, more volatile data [52]. Non-seasonal ARIMA models are generally denoted by ARIMA(p,d,q) where parameters p, d, and q are non-negative integers, p being the order of the autoregressive model, d the degree of 10
ACCEPTED MANUSCRIPT differencing, and q the order of the moving-average model. Seasonal ARIMA models, in turn, are usually denoted by SARIMA(p,d,q)x(P,D,Q)S and can written as follows: ∇=> ∇? @A BA = C = /A DA =
where:
RI PT
• • • • •
S refers to the number of periods in each season; the uppercase P,D,Q refer to the autoregressive, differencing, and moving average terms for the seasonal part of the ARIMA model; is the error term; A is the backward shift operator (eg. A = & ); @A and BA = are the non-seasonal and seasonal autoregressive polynomials, respectively; /A and DA = are the non-seasonal and seasonal moving-average polynomials, respectively; ∇? and ∇=> are the non-seasonal and seasonal differencing operators, respectively.
SC
• •
(10)
M AN U
3.4. Aggregation
In this last step, we aggregate the forecasts obtained for each bootstrapped time series to generate the final output. To that end, we use two different methods: the simple mean (or equal weights combination) and the simple median. Besides its simplicity, the simple mean has proved to be a tough benchmark for forecasting combinations, see for example Stock and Watson [53]. The median, in turn, is less sensitive to outliers, reducing the effects of occasional poor forecasts.
TE D
4. Data and overall procedure
AC C
EP
Our empirical analysis is based upon monthly data of total electric energy consumption (GWh) in different developed and developing countries, namely Canada, France, Italy and Japan (former case) and Brazil, Mexico and Turkey (for the latter). With the exception of Brazil, the data were collected from the International Energy Agency Monthly Electricity Statistics report, which provides electricity production and trade data for all OECD Member Countries [54]. For the Brazilian electric energy consumption, we referred to the data provided by the major Brazilian electric utilities company, Eletrobras, available at the Brazilian Central Bank time series database [55]. The time period of the analysis spans from July 2006 (the first date available for OECD countries) to December 2016. We considered data from July 2006 to December 2014 as training set (in-sample) and the observations from January 2015 to December 2016 as test set (out-of-sample). The original data can be found in the supplementary material (Appendix B). Before proceeding to the methods described in Section 3, a transformed version of each original time series is generated by means of a Box and Cox [56] transformation. The main rationale behind this procedure is to stabilize the variance of a time series and make highly skewed distributions less skewed. In addition, an useful feature of the Box–Cox (BC) approach (along with other log transformations) is that it constrains the forecasts to stay positive on the original scale. Details on the BC procedure are not explained here, however, it is worth noting that, for the transformation parameter (E), we follow Bergmeir et al. [32] and restrict it to lie in the interval [0,1], then use the method of Guerrero [57] to choose its value. Following the BC transformation, each series is decomposed into three components (trend, seasonal and remainder), using the STL Decomposition. The remainder, is then either estimated by means of an ARMA(p,q) process or directly bootstrapped
11
ACCEPTED MANUSCRIPT using the MBB approach (see subsections 3.2.1 and 3.2.2 for details). For each bootstrap approach, a total of 100 new series are generated. Finally, the components are added together again, and the BC transformation is inverted. The overall procedures are illustrated in the flowchart of Figure 1.
RI PT
In the second step of the analysis, several models are proposed to estimate and subsequently forecast the original and bootstrapped versions of the total electric energy consumption time series. In this paper, we restrict our analysis to four different methods:
AC C
EP
TE D
M AN U
SC
(i) an auto ARIMA approach. To that end, we use the auto.arima() function in R, an algorithm which combines unit root tests, minimization of the AICc and Maximum Likelihood Estimation (MLE) to obtain an ARIMA model; (ii) a three parameter Holt-Winters additive model; (iii) a three parameter Holt-Winters multiplicative model; and (iv) an auto state space exponential smoothing approach. For this case, we use the ets() function in R and let it decide which Error, Trend and Seasonal (ETS) combination best suits the data. The selection is primarily based on the minimization of the Akaike Information Criteria with corrections (AICc) [44].
12
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Fig. 1. Overall procedure - Flowchart. Source: Authors’ own.
13
ACCEPTED MANUSCRIPT Finally, the forecasts are combined using the aggregation procedures described in Section 3.4 and the predictive power is assessed by means of an out-of-sample experiment from January 2015 to December 2016. To that end, we make use of certain measures to evaluate the accuracy of the forecasts, such as: Mean Absolute Percentage Error (MAPE): FGHI = J K L O
M − L ; ℎP × 100%
RI PT
N
Symmetric Mean Absolute Percentage Error (sMAPE): FGHI = S K T O
Root Mean Squared Error (RMSE):
|M − | V ; ℎW × 100% |M | + | |⁄2
M AN U
N
XFYI = Z K
O
Theil Inequality Coefficient (TIC):
M − 9; ℎ
M − 9; N ]∑O
ℎ
TE D
%[\ =
SC
N
M N ]∑O
9
_ + ℎ
N ]∑O
(11)
(12)
(13)
9
; ℎ
(14)
EP
In the above formulae, M is the predicted (forecasted) value whereas is the real (observed) value. ℎ, in turn, is the number of forecasting steps ahead.
AC C
5. Empirical Findings and Discussion 5.1 Performance gains from bagging
The empirical results (best highlighted in bold) for the developed and developing countries are summarized in Tables 3 and 4, respectively. A visualization of the forecasts generated by the best bagging approaches plotted against the actual values (for each country) can be seen in Figure 2. Figures illustrating the time series generated using the procedures described in Sections 3.1 and 3.2 and their corresponding forecasts (obtained using the methods addressed in Sections 3.3 and 3.4), along with other results, can be found in the supplementary material (Appendix C).
14
ACCEPTED MANUSCRIPT Table 3. Forecast evaluation – developed countries
MBB.Add.H-W RSB.Add.H-W Add H-W
MBB.Mult.H-W RSB.Mult.H-W Mult H-W
MBB.ETS RSB.ETS Auto ETS
RMSE
(%)
(GWh)
TIC
MAPE
SMAPE
RMSE
(%)
(%)
(GWh)
4.271
Canada 4.187 2275.225
0.024
4.056
4.067
France 1981.013
0.024
Median
3.881
3.958
1960.548
0.021
3.684
3.617
1442.878
0.019
Mean
4.986
4.899
2629.646
0.027
5.116
5.132
2497.840
0.031
Median
5.004
5.004
2503.508
0.027
4.892
4.903
1784.197
0.023
Single
5.140
5.050
2718.240
0.028
5.946
6.014
2865.827
0.036
Mean
4.023
3.966
2146.523
0.022
4.157
4.239
2012.938
0.025
Median
3.433
3.424
1748.092
0.019
3.520
3.583
1618.959
0.021
Mean
3.911
3.817
2251.833
0.023
5.248
5.388
2413.163
0.030
Median
3.174
3.176
1589.605
0.017
4.320
4.416
2019.551
0.027
Single
4.206
4.078
2474.736
0.026
Mean
4.011
3.948
2149.314
0.022
Median
3.294
3.347
1743.229
0.019
Mean
3.884
3.793
2239.347
0.023
4.834
Median
3.296
3.253
1575.038
0.017
Single
4.141
4.023
2427.717
Mean
3.855
3.787
2141.358
5.634
5.792
2463.391
0.031
3.807
3.880
1944.163
0.024
3.722
3.793
1582.883
0.021
4.959
2351.984
0.029
4.558
4.664
1722.974
0.023
0.025
5.033
5.158
2319.074
0.029
0.022
2.781
2.784
1663.936
0.020 0.011
Median
3.385
3.389
1681.856
0.018
2.098
2.100
815.629
Mean
4.234
4.152
2306.102
0.024
2.994
3.004
1785.948
0.022
Median
3.879
3.956
1794.504
0.019
1.955
1.954
867.615
0.011
Single
4.040
3.944
2268.954
0.023
2.489
2.479
1534.811
0.019
0.020
3.494
3.570
3507.538
0.022
Italy
RSB.Arima Auto Arima
MBB.Add.H-W
Median Mean Median Single Mean
Add H-W
MBB.Mult.H-W RSB.Mult.H-W Mult H-W
MBB.ETS RSB.ETS Auto ETS
2.533
2.595
1024.809
Japan
2.527
2.559
625.712
0.012
3.426
3.486
2957.626
0.018
2.562
2.619
1243.208
0.024
3.575
3.645
3516.507
0.022
1.455
1.456
377.912
0.007
3.585
3.609
2922.618
0.018
3.314
3.407
1221.386
0.024
3.229
3.288
3267.591
0.020
2.556
917.959
0.018
3.494
3.570
3507.538
0.022
2.502
Median
2.409
2.439
607.370
0.012
3.426
3.486
2957.626
0.018
Mean
2.126
2.163
848.341
0.016
3.575
3.645
3516.507
0.022
Median
1.583
1.583
402.300
0.008
3.585
3.609
2922.618
0.018
Single
1.904
1.943
803.252
0.015
3.229
3.288
3267.591
0.020
AC C
RSB.Add.H-W
Mean
EP
MBB.Arima
TIC
Mean
RI PT
Auto Arima
SMAPE
(%)
SC
RSB.Arima
MAPE
M AN U
MBB.Arima
Statistic
TE D
Forecast Approach
Mean
2.458
2.512
928.555
0.018
3.909
3.994
3638.494
0.023
Median
2.317
2.344
635.191
0.012
3.997
4.058
3053.883
0.019
Mean
2.012
2.049
856.750
0.016
5.624
5.805
5008.852
0.031
Median
1.419
1.409
371.320
0.007
6.128
6.322
5071.051
0.032
Single
1.829
1.868
818.579
0.016
3.735
3.770
3345.802
0.021
Mean
1.745
1.773
755.870
0.015
3.526
3.588
3255.967
0.020
Median
1.609
1.596
402.184
0.008
3.594
3.660
2818.551
0.017
Mean
1.855
1.860
748.582
0.014
3.711
3.781
3490.502
0.022
Median
1.305
1.296
327.862
0.006
3.044
3.057
2635.629
0.016
Single
1.806
1.838
768.508
0.015
2.274
2.233
2687.012
0.016
15
ACCEPTED MANUSCRIPT Table 4. Forecast evaluation – developing countries
Auto Arima
MBB.Add.H-W RSB.Add.H-W Add H-W
MBB.Mult.H-W RSB.Mult.H-W Mult H-W
MBB.ETS RSB.ETS Auto ETS
RMSE
(%)
(GWh)
Mean
4.724
4.603
Brazil 1933.379
0.025
3.531
Mexico 3.585 1091.476
0.023
Median
4.627
4.522
1784.111
0.023
3.041
3.046
680.274
0.014
TIC
MAPE
SMAPE
RMSE
(%)
(%)
(GWh)
TIC
Mean
4.368
4.264
1805.707
0.023
3.503
3.559
1100.538
0.023
Median
4.359
4.266
1675.724
0.021
3.280
3.276
750.486
0.016
Single
4.677
4.550
1943.011
0.025
3.092
3.122
968.572
0.020
RI PT
RSB.Arima
SMAPE
(%)
Mean
6.789
6.536
2803.963
0.035
4.554
4.674
1375.443
0.029
Median
6.700
6.483
2574.868
0.032
4.241
4.333
1083.106
0.023
Mean
6.447
6.222
2658.016
0.034
4.947
5.058
1446.868
0.030
Median
6.250
6.061
2383.104
0.030
4.257
4.350
1106.579
0.023
Single
7.170
6.884
2961.887
0.037
5.128
5.298
1558.574
0.033
Mean
6.647
6.405
2745.442
0.035
4.608
4.728
1364.842
0.029
Median
6.588
6.378
2495.533
0.031
4.156
4.162
995.764
0.021
SC
MBB.Arima
MAPE Statistic
M AN U
Forecast Approach
Mean
6.381
6.162
2629.106
0.033
4.566
4.657
1345.790
0.028
Median
6.122
5.940
2335.423
0.030
3.832
3.908
952.716
0.020
Single
7.180
6.891
2973.818
0.037
4.779
4.911
1428.301
0.030
Mean
6.471
6.242
2661.628
0.034
6.192
6.441
1780.398
0.038
Median
6.570
6.361
2502.552
0.032
6.086
6.278
1442.600
0.031
Mean
6.411
6.188
2649.286
0.033
6.341
6.610
1853.388
0.040
Median
6.195
6.009
2366.104
0.030
6.046
6.234
1463.423
0.031
Single
7.214
6.927
2965.903
0.037
6.921
7.228
1953.420
0.042
RSB.Arima Auto Arima
MBB.Add.H-W
Mean Median Single
Add H-W
MBB.Mult.H-W RSB.Mult.H-W Mult H-W
MBB.ETS RSB.ETS Auto ETS
2.644
2.632
712.075
0.016
2.151
2.138
490.369
0.012
2.744
2.729
724.887
0.017
2.507
2.511
556.709
0.013
2.277
2.279
681.329
0.016
Mean
3.079
3.038
755.223
0.017
Median
2.756
2.718
595.679
0.014
3.326
3.275
807.492
0.018
Median
Mean
2.993
2.949
642.197
0.015
Single
2.623
2.594
707.484
0.016
Mean
2.740
2.701
707.057
0.016
Median
2.035
2.015
452.767
0.011
Mean
2.995
2.938
795.979
0.018
Median
2.392
2.364
495.507
0.012
Single
2.421
2.383
698.838
0.016
Mean
2.512
2.527
724.285
0.017
Median
2.489
2.459
547.975
0.013
Mean
2.686
2.728
830.497
0.019
Median
2.224
2.224
460.625
0.011
Single
2.913
2.984
932.234
0.022
AC C
RSB.Add.H-W
Mean Median
EP
MBB.Arima
TE D
Turkey
16
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Fig. 2. Electricity demand by country (best forecasts in red, actual values in blue). Source: Authors’ own.
17
ACCEPTED MANUSCRIPT
RI PT
Overall, the results indicate that the bootstrap aggregation (bagging) approaches can substantially improve the forecast accuracy of the demand for energy end-use services in different countries. With the exception of the Japanese case, where the best forecast in terms of MAPE and sMAPE was achieved using a single ETS formulation, the STL and bagging procedures, combined with traditional forecasting methods, led to considerably superior results, in terms of accuracy. In several cases the gains were noteworthy when compared with single forecasts on the real data. For the Italian electricity consumption, for instance, the sMAPE and the RMSE obtained using a Remainder Sieve Bootstrap (RSB) ETS approach were almost 30% and 58% lower than the ones obtained in a single ETS forecast on the real data.
M AN U
SC
It is worth noting that, for developed countries, the bagged forecasts that used the RSB approach performed better, in terms of MAPE and sMAPE, than the ones that used the MBB methodology. In terms of RMSE and TIC, the only case where the MBB methodology outperformed the RSB was for the French monthly electricity consumption. Even so, the difference between the error metrics was not too significant. As for the developing countries, the MBB approach provided slightly better results on two of the three involved countries (Mexico and Turkey). This is a substantial improvement over previous bagging methods, as the MBB approach proposed by Bergmeir et al. [32] has been regarded as a benchmark for forecasting monthly data.
TE D
Another interesting feature is the fact that the mean and the median of the forecasted values differed considerably in nearly every occasion (the Brazilian case is the sole exception). The results obtained using the simple median aggregation approach were considerably superior to the ones obtained by pooling the forecasts using equal weights (simple mean). Considering that the median is less sensitive to outliers, this suggests that the outliers (whether in the original or the generated time series) exert a considerable effect on the overall results.
EP
There was no consensus concerning the superiority of the forecasting methods proposed in Section 3.3. The (auto) ARIMA approach seems to perform better for the Brazilian and Mexican cases, whereas the exponential smoothing methods adapted well for the monthly consumption in developed countries. 5.2 Comparison with other methods
AC C
In what follows, we compare our results with forecasts from other univariate methods established in the literature. Care was taken to choose models that deal with different stylized facts in electricity demand time series, such as nonlinearities, stochastic components (trend, seasonality, residuals), heteroscedasticity, among others. In addition, it should be noted that the choice for other univariate methodologies in lieu of regression-based methods is due to two main reasons: - First, regression-based methods fail to perform well when forecasting several steps ahead (in our case, 24 months ahead). This is mainly due to the fact that independent variables also need to be forecasted (first) so their predicted outcomes can be then used to forecast the dependent variable. It should be highlighted that several steps ahead forecasts are not to be confounded with long run forecasts (such as annual, quinquennial or even larger horizons), which usually deals with a small number of steps ahead (up to 12 steps, in the vast majority of cases); - Second, the selection of the most appropriate independent variables is linked to a huge variety of criteria, being unfeasible to address each case (country) in full particularity.
18
ACCEPTED MANUSCRIPT Particularly, we employ the following univariate methods for forecasting comparison: - a feedforward Artificial Neural Network (ANN) model [58,59], to address complex nonlinear behavior;
RI PT
- a feedforward Artificial Neural Network model with prior Box-Cox transformation (BC-ANN), in an attempt to ensure that residuals will be roughly homoscedastic [56];
SC
- a univariate Support Vector Regression (SVR), an advanced machine learning algorithm, able to learn from training data and form complex non-linear decision boundaries [60]. To select the best subset of variables for prediction (in our case, the lagged values of the electricity demand with the most predictive power), the Correlation-based feature selection (CFS) algorithm [61] was used in each country's training set;
M AN U
- the Theta method [62], a technique equivalent to a simple exponential smoothing with drift (with a particular restriction for this last component). The technique has performed particularly well in the M3-competition [41] for monthly series and for microeconomic data; - two variations of the univariate Singular Spectrum Analysis (SSA) technique - a decompositionreconstruction method that seeks to filter the noise and forecast the signal of an underlying time series using multiple steps (Embedding, Singular Value Decomposition, Grouping and Diagonal Averaging). In this work, we employ both the Recurrent SSA (RSSA) and the Vector SSA (VSSA) variations [63].
TE D
The results obtained using the above-mentioned methods are depicted in Tables 5 and 6, for the developed and developing countries, respectively. To conserve space, references and explanations on the alternative methods are provided in the supplementary material (Appendix A). Table 5. Comparison with other methods – developed countries MAPE
SMAPE
Forecast Approach
RMSE
TIC
(%)
(GWh)
EP
(%)
MAPE
SMAPE
RMSE
(%)
(%)
(GWh)
Forecast Approach
TIC
Canada
3.174
RSB.Mult.H-W - Median
3.296
ANN (1, 1, 2) [12]
4.137
BC-ANN (1, 1, 2) [12]
SVR (6, 20, 27, 84, 96) Thetha RSSA (35, 33)
3.176
1589.605
0.017
RSB.ETS - Median
1.955
1.954
3.253
1575.038
0.017
MBB.ETS - Median
2.098
2.100
815.629
0.011
4.078
2334.346
0.024
ANN (2, 1, 2) [12]
3.406
3.440
1957.902
0.024
4.138
4.085
2307.694
0.024
BC-ANN (1, 1, 2) [12]
3.303
3.370
2002.234
0.025
8.509
8.550
4977.226
0.052
SVR (8, 96)
8.199
8.074
3881.359
0.048
AC C
RSB.Add.H-W - Median
VSSA (27, 26)
France 867.615
0.011
4.137
4.022
2385.858
0.025
Thetha
2.846
2.861
1666.439
0.021
4.746
4.697
2661.387
0.028
RSSA (35, 7)
4.413
4.422
1980.674
0.024
5.857
5.672
3245.611
0.033
VSSA (34, 11)
4.289
4.322
1897.870
0.023
Italy
Japan
RSB.ETS - Median
1.305
1.296
327.862
0.006
RSB.ETS - Median
3.044
3.057
2635.629
0.016
RSB.Mult.H-W - Median
1.419
1.409
371.320
0.007
Single ETS
2.274
2.233
2687.012
0.016
ANN (12, 1, 6) [12]
3.863
3.960
1405.311
0.027
ANN (7, 1, 4) [12]
4.318
4.310
4637.648
0.028
BC-ANN (12, 1, 6) [12]
3.393
3.520
1498.755
0.029
BC-ANN (6, 1, 4) [12]
5.804
5.856
5782.212
0.036
SVR (34, 99)
3.516
3.590
1613.364
0.031
SVR (2, 100)
7.368
7.243
6883.180
0.042
19
ACCEPTED MANUSCRIPT Thetha
2.065
2.107
839.800
0.016
Thetha
2.392
2.370
2615.827
0.016
RSSA (29, 21)
3.078
3.053
1107.605
0.021
RSSA (21, 7)
5.276
5.443
5254.590
0.033
VSSA (30, 16)
2.791
2.866
1127.547
0.022
VSSA (23, 11)
4.424
4.591
4869.373
0.030
Table 6. Comparison with other methods – developing countries SMAPE
RMSE
MAPE
TIC (%)
(%)
(GWh)
(%)
Brazil 4.368
RSB.ARIMA - Median
4.359
4.266
ANN (1, 1, 2) [12]
5.531
5.360
BC-ANN (1, 1, 2) [12]
5.415
5.251
SVR (2, 3, 4, 100)
5.069
4.910
1805.707
4.264
RMSE
(%)
(GWh)
TIC
Mexico
0.023
MBB.ARIMA - Median
1675.724
0.021
Single ARIMA
3.092
3.122
968.572
0.020
2259.505
0.029
ANN (1, 1, 2) [12]
6.762
7.019
2042.153
0.043
2212.039
0.028
BC-ANN (2, 1, 2) [12]
5.810
5.998
1750.736
0.037
2160.605
0.027
SVR (2, 100)
15.040
16.749
4538.936
0.102
M AN U
RSB.ARIMA - Mean
SMAPE
Forecast Approach
SC
MAPE Forecast Approach
RI PT
Notes: For each country, the first and second rows refer to the two best (most accurate) forecasting methods from Table 3 (developed countries) or 4 (developing countries). For ANN formulations, the selected model is given in the form ANN (p,P,k) [m], where p is the number of lagged inputs (autoregressive terms), P is the number of autoregressive terms for the seasonal part of the time series, k is the number of nodes in the hidden layer and m is the seasonal frequency. For SVRs, numbers in parenthesis are the lag variables selected by the CFS algorithm for the training set. For the SSA models, the parameters refer to the window length (L) and the number of eigenvalues / eigentriples (r), in that order. The selection was made on the basis of the lowest Root Mean Squared Error (RMSE) for the calibration period (24 months before the out-of-sample period), i.e. the L and r parameters are the same from the model which demonstrated the lowest RMSE when forecasting for the period January 2013-December 2014.
3.041
3.046
680.274
0.014
5.153
5.012
2086.524
0.026
Thetha
6.545
6.820
1860.834
0.040
RSSA (26, 13)
6.318
6.072
2751.797
0.035
RSSA (35, 7)
3.557
3.567
1080.091
0.022
VSSA (20, 11)
7.033
6.722
3059.962
0.038
VSSA (24, 10)
3.303
3.341
1042.185
0.022
TE D
Thetha
Turkey MBB.Mult.H-W Median
2.035
2.015
MBB.ARIMA - Median
2.151
2.138
ANN (1, 1, 2) [12]
3.604
3.490
BC-ANN (2, 1, 2) [12]
3.387
3.463
SVR (2, 100)
5.723
Thetha
2.914
RSSA (25, 7)
3.559
490.369
0.012
1320.487
0.030
1028.813
0.024
6.022
1887.737
0.045
2.975
898.173
0.021
884.791
0.020
1457.826
0.033
EP
0.011
3.503
AC C
VSSA (39, 24)
452.767
6.174
5.941
Notes: See Table 5.
The results outlined in the above Tables endorse the superiority of the proposed bagging methodologies in terms of more accurate forecasts. The Japanese case remains the only exception, but now results are not conclusive in terms of the best forecasting technique for the 2015-2016 period. The Single ETS approach performed better in terms of MAPE and sMAPE for the Japanese electricity demand, whilst the Theta forecasts were slightly more accurate in terms of RMSE and TIC.
20
ACCEPTED MANUSCRIPT 5.3 Discussion
RI PT
The performance gains demonstrated by the STL + bagging approaches are remarkable as accurate forecasts are decisive for assertive profit/cost management and investment decisions, as well as for the definition of sectoral policies in a local or national scale. For the energy sector, particularly, precise mid/long-term demand forecasting is of the utmost importance for several decision-making processes, such as the construction of new generation facilities, the purchase of existing generating units, the development of transmission and distribution systems, among others. In a more general sense, accurate forecast results are also paramount to reach agreements between different stakeholders (generators, transmitters, distributors, traders, consumers, investors, government and national regulation institutes).
TE D
M AN U
SC
It should be noted that a considerable amount of the variation in the monthly electric energy consumption is due to external factors, which cannot be captured by univariate forecasting methods. Some remarkable examples are the influences of the electric energy generation and, particularly, Industrial Output in several countries. For the Brazilian case, for instance, the Industrial Sector accounted for almost 43% of the total electric energy consumption between the years of 2006 and 2014 (training period). Another point worth noting is the important role that the Gross Domestic Product (GDP) plays in electric energy consumption behavior ([24], [64]). By quickly glancing the GDP data in Brazil, one may notice substantial falls in its common trend, reflecting the recent political and economic turmoil in the country. Along with the energy rationing and the lower industrial output, this might have been an important factor for the substantial decline in the demand for energy end-use services in Brazil in the last years. Notwithstanding the above, as previously mentioned, formulations that consider external influences on the variable of interest usually yield satisfactory results when simulating historical data but fail to perform well in forecasting several steps ahead (more than 12 steps, as in our case). On these grounds, the combination of bagging methodologies and univariate forecast methods emerges as a promising alternative to predict mid-/long-term behavior for a broad variety of time series in different economic sectors.
EP
6. Conclusions and future directions
AC C
In light of the strategic importance of having credible forecasts of the demand for electric energy in almost every country, the continuous search for more accurate methods is key to competitive advantage among different stakeholders. This is especially true when it comes to methods which are robust to uncommon events, such as economic distress and rationing. This paper applies for the first time, as long as the authors are aware, a combination of decomposition and Bootstrap aggregating (Bagging) techniques in order to enhance univariate forecasts for monthly electric energy demand across different countries. Furthermore, a new variation of an existing bagging approach is proposed, here name after Remainder Sieve Bootstrap (RSB). The obtained results attest that the proposed methodologies can improve forecast accuracy. Moreover, the approaches that used the RSB procedure for simulation, forecast and aggregation delivered the best results, especially in terms of MAPE and sMAPE, in most cases. This is a substantial contribution to the literature, as the MBB approach proposed by Bergmeir et al. [32] has been regarded as a benchmark for forecasting monthly data.
21
ACCEPTED MANUSCRIPT
RI PT
A promising aspect in estimating future monthly electric energy consumption is that much of its variation is dependent on the common patterns which are natural to demand time series - such as trends and seasonality. On these grounds, even though models can sometimes benefit from the inclusion of exogenous economic variables, the combination of decomposition and bagging methodologies with univariate methods seems to perform better when predicting energy consumption behavior several steps ahead, as the forecasting of external variables is often regarded as a complex, sensitive task. The empirical results confirmed this belief, as the values obtained for the MAPE and sMAPE were fairly low (approximately 1.3% for the Italian case, for instance) considering the large number of forecast steps proposed (24 months ahead).
SC
In general, it can be stated that the results from the proposed approaches were appropriate and provided consistently accurate forecasts for the mid-/long-term electric energy consumption. This give us confidence to conclude that the bagging methodologies, particularly the MBB and the proposed RSB procedure, are an important forecast tool for the energy industry, despite its gap in related literature.
Acknowledgements
TE D
M AN U
Finally, it is worth noting that other methodologies may be proposed to investigate total electric energy consumption behavior, and that the selection of the most appropriate models is linked to a huge variety of criteria. Further studies might benefit, for instance, from a top-down disaggregation approach before proceeding to the methods here developed. For the electric sector, particularly, this means applying the STL + bagging methods for each subsystem of the total consumption (Industrial, Commercial, Residential and Other Sectors classes), forecasting the simulated series and then pooling together the results into one single output. Such sector-specific studies not only would provide a more in-depth understanding of the demand for electric energy across different countries, but could also possibly further enhance the accuracy of forecasts for total electric energy consumption. The study of other decomposition, bagging and forecasting methods is also a logical extension of this work.
AC C
References
EP
Funding: This work was supported by CAPES; CNPq [304843/2016-4, 443595/2014-3] and FAPERJ [E-26/202.806/2015].
[1] Al-Hamadi HM, Soliman SA. Long-term/mid-term electric load forecasting based on short-term correlation and annual growth. Elect Power Syst Res 2005;74(3):353–61. doi: 10.1016/j.epsr.2004.10.015 [2] Shao Z, Chao F, Yang S-L, Zhou K-L. A review of the decomposition methodology for extracting and identifying the fluctuation characteristics in electricity demand forecasting. Renew & Sust Energy Rev 2017;75:123-36. doi: 10.1016/j.rser.2016.10.056 [3] Dogan E. Are shocks to electricity consumption transitory or permanent? Sub-national evidence from Turkey. Utilities Policy 2016;41:77–84. doi: 10.1016/j.jup.2016.06.007 [4] Apadula F, Bassini A, Elli A, Scapin S. Relationships between meteorological variables and monthly electricity demand. Appl Energy 2012;98:346–56. doi: 10.1016/j.apenergy.2012.03.053
22
ACCEPTED MANUSCRIPT [5] Castelli M, Vanneschi L, De Felice M. Forecasting short-term electricity consumption using a semantics-based genetic programming framework: The South Italy case. Energy Econ 2015;47:37–41. doi: 10.1016/j.eneco.2014.10.009
RI PT
[6] Kaboli SHA, Fallahpour A, Selvaraj J, Rahim NA. Long-term electrical energy consumption formulating and forecasting via optimized gene expression programming. Energy 2017;126:144–64. doi: 10.1016/j.energy.2017.03.009 [7] Suganthi L, Samuel AA. Energy models for demand forecasting — A review. Renew & Sust Energy Rev 2012;16(2):1223–40. doi: 10.1016/j.rser.2011.08.014
SC
[8] Kuster C, Rezgui Y, Mourshed M. Electrical load forecasting models: A critical systematic review. Sust Cities & Society 2017; 35:257–70. doi: 10.1016/j.scs.2017.08.009
M AN U
[9] Abdel-Aal RE, Al-Garni AZ. Forecasting monthly electric energy consumption in eastern Saudi Arabia using univariate time-series analysis. Energy 1997;22(11):1059–69. doi: 10.1016/S03605442(97)00032-7 [10] Chaveza SG, Bernata JX, Coallab HL. Forecasting of energy production and consumption in Asturias (northern Spain). Energy 1999; 24(3):183–98. doi: 10.1016/S0360-5442(98)00099-1 [11] Pappas SS, Ekonomou L, Karamousantas DC, Chatzarakis GE, Katsikas SK, Liatsis P. Electricity demand loads modeling using AutoRegressive Moving Average (ARMA) models. Energy 2008;33(9):1353–60. doi: 10.1016/j.energy.2008.05.008
TE D
[12] Sen P, Roy M, Pal P. Application of ARIMA for forecasting energy consumption and GHG emission: A case study of an Indian pig iron manufacturing organization. Energy 2016;116:1031–8. doi: 10.1016/j.energy.2016.10.068
EP
[13] Maçaira PM, Souza RC, Oliveira FLC. Modelling and Forecasting the Residential Electricity Consumption in Brazil with Pegels Exponential Smoothing Techniques. Procedia Comp Sci 2015;55:328–35. doi: 10.1016/j.procs.2015.07.057
AC C
[14] Hor C-L, Watson SJ, Majithia S. Analyzing the Impact of Weather Variables on Monthly Electricity Demand. IEEE Trans on Power Systs 2005;20(4):2078–85. doi: 10.1109/TPWRS.2005.857397 [15] Apadula F, Bassini A, Elli A, Scapin S. Relationships between meteorological variables and monthly electricity demand. Appl Energy 2012;98:346–56. doi: 10.1016/j.apenergy.2012.03.053 [16] De Felice M, Alessandri A, Catalano F. Seasonal climate forecasts for medium-term electricity demand forecasting. Appl Energy 2015;137:435–44. doi: 10.1016/j.apenergy.2014.10.030 [17] Vu DH, Muttaqi KM, Agalgaonkar AP. A variance inflation factor and backward elimination based robust regression model for forecasting monthly electricity demand using climatic variables. Appl Energy 2015;140:385–94. doi: 10.1016/j.apenergy.2014.12.011
23
ACCEPTED MANUSCRIPT [18] Azadeh A, Ghaderi SF, Sohrabkhani S. A simulated-based neural network algorithm for forecasting electrical energy consumption in Iran. Energy Policy 2008;36(7):2637–44. doi: 10.1016/j.enpol.2008.02.035
RI PT
[19] Kialashaki A, Reisel JR. Development and validation of artificial neural network models of the energy demand in the industrial sector of the United States. Energy 2014;76:749–60. doi: 10.1016/j.energy.2014.08.072 [20] Günay ME. Forecasting annual gross electricity demand by artificial neural networks using predicted values of socio-economic indicators and climatic conditions: Case of Turkey. Energy Policy 2016;90:92–101. doi: 10.1016/j.enpol.2015.12.019
SC
[21] Dudek G. Neural networks for pattern-based short-term load forecasting: A comparative study. Neurocomputing 2016;205:64–74. doi: 10.1016/j.neucom.2016.04.021
M AN U
[22] Azadeh A, Saberi M, Seraj O. An integrated fuzzy regression algorithm for energy consumption estimation with non-stationary data: A case study of Iran. Energy 2010;35(6):2351–66. doi: 10.1016/j.energy.2009.12.023 [23] Mamlook R, Badran O, Abdulhadi E. A fuzzy inference model for short-term load forecasting. Energy Policy 2009;37(4):1239–48. doi: 10.1016/j.enpol.2008.10.051 [24] Kucukali S, Baris K. Turkey’s short-term gross annual electricity demand forecast by fuzzy logic approach. Energy Policy 2010;38(5):2438-45. doi: 10.1016/j.enpol.2009.12.037
TE D
[25] Torrini FC, Souza RC, Oliveira FLC, Pessanha JFM. Long term electricity consumption forecast in Brazil: A fuzzy logic approach. Socio-Econ Plan Sciences 2016;54:18–27. doi: 10.1016/j.seps.2015.12.002
EP
[26] Hong WC. Electric load forecasting by support vector model. Appl Math Modelling 2009;33(5):2444–54. doi: 10.1016/j.apm.2008.07.010
AC C
[27] Chen Y, Xu P, Chu Y, Li W, Wu Y, Ni L, Bao Y, Wang K. Short-term electrical load forecasting using the Support Vector Regression (SVR) model to calculate the demand response baseline for office buildings. Appl Energy 2017;195:659–70. doi: 10.1016/j.apenergy.2017.03.034 [28] Theodosiou M. Forecasting monthly and quarterly time series using STL decomposition. Int J Forecast 2011;27(4):1178–95. doi: 10.1016/j.ijforecast.2010.11.002 [29] Benaouda D, Murtagh F, Starck J-L, Renaud O. Wavelet-based nonlinear multiscale decomposition model for electricity load forecasting. Neurocomputing 2006;70(1–3):139–54. doi: 10.1016/j.neucom.2006.04.005 [30] Qiu X, Ren Y, Suganthan PN, Amaratunga GAJ. Empirical Mode Decomposition based ensemble deep learning for load demand time series forecasting. Appl Soft Comp 2017;54:246–55. doi: 10.1016/j.asoc.2017.01.015 [31] Cordeiro C, Neves M. Forecasting time series with BOOT.EXPOS procedure, REVSTAT—Stat J 2009;7:135–49. Available at: https://www.ine.pt/revstat/tables.html (accessed 13 February 2017).
24
ACCEPTED MANUSCRIPT [32] Bergmeir C, Hyndman, RJ, Benítez JM. Bagging exponential smoothing methods using STL decomposition and Box–Cox transformation. Int J Forecast 2016;32:303–12. doi: 10.1016/j.ijforecast.2015.07.002 [33] Dantas TM, Oliveira FLC, Repolho HMV. Air transportation demand forecast through Bagging Holt Winters methods. J Air Transp Manag 2017; 59:116–23. doi: 10.1016/j.jairtraman.2016.12.006
RI PT
[34] Cleveland RB, Cleveland WS, McRae J, Terpenning I. STL: A seasonal-trend decomposition procedure based on loess. J Official Stat 1990;6:3–73. Available at: http://www.jos.nu/Articles/abstract.asp?article=613 (accessed 11 February 2017).
SC
[35] Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. 1st ed. Perth: University of Western Australia; 2013. [36] Efron B. Bootstrap methods: another look at the jackknife. Annals of Stat 1979;7(1):1–26. doi: 10.1214/aos/1176344552
M AN U
[37] Breiman L. Bagging predictors. Machine Learn 1996;24(2):123–40. doi: 10.1007/BF00058655 [38] Hillebrand E, Medeiros MC. The Benefits of Bagging for Forecast Models of Realized Volatility. Econometric Rev 2010;29(5-6):571–93. doi: 10.1080/07474938.2010.481554 [39] Zontul M, Aydin F, Dogan G, Sener S, Kaynar O. Wind speed forecasting using reptree and bagging methods in Kirklareli-Turkey. J Theoretical & Appl Info on Tech 2013;56(1):17–29. Available at: www.jatit.org/volumes/Vol56No1/3Vol56No1.pdf (accessed 20 March 2017).
TE D
[40] Künsch HR. The jackknife and the bootstrap for general stationary observations. Annals of Stat 1989;17(3):1217–41. doi: 10.1214/aos/1176347265
EP
[41] Makridakis S, Hibon M. The M3-Competition: results, conclusions and implications. Int J Forecast 2000;16(4):451–76. doi: 10.1016/S0169-2070(00)00057-1 [42] Bühlmann P. Sieve bootstrap for time series. Bernoulli 1997;3(2):123–48. doi: 10.2307/3318584
AC C
[43] Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control 1974;19(6):716–23. doi: 10.1109/TAC.1974.1100705 [44] Sugiura N. Further analysis of the data by Akaike's information criterion and the finite corrections. Comm in Stats - Theory & Methods 1978;7(1):13–26. doi: 10.1080/03610927808827599 [45] Gardner Jr. ES. Exponential smoothing: The state of the art. J Forecast 198;4(1):1–28. doi: 10.1002/for.3980040103 [46] Gardner Jr. ES. Exponential smoothing: The state of the art—Part II. Int J Forecast 2006;22(4):637-66. doi: 10.1016/j.ijforecast.2006.03.005 [47] Winters PR. Forecasting sales by exponentially weighted moving averages. Manag Sci 1960; 6:324–42. doi: 10.1287/mnsc.6.3.324
25
ACCEPTED MANUSCRIPT [48] Pegels CC. Exponential forecasting: some new variations. Manag Sci 1969; 12:311–5. Available at: http://www.jstor.org/stable/2628137 (accessed 10 February 2017). [49] Ord, JK, Koehler, AB, Snyder, RD. Estimation and prediction for a class of dynamic nonlinear statistical models. J American Stat Assoc 1997;92(440):1621–9. doi: 10.1080/01621459.1997.10473684
RI PT
[50] Hyndman RJ, Koehler AB, Ord JK, Grose S. Forecasting with exponential smoothing: The state space approach. 1st ed. Berlin: Springer-Verlag; 2008. [51] Box GEP, Jenkins GM. Time series analysis: forecasting and control. 1st ed. San Francisco: Holden Day; 1970.
SC
[52] Makridakis S, Andersen A, Carbone R, Fildes R, Hibon M, Lewandowski R, et al. The forecasting accuracy of major time series methods. 1st ed. New York: John Wiley & Sons; 1984.
M AN U
[53] Stock JH, Watson MW. Combination forecasts of output growth in a seven-country data set. J Forecast 2004;23(6):405–30. doi: 10.1002/for.928 [54] IEA (International Energy Agency) [dataset]. February 1, 2017 Monthly electricity statistics. 2017 Feb 1 [cited 2017 Feb 4]. Available from: http://www.iea.org/statistics/monthlystatistics/monthlyelectricitystatistics/
TE D
[55] Eletrobras (Centrais Elétricas Brasileiras S.A.) [dataset]. February 1, 2017 Consumo de energia elétrica por classe de consumidor e por região. 2017 Feb 1 [cited 2017 Feb 4]. Available from: https://www3.bcb.gov.br/sgspub/ [56] Box GEP, Cox DR. An analysis of transformations (with discussion). J Royal Stat Soc: Series B 1964;26:211–52. http://www.jstor.org/stable/2984418
EP
[57] Guerrero V. Time-series analysis supported by power transformations. J Forecast 1993;12(1):37– 48. doi: 10.1002/for.3980120104
AC C
[58] Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation (No. ICS-8506). San Diego: University of California; 1985. [59] Auer P, Burgsteiner H, Maass W. A learning rule for very simple universal approximators consisting of a single layer of perceptrons. Neural Nets 2008;21(5):786–95. doi: 10.1016/j.neunet.2007.12.036 [60] Smola AJ, Schölkopf B. A tutorial on support vector regression. Statist Comput 2004;14(3):199– 222. doi: 10.1023/B:STCO.0000035301.49549.88 [61] Hall M. Correlation-based feature selection for machine learning. Hamilton (NZ): Waikato University; 1999. [62] Assimakopoulos V, Nikolopoulos K. The theta model: a decomposition approach to forecasting. Int J Forecast 2000;16(4):521–30. doi: 10.1016/S0169-2070(00)00066-2a
26
ACCEPTED MANUSCRIPT [63] Golyandina N, Nekrutkin V, Zhigljavsky A. Analysis of time series structure: SSA and related techniques. Boca Raton: Chapman & Hall/CRC; 2001.
AC C
EP
TE D
M AN U
SC
RI PT
[64] Burke PJ, Csereklyei Z. Understanding the energy-GDP elasticity: A sectoral approach. Energy Econ 2016;58:199–210. doi: 10.1016/j.eneco.2016.07.004
27
ACCEPTED MANUSCRIPT Electricity demand across different countries is forecasted 24 months in advance
•
The potential gains of using bagging techniques to enhance forecasts are explored
•
A new variation of a bagging procedure is proposed
•
The proposed techniques provided consistently accurate forecasts in most cases
AC C
EP
TE D
M AN U
SC
RI PT
•