Short-term electricity load forecasting of buildings in microgrids

Short-term electricity load forecasting of buildings in microgrids

Accepted Manuscript Title: Short-term Electricity Load Forecasting of Buildings in Microgrids Author: Hamed Chitsaz Hamid Shaker Hamidreza Zareipour D...

2MB Sizes 6 Downloads 124 Views

Accepted Manuscript Title: Short-term Electricity Load Forecasting of Buildings in Microgrids Author: Hamed Chitsaz Hamid Shaker Hamidreza Zareipour David Wood Nima Amjady PII: DOI: Reference:

S0378-7788(15)00310-2 http://dx.doi.org/doi:10.1016/j.enbuild.2015.04.011 ENB 5803

To appear in:

ENB

Received date: Revised date: Accepted date:

19-12-2014 17-2-2015 10-4-2015

Please cite this article as: Hamed Chitsaz, Hamid Shaker, Hamidreza Zareipour, David Wood, Nima Amjady, Short-term Electricity Load Forecasting of Buildings in Microgrids, Energy & Buildings (2015), http://dx.doi.org/10.1016/j.enbuild.2015.04.011 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

*Manuscript

ip t

Short-term Electricity Load Forecasting of Buildings in Microgrids

Schulich School of Engineering, University of Calgary, Alberta, Canada b Electrical Engineering Department, Semnan University, Semnan, Iran

us

a

cr

Hamed Chitsaza,∗, Hamid Shakera , Hamidreza Zareipoura , David Wooda , Nima Amjadyb

an

Abstract

Electricity load forecasting plays a key role in operation of power systems.

M

Since the penetration of distributed and renewable generation is increasingly growing in many countries, Short-Term Load Forecast (STLF) of micro-grids is also

d

becoming an important task. A precise STLF of the micro-grid can enhance the

te

management of its renewable and conventional resources and improve the economics of energy trade with electricity markets. As a consequence of the highly

Ac ce p

non-smooth and volatile behavior of the load time series in a micro-grid, its STLF is even a more complex process than that of a power system. For this purpose, a new prediction method is proposed in this paper, in which a Self-Recurrent Wavelet Neural Network (SRWNN) is applied as the forecast engine. Moreover, the Levenberg-Marquardt (LM) learning algorithm is implemented and adapted to train the SRWNN. In order to demonstrate the efficiency of the proposed method, it is examined on real-world hourly data of an educational building within a microCorresponding Author: Hamed Chitsaz; Email: [email protected]; Phone: +1 (587) 896-8388; Fax:+1 (403) 282-6855 ∗

Preprint submitted to Elsevier

February 17, 2015

Page 1 of 42

grid. Comparisons with other load prediction methods are provided.

ip t

Keywords: Micro-grids, buildings, electricity load forecasting, self-recurrent wavelet neural network.

cr

1. Introduction

us

Micro-grids are integrated energy systems composed of distributed energy resources and multiple electrical loads operating as an autonomous grid, which can

5

an

be either in parallel to or islanded from the existing power grid. A micro-grid can be considered as a small-scale version of the traditional power grid that its

M

small scale results in far fewer line losses and lower demand on transmission infrastructure. All of these advantages are consequently motivating an increased

d

demand for micro-grids in a variety of application areas such as campus envi-

10

industrial markets [1].

te

ronments, military operations, community/utility systems, and commercial and

Ac ce p

Considering the fast and worldwide development of micro-grids, their optimal

operation requires advanced tools and techniques. In particular, Short-Term Load Forecast (STLF) is an indispensable task for the operation of a micro-grid. In conventional power systems, STLF is an important tool for reliable and economic op-

15

eration of power systems, as many operating decisions, such as dispatch scheduling of generating capacity, demand side management, security assessment and maintenance scheduling of generators, are based on load forecast [2, 3, 4, 5, 6, 7]. Load forecasts also have significant roles in energy transactions, market shares and profits in competitive electricity markets [7, 8]. Different prediction strate-

2

Page 2 of 42

20

gies have already been presented for the STLF of traditional power systems over

ip t

the years. These methodologies are generally divided into two main groups: classical statistical techniques and computational intelligent techniques. Reviews on

cr

some of these strategies can be found in [2, 4, 5, 6, 7, 8].

In a similar way, STLF is a key factor in operation of micro-grids such as energy management for optimal utilization of available resources in order to min-

us

25

imize the operation cost or any environmental impact of a micro-grid [9]. More-

an

over, STLF for a micro-grid can be used for profitable trade of electric energy within the grid. In other words, it is important for the operator of a micro-grid

30

M

to determine the amount of exchanged power with a wholesale energy market so as to maximize the total benefit [10]. It has also been discussed that the fore-

d

casted loads as well as forecasted generation of renewable resources are the main

micro-grids.

te

inputs for optimal energy management [11, 12] and generation scheduling [13] in

35

Ac ce p

However, modeling and forecasting of micro-grids’ loads can be more com-

plex tasks than those usually applied for conventional power systems, as the load

time series of micro-grids is more volatile in comparison with the load of power systems, as demonstrated later in the present paper. Since the size of a micro-grid is considerably small compared to a traditional power system, the load of a microgrid includes more fluctuations. In other words, the inertia in small-scale systems

40

is low and therefore, the smoothness of load time series in such systems degrades. Using a criterion to measure the volatility of a time series, it will be shown in this paper that the volatility of load time series for a micro-grid is considerably 3

Page 3 of 42

higher than that for a conventional power system. As a result, there is a need to

45

ip t

adapt a suitable STLF model to volatile behavior of micro-grids load time series. Despite the importance of STLF for micro-grids there are a few works presented

cr

in this area. Authors in [14] present an on-line learning model based on Multiple Classifier Systems (MCSs) for short-term load forecasting of micro-grids, and the

us

model was tested on real data of a micro-grid. A bi-level prediction strategy is proposed in [15] for STLF in micro-grids. This strategy is composed of a forecaster including neural network and evolutionary algorithm in the lower level and

an

50

an enhanced differential evolution algorithm in the upper level for optimizing the

M

performance of the forecaster. The proposed models in [15] is designed having the aggregated micro-grid load in mind. However, the present paper focuses on

significantly higher volatility compared to the aggregated micro-grid load. Fore-

te

55

d

forecasting the load of the individual loads within a micro-grid, with potentially

casting individual micro-grid load components is important for operation schedul-

Ac ce p

ing and determining load serving priorities at the feeder level [16]. Some research works have also been presented regarding electricity load pre-

diction for residential areas and buildings [17, 18, 19]. The proper consumption

60

of electricity in buildings leads to lower operational costs. If the facility manager could predict the electricity demand of the building, actions could consequently be taken to reduce the amount of energy and therefore, reduce the operational cost of the building [19]. A few works have been published very recently in the area of energy prediction of buildings. For instance, long-term energy consumption of a

65

residential area in South West China has been studied in [20]. In this reference, an 4

Page 4 of 42

Artificial Neural Network (ANN) model is compared with some other prediction

ip t

models, including Grey model, regression model, polynomial model and polynomial regression model, to forecast the total energy consumption of the residential

70

cr

area, and it is shown that ANN model outperforms the other models. Having ac-

cess to detailed data of a six-story multi-family residential building located on

us

the Columbia University campus in New York City, the authors in [21] were able to conduct a comparative spatial analysis to forecast the energy consumptions of

an

units, floors and the whole building for different temporal intervals (e.g., 10-min, hourly and daily). The results indicate that the most effective models are built with hourly consumption at the floor level providing that high resolution and granular

M

75

data is available via advanced smart metering devices. In [22], a Case-Based

d

Reasoning (CBR) model, categorized as a machine-learning artificial intelligence

te

technique, is proposed to forecast energy demand in an office building located in Verennes, Quebec, Canada. Three forecasting horizons of 3-hour, 6-hour and 24-hour ahead have been simulated with hourly prediction resolution, and the re-

Ac ce p

80

sults demonstrate that the prediction capability of the model is improved when the horizon is reduced to 3-hour ahead. Authors in [23] have proposed a new methodology for electrical consumption forecasting based on end-use decomposition and similar days. Total consumption forecast is also obtained from end-use consump-

85

tions and the data of selected days. In [24], a building-level neural network-based

ensemble model is presented for day-ahead electricity load forecasting, and it is shown that the presented model outperforms SARIMA (Seasonal Auto Regressive Moving Average) by up to 50%. However, the comparisons are made only with 5

Page 5 of 42

SARIMA model, which is a linear statistical model, which may not be capable of capturing high nonlinearity of the building-level electricity load.

ip t

90

To summarize the main points, micro-grids can bring considerable benefits to

cr

power systems, such as supplying loads in remote areas, reducing total system expansion planning cost, reducing carbon emission through coordinated utiliza-

95

us

tion of Renewable Energy Sources (RESs), providing cheaper electricity through proper energy management of available resources and energy trade with the main

an

grid, and improving system reliability resiliency by providing dispatchable power for use during peak power conditions or emergency situations. Moreover, it was

M

discussed that short-term load forecasting tool is of high importance in optimal energy management and secure operation of micro-grids. In this way, some research works have been conducted to develop load forecasting models with higher ac-

d

100

te

curacy. However, as discussed above, a few works have focused on day-ahead load consumption prediction of buildings in micro-grids and consequently, im-

Ac ce p

provement of forecast accuracy is still needed in this area. In the present paper, a forecast method is proposed for the STLF of micro-grids with the focus of elec-

105

tricity load prediction for individual buildings. The main contribution of this paper is applying a Self-Recurrent Wavelet Neural Network (SRWNN) forecasting engine for electricity load prediction of micro-grids. Moreover, the LevenbergMarquardt (LM) learning algorithm is implemented to train the SRWNN. The proposed method improves the forecast accuracy for highly volatile and non-smooth

110

time series of micro-grid electricity load. The higher the forecast accuracy of electricity load, the more efficient energy management can be achieved in a micro6

Page 6 of 42

grid.

ip t

The remaining parts of the paper are organized as follows. Section 2 provides a data analysis on different electricity load time series to draw a distinction be-

tween the load of a micro-grid and a power system. The proposed forecasting

cr

115

method consists of the SRWNN as the forecasting engine and LM as the training

us

algorithm, and is presented in Section 3. The proposed load forecasting method is tested on real-world test cases and the results are compared with the results of

paper.

M

120

an

some other prediction approaches in Section 4. Finally, Section 5 concludes the

2. Data analysis

d

A data analysis is presented in this section so as to compare the characteristics

te

of a micro-grid load time series and electricity load in power systems. The British

125

Ac ce p

Columbia Institute of Technology (BCIT) in Vancouver, the Province of British Columbia (BC), Canada, is considered as the micro-grid test case studied in this paper. BCIT’s Burnaby campus is Canada’s first Smart Power Micro-grid comprised of power plants (including renewable resources of wind and photovoltaic modules), campus loads, command and control (including substation automation, micro-grid control center and distributed energy management), and communica-

130

tion network [25]. The load data used in this work is from one building with a peak value of 694 kW from March 2012 to March 2013, within the BCIT microgrid. Hereafter, we refer to this load as BCIT. To draw a comparison between the characteristics of a micro-grid load and power system load level, the load time 7

Page 7 of 42

Table 1: Comparison of electricity load time series in terms of volatility.

Daily volatility (%) Weekly volatility (%)

8.34 7.09

British Columbia’s California’s System Load System Load 2.66 3.18 2.28 3.15

ip t

BCIT

cr

Volatility index

series of two power systems, i.e., British Columbia where BCIT micro-grid is located, and California, are analyzed.

us

135

Electricity load follows daily and weekly periodicities. In this way, we con-

an

sider two measures for volatility analysis, i.e., daily volatility and weekly volatility. These measures are based on the standard deviation of logarithmic returns

140

M

over a time window. In general, daily volatility quantifies the overall change in hourly electricity load from one day to another, and weekly volatility measures

te

volatility indices, see [26].

d

the load changes in subsequent weeks. For more details regarding aforementioned

One year hourly load data has been considered for British Columbia’s and

145

Ac ce p

California’s power systems for the same period, i.e., from March 2012 to March 2013. Observe from Table 1 that both daily and weekly volatility indices for a micro-grid are considerably higher than those for power systems, which demonstrate low smoothness of micro-grid load time series. For instance, daily volatility related to the micro-grid is 8.34%, while it is respectively 2.66% and 3.18% for British Columbia’s and California’s power systems. It means that electricity

150

load of the micro-grid fluctuates more severely from one day to another compared with that of power systems. Likewise, weekly fluctuations are more severe in the micro-grid than those in a power system. As a result, daily and weekly periodic8

Page 8 of 42

ip t cr us

(a) BC’s power system electricity load

(b) Building electricity load in BCIT

an

Figure 1: One-year hourly load data of BC’s power system and the building in BCIT

M

ities of electricity load in a micro-grid are noticeably low, and consequently, the predictability of such load time series decreases. Fig. 1 illustrates one-year hourly load data of British Columbia’s power sys-

d

155

te

tem and that of the building in BCIT. It is noted that the data is normalized to the maximum value. As seen, the aggregated electricity load in a large area (e.g.,

Ac ce p

the province of British Columbia) is noticeably different from the aggregated load in a building. For instance, British Columbia’s load follows a common seasonal

160

pattern, as the load decreases in the spring in April and starts to increase in the fall

in October. The building’s load follows a fairly similar seasonal pattern. From the beginning of the academic year in september, electricity load starts increasing, and it starts decreasing in February. Moreover, it is evidently shown that fluctuations of load are more severe for a building compared with those for a power system.

165

These variations in load time series of the micro-grid graphically demonstrate its volatility previously shown in Table 1 by volatility indices.

9

Page 9 of 42

BC power system Building in BCIT

ip t

1500

500 0 −20

−15

−10

0 5 % of the peak load

−5

10

cr

1000

15

us

Number of occurrences

(a) 1−hour ramp distribution 2000

20

25

an

900

M

600 300 0 −20

−15

−10

−5

0 5 % of the peak load

10

15

BC power system Building in BCIT

20

25

d

Number of occurrences

(b) 2−hour ramp distribution 1200

te

Figure 2: Distribution of 1-hour and 2-hour ramps

Ac ce p

To have a better understanding of such severe load fluctuations for a build-

ing, hourly changes of load, i.e., the difference between the two observations at subsequent hours so-called 1-hour ramps, can be taken into consideration. Fig. 2

170

(a) shows the distribution (with equally spaced bins of 1% of the peak load) of 1-hour ramps for both the building in BCIT and BC’s power system loads. Note that the negative values show downward ramps. As seen, more frequent hourly upward and downward ramps have been occurred in the building in BCIT with the amplitude of more than 5% of the peak load. The most severe ramp happened

175

in BC’s power system load is a ramp up with the amplitude of almost 9% of the

10

Page 10 of 42

Table 2: Ramp events in electricity load time series

1-hour 2-hour Building BC power Building BC power (% of the peak load) in BCIT system in BCIT system 5% ≤ RU < 10% 556 504 818 971 10% ≤ RU < 15% 84 0 362 352 Ramp Up 15% ≤ RU < 20% 12 0 121 55 (RU) RU > 20% 4 0 21 0 5% ≤ RD < 10% 520 259 1003 1272 10% ≤ RD < 15% 48 0 283 141 Ramp Down 15% ≤ RD < 20% 9 0 57 0 (RD) RD > 20% 0 0 11 0

an

us

cr

ip t

Interval

peak load, while it is a ramp up with more than 20% of the peak load for the

M

building. Similarly, Fig. 2 (b) illustrates the distribution for 2-hour ramps, i.e., load variations in two-hour durations. As the longer time is considered for ramps,

180

d

the larger ramps will be detected. Obviously, sharp upward and downward ramps have more frequently happened in BCIT building load than in BC system load.

te

To provide more detailed statistics of ramps, table 2 shows the number of

Ac ce p

upward and downward ramps for 1-hour and 2-hour durations. For instance, there have been 100 upward ramps more than 10% of the peak load in BCIT building load, while no 1-hour ramp up has occurred with the amplitude of more than 10%

185

of the peak load in BC system load. With regard to 2-hour ramp up, there have

been 55 2-hour ramp ups more than 15% of the peak load occurred in BC, while it has been 142 ones for BCIT load. This table also demonstrates that the number of downward ramps are fewer than the number of upward ramps when large ramps are concerned.

190

Based on the above descriptions, prediction of electricity load time series of a

11

Page 11 of 42

building seems to be more difficult than that of a power system since high volatil-

ip t

ity lowers the predictability. Consequently, it is required to adapt a forecasting model so as to cope with the challenging characteristics of such time series. In the

behaviour of micro-grid time series.

us

195

cr

next section, a forecasting model is proposed to capture the dynamic and volatile

3. The forecasting model

an

The discussion in section 2 showed that dealing with micro-grid load time series is a more challenging task compared with a power system, and therefore,

200

M

traditional STLF will not result in satisfactory accuracy in micro-grid load prediction. In this way, the SRWNN forecasting engine is firstly presented in this

te

of the SRWNN.

d

section and the training algorithm is then implemented to set the free parameters

Ac ce p

3.1. Self-Recurrent Wavelet Neural Network The wavelet theory has been applied through two different approaches for

205

forecast processes. The first one is using the wavelet transform as a preprocessor to compose the load time series into its low and high frequency components. Each component is separately processed by a forecast engine [27]. The other approach is constructing the wavelet neural network (WNN) in which a wavelet function is used as the activation function of the hidden neurons of a Feed-Forward Neu-

210

ral Network (FFNN). The WNN was first introduced in [28] for approximating nonlinear functions. Due to the local properties of wavelets and the concept of

12

Page 12 of 42

adapting the wavelet shape according to training data set instead of adapting the

ip t

parameters of the fixed shape basis function, WNNs have better generalization property compared to the classical FFNNs, and therefore, these are more appropriate for the modelling of time series [29].

cr

215

The SRWNN is a modified model of WNN including the properties of the dy-

us

namics of Recurrent Neural Networks (RNNs) [30] and the fast convergence of WNNs, which has successfully been applied to estimating and controlling nonlin-

220

an

ear systems [31]. Since the SRWNN has a self-recurrent mother wavelet layer, it can store the past information of wavelets and well attract the complex nonlinear

M

systems [32]. Having self-feedback loops and input direct terms, SRWNN has improved capabilities compared to WNN, such as its dynamic response and infor-

d

mation storing ability. Therefore, SRWNN has been applied as a forecast engine

225

te

in this paper to overcome the volatile and non-smooth behavior of the load time series in a micro-grid. Moreover, SRWNN does not include limitations, such as

Ac ce p

dependency on appropriate tuning of parameters and complex optimization process, which are likely to be found in models such as Support Vector Machines (SVMs) [33].

The architecture of the SRWNN is shown in Fig. 3, which is a feed forward

230

network with four layers. As seen, X = [x1 , ..., xM ] is the input vector of the forecast engine and y is the target variable. The inputs x1 , ..., xM of the forecast

engine can be from the past values of the target variable and past and forecast values of the related exogenous variables. For instance, past values of electricity load along with the past and forecast values of temperature can be considered for 13

Page 13 of 42

235

electricity load prediction, provided that their data is available.

ip t

A feature selection technique can be used to refine these candidate features and select the most effective inputs for the forecast process. In this research work,

cr

we use the feature selection method of [34]. This method is based on the infor-

mation theoretic criterion of mutual information and selects the most informative inputs for the forecast process by filtering out the irrelevant and redundant can-

us

240

didate features through two stages. In the first stage, which is called irrelevancy

an

filter, mutual information between each candidate input, i.e. xi (t), and the target variable is calculated. The higher value of mutual information for xi (t) means

245

M

the more common information content of this feature with the target variable. The candidate inputs with computed mutual information value greater than a relevancy

d

threshold, denoted by T H1 , are considered as the relevant features of the forecast

te

process, which are retained for the next stage. However, other candidate inputs with mutual information value lower than T H1 are considered as irrelevant fea-

250

Ac ce p

tures, which are filtered out. In the second stage, which is called redundancy filter, redundant features among the candidate inputs secected by the relevancy filter are found and filtered out. Two selected candidates, e.g., xk (t) and xl (t), with high value of mutual information have more common information, i.e., high level of redundancy. Thus, the redundancy of each selected feature xk (t) with the other

candidate inputs is calculated. Then, if the measured redundancy becomes greater

255

than a redundancy threshold, denoted by T H2 , xk (t) is considered as a redundant candidate input. Hence, between this candidate and its rival, which has the maximum redundancy with xk (t), one with lower relevancy should be filtered out [34]. 14

Page 14 of 42

ip t cr us an

M

Figure 3: Architecture of the SRWNN.

The selected candidate features in relevancy filter are considered as the inputs of

T H1 and T H2 is performed by cross validation technique. Since this method is

te

260

d

the load forecasting engine. Moreover, fine-tuning the values of the thresholds

Ac ce p

not the focus of this paper, it is not further discussed here. The interested reader can refer to [34] for details of this feature selection method. Therefore, the target variable is the electricity load of the next time interval

that the forecasting engine presents a prediction for it using the past values of

265

electricity load and calendar effects. Moreover, Multi-period forecast, e.g. load

prediction for the next 24 hours, is reached via recursion, i.e. by feeding input variables with the forecaster’s outputs. For instance, forecasted load for the first hour is used as y(t−1) for load prediction of the second hour provided that y(t−1) is among the selected candidate inputs of the feature selection technique.

270

The input layer of the forecast engine transmits M input variables, which are 15

Page 15 of 42

selected by the feature selection technique, to the next layer without any changes.

ip t

The second layer, which is called the wavelet layer, consists of N × M neurons that each has a self-feedback loop. In this paper, Morlet wavelet function has

which is defined as follows: 2

us

275

cr

been considered as the activation function of neurons in the mother-wavelet layer,

an

ψ(x) = e−0.5x cos(5x)

(1)

In SRWNN, a wavelet of each node is derived from its mother wavelet as below: ui,j − bi ), ai

M

ψi,j (ri,j ) = ψ(

ri,j =

ui,j − bi ai

(2)

d

where ψi,j is the scaled and shifted version of Morlet mother wavelet with ai and

te

bi as the scale and shift parameters, respectively. In addition, the inputs of the

Ac ce p

wavelets in (2) are as follows:

ui,j = xj + ψi,j z −1 · θi,j

280

(3)

where z −1 is the time delay; thus, the input of this layer contains the memory term ψi,j z −1 which can store the past information of networks, and θi,j denotes the weight of the self-feedback loop, which represents the rate of information storage. This feature is the main difference between a SRWNN and a WNN. In

fact, the SRWNN is the same as WNN when all θi,j are equal to zero. However, 285

it is noted that the initial values for θi,j are usually considered zero, which means 16

Page 16 of 42

there are no feedback units initially.

ip t

M-dimensional wavelet functions are constructed by the tensor product of one-

Ψi =

M Y

cr

dimensional Morlet wavelets in the third layer as follows:

ψi,j ,

i = 1, 2, ..., N

us

j=1

(4)

y=

N X

w i · Ψi +

290

M X

vj · xj + g

(5)

j=1

M

i=1

an

The output of the SRWNN, denoted by y, is finally computed as below:

where, wi is the weight between ith neuron of the product layer and the output

d

node, vj is the direct input weight between j th input and the output node, and

te

g is the bias of the output node. Therefore, the output of SRWNN is obtained by a combination of multi-dimensional wavelet functions, i.e. Ψi , as well as a

295

Ac ce p

combination of inputs, i.e. xj . In other words, the proposed model not only can benefit from the capabilities of wavelet functions, such as their ability to capture cyclical behaviors, but also can capture trends of the signal. In addition, SRWNN can benefit from its dynamic response by storing the past information of wavelets in self-feedback loops (equation 3) to capture complex nonlinearities. Based on the aforementioned formulation, the vector of the free parameters of the SRWNN

300

is denoted by P as follows:

P = [vj , wi , ai , bi , θi,j , g], i = 1, ..., N, j = 1, ..., M

(6)

17

Page 17 of 42

Therefore, the SRWNN has NP = M + 3N + M × N + 1 free parameters

ip t

which are determined by the training method. It should also be noted that the SRWNN model presented in this paper differs from the SRWNN proposed in [32].

305

cr

There are two differences between these two models. First, there is an additional external bias (e.g., g) to the output layer of the presented SRWNN in this work.

us

A bias can increase or lower the net input of the activation function, depending on whether it is positive or negative, respectively [35]. Consequently, biases can

an

enhance the input/output mapping function by adding another feature to neural networks. Second, Morlet wavelet functions have been used as the activation functions in Wavelet layer of SRWNN in this paper, while the second derivative

M

310

of Gaussian functions, i.e., Mexican hat wavelet function, in reference [32] of

d

the previous version. Although Mexican hat wavelet function has successfully

te

been used in WNN model for forecasting applications due to its superiorities over Daubechies wavelets [29], it has been shown that Morlet wavelets outperform Mexican hat wavelets for prediction applications [36, 37]. Therefore, we applied

Ac ce p

315

Morlet mother wavelets as the activation functions in SRWNN in our paper. 3.2. The training algorithm

In this subsection, a training algorithm is implemented to set the free parame-

ters of the SRWNN denoted by P in (6). Since the mother wavelet function used

320

in the SRWNN, i.e. Morlet wavelet function, is differentiable with respect to all free parameters, the Levenberg-Marquardt (LM) learning algorithm can be used in this regards. This learning algorithm was applied to train the neural networks

18

Page 18 of 42

by Hagan and Menhaj in [38]. Due to the advantages of the LM algorithm, such as

325

ip t

accurate training and fast convergence, it has been recommended in many research works, and therefore, it is implemented for training the SRWNN in this paper. The

cr

LM algorithm is briefly described in the Appendix and its implementation on the SRWNN is then presented.

us

Moreover, the termination criterion used for the training of the SRWNN is based on early-stopping technique. Accordingly, the whole available data is divided into training and validation samples. The SRWNN is trained using the train-

an

330

ing samples and the error for validation samples is monitored in each iteration. As

M

the validation error begins to rise during some number of iterations, usually five, the training phase is stopped and the values of the free parameters relating to the

ing algorithm.

Ac ce p

4. Numerical results

te

335

d

iteration with the least validation error are stored as the final solution of the train-

In this paper, we mainly focus on 24-hour ahead load prediction with hourly

forecast steps. Day-ahead load forecasting can bring significant operational advantages for energy management of micro-grids. For instance, BCIT micro-grid

340

consists of different types of generating units (e.g., thermal, wind and PV units),

and day-ahead load predictions are used for energy management purposes. In other words, optimal utilization of available resources is achieved using load forecasting in order to minimize the operation cost for BCIT campus micro-grid. Moreover, as this micro-grid can operate in both stand alone and grid-connected 19

Page 19 of 42

345

modes, accurate load forecast can be used for profitable trade of electric energy

ip t

within the British Columbia power system. The same load time series data of the building in BCIT and two power systems

cr

are used for numerical experiments of this section. Based on the data analyses

presented in section 2, electricity load not only depends on the load profile of the previous day, i.e., daily periodicity, but also the load pattern of the previous week,

us

350

i.e., weekly periodicity. To capture such patterns, 192 candidate inputs has been

an

considered as lagged hourly load data, i.e., {Lt−192 , ..., Lt−1 } where Lt indicates the electricity load at time t. The feature selection technique selects the most in-

355

M

formative lagged load values from these candidate inputs. Calendar information is also highly important for a load forecasting model so as to capture weekly and

d

seasonal patterns. For instance, either considering the day of the week or differ-

te

entiating weekends and weekdays is a common way presented in the literature [5, 9, 39]. Thus, weekends and holidays are considered in this work using a bi-

360

Ac ce p

nary variable for detecting weekends and holidays from weekdays. The month of the year is also used in some cases [39]; however, it is not considered in this

paper since the seasonality factor is already captured, as the model is re-trained

every day. Furthermore, temperature data as an exogenous variable has been used to improve load forecasting prediction since temperature time series usually has high relevancy to electricity consumption time series [5, 7, 8, 40]. Accordingly,

365

based on publicly available data, seven daily values of temperature for the previous week (e.g., Td−7 , ..., Td−1 ), and the daily forecast value of the temperature for the prediction day (e.g., Td ) were first considered for the model, where Td repre20

Page 20 of 42

sents the average daily temperature for day d. However, numerical experiments

370

ip t

for BCIT test case revealed that low resolution temperature data, i.e. daily data, cannot improve the accuracy for hourly load forecast. Therefore, we tested histor-

cr

ical hourly temperature data (located in Vancouver) and also used the same time series for temperature forecasts, i.e., perfect forecasts, in order to observe if hourly

us

temperature data can enhance the forecast results for BCIT test case. For this purpose, lagged hourly temperature data, i.e., {Tt−192 , ..., Tt−1 }, are considered as 192 candidate inputs that feed the feature selection stage along with 192 candi-

an

375

date inputs for load data. The feature selection technique then selects the most

M

informative candidates among all candidates of load and temperature and transfer them to the model. Considering the selected inputs, few temperature inputs are

series and load time series of BCIT. The low correlation results from the fact that

te

380

d

among all selected inputs that shows the low correlation of the temperature time

the electric load of this building is mainly lighting load. Considering the mild

Ac ce p

temperatures in Vancouver, the heating load is not as significant. The numerical results also supported this low correlation, as hourly temperature data with even perfect forecasts could not improve the forecast accuracy of the model. Therefore,

385

temperature inputs are not considered for the numerical results in this paper. To show the effectiveness of different forecasting engines, SRWNN is com-

pared with two other efficient neural network-based forecasting models, i.e., WNN and Multi-Layer Perceptron (MLP). It is noted that statistical models (e.g., Autoregressive Integrated Moving Average (ARIMA) model) are not considered in

390

this paper since such techniques are basically linear methods and have limited ca21

Page 21 of 42

pability to capture nonlinearities in the load series [41, 42]. Therefore, we chose

ip t

two efficient Computational Intelligence (CI) based models, e.g., MLP as an efficient Feed Forward Neural Network (FFNN) and WNN as an effective model

395

cr

combining nonlinear mapping merits of FNNNs and wavelet functions, as benchmarks in our comparative results.

us

Hence, 10 test months of hourly load data from the building in BCIT from May 2012 to February 2013 are considered for 24-hour ahead load prediction.

an

It is noted that the first two months of the historical data is used for training of the forecast engine and so the results of the first two months cannot be presented here. Two error criteria are used in this paper to evaluate forecast errors: (i)

M

400

normalized Root Mean Square Error (nRMSE) and (ii) normalized Mean Absolute

d

Error (nMAE), defined as follows:

Ac ce p

te

v u N u 1 X LACT(t) − LFOR(t) nRMSE = t ( )2 × 100 N t=1 LPeak

(7)

N

1 X LACT(t) − LFOR(t) | | × 100 nMAE = N t=1 LPeak

(8)

where LACT(t) and LFOR(t) indicate the actual and forecast values of electricity

405

load for hour t. Moreover, N indicates number of hours for each month, and LPeak is the peak value of the electricity load over the year, which is 694 kW for this test case. Observe from Table 3 that SRWNN outperforms the other forecasting models in all test months and in terms of both nRMSE and nMAE. For instance, the 22

Page 22 of 42

Table 3: Forecasting errors, in %, of SRWNN, WNN and MLP for 10 test months.

ip t

cr

us

an

M

Month May Jun. Jul. Aug. Sep. Oct. Nov. Dec. Jan. Feb. Average

MLP WNN SRWNN nRMSE nMAE nRMSE nMAE nRMSE nMAE 8.44 6.22 5.96 4.05 5.23 3.80 9.92 7.55 5.44 4.27 4.86 3.80 10.41 7.92 7.04 5.26 5.43 4.01 10.40 8.14 6.57 4.95 6.46 4.80 11.88 8.40 7.83 6.01 6.28 4.82 10.45 7.68 4.81 3.83 4.24 3.28 6.34 4.89 4.62 3.56 4.30 3.21 6.21 4.58 4.54 3.35 4.22 3.05 6.93 4.74 4.86 3.40 4.25 3.11 6.85 5.29 5.06 3.94 4.58 3.50 8.78% 6.54% 5.67% 4.26% 4.98% 3.74%

average nRMSE and average nMAE of SRWNN are (5.67-4.98)/5.67=12.1% and 410

(4.26-3.74)/4.26=12.2% lower than those of WNN, and (8.78-4.98)/8.78=43.2%

d

and (6.54-3.74)/6.54=42.8% lower than those for MLP, respectively. This table

te

demonstrates that for a highly volatile time series, i.e. micro-grid electricity load,

Ac ce p

a SRWNN forecasting model can more efficiently cope with the variations and non-smooth behavior of the time series.

415

Moreover, Fig. 4 illustrates the carpet charts of monthly mean absolute errors

for different hours of the day for SRWNN and WNN on BCIT test case. This figure clearly shows that large errors for both models usually occur between 12:00 PM and 16:00 PM when the load peaks. However, this colormap shows lower errors during the peak hours for SRWNN in comparison with the WNN. More

420

importantly, the superiority of SRWNN over WNN is revealed during the upward ramps in the morning. As analyzed in section 2, sharp upward ramps occur more

23

Page 23 of 42

ip t cr us (b) WNN

an

(a) SRWNN

Figure 4: Mean absolute error (kW) of different hours of the day in different months 45

M

35

30

25

d

20

15

10

2

3

4

5

6

7

8

9

10

11

12

SRWNN WNN

13

14

15

16

17

18

19

20

21

22

23

24

Hour

Ac ce p

5 1

te

Mean Absolute Error (kW)

40

Figure 5: 10-month mean absolute error for different hours of the day

than downward ramps for BCIT test case, and consequently, any improvements in forecasting ramp up events can considerably enhance the forecast accuracy of this load time series. Fig. 5 demonstrates the average of mean absolute errors for all

425

10 months. According to these two curves, SRWNN shows lower yearly errors during the morning ramp, which usually occurs from 7:00 AM to 12:00 PM. In addition, there is an improvement in ramp down forecasting from 16:00 PM to 18:00 PM. 24

Page 24 of 42

Curves of generated forecasts and real data for a good forecasting day, i.e. November 15, and a bad forecasting day, i.e. September 7, is demonstrated in

ip t

430

Fig. 6. Fig. 6(a) shows that there are sharp changes and variations on Septem-

cr

ber 7. Sharp spikes could result from the high temperatures during specific days,

which increase the electricity consumption of the buildings for air conditioning.

435

us

As a consequence of such severe ramps, the forecasting model faces difficulties to capture this high sudden variations in electricity load. The major error is the

an

magnitude error occurred during the peak load. On the contrary, there has been smoother variations on November 15 shown in Fig. 6(b), so the forecasting model

M

could perfectly capture the upward ramp. As a result, the challenge of high volatility and sharp ramps in micro-grid time series is evidently distinct from power system loads, and makes such time series more unpredictable.

d

440

te

In the next experiment, forecasting errors of different days of the week for the same 10-month period are separately considered to observe the users’ behavior.

Ac ce p

It is noted that the electricity consumption of the building is mainly from lightings as mentioned earlier in this section. Here, users’ behavior is represented by

445

considering the calendar effect as the inputs of the model. A binary variable for differentiating weekends and holidays from weekdays is used, i.e., zero represents

weekends and holidays, while one represents weekdays. Fig. 7 demonstrates the forecasting errors with and without the calendar effects. First, observe that the average of nRMSE considering the calendar effect, i.e., 4.78%, is lower than that

450

when the calendar effect is not included, i.e., 5.32%. Moreover, according to the figure, the highest error occurs on Mondays, which is the first working day at the 25

Page 25 of 42

ip t cr us an M

(a) Bad forecasting day

(b) Good forecasting day

d

Figure 6: Samples for bad (a) and good (b) forecasting days

te

university. Calendar inputs can efficiently capture such behaviors of the users. For instance, the forecasting error in terms of nRMSE corresponds to Monday has

455

Ac ce p

considerably decreased from 7.32% to 5.83% when the calendar effect is taken into account. In addition, the standard deviation of the error associated with different days of week has decreased from 0.97% to 0.57% using the calendar effect. In other words, the model performs in a more robust way for predicting different days of the week. According to Fig. 7, the difference between the maximum and the minimum errors with calendar, i.e., 1.85%, and without calendar, i.e., 2.83%,

460

can also show the better performance of the model including the calendar effect. As a result, users’ behavior can efficiently be captured by considering calendar effects in order to improve the forecast accuracy.

26

Page 26 of 42

ip t cr us an

M

Figure 7: Forecasting errors for different days of the week

In the last experiment, the proposed forecasting model is applied to predict two

how forecast accuracy of SRWNN improves, compared with WNN, as the volatil-

te

465

d

power system time series. The main goal of this numerical experiment is to show

Ac ce p

ity of the time series increases. Hence, from a power system with low volatility to one with higher volatility, forecast accuracy improvements increase for SRWNN. In this way, the same test cases for British Columbia’s and California’s power systems are considered. Table 4 shows the obtained forecast error results (based on

470

the average of 10-month error) for both SRWNN and WNN models. Firstly, this table demonstrates noticeable lower forecast errors of both models for prediction of power systems’ load data compared with those for a micro-grid illustrated in Table 3. For instance, 4.98% compared with 2.29% in terms of nRMSE for the micro-grid and British Columbia’s power system, respectively. Besides, observed

475

from Table 1, the volatility for British Columbia’s power system time series is 27

Page 27 of 42

Table 4: Forecasting errors of SRWNN and WNN for two power systems.

cr

ip t

WNN SRWNN Improvements(%) Power System nRMSE nMAE nRMSE nMAE nRMSE nMAE British Columbia 2.46 1.81 2.29 1.69 6.9 6.6 California 3.67 2.57 3.38 2.37 7.9 7.8

the lowest in terms of both daily and weekly volatility indices. Consequently,

us

it is expected to have higher predictability for British Columbia’s power system compared to the micro-grid and California’s power system. Table 4 statistically

480

an

supports that the forecasting errors for British Columbia is lower than those for California, e.g., 2.46% compared to 3.67% in terms of nRMSE for WNN.

M

Secondly, Table 4 shows how effective the SRWNN becomes as the volatility of a time series increases. As seen from the last column of Table 4, the forecast ac-

d

curacy improvements obtained from SRWNN in terms of nRMSE and nMAE are

485

te

respectively (2.46-2.29)/2.46=6.9% and (1.81-1.69)/1.81=6.6% for BC’s power system. Similarly, there are 7.9% and 7.8% forecast accuracy improvements in

Ac ce p

terms of nRMSE and nMAE for California’s power system, respectively. Therefore, since the volatility of California’s power system is higher than that for British Columbia’s, SRWNN obtained higher improvement of forecast accuracy compared with WNN for California’s power system. In other words, California’s load

490

contains higher daily and weekly volatilities, and consequently, the SRWNN can capture these variations and present more accurate forecast results compared with WNN. To have a better sense of these percentage errors, forecast accuracy improvement in terms of mean absolute error is around 93 MW, which is almost twice as big as the capacity of Kumeyaay wind farm, i.e. 50 MW, located in San

28

Page 28 of 42

495

Diego, California [43]. As a result, it shows as the volatility of the time series

ip t

increases, the performance of SRWNN improves in comparison with WNN. As mentioned earlier in this section, load forecast accuracy can be improved using

cr

weather forecast data as exogenous inputs to the forecasting model. For instance, load forecasting models utilized in California ISO (CAISO) include weather fore-

casts, such as temperature, dew point, wind speed and cloud cover, for next 9 days

us

500

for 24 weather stations [44]. It is noted that including such exogenous inputs to

an

the model depends on the availability of the public data.

The computation time of the SRWNN model for the training phase is less

505

M

than 35 seconds for one day prediction for the test cases of this paper, which is measured on a hardware set of Mac Intel Core i5 2.7 GHz with 12 GB RAM.

d

Although this computation time is larger than that for WNN, i.e., less than 11

te

seconds, it is completely acceptable within a 24-hour decision making framework,

Ac ce p

and shows fast forecasting performance of the proposed method.

5. Conclusions

510

STLF is an important tool for reliable and economic operation of power sys-

tems as many operating decisions are based on load forecast, e.g., dispatch schedul-

ing of generating units, security assessment and demand side management. Likewise, precise STLF for a micro-grid can enhance the management of its renewable and conventional resources and improve the economics of energy trade with elec-

515

tricity markets. Considering volatile and non-smooth characteristics of load time series of micro-grids compared with power systems’ electricity load, a new fore29

Page 29 of 42

casting method is proposed to deal with such challenges in this paper. The pro-

ip t

posed method has the structure of a SRWNN as the forecasting engine, in which feedback loops have been added to a WNN so as to better capture nonlinear complexities of volatile time series. LM learning algorithm is implemented to train

cr

520

the SRWNN, i.e., adjusting the free parameters of the SRWNN. High volatility of

us

a micro-grid load was shown by defining a volatility criterion and comparing with the volatility of two power systems’ load data. The effectiveness of the proposed

525

an

forecasting method was demonstrated by real-world load data of a micro-grid and power systems. The results show that the proposed SRWNN model leads to more

M

accurate forecasts when a volatile time series prediction is of interest.

d

Appendix: Formulation of the training algorithm

te

The task of the forecasting engines is to learn the mapping function between

530

Ac ce p

a specified set of input/output pairs {(X1 , t1 ), (X2 , t2 ), ..., (XQ , tQ )}, known as training samples. Q indicates the number of training samples. Xq and tq are the q th input vector and the corresponding target output of the forecasting model, respectively. Mean squared error (MSE) is usually considered to be the performance index for the network. The MSE is calculated by Q

1 X 2 MSE = e, Q q=1 q

(eq = tq − yq )

(A.1)

where, yq is the output of the forecasting engine when Xq is fed as the input of the 535

forecasting engine. eq is the forecast error of the q th sample. 30

Page 30 of 42

The LM algorithm is an approximation of Newton’s method, in which the

(A.2)

cr

Pk+1 = Pk − (J ⊺ J + µI)−1 J ⊺ e

ip t

solution is updated as follows:

us

where P is the vector of the free parameters according to (6). k represents the iteration number, and I is the identity matrix. J is the Jacobian matrix composed of the first derivatives of the network errors with respect to all its free parameters

an

540

and J ⊺ J is the Hessian matrix. Considering (A.1) as the performance function

M

that should be minimized, the gradient of (A.1) can be shown as J ⊺ e. The main modification of the LM algorithm with respect to Newton’s method

zero in (A.2). When µ is large, the LM algorithm tends to gradient descent with a

te

545

d

is the parameter µ, such that the algorithm becomes the Newton’s method if µ is

small step size, i.e., (1/µ), while for small µ the LM algorithm tends to Newton’s

Ac ce p

method. Since the Newton’s method is faster and more accurate than the gradient descent, the aim is to shift toward Newton’s method as quickly as possible. Thus, µ is divided by a factor β (β > 1) after each successful step, i.e. reduction in

550

the MSE given in (A.1). On the contrary, µ is multiplied by the factor β when a tentative step increases the MSE. Therefore, the MSE is always reduced at each iteration of the algorithm [38]. The initial value for µ is usually considered 0.01

and β is usually set as 10. For further details regarding the LM training algorithm, the interested reader can refer to [38]. The implementation of the LM learning 555

algorithm on the SRWNN is proposed in the following.

31

Page 31 of 42

Since computation of the Jacobian matrix is the most important part of the LM

ip t

algorithm, it is required to determine the first derivative of the network errors with respect to each free parameter of (6) in the SRWNN, i.e., vj , wi , ai , bi , θi,j , and g.

(A.3)

i = 1, 2, ..., N

(A.4)

us

j = 1, 2, ..., M

an

∂(t − y) ∂e = = −xj , ∂vj ∂vj ∂e ∂(t − y) = = −Ψi , ∂wi ∂wi ∂(t − y) ∂Ψi ∂e = = −wi · , ∂ai ∂ai ∂ai

cr

The elements in the Jacobian matrix are calculated by the following equations.

M

560

i = 1, 2, ..., N

" # M M ∂Ψi X dψi,j Y ψ(ri,l ) , = · ∂ai dai j=1

(A.5)

(A.6)

d

l=1,l6=j

−ri,j dψi,j = · ψ ′ (ri,j ), dai ai

te

(A.7)

Ac ce p

where ψ ′ (.) is the derivative of the Morlet mother wavelet function. ∂e ∂(t − y) ∂Ψi = = −wi · , ∂bi ∂bi ∂bi

" # M M ∂Ψi X dψi,j Y = · ψ(ri,l ) , ∂bi dbi l=1,l6=j j=1

−1 dψi,j = · ψ ′ (ri,j ), dbi ai

i = 1, 2, ..., N

(A.8)

(A.9) (A.10)

32

Page 32 of 42

i = 1, ..., N , j = 1, ..., M

M Y ψi,j z −1 ∂Ψi ψ(ri,l ) = · ψ ′ (ri,j ) · ∂θi,j ai

us

∂(t − y) ∂e = = −1 ∂g ∂g

(A.13)

Therefore, the Jacobian matrix with the size of Q × NP can be computed using

an

565

(A.12)

cr

l=1,l6=j

(A.11)

ip t

∂e ∂Ψi = −wi · , ∂θi,j ∂θi,j

(A.3) to (A.13) and all free parameters of the SRWNN are updated using (A.2).

M

The procedure of the LM learning algorithm for training the SRWNN is summarized as follows:

parameters vj , wi , ai , bi , θi,j and g of the forecasting engine within their

te

570

d

1. Set the iteration number to 1, i.e., (k = 1). Randomly initialize the free

Ac ce p

allowable ranges for the first iteration P1 . 2. Present all xq s and compute the corresponding SRWNN outputs yq using (5). Moreover, compute the corresponding errors eq and the performance

index MSE using (A.1).

575

3. Compute the Jacobian matrix 4. Update the free parameters of the forecasting engine using (A.2) to obtain Pk+1.

5. Compute the performance index MSE using Pk+1. If the new MSE is smaller than the one computed in step 2, reduce the parameter µ by the 33

Page 33 of 42

580

factor β, and save Pk+1 . Otherwise, increase the parameter µ by multiply-

ip t

ing it to β and go back to step 3. 6. Increment k, i.e., (k = k+1). The training algorithm is terminated when the

cr

termination criterion is satisfied. Otherwise, go back to step 3. It is noted

585

us

that the termination criterion can be the maximum number of iterations. However, the early stopping technique, discussed in section 3.2, is used as

an

the termination criterion of the training algorithm in this paper as it can monitor the prediction ability of SRWNN forecast engine for the unseen

M

samples and terminate the training process in the best point with the least

Acknowledgements

te

590

d

validation error.

Partial support for this work came from the Canadian National Science and

Ac ce p

Engineering Research Council (NSERC) and the ENMAX Corporation under the Industrial Research Chairs program. Moreover, the authors would like to thank Dr. Hassan Farhangi and Dr. Ali Palizan of British Columbia Institute of Technology

595

(BCIT) for providing data and invaluable insight.

References

[1] Navigant research, 2013. URL: www.navigantresearch.com/research/microgrids. [2] J. Taylor, P. McSharry, Short-term load forecasting methods: An evaluation

34

Page 34 of 42

based on european data, IEEE Transactions on Power Systems 22 (2007) 2213–2219.

ip t

600

[3] T. Hong, M. Gui, M. Baran, H. Willis, Modeling and forecasting hourly elec-

us

Society General Meeting, 2010 IEEE (2010) 1–8.

cr

tric load by multiple linear regression with interactions, Power and Energy

[4] E. Paparoditis, T. Sapatinas, Short-term load forecasting: The similar shape functional time-series predictor, IEEE Transactions on Power Systems 28

an

605

(2013) 3818–3825.

M

[5] H. Hippert, C. Pedreira, R. Souza, Neural networks for short-term load

610

te

16 (2001) 44–55.

d

forecasting: a review and evaluation, IEEE Transactions on Power Systems

[6] Y. Wang, Q. Xia, C. Kang, Secondary forecasting based on deviation analy-

Ac ce p

sis for short-term load forecasting, IEEE Transactions on Power Systems 26 (2011) 500–507.

[7] E. Ceperic, V. Ceperic, A. Baric, A strategy for short-term load forecasting by support vector regression machines, IEEE Transactions on Power

615

Systems 28 (2013) 4356–4364.

[8] Y. Goude, R. Nedellec, N. Kong, Local short and middle term electricity load forecasting with semi-parametric additive models, IEEE Transactions on Smart Grid 5 (2014) 440–446.

35

Page 35 of 42

[9] A. Chaouachi, R. M. Kamel, R. Andoulsi, K. Nagasaka, Multiobjective intelligent energy management for a microgrid, IEEE Transactions on In-

ip t

620

dustrial Electronics 60 (2013) 1688–1699.

cr

[10] E. Mashhour, S. Moghaddas-Tafreshi, Integration of distributed energy re-

us

sources into low voltage grid: A market-based multiperiod optimization model, Electric Power Systems Research 80 (2010) 473–480.

[11] E. R. Sanseverino, M. L. D. Silvestre, M. G. Ippolito, A. D. Paola, G. L.

an

625

Re, An execution, monitoring and replanning approach for optimal energy

M

management in microgrids, Energy 36 (2011) 3429–3436. [12] A. Mohamed, V. Salehi, O. Mohammed, Real-time energy management

tions on Smart Grid 3 (2012) 1911–1922.

te

630

d

algorithm for mitigation of pulse loads in hybrid microgrids, IEEE Transac-

Ac ce p

[13] M. Eghbal, T. K. Saha, N. Mahmoudi-Kohan, Utilizing demand response programs in day ahead generation scheduling for micro-grids with renewable sources, 2011 IEEE PES Innovative Smart Grid Technologies Asia (ISGT) (2011) 1–6.

635

[14] P. Chan, W.-C. Chen, W. Ng, D. Yeung, Multiple classifier system for short term load forecast of microgrid, Proceedings of the 2011 International Conference on Machine Learning and Cybernetics (10-13 July, 2011) 1268– 1273.

36

Page 36 of 42

[15] N. Amjady, F. Keynia, H. Zareipour, Short-term load forecast of microgrids by a new bilevel prediction strategy, IEEE Transactions on Smart Grid 1

ip t

640

(2010) 286–294.

cr

[16] M. Shahidehpour, M. Khodayar, Cutting campus energy costs with hierar-

us

chical control, IEEE Electrification Magazine 1 (2013) 40– 56.

[17] A. Ahmad, M. Hassan, M. Abdullah, H. Rahman, F. Hussin, H. Abdullah, R. Saidur, A review on applications of ANN and SVM for building elec-

an

645

trical energy consumption forecasting, Renewable and Sustainable Energy

M

Reviews 33 (2014) 102 – 109.

[18] G. Escriva-Escriva, C. Alvarez-Bel, C. Roldan-Blay, M. Alcazar-Ortega,

forecasting based on building end-uses, Energy and Buildings 43 (2011)

Ac ce p

3112 – 3119.

te

650

d

New artificial neural network prediction method for electrical consumption

[19] A. H. Neto, F. A. S. Fiorelli, Comparison between detailed model simulation and artificial neural network for forecasting building energy consumption, Energy and Buildings 40 (2008) 2169 – 2176.

655

[20] S. Farzana, M. Liu, A. Baldwin, M. U. Hossain, Multi-model prediction and simulation of residential building energy in urban areas of chongqing, south west china, Energy and Buildings 81 (2014) 161 – 169. [21] R. K. Jain, K. M. Smith, P. J. Culligan, J. E. Taylor, Forecasting energy consumption of multi-family residential buildings using support vector regres37

Page 37 of 42

660

sion: Investigating the impact of temporal and spatial monitoring granularity

ip t

on performance accuracy, Applied Energy 123 (2014) 168 – 178. [22] D. Monfet, M. Corsi, D. Choiniere, E. Arkhipova, Development of an energy

cr

prediction tool for commercial buildings using case-based reasoning, Energy

665

us

and Buildings 81 (2014) 152 – 160.

[23] G. Escriva-Escriva, C. Roldan-Blay, C. Alvarez-Bel, Electrical consumption

an

forecast using actual data of building end-use decomposition, Energy and Buildings 82 (2014) 73 – 81.

M

[24] J. G. Jetcheva, M. Majidpour, W. P. Chen, Neural network model ensembles for building-level electricity load forecasts, Energy and Buildings 84 (2014)

[25] British

d

214 – 223.

te

670

Columbia

Institute

of

Technology,

2014.

URL:

Ac ce p

http://www.bcit.ca/microgrid/.

[26] H. Zareipour, K. Bhattacharya, C. A. Canizares, Electricity market price volatility: The case of Ontario, Energy Policy 35 (2007) 4739–4748.

675

[27] N. Amjady, F. Keynia, Short-term load forecasting of power systems by combination of wavelet transform and neuro-evolutionary algorithm, Energy 34 (2009) 46 – 57. [28] Q. Zhang, A. Benveniste, Wavelet networks, IEEE Transactions on Neural Networks 3 (1992) 889–898. 38

Page 38 of 42

680

[29] N. M. Pindoriya, S. N. Singh, S. K. Singh, An adaptive wavelet neural

ip t

network-based energy price forecasting in electricity markets, IEEE Transaction on Power System 23 (2008) 1423–1432.

cr

[30] J. Vermaak, E. Botha, Recurrent neural networks for short-term load fore-

685

us

casting, IEEE Transactions on Power Systems 13 (1998) 126–132.

[31] S. J. Yoo, J. B. Park, Y. H. Choi, Adaptive dynamic surface control of

an

flexible-joint robots using self-recurrent wavelet neural networks, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 36

M

(2006) 1342–1355.

[32] S. J. Yoo, J. B. Park, Y. H. Choi, Indirect adaptive control of nonlinear dynamic systems using self recurrent wavelet neural networks via adaptive

d

690

te

learning rates, Information Sciences 177 (2007) 3074–3098.

Ac ce p

[33] A. Tascikaraoglu, M. Uzunoglu, A review of combined approaches for prediction of short-term wind speed and power, Renewable and Sustainable Energy Reviews 34 (2014) 243 – 254.

695

[34] N. Amjady, F. Keynia, H. Zareipour, Wind power prediction by a new forecast engine composed of modified hybrid neural network and enhanced particle swarm optimization, IEEE Transactions on Sustainable Energy 2 (2011) 265–276. [35] S. S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice

700

Hall, 1999. 39

Page 39 of 42

[36] H. Chitsaz, N. Amjady, H. Zareipour, Wind power forecast using wavelet

ip t

neural network trained by improved clonal selection algorithm, Energy Conversion and Management 89 (2015) 588–598.

IEEE Transactions on Power Systems 25 (2010) 1519–1530.

us

705

cr

[37] L. Wu, M. Shahidehpour, A hybrid model for day-ahead price forecasting,

[38] M. T. Hagan, M. B. Menhaj, Training feedforward networks with the Mar-

an

quardt algorithm, IEEE Transactions on Neural Networks 5 (1994) 989–993. [39] L. Hernandez, C. Baladr´on, J. Aguiar, B. Carro, A. Sanchez-Esguevillas,

ral networks, Energies 6 (2013) 1385–1408.

d

710

M

J. Lloret, Short-term load forecasting for microgrids based on artificial neu-

te

[40] A. Pandey, D. Singh, S. Sinha, Intelligent hybrid wavelet models for shortterm load forecasting, IEEE Transactions on Power Systems 25 (2010)

Ac ce p

1266–1273.

[41] B.-L. Zhang, Z.-Y. Dong, An adaptive neural-wavelet model for short term

715

load forecasting, Electric Power Systems Research 59 (2001) 121–129.

[42] N. Amjady, A. Daraeepour, Mixed price and load forecasting of electricity markets by a new iterative prediction method, Electric Power Systems Research 79 (2009) 1329–1336. [43] Kumeyaay

720

wind

farm,

2014.

URL:

http://www.thewindpower.net/windfarm_en_2792_kumeyaay.php. 40

Page 40 of 42

[44] California

Independent

System

Operator,

2014.

URL:

Ac ce p

te

d

M

an

us

cr

ip t

http://www.caiso.com/1c57/1c578a8751b30.pdf.

41

Page 41 of 42

*Highlights (for review)

 

Ac ce p

te

d

M

an

us

cr



Load forecasting for micro-grids is more challenging than the conventional power system load forecasting. Electricity load of a building in a micro-grid is more volatile than that of a power system. The SRWNN model can efficiently capture non-smooth behavior of building load. The superiority of SRWNN forecasting model over WNN increments as the volatility increases.

ip t



Page 42 of 42