Accepted Manuscript Forecasting ground-level irradiance over short horizons: Time series, meteorological, and time-varying models Gordon Reikard, Sue Ellen Haupt, Tara Jensen PII:
S0960-1481(17)30404-4
DOI:
10.1016/j.renene.2017.05.019
Reference:
RENE 8782
To appear in:
Renewable Energy
Received Date: 30 December 2016 Revised Date:
28 March 2017
Accepted Date: 5 May 2017
Please cite this article as: Reikard G, Haupt SE, Jensen T, Forecasting ground-level irradiance over short horizons: Time series, meteorological, and time-varying models, Renewable Energy (2017), doi: 10.1016/j.renene.2017.05.019. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT 1 1 2 3
RI PT
4 5 6 7
SC
8 9
M AN U
10 11 12
Forecasting ground-level irradiance over short horizons:
13
Time series, meteorological, and time-varying models
14
25 26
TE D
Gordon Reikard U.S. Cellular
EP
Sue Ellen Haupt National Center for Atmospheric Research Tara Jensen National Center for Atmospheric Research
AC C
15 16 17 18 19 20 21 22 23 24
Revised, March 27, 2017
ACCEPTED MANUSCRIPT 2 27 ABSTRACT
28 29
One of the key enabling technologies for integrating solar energy into the grid is short-range
31
forecasting. Two issues have emerged in the literature. The first has to do with the relative merits of
32
physics-based versus time series models. The second is how to parameterize short-term variability. One
33
promising approach is time-varying parameter models. Time series models can be updated using moving
34
windows. Meteorological models can be adjusted to match the data more closely. This study evaluates
35
several types of models over forecast horizons ranging from 15 minutes to 4 hours, using data from two
36
locations in the United States. The Weather Research Forecast (WRF) model is a state-of-the art
37
numerical weather prediction system. The Dynamic Integrated Forecast (DICast) system combines
38
meteorological models with statistical adjustments. The primary time series model is the ARIMA.
39
Several other techniques are also tested, cloud advection, smart persistence forecasts and regression trees.
40
Each type of model is found to have particular strengths and weaknesses. Among time series models,
41
ARIMAs with time-varying coefficients are superior to fixed coefficient methods. In a direct comparison
42
of meteorological and time series models, the ARIMA is more accurate at short horizons, while the
43
numerical weather prediction models are more accurate as the horizon extends. The convergence point, at
44
which the two methods achieve similar degrees of accuracy, is in the range of 1-3 hours. Adjusting
45
meteorological model output using statistical corrections at regular intervals, as in the DICast,
46
consistently outperforms the alternatives at horizons of 2-4 hours, and is highly competitive at 1 hour.
48 49 50
SC
M AN U
TE D
EP
AC C
47
RI PT
30
Keywords: solar irradiance, meteorological models, time series models, forecasting
ACCEPTED MANUSCRIPT 3 51
1. Introduction
52 53
One of the key enabling technologies for integrating solar energy into the grid is short-term forecasting. In most utilities in North America, balancing reserves, used to buffer imbalances between supply
55
and demand, are calculated at the 1 hour horizon. The Federal Energy Regulatory Commission has also
56
mandated 15 minute transmission scheduling to assist in integrating variable sources [1]. Some independent
57
system operators are scheduling at horizons of as little as 5 minutes. Forecasts at somewhat longer horizons
58
are used in operational planning, peak load matching, switching sources and planning.
SC
59
RI PT
54
An extensive literature on forecasting has emerged over the last two decades. Several classes of forecasting models have demonstrated value over particular horizons [2]. At the shortest time scales, less
61
than 15 minutes, models based on sky image data can often outperform other methods [3-8]. Time series
62
models have also been found to perform well over short horizons, ranging from a few minutes to several
63
hours. At longer horizons, on the order of 4 hours and beyond, meteorological or Numerical Weather
64
Prediction (NWP) models have been found to yield the most accurate predictions [9-13]. Two issues have emerged in the literature. The first has to do with the relative merits of physics-
TE D
65
M AN U
60
based versus time series methods. In principle, meteorological models are attractive because they can capture
67
the factors influencing ground-level irradiance: cloud cover, precipitation, humidity and aerosols. The
68
corresponding disadvantage is that these variables may be difficult to predict accurately. In time series
69
models, the causal factors are captured only implicitly, through lag coefficients. Despite this, they often
70
predict more accurately at short horizons.
AC C
71
EP
66
The second issue is how to model short-term variability. The atmosphere is known to have
72
multifractal properties: It is characterized by high degrees of intermittency and irregular outliers [14-15].
73
One approach is time varying models. Evidence from other fields, primarily econometrics, has demonstrated
74
that stochastic parameter models can often predict more accurately than their fixed-coefficient counterparts
75
[16-17]. Meteorological models can be made time varying by adjusting the forecasts to more closely match
76
the recent data.
ACCEPTED MANUSCRIPT 4 This study evaluates meteorological and time series models over horizons ranging from 15 minutes to
78
4 hours. The models are reviewed in Section 2. The databases and forecasting methodology are reviewed in
79
Section 3. The empirical findings are reported in Section 4. Further analysis is conducted in in Sections 5-6.
80
Section 7 concludes.
RI PT
77
81 82
2. The forecasting models
83
Table 1 provides a glossary of the forecasting models. The meteorological forecasts were generated
SC
84
using systems developed at the National Center for Atmospheric Research (NCAR). These include two
86
versions of the Weather Research Forecast model and the Dynamic Integrated Forecast system. Two cloud
87
advection models are also tested. The primary time series model tested here is the well-known
88
autoregressive, integrated, moving average (ARIMA) class. The other statistical models include persistence
89
and regression forecasts based on the clearness index, i.e., the ratio of the irradiance reaching the surface of
90
the earth to irradiance impinging on the top of the atmosphere. The DICast and ARIMA incorporate time-
91
varying parameters. The other techniques are fixed coefficient models. [TABLE 1 ABOUT HERE]
TE D
92
M AN U
85
2.1 Meteorological and combined models
94
The Weather Research and Forecasting (WRF) model uses the Navier Stokes equations of fluid
EP
93
flow [18-19]. It simulates advection in the atmosphere using the initial conditions established by
96
observations and boundary conditions from a global model, with a series of parameterizations for the
97
unresolved and external processes [20].
98 99
AC C
95
A recent version known as WRF-Solar incorporates several new features that enhance its ability
to predict irradiance [21]. The solar tracking algorithm accounts for changes in the eccentricity of the
100
Earth’s orbit and the obliquity of the Earth’s axis. The model output includes diffuse and direct normal as
101
well as global horizontal irradiance. Interpolation algorithms are used to account for irradiance between
102
model runs, and a fast radiative transfer algorithm calculates the surface irradiance. A new
103
parameterization is used to to capture absorption and scattering of radiation by aerosols. Three
ACCEPTED MANUSCRIPT 5 104
dimensional aerosols are allowed to interact with the cloud microphysics. The model also accounts for
105
the feedbacks in shortwave irradiance from smaller clouds. The forecasts are from two different versions of WRF-Solar. The first is the now-casting version
107
of WRF-Solar (hereafter WRF-Solar-Now), which was run hourly at 9 km resolution over the contiguous
108
United States. The initial and boundary conditions are calculated using the National Centers for
109
Environmental Prediction (NCEP) Rapid Refresh model. The second version, WRF-Solar Day Ahead
110
(hereafter WRF-Solar-DA), uses a higher resolution of 3 km. It is run once per day, targeting day-ahead
111
decision points. This version is initialized with NCEP’s High Resolution Rapid Refresh system for the
112
first 15 hours.
SC
Prior studies have found that models combining atmospheric physics and statistical adjustment can
M AN U
113
RI PT
106
predict more accurately than meteorological models alone, over horizons of 1-5 hours [22-26]. The forecasts
115
are from the Dynamic Integrated forecast (DICast®) system, which adjusts NWP model output based on
116
recent data [27]. Forecasts have been created for several measures of irradiance at selected locations where
117
observational data was provided by private utilities [28, 13]. DICast uses a two-step process. The NWP
118
model forecasts are post-processed using a Model Output Statistics (MOS) approach [29]: the model output
119
is adjusted upward or downward to match the most recent actual values. Then, several of these adjusted
120
forecasts are combined using optimized weights. In effect, DICast is an artificial intelligence system that
121
continually updates the forecasts so as to more closely approximate the observed data.
EP
TE D
114
122
124
2.2 Cloud advection models
AC C
123
Two cloud advection models are also tested. Analyses of satellite-derived advection schemes
125
have found that these fill a gap between short-term sky-imaging methods and methods based on
126
meteorological models alone, as well as capturing ramps in irradiance [30-31]. Empirical tests of
127
satellite-derived advection forecasts have concluded that these are effective primarily within very short
128
time frames [32-42].
129 130
The CIRACast algorithm identifies cloud fields from satellite observations [43-45]. The Colorado State University’s Cooperative Institute for Research in the Atmosphere (CIRA) team has
ACCEPTED MANUSCRIPT 6 developed an operational satellite-derived forecast for predicting irradiance over intervals of 15 minutes
132
to 3 hours. The CIRA method utilizes real-time cloud imagery obtained from the Pathfinder Atmospheres
133
Extended (PTMOS-x) retrieval suite, which is based on geostationary satellite observations [46]. The
134
cloud motion is then predicted using wind forecasts derived from the Global Forecast System (GFS).
135
Ground level irradiance is computed using a radiative transfer model.
136
RI PT
131
The second forecast uses the Multisensor Advection Diffusion (MADCast) algorithm of Auligné [47-48]. The method is in three stages. The first is to retrieve observations and express them as cloud
138
fractions. Irradiance is calculated using a radiative transfer model, under the assumption of clear skies.
139
Departures between the irradiance measured by the satellites and irradiance estimated by the model are
140
computed. The cloud fraction is based on the difference between the two. The cloud fraction profiles are
141
then interpolated to the model grid points. The interpolations are enhanced using the information from
142
multiple satellite platforms, to obtain better estimates of their horizontal and vertical resolutions. The
143
second stage is to forecast the cloud fraction, using the dynamic core of the WRF model. The third stage
144
is to convert these forecasts into ground-level irradiance.
M AN U
SC
137
TE D
145
2.3 Time series models
147
Time series models can be implemented using commercially available software, and the
148
programming is straightforward. The ARIMA class of models is well-established [49]. Using standard
149
notation, let φ(L) be the autoregressive operator, represented as a polynomial in the backshift operator:
150
φ(L) = 1 – φ1L - … - φpLp. Let Φ(L) be the cyclical autoregressive operator, defined the same way. Let
151
θ(L) be the moving average operator: θ(L) = 1 + θ1L + … + θqLq, and Θ(L) be the cyclical moving
152
average operator. Let the superscript ξ denote the order of differencing, and the superscript ζ denote the
153
order of cyclical differencing. Let the superscript f denote the cyclical frequency, for the hourly data, 24
154
hours. Let Yt denote ground-level irradiance and the t-subscript denote time variation. The model is then
155
of the form:
156
(1-L)ξ(1-Lf)ζYt = [θt(L) Θt(L) / φt(L) Φt(L)] εt
AC C
EP
146
(1)
ACCEPTED MANUSCRIPT 7 157
where ε is the residual and the coefficients are stochastic. The success of the ARIMA and related models in forecasting irradiance traces back to their
159
ability to reproduce the diurnal cycle. An early review of time series techniques found that ARIMA
160
models were able to outperform most of the alternatives, including neural networks and unobserved
161
components models, which use trigonometric terms to capture cyclical behavior [50]. This has been
162
confirmed in more recent studies [51].
RI PT
158
The ARIMA can also include causal inputs, in which case it is referred to as a transfer function.
163
Inputs that have been considered include a range of meteorological variables such as temperature,
165
precipitation, cloud cover, and processed satellite images [52-56]. Let Mt denote the input, λ and δ denote
166
the moving average and autoregressive polynomials for the input. The transfer function is of the form:
167
(1-L)ξ(1-Lf)ζYt = [θt (L)t Θ t (L) / φt (L) Φt (L)] εt + [λt (L) / δt (L) ] Mt
M AN U
SC
164
(2)
Other popular techniques include neural networks [57-65]. Training neural networks directly on
168
irradiance has been found to forecast effectively only at very short horizons. Instead, the approach used by
170
many of these studies has been to train the net on the clearness index. The advantage of this method is that it
171
identifies the effect of aerosols and cloud cover, and implicitly takes into account the solar angle, which is
172
computed explicitly. The corresponding disadvantage is that cloud cover can be highly intermittent, making
173
it difficult to predict.
TE D
169
Two forecasts using the clearness index are also tested. These were run at NCAR as part of the
EP
174
Sun4Cast solar forecasting system [13], and are reproduced here. The first is simply a “smart”
176
persistence forecast, which is often used as the baseline for the other forecasts to beat: The clearness
177
index is assumed to be equal to its previous value, while the solar angle changes as computed for the
178
specific time of day and day of year. The second is a regression tree forecast based on the open source
179
Cubist software [53, 66]. These forecasts were run hourly to predict 15 minute interval forecasts out to 3
180
hours.
181 182 183
AC C
175
ACCEPTED MANUSCRIPT 8 184
3. The databases and the forecasting methodology
185 3.1 The data
187
Seven data sets at two locations in the United States were used in the forecasting tests. Table 2
RI PT
186
shows the datasets, the interval spanned, and the number of usable observations, i.e., observations for
189
which there are meteorological model forecasts, less missing and nighttime values. The data are available
190
at two resolutions, 15-minute and hourly. All the databases include global horizontal irradiance (GHI)
191
and the clearness index. Irradiance is denominated in watts per meter squared (W/m2). The clearness
192
index ranges between values of 0 and 1, where 1 indicates completely clear skies.
Figures 1-2 show the locations on maps. Four sites are from the Sacramento Municipal Utility
M AN U
193
SC
188
District (SMUD) in California (latitude 38.33 N, longitude 121.28 W). The district spans about 1400
195
square kilometers. Site 67 lies east of Sacramento. Sites 68-69 are northeast of Sacramento, near the
196
towns of Folsom and Roseville respectively. Site 70 is further south of the city, and west of the Interstate
197
5 highway. At the 15-minute resolution, the data run from January 2, 2015 through April 28, 2016, and
198
consist of 17,701 to 19,636 usable values. At the hourly resolution, the data run from January 17, 2015
199
through June 6, 2016. The number of usable observations ranges from 3,056 to 5,232.
200
TE D
194
Three sites are at Brookhaven National Laboratory, in Upton, New York (latitude 40.52 N, longitude 72.53 W). The Long Island Solar Farm (LISF) is a 32-megawatt solar photovoltaic plant built
202
as a joint venture between the Department of Energy and the Long Island Power Authority. Since the
203
power plant spans only about 200 acres, these sites are closer together. The data run from February 6,
204
2015 through April 12, 2016. The number of usable values ranges from 3,659 to 3,815 for the hourly
205
data, and 15,170 to 15,808 for the 15-minute data. [FIGURES 1-2 ABOUT HERE]
206
ABOUT HERE]
207
AC C
EP
201
[TABLE 2
Figures 3-4 show ground level irradiance at the 24-hour resolution, at Sacramento site 69, and
208
Brookhaven site 13. Irradiance is dominated by the diurnal cycle, but also shows high degrees of
209
nonlinear variability. The data at Brookhaven is much more volatile, due to higher precipitation and
210
cloud cover. [FIGURES 3-4 ABOUT HERE]
ACCEPTED MANUSCRIPT 9 In all the data sets, there were large numbers of missing observations. Missing values are not a
211
problem for the meteorological models, since all of the forecasts correspond with actual observations.
213
However, they are an issue for the ARIMAs, which need to be estimated over continuous data streams.
214
Several interpolation methods were tried. The optimal method for Sacramento was to estimate an
215
ARIMA over the actual observations, and use the fitted values to fill in the missing data points. At
216
Brookhaven, this was impossible, since with more missing observations, the likelihood function could not
217
be maximized. Instead, the procedure was to use data from Brookhaven’s solar bay station, where there
218
were no missing values. The bay station is located less than 1 kilometer from the power plant. The
219
observations for irradiance were similar to those at the three sites, making these values suitable to use as
220
interpolations.
M AN U
SC
RI PT
212
In the tests of forecast accuracy, however, all the interpolated values were omitted. To insure
221 222
comparability, the time series forecasts are evaluated only for the same intervals as for the meteorological
223
models.
224 3.2 Design of the tests
226
The forecasting experiments for Sacramento were set up as follows. For the 15-minute data, the
TE D
225
following models were used: the smart persistence and regression tree, the WRF-Solar-Now, the ARIMA,
228
the transfer function, the CIRACast and MADCast, and a weighted average of the WRF-Solar-Now and
229
cloud advection models. For the hourly data, the models also include the WRF-Solar-DA, and the
230
DICast. Two measures of forecast accuracy are used, the mean absolute error (MAE), in W/m2, and the
231
root mean squared error (RMSE). The RMSE assigns a stronger penalty to large errors.
AC C
232
EP
227
The ARIMA models were specified as ARIMA (1,0,0)(1,1,0), i.e., the model is differenced at
233
the cyclical horizon; it includes one proximate lag and one lag corresponding to the diurnal cycle. For the
234
hourly data, the interval of differencing is of course 24 hours. For the 15-minute data, the interval of
235
differencing is 96 periods. Various specifications were essayed, including longer lags. However, the
236
simpler specification produced the lowest forecast errors.
ACCEPTED MANUSCRIPT 10 In the transfer function, the clearness index is used as an input. The issues involved in estimating
238
the transfer function were complex. First, missing values in the clearness index had to be interpolated. A
239
battery of interpolation algorithms was run; the best results were obtained using a regression with
240
stochastic coefficients. Spurious interpolations such as negative values or values in excess of unity were
241
constrained to lie between 0 and 1. Second, the clearness index itself needs to be forecasted. Several
242
methods were tried, but none were able to predict effectively beyond very short horizons. The model
243
used here was a regression on lags. A neural network produced very similar results.
In the time series models for the 15-minute data, the first 2,000 observations were used as a
SC
244
RI PT
237
training sample. The irradiance and clearness index series were then forecasted iteratively, over horizons
246
of 15, 30 and 45 minutes. In the tests for the hourly data, the first 500 observations were used as the
247
training sample, and the forecasts were run over horizons of 1-4 hours. In each instance, the models were
248
estimated over prior values, forecasted, then re-estimated over the most recent value and forecasted again,
249
until the end of the time series. All the predictions are true out-of-sample forecasts, in that they only use
250
data prior to the start of the horizon. The forecasts for horizons beyond one observation are for this
251
interval only, and skip the intervening values.
TE D
252
M AN U
245
Time-varying parameter regressions can be estimated either using a Kalman filter [67] or a moving window. With an unrestricted Kalman filter, the coefficients behave as a random walk, reducing
254
predictive accuracy, so the moving window was used instead. Narrower widths allow high degrees of
255
coefficient variation, while wider widths make the coefficients more inertial [68]. Preliminary tests were
256
run over a range of moving windows. For the 15-minute data, the lowest errors were found over window
257
width of 1400-1700 observations. The width used in the tests was 1580 observations (365 hours or
258
roughly 15 days). For the hourly data, the smallest errors were found for widths in the range of 400-600
259
hours. In the tests, a window width of 480 hours (20 days) was used.
260 261 262 263
AC C
EP
253
ACCEPTED MANUSCRIPT 11 264
4. The forecasting tests
265 266
Because of the number of models tested, it is useful to report the salient findings up front. At horizons of 15-45 minutes, the ARIMA is generally superior. At horizons of 2-4 hours, the DICast
268
achieves the most accurate forecasts. At 1 hour, the contest between the ARIMA and the DICast is close
269
at the California sites, but the DICast is superior at Brookhaven. The ARIMA is generally more accurate
270
than WRF-Solar models at short horizons, but as the horizon extends, the meteorological models achieve
271
greater accuracy.
SC
RI PT
267
272 4.1 Sacramento
274
Table 3 shows the results for the 15-minute data at Sacramento. Parts 1 and 2 report the MAE
275
and RMSE respectively. At this resolution, the ARIMA easily achieves the most accurate forecasts in
276
terms of the MAE. The average error for the ARIMA is 41.1 W/m2 at the 15-minute horizon, increasing
277
to 53.6 W/m2 at 30 minutes and 60.2 W/m2 at 45 minutes. The error for the transfer function is slightly
278
higher. At first sight, this might appear counterintuitive: including the clearness index should make the
279
model more sensitive to cloud cover. However, including more terms on the right hand side can cause the
280
model to become “over-parameterized”. The clearness index adds a second term for behavior that is
281
already captured by the lags. As a result, the model terms are not independent and may interfere with
282
each other, reducing predictive accuracy.
TE D
EP
At the 15-minute horizon, the results from tseveral of the other models are similar. The MAEs
AC C
283
M AN U
273
284
for the regression tree, WRF-Solar and CIRACast all lie in a range of 74.4 to 77.3 W/m2. However, at
285
30 and 45 minutes, the WRF-Solar-Now model is clearly superior. The MADCast model generates lower
286
MAEs at 15 minutes, although at 45 minutes the two cloud advection models achieve similar degrees of
287
accuracy. At all horizons, these models are consistently able to beat the smart persistence forecast. The
288
regression tree is somewhat better, but the errors remain prohibitively high.
289 290
The findings for the RMSE confirm that the ARIMA and transfer function are more accurate at short horizons. When this measure is used, the transfer function is more competitive. Taking the average
ACCEPTED MANUSCRIPT 12 of all four sites, the transfer function error is only negligibly higher than the ARIMA at 15-30 minutes,
292
although at 45 minutes the ARIMA is clearly better. At Site 67, the transfer function is actually slightly
293
better at 30-45 minutes, but this is an anomalous result. The more typical finding is that the ARIMA and
294
transfer function results are generally close, with the ARIMA slightly more accurate.
RI PT
291
At 15 minutes, the weighted average of three models runs in third place, while the MADCast and
296
persistence forecasts run fourth and fifth. The MADCast model continues to do reasonably well at 30-45
297
minutes, when it is substantially better than the CIRACast. The accuracy of the persistence forecast falls
298
away very quickly. [TABLE 3 ABOUT HERE]
299
SC
295
Table 4 shows the results for the hourly data. Using the MAE, at the 1 hour horizon, the contest between the DICast and the ARIMA is fairly close, with DICast winning in two cases, while the ARIMA
301
is more accurate in the other two. Averaging the four sites, the mean absolute error for the DICast is 42.8
302
W/m2, while the mean absolute error for the ARIMA is 46.5 W/m2. The other models all show much
303
higher errors. The WRF-Solar-Now shows an MAE of 70.5 W/m2. The WRF-Solar-DA is more
304
accurate: the MAE is 67.5 W/m2. The cloud advection models achieve MAEs in the 83-86 W/m2 range;
305
the MADCast is slightly better than the CIRACast. The transfer function MAE averages 51.6 W/m2,
306
higher than the ARIMA. Again, the smart persistence and regression tree do poorly, consistently
307
showing the highest errors.
310
TE D
EP
309
At 2 hours, the DICast comes in first, with an error of 55 W/m2. The ARIMA runs second, with an MAE of 60.3 W/m2. The WRF-Solar-DA runs third, and the WRF-Solar-Now runs fourth. At 3-4 hours, DICast wins unambiguously, with MAEs in the range of 55-57 W/m2, while the
AC C
308
M AN U
300
311
ARIMA MAE increases to 65.8 W/m2 and 68.7 W/m2. At the 3 hour horizon, the WRF-Solar-DA and the
312
ARIMA are tied almost exactly. At 4 hours however, the WRF-Solar-DA is slightly better.
313
There is of course some variation at the individual sites. For instance, at Site 68, the results for
314
the WRF-Solar-DA, ARIMA and DICast are very similar at 2-3 hours. By comparison, at Site 69, the
315
WRF-Solar-DA is better at all horizons beyond one hour, while at Site 70, the ARIMA is better for the
316
first two hours. [TABLE 4 ABOUT HERE]
ACCEPTED MANUSCRIPT 13 The finding for the RMSE, reported in Part II of Table 4, differ in certain respects, but on the
318
whole produce similar findings. The contest is again primarily between the DICast and the ARIMA.
319
Using an average of the four Sacramento sites, the ARIMA achieves the lowest error at the 1 hour
320
horizon. The convergence point, at which the two methods achieve comparable degrees of accuracy, is 2
321
hours. The DICast is better at 3-4 hours. The WRF-Solar-DA and WRF-Solar-NOW achieve comparable
322
degrees of accuracy at 1 hour, but at all other horizons, the WRF-Solar-DA is better. The MADCast is
323
slightly better than CIRACast at 1 hour, but at these horizons the cloud advection models are not
324
particularly competitive.
SC
RI PT
317
Figures 5-6 show the errors from the DICast and ARIMA for Sacramento site 69 in a scatterplot,
325
with ground level irradiance on the horizontal axis. For the DICast, the majority of the errors fall in an
327
intermediate range, generally under 30 W/m2. However, a substantial number lie in a higher range, and
328
there are occasional extreme outliers. For the ARIMA, the distribution is more diffuse. There is no
329
obvious relationship between forecast accuracy and the level of irradiance. [FIGURES 5-6 ABOUT
330
HERE]
331
TE D
M AN U
326
332
4.2 Brookhaven
333
Tables 5-6 show the results for Brookhaven. Not all the models were available for this site. Instead, the methods used here are the smart persistence, the NWP-Solar-Now, CIRAcast, MADcast, the
335
weighted average and the ARIMA. The transfer function also could not be tested, due to extended gaps in
336
the clearness index.
AC C
337
EP
334
At the 15 minute resolution, the ARIMA achieves the highest degree of accuracy, but the errors
338
here are considerably higher than at the Sacramento sites. The weighted average achieves the second
339
smallest error, followed by WRF-Solar-Now. As the horizon increases, the accuracy of all the models
340
falls away very quickly. The ARIMA does poorly, running fourth at 30 minutes. At 45 minutes, the
341
WRF-Solar-Now does better than the alternatives. The accuracy of CIRACast falls away very quickly
342
between 30 and 45 minutes. The MADCast achieves similar degrees of accuracy at all three horizons.
343
[TABLE 5 ABOUT HERE]
ACCEPTED MANUSCRIPT 14 344
At the hourly resolution, the models are the smart persistence, MADCast, WRF-Solar-Now, WRF-Solar-DA, DICast and ARIMA. The DICast achieves the highest degree of accuracy at all
346
horizons. The forecast error from the DICast increases sharply between 1 and 2 hours, but then levels off.
347
The ARIMA places second at 1 hour, but again the accuracy falls away very rapidly. The WRF-Solar-
348
DA runs in third place at 1 hour and second at 2-4 hours. The WRF-Solar-Now runs in fourth place at 1
349
hour and third thereafter. [TABLE 6 ABOUT HERE]
351
Figures 7-8 show the DICast and ARIMA errors for Brookhaven site 13. The errors for the DICast are visibly lower than for the ARIMA. [FIGURES 7-8 ABOUT HERE]
SC
350
5. Findings from the forecasting tests
354 355
M AN U
352 353
RI PT
345
Among the statistical models, the ARIMA is superior to the two models based on the clearness index, and perhaps surprisingly to the transfer function as well. This outcome is at variance with other studies, so it
357
requires some explanation. Several other studies comparing neural networks with ARIMAs have used long
358
training periods, extended holdout periods for the forecasts, and fixed coefficients [70]. This procedure,
359
however, massively understates the power of the ARIMA. Allowing regression coefficients to vary over time
360
can capture a great deal of nonlinear variability. The main caveat is that the ARIMA works better at
361
Sacramento. At Brookhaven, the ARIMA is much less effective, except at the shortest horizons. The higher
362
errors from the regression tree are probably attributable to the use of fixed coefficients.
EP
The tests also provide further evidence on how well statistical methods compare with
AC C
363
TE D
356
364
meteorological models. At Sacramento, the convergence point between the WRF-Solar models and the
365
ARIMA is in the range of 3 hours, using the MAE. At horizons beyond 3 hours, the WRF-Solar models
366
are more accurate. At horizons of 15 minutes to 1 hour, ARIMA models are clearly superior. At 2 hours,
367
the contest is much closer, with the ARIMA slightly better for the California data sets. Conversely, at
368
Brookhaven, the convergence point appears to be about an hour. At all horizons beyond 1 hour, the
369
WRF-solar models are superior.
ACCEPTED MANUSCRIPT 15 370
The contest between the ARIMA and the DIcast is fairly close at short horizons. At the
371
Sacramento sites, the ARIMA is competitive at 1 hour using the MAE at Sacramento, and over somewhat
372
longer horizons using the RMSE. The DICast, however, is more accurate at all horizons at Brookhaven.
374
RI PT
373 6. Regressions for the physics models
375 376
As a further gauge of the relative strengths of the meteorological and cloud advection models, the actual values were regressed on the model forecasts at 1 hour, in natural logs, using a grid-search
378
correction for serial correlation. Tables 7-8 present these results for Sacramento and Brookhaven
379
respectively. The coefficients are elasticities: they express the percent change in the data in response to
380
the percent change in the model forecast. Ideally, the elasticity should equal to 1. Rho (ρ) is the
381
coefficient of fractional differencing, estimated by the serial correlation correction. Lower values of rho
382
are consistent with less structure in the residuals. [TABLES 7-8 ABOUT HERE]
M AN U
SC
377
The cloud advection models do not fit the data particularly well. the CIRAcast elasticities range
384
from 0.70 to 0.77. The adjusted R-square ranges from 0.65 to 0.71, indicating that a substantial share of
385
the variance remains unexplained. The MADcast is only somewhat better: the elasticities range from 0.71
386
to 0.83 at Sacramento, and 0.73 to 0.76 at Brookhaven. The weighted average also shows poor results.
387
The elasticities are in the range of 0.68 to 0.75.
EP
TE D
383
The WRF models yield rather good results at Sacramento. The elasticities are close to unity at
389
three of the sites, although site 67 shows a lower value. At Brookhaven, however, the WRF elasticities
390
are considerably lower. Similarly, the R-squares are high at three of the Sacramento sites, but lower at
391
Brookhaven.
392
AC C
388
The DICast shows the best results. At Sacramento, the elasticity for the DICast ranges from 0.95
393
to 1.06, averaging out to roughly unity. At Brookhaven, the elasticities are just short of 1. The R-squares
394
are in the range of 0.88 to 0.91 at Sacramento, and 0.96 at Brookhaven. The constant terms and
395
coefficients of fractional differencing are also lower, pointing to less serial correlation in the residuals.
396
ACCEPTED MANUSCRIPT 16 397 398
7. Conclusions
399 Each model has particular strengths and weaknesses. Time series models are able to predict more
RI PT
400
accurately at the shortest horizons primarily because the data is dominated by dependence between
402
proximate time points. As this dependence falls away, over longer horizons, the time series models
403
become less effective. Expressed another way, the atmosphere has moved beyond the Lagrangian integral
404
time scale at this point, i.e., that time span over which the atmosphere “remembers” its prior state.
405
Among time series models, ARIMAs with stochastic parameters are superior to fixed coefficient models.
406
While the persistence and regression tree models showed much larger errors, this does not necessarily
407
invalidate the idea of converting irradiance to an index. In this respect, there is an emerging literature on
408
clear sky models, which quantify the impact of factors ranging from the air mass, to cloud cover and
409
atmospheric turbidity [71]. A detailed comparison of clear sky models with ARIMAs, both using
410
stochastic parameters, will be the topic of a future study.
M AN U
SC
401
Cloud advection models are not found to fit the data closely, and their predictive accuracy is
412
generally lower than large-scale meteorological models. The reasons most likely have to do with the
413
larger number of causal factors taken into account in numerical weather prediction systems. Large-scale meteorological models do better for time scales beyond about 2 hours. The reason is
EP
414
TE D
411
their ability to predict the synoptic and mesoscale changes in cloud cover and weather patterns. Their
416
main limitation is that they do not capture all the short-term variability in the data. Statistical adjustment
417
enables the models to track the data more effectively. The DICast combines the physics embodied in
418
meteorological models with the ability to adjust to changing atmospheric conditions.
419
AC C
415
In a direct comparison of time series and physics-based models, the ARIMA is more accurate at
420
short horizons, while the meteorological models are more accurate as the horizon extends. The
421
convergence points lie in a range of only 1-3 hours. Comparisons for wind and wave energy have found
422
longer convergence points, in the range of 5-6 hours [72-74]. By implication, the expected integral time
423
scale for wave energy is longer than for solar. The shorter convergence points may also reflect the fact
ACCEPTED MANUSCRIPT 17 that the WRF-Solar-Now model was run in nowcasting mode, updating every hour, which is seldom
425
available for other NWP models. When NWP models are statistically adjusted, they are able to
426
outperform all the alternatives at horizons of 2-4 hours, while they are competitive at the 1-hour horizon.
427
The more general implication is that over horizons where the data exhibits high degrees of variability,
428
enabling models to adapt to changing conditions raises forecast accuracy.
429 430
433
Acknowledgements
M AN U
432
SC
431
RI PT
424
The authors thank Sacramento Municipal Utility District and Brookhaven National Laboratory for use of their solar irradiance measurements. They also thank the modelers who produced the forecasts
435
used as a comparison for this work and Jared Lee for the maps used in Figures 1-2. Finally, the authors
436
thank the reviewers, whose comments have made this a better paper.
EP AC C
437
TE D
434
ACCEPTED MANUSCRIPT 18 438
REFERENCES
439 440 441 442 443 444
[1] Federal Energy Regulatory Commission (FERC), 2012. Order No. 764. Washington, DC. Federal Energy Regulatory Center.
445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480
[3] Chow, C. W., Urquhart, N., Lave, M., Dominquez, A., Kleissl, J., Shields, J., Washom, B., 2011: Intra-hour forecasting with a total sky imager at the UC San Diego solar energy testbed. Solar Energy, 85, 2881-2893.
481 482
[14] Schertzer, D., Lovejoy, S., Schmitt, F., Chigirinskaya, Y., Marsan, D., 1997. Multifractal cascade dynamics and turbulent intermittency. Fractals. 5, 427–471.
483 484
[15] Lovejoy, S., Schertzer, D., 2013. The weather and climate: emergent laws and multifractal cascades. Cambridge, Cambridge University Press.
RI PT
[2] Tuohy, A., Zack, J., Haupt, S.E., Sharp, J., Ahlstrom, M., Dise, S., Grimit, E., Möhlren, C., Lange, M., Casado, M.G., Black, J., Marquis, M., Collier, Craig., 2015. Solar Forecasting: Methods, Challenges and Performance. IEEE Power and Energy Magazine. 13, 50-59.
SC
[4] Marquez, R., Coimbra, C.F.M., 2013: Intra-hour DNI forecasting based on cloud tracking image analysis. Solar Energy, 91, 327-336.
M AN U
[5] Marquez, R., Pedro, H.T.C., Coimbra, C.F.M., 2013: Hybrid solar forecasting method uses satellite imaging and ground telemetry as inputs to ANNs. Solar Energy, 92, 176-188. [6] Quesada-Ruiz, S., Chu, Y., Tovar-Pescador, J., Pedro, H. T. C., Coimbra, C.F.M., 2014: Cloudtracking methodology for intra-hour DNI forecasting. Solar Energy, 102, 267-275. [7] Chu, Y., Pedro, H.T.C., Coimbra, C.F.M., 2013: Hybrid intra-hour DNI forecasts with sky image processing enhanced by stochastic learning. Solar Energy, 98, 592-603.
TE D
[8] Chu, Y., Pedro, H.T.C., Li, M., Coimbra, C.F.M., 2015: Real-time forecasting of solar irradiance ramps with smart image processing. Solar Energy, 114, 91-104. [9] Lorenz, E., Kuhnert, J., Heinemann, D., 2012: Overview on irradiance and photovoltaic power prediction. Weather Matters for Energy, 429-454. [10] Kleissl, J. 2013: Solar Energy Forecast and Resource Assessment. Academic Press.
EP
[11] Mathiesen, P., Kleissl, J., 2011: Evolution of numerical weather prediction for intra-day solar forecasting in the continental United States. Solar Energy 85, 967-997.
AC C
[12] Zhang, J., Hodge, B.M., Lu, S., Hamann, H.F., Lehman, B., Simmons, J., Campos, E., Banunarayanan, V. Black, J., Tedesco, J. , 2015: Baseline and target values for regional and point PV power forecasts: Toward improved solar forecasting. Solar Energy, 122, 804-819. [13] Haupt, S.E., B. Kosovic, T. Jensen, J. Lee, P. Jimenez, J. Lazo, J. Cowie, T. McCandless, J. Pearson, G. Weiner, S. Alessandrini, L. Delle Monache, D. Yu, Z. Peng, D. Huang, J. Heiser, S. Yoo, P. Kalb, S. Miller, M. Rogers, and L. Hinkleman, 2016: The SunCast Solar Power Forecasting System: The Results of the Public-Private-Academic Partnership to Advance Solar Power Forecasting. NCAR TechnicalReport TN-526+STR, 307 pp, doi:10.5065/D6N58JR2.
ACCEPTED MANUSCRIPT 19
[16]Granger, C.W.J., 2008. Non-Linear Models: Where Do We Go Next – Time Varying Parameter Models? Studies in Nonlinear Dynamics and Econometrics 12: Article 1. http://www.bepress.cxom/snde/vol12/iss3/art1
489
[17] Bunn, D.W., 2004. Modelling Prices in Competitive Electricity Markets. New York, Wiley.
490 491 492 493 494 495 496 497 498
[18] Dutton, J.A., 1976: The Ceaseless Wind: An Introduction to the Theory of Atmospheric Motion, New York, McGraw-Hill.
499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516
[21] Jimenez, P., Hacker, J. Dudhia, J., Haupt, S.E., Ruiz-Arias, J., Gueymard, C., Thompson, G., Eidhammer, T., and A. Deng, A., 2015: WRF-Solar: An augmented NWP model for solar power prediction. Model description and clear sky assessment. Bulletin of the American Meteorological Society. 97, 7, 1249-1264. doi:10.1175/BAMS-D-14-00279.1.
517 518 519 520 521 522
[26] Delle Monache, L., Eckel, F.A., Rife, D.L., Nagarajan, B., and Searight, K., 2013: Probabilistic weather predictions with an analog ensemble. Monthly Weather Review. 141, 3489–3516, doi:10.1175/MWR-D-12-00281.1.
523 524
[28] Haupt, S.E. and B. Kosovic, 2016: Variable Generation Power Forecasting as a Big Data Problem, IEEE Transactions on Sustainable Energy. Forthcoming.
525 526 527 528
[29] Glahn, H.R., Lowry, D.A., 1972. The use of model output statistics (MOS) in objective weather Forecasting. Journal of Applied Meteorology, 11, 1203-1211.
RI PT
485 486 487 488
[19] Vallis, G.K., 2006: Atmospheric and Oceanic Fluid Dynamics: Fundamentals and Large-Scale Circulation, Cambridge University Press, Cambridge, UK.
M AN U
SC
[20] Skamarock, W.C., J.B. Klemp, J. Dudhia, D.O. Gill, D.M. Barker, M.G. Duda, X.-Y. Huang, W. Wang, and J.G. Powers, 2008: A description of the Advanced Research WRF Version 3. NCAR Tech. Note NCAR/TN-475+STR. 113 pp. doi:10.5065/D68S4MVH.
[22] Bouzerdoum, M., Mellit, A., and Massi Pavan, A., 2013: A hybrid model (SARIMA-SVM) for short-term power forecasting of a small-scale grid-connected photovoltaic plant. Solar Energy, 98, 226235.
TE D
[23] Voyant, C., Muselli, M., Paoli, C., Nivet, M.L., 2014: Numerical Weather Prediction (NWP) and hybrid ARMA/ANN to predict global radiation. Energy, 39:1, 341-355. [24] Voyant, C., Muselli, M., Paoli, C., Nivet, M.L., 2013: Hybrid methodology for hourly global radiation forecasting in Mediterranean area. Renewable Energy, 53, 1-11.
AC C
EP
[25] Myers, W., G. Wiener, S. Linden, and S. E. Haupt, 2011: A consensus forecasting approach for improved turbine hub height wind speed predictions, in Proceedings, Wind Power, 2011. Anaheim, CA, May 24, 2011.
[27] Haupt, S.E. and W.P. Mahoney, 2015: Wind Power Forecasting, IEEE Spectrum, Nov 2015, pp. 4652.
ACCEPTED MANUSCRIPT 20 529 530 531 532 533 534
[30] Peng, Z, Yoo, S., Yu, D., Huang, D., 2013. Solar irradiance forecast system based on geostationary satellites. 2013 IEEE International Conference on Smart Grid Communications, 708–713.
535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574
[32] Perez, R., Kivalov, S., Schlemmer, J. Hemker, K., Renne, D., Hoff, T.E., 2010: Validation of short and medium-term operational solar radiation forecasts for the U.S. Solar Energy, 84, 2161-2172.
575 576 577 578
[44] Miller, S.D., Forsythe, J.M., Partain, P.T., Haynes, J.M., Bankert, R.L., Sengupta, M., Mitrescu, C., Hawkins, J.D., Vonder Haar, T.H., 2014: Estimating Three-Dimensional Cloud Structure via Statistically Blended Satellite Observations. Journal of Applied Meteorology and Climatology, 53, 437–455.
RI PT
[31] Peng, Z., Yoo, S., Yu, D., Huang, D., Kalb, P, Heiser, J. 2014. 3D cloud detection and tracking for solar forecast using multiple sky imagers. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, 512–517. Association for Computing Machinery 2014.
[33] Perez, R., Lorenz, E., Pelland, S., Beautharnois, M., Van Knowe, G., Hemker, K., Heinemann, D., Remund, J., Muller, S.C. Traumiller W., et al., 2013: Comparison of numerical weather prediction solar irradiance forecasts in the U.S., Canada and Europe. Solar Energy, 94, 305-326.
SC
[34] Beyer, H. G., Costanzo, C., Heinemann, D., 1996: Modifications of the Heliosat procedure for irradiance estimates from satellite data. Solar Energy, 56, 121–207.
M AN U
[35] Bilionis, I., Constantinescu, E. M., and Anitescu, M., 2014: Data-driven model for solar irradiation based on satellite observations. Solar Energy, 110, 22-38. [36] Cros, S., Liandrat, O., Sébastien, N., Schmutz, N., 2014: Extracting cloud motion vectors from satellite images for solar power forecasting. Geoscience and Remote Sensing Symposium (IGARSS), 2014 IEEE International, 4123-4126.
[37] Stauffer, D.R., Seaman, N.L., 1990: Use of four-dimensional data assimilation in a limited area mesoscale model. Part I: Experiments with synoptic-scale data. Monthly Weather Review, 181, 1250-1277.
TE D
[38] Stauffer, D.R., Seaman, N.L., 1994: Multiscale four-dimensional data assimilation, Journal of Applied Meteorology. 33, 416-434 [39] Tapakis, R., Charalambides, A.G., 2013: Equipment and methodologies for cloud detection and classification: A review. Solar Energy, 95, 392-430.
EP
[40] Zagouras, A., Kazantzidis, A., Nikitidou, E., Argiriou, A.A., 2013: Determination of measuring sites for solar irradiance, based on cluster analysis of satellite-derived cloud estimations. Solar Energy, 97, 111.
AC C
[41] Hammer, A., Heinemann, D., Lorenz, E., Lückehe, B., 1999: Short-term forecasting of solar radiation: a statistical approach using satellite data. Solar Energy, 67, 139-150. [42] Lorenz, E., Hammer, A., Heinemann, D., 2004: Short term forecasting of solar radiation based on satellite data. EUROSUN2004, ISES Europe Solar Congress, 841-848. [43] Miller, S. D., Rogers, M.A., Heidinger, A.K., Laszlo, I., Sengupta, M., 2012. Cloud advection schemes for short-term satellite-based insolation forecasts. Proceedings of the World Renewable Energy Forum, 17 May 2012, Denver, CO, 1963-1967.
ACCEPTED MANUSCRIPT 21 [45] Rogers, M.A., Miller, S.D., Haynes, J.M., Heidinger, A. K., Haupt, S.E., Sengupta, M., 2015. Improvements in satellite-derived short-term insolation forecasting: Statistical comparisons, challenges for advection-based forecasts, and new techniques. Sixth conference on Weather, Climate, and the New Energy Economy at the 95th American Meteorological Society Annual Meeting, Phoenix, AZ, 6 Jan 2015.
584 585 586
[46] Heidinger, A. K., Foster, M.J., Walther, A., Zhao, X., 2013: The Pathfinder Atmospheres Extended AVHRR Climate Data Set. Bulletin of the American Meteorological Society. doi:10.1175/BAMS-D-12-00246.1.
SC
[47] Auligné, T., 2014a: Multivariate minimum residual method for cloud retrieval. Part I: Theoretical aspects and simulated observations experiments. Monthly Weather Review, 142, 4383-4398. doi: 10.1175/ mWR-D-13-00172.1. [48] Auligné, T., 2014b: Multivariate minimum residual method for cloud retrieval. Part II: real observations experiments. Monthly Weather Review, 142, 4399-4415. doi: 10.1175/ mWR-D-1300173.1.
M AN U
587 588 589 590 591 592 593 594
RI PT
579 580 581 582 583
595 596
[49] Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting and Control. San Francisco, Holden-Day.
597 598 599 600 601 602
[50] Reikard, G., 2009. Predicting Solar Radiation at High Resolutions: A Comparison of Time Series Forecasts. Solar Energy. 83, 342-349.
603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625
[52] Lopez, G., Batlles, F.J., Tovar-Pescador, J., 2005: Selection of input parameters to model direct solar irradiance by using artificial neural networks. Energy, 30, 1675-1684.
TE D
[51] Nobre, A.M., Severiano, C.A., Karthik, S. Marek Kubis, M., Zhao , L., Martins, F.R., Pereira, E.B., Rüther R., Reindl T., 2016. PV power conversion and short-term forecasting in a tropical, densely-built environment in Singapore. Renewable Energy, 94, 496-509.
EP
[53] McCandless, T. C., Haupt, S. E., Young, G.S., 2014: Short Term Solar Radiation Forecasts Using Weather Regime-Dependent Artificial Intelligence Techniques, 12th Conference on Artificial Intelligence: Applications of Artificial Intelligence Methods for Energy, Atlanta, GA, American Meteorological Society, J3.5.
AC C
[54] McCandless, T. C., Haupt, S. E., Young, G.S., 2016: A Regime-Dependent Artificial Neural Network Technique for Short-Range Solar Irradiance Forecasting. Renewable Energy, 89, 351-359. [55] McCandless, T.C., Young, G.S., Haupt, S.E., Hinkelman, L.M., 2016. Regime-Dependent Short-Range Solar Irradiance Forecasting. Journal of Applied Meteorology and Climatology, 55, 15991613. [56] Fu, C.L., and Cheng, H.Y., 2013: Predicting solar irradiance with all-sky image features via regression. Solar Energy, 97, 537-550. [57]Mellit, A., 2008: Artificial Intelligence Technique for Modeling and Forecasting of Solar Radiation Data: A Review. International Journal Artificial Intelligence and Soft Computing, 1, 52-76. [58] Mellit, A., Massi Pavan, A., Lughi, V., 2014: Short-Term Forecasting of Power Production in a Large-Scale Photovoltaic Plant, Solar Energy, 105, 401-413.
ACCEPTED MANUSCRIPT 22 626 627 628 629
[59] Notton, G., Paoli, C., Vasileva, S., Nivet, M.L., Canaletti, J-L., Cristofari, C., 2012: Estimation of hourly global solar irradiation on tilted planes from horizontal one using artificial neural networks. Energy, 39, 166-179. [60] Bhardwaj, S., Sharma, V., Srivastava, S., Sastry, O. S., Bandyopadhyay, B., Chandel, S. S., Gupta, J.R.P., 2013: Estimation of solar radiation using a combination of Hidden Markov Model and generalized Fuzzy model. Solar Energy, 93, 43-54.
654 655
[67] Kalman, R.E., 1960. A new approach to linear filtering and prediction problems. Transactions of the American Society of Mechanical Engineers, Journal of Basic Engineering. 83D, 35-45.
656 657
[68] Rossi, B., Inoue, A., 2012. Out-of-sample forecast tests robust to the choice of window size. Journal of Business and Economic Statistics. 30, 432-453.
658 659 660 661 662 663 664 665 666 667 668 669 670 671 672
[70] Pedro, H.T.C., Coimbra, C.F.M., 2012: Assessment of forecasting techniques for solar power prediction with no exogenous inputs. Solar Energy, 86, 2017-2028.
RI PT
630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653
[61] Diagne, M., David, M., Lauret, P., Boland, J., Schmutz, N., 2013: Review of solar irradiance forecasting methods and a proposition for small-scale insular grids. Renewable and Sustainable Energy Reviews, 27, 65-76.
SC
[62] Inman, R. H., Pedro, H.T.C., Coimbra, C.F.M., 2013: Solar forecasting methods for renewable energy integration. Progress in Energy and Combustion Science, 39, 535-576.
M AN U
[63] Fernandez, E., F. Almonacid, N. Sarmah, P. Rodrigo, T.K. Mallick, Perez-Higueras, P., 2014: A model based on artificial neuronal network for the prediction of the maximum power of a low concentration photovoltaic module for building integration. Solar Energy, 100, 148-158. [64] Almonacid, F., Pérez-Higueras, P. J., Fernández, E. F., Hontoria, L., 2014: A methodology based on dynamic artificial neural network for short-term forecasting of the power output of a PV generator. Energy Conversion and Management, 85, 389-398. [65] Morf, H., 2014: Sunshine and cloud cover prediction based on Markov processes. Solar Energy, 110, 615-626.
AC C
EP
TE D
[66] Quinlan, J.R., 1996: Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, 4, 77-90.
[71] Reno, M.J., Hansen, C.W., Stein, J.S., 2012. Global Horizontal Clear Sky Models: Implementation and Analysis. Sandia National Laboratory report SAND 2012-2389. [72] Reikard, G., Pinson, P., Bidlor, J.R., 2011. Forecasting Ocean Wave Energy: The ECMWF wave model and time series methods. Ocean Engineering. 38, 1089-1099. [73] Foley, A.M, Leahy, P.G., Marvuglia, A., McKeough, E.J., 2012. Current methods and advances in forecasting of wind power generation. Renewable Energy, 37, 1-8. [74] Reikard, G., Robertson, B., Bidlot, J.R., 2015. Combining wave energy with wind and solar: Shortterm forecasting. Renewable Energy. 81, 2015, 442-456.
ACCEPTED MANUSCRIPT
Model
RI PT
Table 1: Glossary of the Forecasting Models
Description
SC
Numerical Weather Prediction models
Temporal Resolution
Weather Research and Forecasting model. Now-casting version, 9 km resolution. Initial and boundary conditions calculated using the National Centers for Environmental Prediction (NCEP) Rapid Refresh model.
15 minutes, 1 hour
WRF-Solar-DA
Weather Research and Forecasting model. Day ahead, 3 km resolution. Initial and boundary conditions calculated using NCEP’s High Resolution Rapid Refresh system for the first 15 hours, transitioning to the Rapid Refresh at longer horizons.
1 hour
DICast
Dynamic Integrated Forecast system, using statistical adjustment and blending 1 hour of multiple NWP model outputs. The model forecasts are post-processed to correct the bias using the recent forecast error. Then, several of the adjusted NWP forecasts are combined using optimized weights.
Cloud Advection Models
AC C
EP
TE D
M AN U
WRF-Solar-NOW
CIRACast
Combines cloud fields identified by satellite observations with wind forecasts from the NOAA Global Forecast System.
15 minutes, 1 hour
MADCast
Multisensor Advection Diffusion algorithm.
15 minutes, 1 hour
ACCEPTED MANUSCRIPT
Combined Models Weighted average of the WRF-Solar-Now, CIRACast and MADCast.
RI PT
Weighted Average Time Series Models
15 minutes, 1 hour
Autoregressive, integrated, moving average model, with time-varying coefficients. Specification: ARIMA (1,0,0)(1,1,0). Coefficient variation estimated using a moving window.
15 minutes, 1 hour
Transfer Function
ARIMA with the clearness index as an input. The clearness index is forecasted using a regression on proximate lags.
15 minutes, 1 hour
Smart Persistence
The clearness index is set equal to its previous value, while the solar angle changes In relation to the specific time of day and day of year.
15 minutes, 1 hour
Regression tree
The clearness index, forecasted using a regression tree.
15 minutes, 1 hour
AC C
EP
TE D
M AN U
SC
ARIMA
ACCEPTED MANUSCRIPT
Table 2: The Data Sacramento Site
Start date
End date
Usable Observations
January 2, 2015, 02:00 hours January 2, 2015, 02:00 hours January 2, 2015, 02:00 hours January 2, 2015, 02:00 hours
April 28, 2016, 23:00 hours April 28, 2016, 23:00 hours April 28, 2016, 23:00 hours April 28, 2016, 23:00 hours
19441 19523 19536 17701
January 17, 2015, 20:00 hours January 17, 2015, 20:00 hours January 17, 2015, 20:00 hours January 17, 2015, 20:00 hours
June 6, 2016, 0:100 hours June 6, 2016, 0:100 hours June 6, 2016, 0:100 hours April 14, 2016, 10:00 hours
5232 5305 5215 3056
67 68 69 70
Site Site Site Site
67 68 69 70
M AN U
Hourly resolution
Brookhaven 15-minute resolution
February 7, 2015, 01:00 hours February 7, 2015, 01:00 hours February 7, 2015, 01:00 hours
Hourly resolution
EP
February 6, 2015, 20:00 hours February 6, 2015, 20:00 hours February 6, 2015, 20:00 hours
AC C
Site 13 Site 18 Site 24
TE D
Site 13 Site 18 Site 24
SC
Site Site Site Site
RI PT
15-minute resolution
April 12, 2016, 14:00 hours April 12, 2016, 14:00 hours April 12, 2016, 14:00 hours
15714 15170 15808
April 12, 2016, 14:00 hours April 12, 2016, 14:00 hours April 12, 2016, 14:00 hours
3797 3659 3815
ACCEPTED MANUSCRIPT
Table 3: Model Accuracy, 15 minute resolution, Sacramento
Site
30 minutes
Sacramento site 67 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
86.5 78.1 67.6 70.7 75.6 77.7 42.4 42.9
99.8 85.7 67.8 71.4 81.9 86.6 54.8 57.2
Sacramento site 68 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
85.9 73.6 83.4 84.8 70.8 71.1 41.1 42.7
45 minutes
AC C
EP
TE D
M AN U
15 minutes
SC
Forecast Horizon
RI PT
Part 1: The mean absolute error (watts per meter squared)
98.5 80.7 80.6 85.6 74.5 84.7 54.3 57.6
106.2 94.2 66.5 70.9 84.5 88.6 61.5 65.2
111.8 88.3 80.4 84.6 80.8 91.2 60.7 65.6
ACCEPTED MANUSCRIPT
Table 3, continued Forecast Horizon
Sacramento site 69 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
81.6 68.5 68.1 71.9 80.9 72.5 41.4 42.7
94.8 75.2 68.6 72.8 83.1 80.1 54.2 56.9
Sacramento site 70 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
79.1 67.6 65.1 74.3 60.6 70.8 42.2 43.4
TE D
EP AC C
Average, Four sites Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA
83.3 72.0 71.1 75.4 72.0 73.0 41.8
45 minutes
RI PT
30 minutes
M AN U
15 minutes
SC
Site
105.7 83.7 68.5 73.2 75.5 85.7 60.9 64.8
92.2 72.3 64.3 75.5 65.6 79.4 53.1 56.9
104.9 81.6 64.5 77.3 69.5 86.9 60.7 65.1
96.3 78.5 70.3 76.3 76.3 82.7 54.1
107.2 87.0 70.0 76.5 77.6 88.1 61.0
ACCEPTED MANUSCRIPT
57.2
65.2
EP
TE D
M AN U
SC
RI PT
42.9
AC C
Transfer Function
ACCEPTED MANUSCRIPT
Table 3, continued
Site
30 minutes
Sacramento site 67 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
126.5 146.7 159.9 130.1 124.6 117.4 72.4 72.9
152.9 158.4 155.5 155.2 123.8 136.8 92.7 91.5
Sacramento site 68 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
128.1 133.4 169.4 135.2 115.7 119.3 71.8 72.4
45 minutes
AC C
EP
TE D
M AN U
15 minutes
SC
Forecast Horizon
RI PT
Part 2: The root mean squared error
142.8 139.8 172.4 140.1 116.2 125.1 90.5 92.1
157.5 170.2 148.7 145.8 124.8 130.1 103.2 100.9
156.8 150.7 169.1 166.1 115.4 126.7 99.6 102.7
ACCEPTED MANUSCRIPT
Table 3, continued. Forecast Horizon
Sacramento site 69 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
121.1 137.1 140.1 146.6 119.2 111.2 70.2 70.8
143.7 150.4 149.8 156.7 123.3 124.8 88.3 89.9
Sacramento site 70 Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
123.8 127.7 170.1 129.2 129.5 114.9 72.0 72.5
TE D
EP AC C
Average, Four sites Smart Persistence Regression tree NWP-Solar-NOW CIRACast MADCast Weifhted average ARIMA Transfer Function
124.9 136.2 159.9 135.3 122.3 115.7 71.6 72.1
45 minutes
RI PT
30 minutes
M AN U
15 minutes
SC
Site
158.3 265.4 161.1 168.9 124 129.7 98.1 100.7
145.5 134.6 167.2 140.4 123.2 126.9 89.3 90.8
167.4 145.2 165.7 163.1 130.4 136.2 99.1 110.7
146.2 145.8 161.2 148.1 121.6 128.4 90.2 91.1
160.0 182.9 161.2 161.0 123.7 130.7 100.0 103.8
ACCEPTED MANUSCRIPT
Table 4: Model Accuracy, hourly resolution, Sacramento
Forecast Horizon 1 hour
2 hours
Site
125.1 100.1 85.0 87.9 95.1 66.1 66.0 49.1 47.4 53.4
SC
153.9 130.5 98.5 73.8 91.7 65.6 … 43.1 61.1 70.6
M AN U
TE D
EP
Sacramento 68 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
120.4 103.9 88.5 72.1 89.9 67.3 … 38.9 46.6 52.4
AC C
Sacramento 67 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
RI PT
Part 1: Mean absolute error (watts per meter squared)
158.8 128.2 91.2 89.2 104.8 81.7 65.6 66.2 62.1 70.6
3 hours
4 hours
182.7 … … 75.6 99.1 64.8 … 44.5 65.5 75.5
195.4 … … 78.1 90.3 64.3 … 44.6 68.3 80.2
186.4 … … 90.1 112.7 80.8 66.6 66.1 68.1 79.4
198.6 … … 93.3 106.7 81.7 67.1 66.8 71.8 84.4
ACCEPTED MANUSCRIPT
Table 4, continued.
2 hours
3 hours
Site
117.8 89.4 74.9 75.2 90.7 67.3 63.5 44.5 43.3 49.9
SC
Sacramento70 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
EP AC C
157.4 122.3 95.8 73.2 92.4 67.1 49.1 43.2 62.8 70.1
M AN U
130.5 93.3 90.1 71.7 87.4 67.9 49.8 36.2 48.5 54.1
TE D
Sacramento 69 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
RI PT
Forecast Horizon 1 hour
156.3 113.3 77.6 76.1 95.4 63.3 61.5 58.9 55.9 66.9
4 hours
187.2 … … 75.4 100.4 68.1 49.9 44.1 69.1 78.3
201.9 … … 77.9 93.2 66.1 50.1 49.3 72.1 83.4
186.8 … … 77.1 103.6 63.1 67.5 58.6 60.6 67.4
202.1 … … 77.6 92.7 63.2 67.1 58.9 62.7 68.8
ACCEPTED MANUSCRIPT
Table 4, continued
TE D
Part 2: Root mean squared error
163.2 145.2 128.1 114.1 136.2 114.3 … 73.8 76.8 90.7
AC C
Sacramento 67 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
EP
Forecast Horizon 1 hour
Site
2 hours
199.1 174.4 144.9 116.6 139.9 111.6 … 74.9 96.7 113.1
185.8 … … 79.6 104.0 69.2 61.3 53.3 65.8 75.2
199.5 … … 81.7 95.7 68.8 61.4 54.9 68.7 79.2
3 hours
4 hours
230.6 … … 119.2 151.8 116.4 … 75.4 106.1 123.8
247.4 … … 119.9 152.9 119.7 … 76.1 112.5 132.5
RI PT
156.6 123.6 90.8 78.1 96.1 69.4 58.7 52.9 60.5 69.6
SC
123.5 96.7 84.6 76.7 90.8 67.2 59.8 42.2 46.4 52.5
M AN U
Average of four sites Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
ACCEPTED MANUSCRIPT
Table 4, continued
2 hours
3 hours
Site
130.2 136.2 129.5 112.8 121.9 91.2 80.7 62.9 77.3 90.8
SC
Sacramento 69 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
EP AC C
211.4 177.1 140.6 146.4 162.3 140.5 132.3 129.1 101.4 114.8
M AN U
177.8 147.7 129.8 145.5 147.7 131.2 131.1 90.2 77.1 87.5
TE D
Sacramento 68 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
RI PT
Forecast Horizon 1 hour
203.6 169.1 142.7 114.6 139.3 110.3 83.9 63.8 100.7 117.4
4 hours
241.2 … … 148.2 174.2 148.8 135.6 129.7 113.8 129.4
256.3 … … 149.6 181.2 148.7 138.3 137.8 120.2 137.6
236.1 … … 116.9 153.2 108.1 88.5 73.9 113.8 129.4
255.2 … … 117.8 155.1 105.3 91.5 79.3 118.0 139.3
ACCEPTED MANUSCRIPT Forecast Horizon 1 hour
2 hours
180.9 159.3 131.9 161.2 152.1 143.3 156.1 120.8 74.6 82.2
221.3 187.2 140.1 160.7 174.9 153.7 143.6 121.6 96.5 105.2
Average of four sites Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
163.0 147.1 129.8 133.4 139.5 120.0 122.6 86.9 76.5 87.8
208.9 177.0 142.1 134.6 154.1 129.0 119.9 97.4 98.8 112.6
AC C
EP
TE D
M AN U
SC
Sacramento70 Smart Persistence Regression tree CIRACast MADCast Weighted average NWP-Solar-NOW NWP-Solar-DA DICast ARIMA Transfer Function
3 hours
4 hours
251.4 … … 160.4 185.9 150.8 143.7 123.6 104.5 118.2
267.3 … … 161.5 185.9 146.3 143.7 124.5 113.2 126.3
239.8 … … 136.2 166.3 131.0 122.6 100.6 109.6 125.2
256.6 … … 137.2 168.8 130.0 124.5 104.4 116.0 133.9
RI PT
Table 4, continued.
ACCEPTED MANUSCRIPT
Table 5: Model Accuracy, 15-minute resolution, Brookhaven
Forecast Horizon
Brookhaven 24 Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
81.6 81.5 …
94.2 79.1 65.1
87.2 81.3
…
90.8 78.7 64.9
45 minutes
SC
95.1 81.8 66.4
100.1 85.2 82.6 97.7 89.6 89.1
M AN U
…
TE D
Brookhaven 18 Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
30 minutes
87.9 84.2
EP
Brookhaven 13 Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
15 minutes
AC C
Site
RI PT
Part 1: Mean absolute error (watts per meter squared)
109.1 86.8 93.3 96.8 93.2 101.8
93.9 83.4 83.5 96.1 86.4 88.4
101.1 84.3 93.8 95.2 89.4 100.9
93.5 82.1 84.4 91.7 85.1 88.2
101.1 82.9 95.1 90.9 87.7 100.5
ACCEPTED MANUSCRIPT
Table 5, continued
Forecast Horizon
Site
15 minutes
Average of all sites Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
45 minutes
95.8 83.6 83.5 95.2 87.0 88.6
103.8 84.7 94.1 94.3 90.1 101.1
30 minutes
45 minutes
147.7 135.1 124.8 155.8 134.1 136.1
156.5 134.7 149.3 155.7 135.7 139.6
142.6 142.2 125.3 154.9 132.2 134.1
149.5 143.1 149.8 155.1 132.9 138.5
85.6 82.3 93.4 79.9 65.5
Site
15 minutes
Brookhaven 13 Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
SC
Forecast Horizon
RI PT
…
Part 2: Root mean squared error
M AN U
132.9 126.5
…
TE D
155.4 126.5 103.7
125.2 138.1
…
AC C
EP
Brookhaven 18 Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
30 minutes
154.6 125.5 102.6
ACCEPTED MANUSCRIPT
Table 5, continued
Forecast Horizon
Site
15 minutes
Brookhaven 24 Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
126.2 137.6 150.7 124.6 102.3
Average of all sites Smart Persistence NWP-Solar-NOW CIRACast MADCast Weighted average ARIMA
140.9 139.6 127.3 150.2 129.6 133.1
148.4 139.9 152.9 150.8 130.1 138.3
143.7 139.0 125.8 153.6 132.0 134.4
151.5 139.2 150.7 153.9 132.9 138.8
SC
128.1 134.1
45 minutes
RI PT
…
30 minutes
AC C
EP
TE D
M AN U
153.6 125.5 102.9
ACCEPTED MANUSCRIPT
Table 6: Model Accuracy, hourly resolution, Brookhaven
Forecast Horizon 2 hours
115.1 98.6 88.4 76.7 51.1 71.4
Brookhaven 18 Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
109.6 96.5 86.4 76.1 49.6 70.1
EP AC C
Brookhaven 24 Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
TE D
Brookhaven 13 Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
110.6 92.1 84.6 73.5 48.2 70.2
3 hours
4 hours
171.6 102.1 89.1 77.7 63.8 145.1
187.4 104.7 88.5 79.1 64.2 151.4
142.6 98.1 85.7 76.3 60.3 112.2
165.1 100.5 85.5 76.9 60.7 124.4
180.1 102.8 86.4 78.2 61.4 149.1
142.7 94.4 83.9 73.7 58.8 113.6
165.6 96.9 85.8 74.3 59.3 126.2
171.1 99.1 85.9 75.3 59.8 150.3
153.9 99.9 87.1 77.1 63.2 114.8
SC
1 hour
M AN U
Site
RI PT
Part 1: Mean absolute error (watts per meter squared)
ACCEPTED MANUSCRIPT
Table 6, continued Forecast Horizon Average of all sites Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
2 hours
Part 2: Root mean squared error Forecast Horizon
Brookhaven 18 Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
TE D
163.4 155.1 143.4 124.7 82.7 122.7
159.6 155.7 146.6 124.9 81.3 119.1
4 hours
167.4 99.8 86.8 76.3 61.3 131.9
179.5 102.2 86.9 77.5 61.8 150.3
3 hours
4 hours
200.6 155.5 140.8 125.1 101.8 179.3
226.8 158.1 144.1 125.6 102.9 198.8
248.7 159.4 144.3 127.6 103.8 219.8
196.2 156.6 145.3 125.2 99.2 172.4
222.1 159.1 146.7 125.7 99.4 186.5
243.1 160.6 146.6 127.4 100.6 215.7
2 hours
EP
Brookhaven 13 Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
1 hour
AC C
Site
146.4 97.5 85.6 75.7 60.8 113.5
SC
111.8 95.7 86.5 75.4 49.6 70.6
3 hours
RI PT
1 hour
M AN U
Site
ACCEPTED MANUSCRIPT
Table 6, continued
Forecast Horizon 1 hour
2 hours
160.4 151.8 142.2 122.5 79.6 119.6
195.1 152.9 139.8 122.8 96.8 169.2
Average of all sites Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
161.1 154.2 144.1 124.0 81.2 120.5
197.3 155.0 142.0 124.4 99.3 173.6
AC C
EP
TE D
M AN U
SC
Brookhaven 24 Smart Persistence MADCast NWP-Solar-NOW NWP-Solar-DA DICast ARIMA
3 hours
4 hours
220.2 156.8 144.3 123.1 97.3 185.7
240.5 157.3 144.9 124.8 98.1 217.6
223.0 158.0 145.0 124.8 99.9 190.3
244.1 159.1 145.3 126.6 100.8 217.7
RI PT
Site
ACCEPTED MANUSCRIPT
Table 7: Regression coefficients, Sacramento sites
CIRAcast
MADcast
Weighted Average NWP-Solar-NOW
NWP-Solar-DA
DICast
SC
Site
RI PT
Statistics are regression coefficients, unless otherwise indicated. Rho is the coefficient of fractional differencing.
Constant
0.31
0.84
0.82
Elasticity
0.89
0.82
0.74
R-bar-square
0.65
0.86
0.57
Rho
0.81
0.75
TE D
M AN U
Sacramento 67
-0.39
0.75 …
1.05
0.81 …
0.90
0.71 …
0.11
EP
0.86
1.21 …
Constant
1.45
Elasticity
0.71
R-bar-square Rho
AC C
Sacramento 68 1.04
0.89
-0.41
-0.41
0.16
0.78
0.75
1.03
1.02
0.95
0.69
0.84
0.59
0.87
0.87
0.89
0.91
0.91
0.89
0.89
0.81
0.77
ACCEPTED MANUSCRIPT
Table 7, continued
CIRAcast
MADcast
Weighted Average NWP-Solar-NOW
Sacramento 69 Constant
1.05
0.71
0.86
Elasticity
0.77
0.83
0.72
R-bar-square
0.70
0.85
0.58
Rho
0.91
0.84
0.88
Constant
1.66
1.58
Elasticity
0.70
0.71
R-bar-square
0.66
Rho
0.92
NWP-Solar-DA
RI PT
Site
-0.43
1.05
1.06
0.89
0.90
0.91
0.78
0.31
0.31
0.71
1.96
0.11
0.68
0.95
0.63
0.95
0.81
0.58
0.87
0.74
0.88
0.95
0.91
0.91
0.93
0.61
SC
-0.43
M AN U
0.96
TE D EP
1.43
AC C
Sacramento 70
-0.14
DICast
ACCEPTED MANUSCRIPT Table 8: Regression coefficients, Brookhaven sites Statistics are regression coefficients, unless otherwise indicated. Rho is the coefficient of fractional differencing.
MADcast
Weighted Average NWP-Solar-NOW
Brookhaven 13 Constant
1.04
1.27
Elasticity
0.74
0.71
R-bar-square
0.84
0.79
Rho
0.91
0.84
DICast
RI PT
Site
-0.05
0.71
0.96
0.79
0.95
0.84
0.44
1.27
1.30
-0.12
0.71
0.71
0.96
0.84
0.79
0.80
0.96
0.91
0.83
0.83
0.31
1.12
1.33
1.33
-0.06
0.73
0.71
0.71
0.97
R-bar-square
0.84
0.79
0.81
0.96
Rho
0.91
0.82
0.83
0.29
0.92
Elasticity
0.76
Rho
Constant Elasticity
AC C
Brookhaven 24
EP
R-bar-square
TE D
Constant
M AN U
Brookhaven 18
SC
1.27
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
RI PT
ACCEPTED MANUSCRIPT
1200
SC
1000
M AN U
800 600
EP
0
AC C
200
TE D
400
RI PT
ACCEPTED MANUSCRIPT
1200
M AN U
SC
1000 800
0
EP
200
AC C
400
TE D
600
ACCEPTED MANUSCRIPT
600
RI PT
500
SC
400
M AN U
300
200
200
400
EP
0
AC C
0
TE D
100
600
800
1000
1200
ACCEPTED MANUSCRIPT
600
RI PT
500
SC
400
M AN U
300
200
200
400
EP
0
AC C
0
TE D
100
600
800
1000
1200
ACCEPTED MANUSCRIPT
800
RI PT
700 600
SC
500 400
M AN U
300 200
200
400
EP
0
AC C
0
TE D
100
600
800
1000
1200
ACCEPTED MANUSCRIPT
900
RI PT
800 700 600
SC
500
M AN U
400 300 200
200
400
EP
0
AC C
0
TE D
100
600
800
1000
1200
ACCEPTED MANUSCRIPT
Figure captions and titles.
RI PT
Title: Figure 1: Sites 67-70, Sacramento Municipal Utility District (SMUD).
SC
Title: Figure 2: The Long Island Solar Farm, at Brookhaven National Laboratory, New York.
Title: Title: Figure 3: Global Horizontal Irradiance at Sacramento site 69.
M AN U
Caption: Left scale: watts per meter squared. Resolution: Hourly. Time span: May 1 to May 31, 2015. Source: Sacramento Municipal District and National Center for Atmospheric Research.
Title: Figure 4: Global Horizontal Irradiance at Brookhaven site 13.
TE D
Caption: Left scale: watts per meter squared. Resolution: Hourly. Time span: May 1 to May 31, 2015. Source: Brookhaven National Laboratory.
Title: Figure 5: The DICast forecast error versus irradiance, Sacramento site 69
AC C
EP
Caption: Forecast horizon: 1 hour. Left scale: absolute error, in watts per meter squared. Horizontal scale: ground level irradiance, in watts per meter squared.
Title: Figure 6: The ARIMA forecast error versus irradiance, Sacramento site 69 Caption: Forecast horizon: 1 hour. Left scale: absolute error, in watts per meter squared. Horizontal scale: ground level irradiance, in watts per meter squared.
ACCEPTED MANUSCRIPT
Title: Figure 7: The DICast forecast error versus irradiance, Brookhaven site 13
RI PT
Caption: Forecast horizon: 1 hour. Left scale: absolute error, in watts per meter squared. Horizontal scale: ground level irradiance, in watts per meter squared.
Title: Figure 8: The ARIMA forecast error versus irradiance, Brookhaven site 13
AC C
EP
TE D
M AN U
SC
Caption: Forecast horizon: 1 hour. Left scale: absolute error, in watts per meter squared. Horizontal scale: ground level irradiance, in watts per meter squared.
ACCEPTED MANUSCRIPT
Meteorological and time series are used to forecast. Models with fixed and time-varying coefficients are tested.
RI PT
Time series models are more accurate at short horizons, while meteorological models are more effective at longer horizons.
AC C
EP
TE D
M AN U
SC
The optimal model is the Dynamic Integrated Forecast (DICast) system, which combines meteorological models with statistical adjustments.
ACCEPTED MANUSCRIPT 1
RENE-D-16-3668: Responses to the Reviewers
RI PT
In responding to the reviewers, we face the following dilemma. Three of the reviews are highly favorable, one is moderately favorable, and one is critical. For this reason, we are responding to the more favorable reviews first. We have prepared detailed comments in response to the critical review; these come at the end.
SC
Several reviewers commented on the titles and labels for the figures. We chose to put these into a separate document, since the titles and figure captions are typeset separately.
Reviewer #1:
M AN U
Reviewer 1: The abstract should include a brief introduction of research motivation and purpose, some description of the research process and major methods used and some highlight on the conclusions. The research motivation and process were not summarized clearly. Response: We agree, and have revised the abstract.
TE D
Reviewer 1: The introduction should be reorganized to emphasize the significance and innovation of the study comparing to the other researchers. Response: We agree, and have revised the introduction.
EP
Reviewer 1: In the models review sections (i.e. sections 2-6), it is suggested that we add a little more explanations on comparing the weakness of each model and the expected improvements on the combine usage of the models.
AC C
Response: We agree, and have inserted some additional text. The entire paper has been reorganized, as requested by one of the reviewers. The new Section 2, on the models, incorporates some material that was previously in the introduction. The evaluation of the strengths and weaknesses of the models comes in the conclusions, rather than in the main text. Further, in the new Section 6, we run sensitivity tests to evaluate the relationship between the models and the data. Reviewer 1: Please check if there's any typo on the End date in Table 1, and add label to figure 1 and figure 2. Response: The end dates did not always match in the 15-minute and 1-hour data sets, which accounts for the discrepancies in the dates and times. We could not add the labels directly to
ACCEPTED MANUSCRIPT 2
RI PT
the figures with the existing software, so we put these into a separate document. This issue is noted by several of the other reviewers as well. In the production process, Elsevier often typesets the figure and table titles and captions separately, and then inserts them. We chose to prepare a separate document with the titles and captions to facilitate this. This also makes it much easier to revise titles or captions.
Reviewer #4:
SC
Reviewer 4: The assessment of the results was realized by using only two accuracy indicators: MAE and RMSE. The authors must also use some graphical methods to characterize the models performance such as scatterplots, boxplots, Taylor diagrams, etc.
M AN U
Response: We agree, and have generated several additional graphs that illustrate the properties of the models. The titles and captions are in the separate document. Reviewer 4: Line 118: which data have been used to correct the forecast errors? Response: These calculations were done as part of the DICast system run at NCAR. We have revised the text accordingly.
TE D
Reviewer 4: Table 1: provide geographical coordinates of the stations mentioned in the table. Response: We put the latitude and longitude into the text, rather than the table. Since the sites are fairly close together, reporting this for each site would have been redundant. Further, one of the other reviewers requested a map, which we provided.
EP
Reviewer 4: Add axis titles to figure 1 and figure 2.
AC C
Response: The titles and captions are in the separate document.
Reviewer #5:
Reviewer 5: Provide a map with the geography of the two locations. Response: Maps are included.
Reviewer 5: Clarify why cloud advection models are treated separately and not under the section 2, Meteorological models.
ACCEPTED MANUSCRIPT 3
Response: Reviewer 3 makes the same point, and argues for a reorganization of the text with different subtitles. We have moved the section on cloud advection models forward, so that it follows directly after the other meteorological models.
RI PT
Reviewer 5: Clarify if time series models are considered statistical methods.
SC
Response: This issue appears to be one of semantics, and we have encountered it before. Time series models lie in a middle ground between what are usually referred to as statistical models and other fields such as econometrics. There is a great deal of overlap between the two. We have used the term statistical to differentiate time series methods from meteorological models. However, in the revised version, we have replaced the term statistical with time series on at several junctures.
M AN U
Reviewer 5: Table 3 should be provided as supplementary material. Graphs could be much more relevant.
EP
Reviewer #3:
TE D
Response: We prefer to leave the tables as part of the paper. In other studies by some of the same authors, also published in Renewable Energy, extensive tabular material was provided to support the narrative. This did not in any way interfere with narrative coherence. However, since the other reviewers also mentioned this, we included summary of the salient findings at the start of the new section on empirical findings. We considered using a graph or a table here, but decided instead to use a short text description.
AC C
Reviewer 3: The results reported in sections 8 and 9 are detailed, but there are no explanations regarding how and why models perform differently. The authors should provide a sensitivity analysis, or a similar procedure, to deepen the differences among the proposed models, and to find which parameters are the most important. Response: We agree, and this issue came up as we were drafting the paper. We have included some new text on this issue. In Section 6, we include two new tables, along with regressions of the data on the model forecasts. There is also an extended discussion of the results. These clearly indicate that the DICast output shows a closer relationship to the data than the meteorological and cloud advection model results. Reviewer 3: I suggest to provide a diagram rather than Table 2 (or both of them), to highlight the differences between meteorological, statistical (time series), and combined models.
ACCEPTED MANUSCRIPT 4
RI PT
Response: Here, we ran into some trouble. There was no obvious way to provide a visual guide to the models. I admit to a strong preference for English prose rather than visual images. Instead, after reading through these sections several times, we concluded that the text provides an adequate explanation for the material in Table 2. We did, however, expand considerably on the relative strengths and weaknesses of the models in the new Section 5.
SC
Reviewer 3: I suggest to convert sections 2, 3, 4, and 5 into subsections, and to gather them under a section 2 named "Forecasting models" or a similar title. Then, I suggest to convert sections 6 and 7 into subsections, and to gather them under a section 3 called "Simulation initialization" or similar. Finally, I suggest to convert sections 8 and 9 into subsections, and to gather them under a section 4 called "Results and discussion" or a similar title.
M AN U
Response: Reviewer 5 also argues for reorganizing some of the text. We did recognize in the initial drafting process that some of the titles read more like subtitles. We have reorganized the paper as into long sections, with subtitles for the sub-sections. Reviewer 3: In my opinion, the conclusions are too long. There should be a discussion section, or some findings should be directly reported in sections 8 and 9. Response: This has been done as part of the overall reorganization. The discussion is in Section 5, and the conclusions are shorter.
TE D
Reviewer 3: I suggest reporting all of figures and tables in the main text, in proximity of their citation. This greatly helps both readers and reviewers. Figures 1 and 2 have no caption, and no labels with units of measurement are reported.
AC C
EP
Response: We prefer to keep all the tables and figures in separate documents. These are typeset separately as part of the production process. Journals differ here with respect to this policy. Some journals encourage combining all the text, tables and figures into a single PDF, while others prefer to keep text, tables and figures in separate documents. Because of the number of additional figures and tables, we prefer to use the latter approach. Also, I do not have access to Latex software. Everything is written in Microsoft Word. Similarly, we prefer to place the figure titles and captions in a separate document. We hope that this does not unduly inconvenience the reviewers.
Reviewer 3: There is no Nomenclature section. Response: We did not consider this to be necessary. Most of the terminology used in the paper is fairly standard. Anyone in the solar energy community is familiar with acronyms like GHI, and in fact we always wrote out the names in prose before using the acronym. Similarly,
ACCEPTED MANUSCRIPT 5
RI PT
we wrote out the names of models like DICast in the text before switching to the acronym. Finally, in Table, 2, we give the names of the models in acronym form, followed by a description of what the models do.
Reviewer #2:
SC
Reviewer 2: The study has very weak originality and significance, and thus provided minimum useful information to the interested readers. Therefore, the manuscript is not recommended for publication.
M AN U
Response: We note upfront that the other reviewers do not share this assessment. Reviewers 4 and 5 characterized the study as scientifically rigorous and original. We did note some similarity between this comment and the request by Reviewer 1, who requested a stronger justification for the paper upfront. For this reason, we have substantially revised the introduction, to place greater emphasis on the new material.
TE D
We acknowledge that some of the findings here build on previous studies, and provide additional support for known findings. However, this does not disqualify the paper. In this respect, a great deal of scientific progress takes place in the form of incremental studies, rather than quantum leaps. Although this study is more incremental in nature, we view it as original and useful for the following reasons:
EP
1] The findings for meteorological, combined, cloud advection and time series models have rarely been compared so systematically, at the same sites and over the same data sets. By making this comparison more rigorous, we add new empirical findings to the literature.
AC C
2] There is an ongoing discussion in the literature on renewable energy as to the relative merits of physics-based and statistical or time series methods. This paper bears directly on these issues. It provides both systematic empirical evidence and insights as to the relative strengths and weaknesses of the two approaches. 3] There has been only limited discussion in the renewable energy literature as to the power of time-varying coefficient models. A great deal of this literature has used models with fixed parameters, or in the case of neural networks, fixed input and bias weights. However, it has been established in other fields, notably econometrics, that when the data is highly stochastic, time-varying parameter models are more effective. This study looks at this issue in greater depth than most of the existing works. The finding that a system which combines physicsbased model output with statistical adjustments and artificial intelligence can produce more
ACCEPTED MANUSCRIPT 6
accurate forecasts than meteorological models with fixed coefficients and in some instances ARIMAs with time-varying coefficients is a new one. This finding is worth making explicit.
SC
RI PT
Reviewer 2: The detailed reasons are itemized as follows. 1. Four conclusions were made from the provided data: A. The combined model is the best; B. ARIMA is better than other included time series methods; C. In very short horizons, statistical models are better than meteorological models; in long horizons, the opposite is true; D. Using time-dependent parameters instead fixed ones can improve the model accuracies. The conclusions A and C were not new, which simply reconfirm the findings presented in existing publications.
M AN U
Response: This paper actually does go further than earlier publications with respect to findings A and C. Specifically, we quantify the transition points, i.e., the forecast horizons at which meteorological and combined models are able to achieve greater accuracy than time series models. It is interesting that in the case of solar energy, the transition point is relatively short, in the range of 2-3 hours. By comparison, in other areas like wave energy, the transition point is in the range of 5-6 hours. We make this point explicitly, and view it as a new contribution.
EP
TE D
Reviewer 2: So, the only contributions of the study were the second and last conclusions. However, the last conclusion was undermined by the approach that the study took. The timevarying coefficients were used only in two models: DICast and ARIMA. Since DICast is the only model of its kind (the combined model), it is very difficult to tell whether its good accuracy is only due to the use of time-varying coefficients since the authors also claimed that combining of Meteorology models and time series models itself can improve the accuracy.
AC C
Response: We explicitly conclude in the revised paper that the DICast is superior primarily because of the statistical adjustment. We note that the weighted average reported in the tables does not do particularly well. In a more general sense, DICast has two key features. First, the model parameters are adjusted, as described in the text. This makes it superior to meteorological models with fixed parameters, cloud advection models and other techniques. Second, it incorporates information on weather and climate, which the ARIMA does not. The finding is that despite the fact that DICast and ARIMA both incorporate time-varying coefficients, DICast is better at all but the shortest horizons because it includes the meteorological information. Finally, a direct comparison of the ARIMA with the meteorological models with fixed parameters demonstrates that allowing the coefficients to vary over time makes the ARIMA very competitive. This
ACCEPTED MANUSCRIPT 7
provides strong evidence that time-varying coefficients enhance predictive accuracy in this kind of environment.
RI PT
Reviewer 2: The manuscript was loaded with data, which is encouraged. However, the authors just simply narrated observations from the data (especially when the conclusions did not apply to some data entries), instead of providing enough insights or explanations from the data. Therefore, there is no transformative information for the readers to either build a better model or use an existing model in a smarter way.
M AN U
SC
Response: Some of the other reviewers made similar points, although their phraseology was more supportive. We have expanded the discussion of the results, to take all these considerations into account, and believe that the recommendations do in fact contribute to building more effective models. The principle underlying the DICast – adjusting meteorological forecasts to more closely match the recent data – yields the highest degree of accuracy. Reviewer 2: The authors noticed that, in the RMSE data sheet of 15 min resolution at Site 67 in Sacramento, the transfer function was better than ARIMA, which conflicted with the second conclusion. A valid explanation will be very useful, probably even more useful than the data itself. However, the authors failed to provide one.
TE D
Response: This was an anomalous result, which occurred in one instance, at one site. It is clearly not typical. In other words, this is the exception that justifies the rule. However, we have provided some text as to why this should not be viewed as representative. Reviewer 2: The combined model is a key study object. However, the data from DICast was missing in all of the 15 min resolution data set.
AC C
EP
Response: Unfortunately, there was nothing that could really be done about this. DICast simulations were not available at higher frequencies. We would have dearly loved to evaluate the model at the higher resolutions. Due to the limitations of the database, this will have to be left to a future study. However, since the ARIMA is often better than DICast at the 1-hour horizon, it is reasonable to assume that it will be superior at horizons of 15-45 minutes.
Reviewer 2: Relative errors in percentage may serve better than the mean absolute errors and root mean squared errors. Response: There have been several studies of the relative merits of different measures of forecast accuracy. For wind and wave energy, we would prefer the mean absolute percent error. However, for solar, one crucial problem is that the percent error can become very high during periods of low irradiance, i.e., early morning and late evening. The reason of course is
ACCEPTED MANUSCRIPT 8
AC C
EP
TE D
M AN U
SC
RI PT
that while the error in W / m2 may be small, the low denominator can raise the percent error. Most of the literature has therefore preferred to use the mean absolute error and the RMSE. We followed this convention. Since there is already a great deal of tabular material, we did not want to expand the tables by one-third simply to provide a third measure of accuracy.