Marine Structures 16 (2003) 35–49
Forecasting wind with neural networks Anurag More, M.C. Deo* Civil Engineering, Indian Institute of Technology, Mumbai 400076, India Received 12 February 2002; received in revised form 25 September 2002; accepted 3 October 2002
Abstract Wind forecasts over a varying period of time are needed for a variety of applications in the coastal and ocean region, like planning of construction and operation-related works as well as prediction of power output from wind turbines located in coastal areas. Such forecasting is currently done by adopting complex atmospheric models or by using statistical time-series analysis. Because occurrence of wind in nature is extremely uncertain no single technique can be entirely satisfactory. This leaves scope for alternative approaches. The present work employs the technique of neural networks in order to forecast daily, weekly as well as monthly wind speeds at two coastal locations in India. Both feed forward as well as recurrent networks are used. They are trained based on past data in an auto-regressive manner using backpropagation and cascade correlation algorithms. A generally satisfactory forecasting as reflected in its higher correlation and lower deviations with actual observations is noted. The neural network forecasting is also found to be more accurate than traditional statistical timeseries analysis. r 2002 Elsevier Science Ltd. All rights reserved. Keywords: Real-time forecasting; Wind analysis; Neural networks; Time-series; Wind prediction
1. Introduction Forecasting of wind speed forms an essential input to many marine works ranging from planning, construction and operation-related activities in the oceanic area to prediction of power output from wind turbines and aircraft operations in the coastal area.
*Corresponding author. Tel.: +91-22-576-7330; fax: +91-22-576-7302. E-mail address:
[email protected] (M.C. Deo). 0951-8339/03/$ - see front matter r 2002 Elsevier Science Ltd. All rights reserved. PII: S 0 9 5 1 - 8 3 3 9 ( 0 2 ) 0 0 0 5 3 - 9
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
36
Occurrence of wind is highly uncertain in time and space. Currently, wind forecasting over ocean is made on the basis of mathematical models, which simulate atmospheric physical process and use them in conjunction with data reported by merchant ships or rider buoys. While simultaneous spacewise information yielded by these models is advantageous, they require an excess information apart from historical wind observations and are complex and tedious to apply especially when point-forecasts at specific stations are needed. A simpler alternative to them in the form of neural networks (also called artificial neural networks) is presented in this paper. Neural network is a technique basically used to map any random input vector to the corresponding random output vector without assuming any fixed relationship between them beforehand. Neural networks can learn from examples (past data), recognize a hidden pattern in historical observations and use them to forecast future values. It is this property of the network that forms basis of the present application. Data error tolerance, ease in adaptability to on-line measurements and lack of any excess information (other than time-series history of wind speeds) are additional advantages of the neural network approach over the conventional forecasting schemes.
2. The network and its training A neural network basically consists of interconnected neurons. Each neuron or node is an independent computational unit (Fig. 1) which works as per the following equation: hX i y¼f ðx1 w1 þ x2 w2 þ x3 w3 þ ?Þ þ b ; ð1Þ
x1 Activation function
x2 o o o
Σ Summing junction
x3
f [.]
Output
β Threshold Connection weights
Fig. 1. Working of a neuron.
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
37
where y is the output from neuron; x1 ; x2 ; x3 ;y are the input values; w1 ; w2 ; w3 ; y are the connection weights; b is the bias value; f is the transfer function, typically sigmoidal function given by 1 : 1 þ e½
f ½ ¼
ð2Þ
A typical neural network used in the present study is shown in Fig. 2. This is called feed forward type of network where computations proceed along the forward direction only. There are three layers of neurons, namely input, hidden and output layer. The output obtained from the output neurons constitutes the network output. When the time series has a strong memory, recurrent networks with backward connections become useful and an example of such a recurrent network is shown in Fig. 3. This type of network is called Jorden type. The current study involves use of another recurrent network of Elman type where instead of a single context neuron shown in Fig. 3 there would be as many context neurons as the number of existing hidden neurons. However, trials in the present study indicated that for the application to wind forecast this type of (Elman) network did not show the results better than that of Jorden type shown in Fig. 3. The connection weights and bias values are initially chosen as random numbers and then fixed by the results of a training process. Many alternative training processes are available, out of which the present study adopted two popular schemes, namely back-propagation (BP) and cascade correlation (CC). The goal of any training algorithm is to minimize the global (mean sum squared) error E; defined below 1X E¼ ðOn tn Þ2 ; ð3Þ 2
bias
bias
Vt-1 Vt Output Nodes
Vt-2
Input Nodes
Hidden Nodes Fig. 2. Typical feed forward network.
Connection weights
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
38
Vt-1 Vt Vt-2 Input Nodes
Hidden Nodes
Context Node
Output Nodes
Connection weights
Fig. 3. Typical Jordan recurrent neural network.
where On and tn are network and target output for any nth output node. The summation has to be carried out over all output nodes for every training pattern. A pair of input and output values constitutes a training pattern. The BP algorithm calculates the error ‘E’ as per Eq. (3) and distributes it backward from the output to hidden and input nodes. This is done using the steepest gradient descent principle where the change in weight is directed towards negative of the error gradient, i.e. Dwn ¼ aDwn1 Z
qE ; qw
ð4Þ
where w is the weight between any two nodes; Dwn ; Dwn1 are the changes in this weight at nth and n 1th iteration; a the momentum factor and Z is the learning rate. The learning rate governs the size of the weight change as per the effect of the weight on the total error. The momentum factor prevents weight oscillations during training iterations and also accelerates the training on flat error surfaces. In the current study these values were selected by varying them from 0.1 to 0.9 till convergence reached, i.e. when further iteration of training cycles did not result in reduced value of the total error. The CC training algorithm is aimed at evolving an optimum network architecture and it start without any hidden nodes. It follows following steps to complete the training: 1. Start with inputs and outputs only. 2. Train the network over the training data set by the gradient rule. 3. Add a new hidden node. Connect it to all input nodes as well as to other already installed hidden nodes. Training of this node is based on maximization of overall correlation S given by XX % p;o E% o Þ; ð5Þ S¼ ðVp VÞðE p o
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
39
where Vp is the output of the new hidden node for pattern p; V% is the average output over all patterns; Ep;o is the network output error for output node o on pattern p; and E% o is the average network error over all patterns. Pass the training data set one by one and adjust input weights of the new node after each training set until S does not change appreciably. 4. Once training of the new node is done, that node is installed as a hidden node of the network. The input side weights are frozen, and the output side weights are trained again. 5. Go to step 3, and repeat the procedure until the network attains a prespecified minimum error within a fixed number of training iterations. Unlike the feed forward network the recurrent network has connections in the backward direction from the output to the preceding nodes so as to make itself to be suitable to accommodate past information in case of dynamic systems. As mentioned earlier the present study uses the Jorden type of feedback shown in Fig. 3. Details of the concept of neural networks and the description of various training algorithms can be seen in books like Kosko [1], Wu [2] and Yeh et al. [3].
3. Development of the networks Daily average values of wind speeds measured with the help of a three cup anemometer over a period of 12 years ranging from 1989 to 2000 were available at the coastal location of Colaba within Greater Mumbai (Bombay) region along the west coast of India. This is the only database used in the present study. The data were taken by India Meteorological Department (IMD), Mumbai. The observations have very few gaps or missing values. In the present study, such gaps were interpolated between neighboring values. Daily mean series of wind speeds was used to obtain weekly as well as monthly mean values and such averaged observations were then employed to make corresponding forecasts of daily, weekly and monthly wind speeds. The objective was to develop neural networks, which could forecast the wind speeds over the next time step of 1 day, 1 week and 1 month based on the input of a chosen sequence of immediately preceding observations, as explained in the following sections.
4. Monthly forecasts 4.1. Neural networks A feed forward network with the input consisting of a sequence of a few past values of monthly average wind speed in order to have the output of forecasted speed for the subsequent month was developed. The data pertaining to the first 10 years
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
40
was used for training while the observations of the last 2 years were reserved for testing of the network. The length of the input sequence was decided by trials, which showed that consideration of 2 preceding values only is sufficient to arrive at the maximum possible prediction accuracy. Similar trials were adopted to select the number of hidden layers along with the number of neurons in each one of them. This was necessary when training was examined by the BP approach. In the case of the alternative CC scheme, hidden layers and neurons get automatically fixed during training as explained in the previous section. The software used was the Stuttgart Neural Network Simulator [4], version 4.1. Marginal improvement in the result was obtained when the input–output values were normalized in the range of (0.25, 0.75) instead of (0,1). Fig. 2 shows the topology of the FF network trained using the BP algorithm. In this figure as well as in Fig. 3 Vt ; Vt1 ; Vt2 ; denote wind speeds at time t; t 1; t 2; respectively. A few examples of training pairs used are given in Table 1 as an example. The resulting matrix of trained weight and bias values is given in Fig. 4. The data of an entire year was assumed as samples from the same population because clear data division into monsoon (fierce) wind and non-monsoon (calmer) wind was not noticed. Separate training was thus found to be untenable. Performance of the trained network was judged by (i) comparing the time histories of predicted and observed sequences, (ii) drawing a scatter diagram of network predictions against the measured (or target) values, and (iii) computing average % errors in prediction when compared with the actual measurements. These comparisons are based on testing set of data pertaining to last 2 years, which were excluded from training. While the scatter diagram is a visual aid to show the spread of predictions from the conventionally expected values, quantitative discrepancy is conventionally understood in terms of the percentage errors in engineering applications. For the developed network mentioned earlier, Figs. 5 and 6 show the time history and scatter diagram based comparisons. (The underlying network is FF trained using BP.) The predictions can be seen as fairly close to the corresponding actual measurements. All ups and downs in the observed time series may well be modeled well in the predicted series. The value of the average % error between the predicted and the observed series was as low as 4.7. It is to be noted that great prediction accuracy is difficult to achieve in an inherently random phenomenon of wind. The
Table 1 Examples of training pairs (monthly forecasting) Pair No.
Input
1 2 3 4 5
60 52 69 58 47
Output 52 69 58 47 47
69 58 47 47 35
Note: The numbers under columns of input, output indicate wind speeds in km/h.
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
41
Fig. 4. Network Topology of BP Network showing weights and bias at each node.
Monthly wind speed (Km /hr)
60
Predicted Actual
50 40 30 20 10 0 Apr-98
Nov-98
May-99 Dec-99 Month
Jun-00
Jan-01
Fig. 5. Time histories of forecasted and observed monthly wind speed (neural network—FF-BP).
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
42 60
Observed speed (Km /hr)
50 40 30 20 10
R = 0.82
0 0
10
20 30 40 Predicted speed (Km/hr)
50
60
Fig. 6. Scatter of forecasted and observed monthly average wind speed (neural network—FF-BP).
Table 2 Forecasting schemes and their performances Forecast
Scheme used
Network type
Training algorithm
Topology used
Month
Neural network
Feed forward
Time series
Recurrent ARIMA
BP CC JE ARIMA (2,12,2)
2–3–12–3–1 3–1 4–6–7–6–1 —
4.7 4.5 4.3 5.9
Neural network
Feed forward
Time series
Recurrent ARIMA
BP CC JE ARIMA (1,52,1)
5–20–1 1–1 5–4–4–1 —
6.0 5.4 6.1 8.6
Neural network
Feed forward
Time series
Recurrent ARIMA
BP CC JE ARIMA (2,365,2)
4–5–1 3–1 4–15–1 —
7.0 6.3 7.5 11.5
Week
Daily
Mean % error
Note: BP=back-propagation; CC=cascade correlation; JE=Jordan Elman; ARIMA=auto-regressive integrated moving average; r=correlation coefficient.
accuracy level reached in the present on-line wind forecasting model is sufficient to take operational decisions related to marine works like planning of harbor works. The monthly forecasting as above based on FF-BP was repeated by using the CC scheme of training. This was done to see whether the training improves by adopting a different algorithm. Results of the testing of this network are shown in Table 2 and they indicated almost same performance of the CC training method (average error = 4.5%).
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
43
As explained earlier Jorden network can act as ‘time delay’ model and bring in the effect of dependency between neighboring values in a sequence of observations. A recurrent network of 2 inputs and 2 hidden layers of 5 hidden neurons each and one output neuron was developed. The testing results showed its performance similar to the earlier FF-BP network in that the underlying mean error was 4.9%. Kalogirou et al. [5] had earlier obtained higher accuracy in monthly flow forecasting using neural networks at 2 sites of Cyprus. However, their testing database was half of the present one. 4.2. ARMA model Neural networks discussed above can be looked upon as highly generalized form of non-linear time-series models used for forecasting purpose. Their generality is because of self-learning achieved in a ‘model-free’ environment. Traditional stochastic time-series schemes like AR, MA, ARMA, ARIMA also are commonly used for forecasting a random variation, although they are model based and hence could be less flexible in fitting to data. It was examined to see how the networks perform vis-a" -vis a representative time-series model. The model of ARIMA is widely used for many engineering prediction problems of short, medium and long range type. Although linear, it is easy to develop, apply and interpret and not to need assumptions, for example, stationarity over time. It is to be noted that application of both NN and stochastic time-series models is justified for the problem of wind prediction in that both schemes involve a widely accepted assumption that all causative factors are implicitly accounted for in the sequence of occurrence itself and hence physical modeling is equivalent to time sequence modeling. A brief outline of ARIMA model is as follows: ARMAðp; qÞ models, where p and q are the auto-regressive and moving-average orders, respectively describe each observation of the time series as a weighted sum of p previous data, and (the current as well as) q previous values of a white noise process xt ¼ f1 ðxt1 mx Þ þ f2 ðxt2 mx Þ þ ? þ fp ðxtp mx Þ þ Zt þ y1 Zt2 þ y2 Zt2 þ ? þ yq Ztq þ mx ;
ð6Þ
where xt ; xt1 ; xt2 ; y are the values of the wind speed at times t; t 1; t 2; y; Zt ; Zt1 ; Zt2 ; y are the series of the white noise, i.e. zero-mean random variable values at t; t 1; t 2; y that are not correlated with the past values of xt ; f1 yfp and y1 yyp ; are the the auto-regressive and moving-average parameters, respectively, and mx is the the mean of the time series. In ARIMA type of models we assume that series could be non-stationary, but its differenced version, Dd xt is stationary, where Dd xt ¼ dth order differencing of xt : As an example: D2 xt ¼ ðxt xt1 Þ ðxt1 xt2 Þ:
ð7Þ
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
44
ARIMAðp; d; qÞ model is obtained by fitting ARMAðp; qÞ model given by Eq. (6) to the differenced series indicated above in Eq. (7). See Box and Jenkins [6] for further details. The ARIMA model as programmed in software SAS [5] was used to forecast monthly wind speeds. Similar to the case of neural networks the model was calibrated using the first 10 years data and tested on the last 2 years observations. Considerable experimentation was made in order to arrive at the most satisfactory
60 Predicted Actual
Monthly wind speed (Km /hr)
50 40 30 20 10 0 Feb-98
Dec-98
Oct-99
Aug-00
Month Fig. 7. Time histories of forecasted and observed monthly wind speed (ARIMA).
Observed wind speed (Km/hr)
60 50 40 30 20 10
R = 0.80
0 0
10
20
30
40
50
60
Predicted wind speed (Km/hr)
Fig. 8. Scatter of formulated and observed monthly wind speed (ARIMA).
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
45
calibration. Figs. 7 and 8 show the time history as well as scatter diagram based comparisons between the predictions and observations. The predictions are based on ARIMA model of order ðp; d; qÞ; (ARIMA(2,12,0)) model, which was found to be most appropriate. Table 2 shows the average absolute error resulted while testing the calibrated series with the last two years data. A comparison of this stochastic model fit with the neural network results (Figs. 5 and 7, Table 2) would show that the networks perform better than the ARIMA model as the latter model produces higher errors (5.9%) than the network (4.5%). The development of the wind forecasting scheme in a model-free manner seems to have resulted in more flexibility in the data mining approach of neural network than one associated with the fixed type of stochastic time-series schemes.
5. Weekly forecasting In the next phase of studies neural networks were developed to forecast weekly wind speeds. The input to the network was a sequence of past wind speeds while the output belonged to the wind corresponding to the next week. Similar to the case of monthly forecasting both FF and Recurrent networks trained using BP and CC algorithms were developed using the first 10 years data and tested using last 2 years data. Figs. 9 and 10 show the comparison of forecasting performance of the network in terms of the time histories as well as scatter plots. The underlying network was of FF type trained using the CC algorithm. The testing output had average % error of 5.4. Similar to earlier monthly predictions these weekly predictions also seem to be closer to their observed values.
70 Actual NN
Wind speed (Km/hr)
60
ARIMA
50 40 30 20 10 0 0 Oct. 1998
25
50
75 Week
100
125 Dec. 2000
Fig. 9. Time history comparison of the forecasted and observed weekly mean wind speeds.
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
46 70
Observed speed (Km/hr)
60 50 40 30 20
R = 0.80
10 0 0
10
20
30
40
50
60
70
Predicted speed (Km/hr)
Fig. 10. Scatter of forecasted and observed weekly wind speed (neural network).
Observed wind speed (Km /hr)
70 60 50 40 30 20 R = 0.78 10 0 0
10
20 30 40 50 Predicted wind speed (Km /hr)
60
70
Fig. 11. Scatter of forecasted and observed weekly wind speed (ARIMA).
Figs. 9–11 are shown to compare the neural network predictions with those yielded by stochastic time-series models. ARIMA (1,52,1) was fitted to training part of data and the resulting model was used to predict weekly wind speeds with respect to the testing part of observations. Fig. 9 and 11 clearly indicate that the ARIMA model is less accurate than the corresponding network predictions. The average %
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
47
error obtained for the network shown in the figure was 5.4 while that for the ARIMA model was 8.6. Table 2 gives the information on network topologies involved in case of the weekly wind predictions. Superiority of the neural network models becomes clear.
90
Predicted wind speed (Km /hr)
75 60 45 30 15
R = 0.80
0 0
10
20
30
40
50
60
70
80
90
Observed wind speed (Km /hr) Fig. 12. Forecasted versus observed daily wind speed (neural network).
Predicted wind speed (Km /hr)
90 75 60 45 30 15 R=0.72
0 0
10
20
30 40 50 60 Observed wind speed (Km /hr)
70
Fig. 13. Forecasted versus observed daily wind speed.
80
90
48
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
6. Daily forecasts Finally, forecasting of daily mean wind speed was attempted using neural network (as well as ARIMA schemes for comparison purposes). Fig. 12 shows the performance of neural networks with respect to testing set of data belonging to the last 2 years of measurement, while Fig. 13 shows the same for the ARIMA model. Here also higher accuracy of the network predictions with actual measurements is shown. Some results of different approaches adopted in daily speed forecasting are shown in Table 2. Similar to monthly and weekly forecasting here also the neural network approach was more satisfactory than the traditional stochastic ARIMA model.
7. Conclusions The foregoing sections dealt with the problem of forecasting wind with the help of a relatively recent approach of neural networks. Wind speed was forecasted over the next time step of 1 month, 1 week and 1 day. The rising and falling trends of the observed time series were properly picked up by the trained network during its validation. The network forecasting was fairly close to the corresponding measurements with average errors restricted to 4.3%, 5.4% and 6.3%, respectively, in case of monthly, weekly and daily forecasts when adequately trained networks were involved. Forecasting accuracy decreased as the interval of forecasting reduced from one month to one day. This is expounded that the overfitting was made by large training patterns. Superiority of one particular network type—feed forward or recurrent—over the other in training was not decided. However, the training algorithm of CC yielded slightly more accurate forecasts as compared to that of the BP. The neural networks produced much more accurate forecasts than the traditional stochastic time-series model of ARIMA. It is indicating that their generalizing capacities are needed in wind speed forecasting over different sorts of time intervals. Considering the success of the preliminary data mining process employed, it is expected that rigorous statistical preprocessing of data coupled with use of complex network types and training algorithms should result in deriving more accurate forecasts of wind speed.
References [1] Kosko B. Neural networks and fuzzy systems. Englewoods Cliffs, NJ: Prentice-Hall, 1992. [2] Wu JK. Neural networks and simulation methods. New York: Marcel Dekker, 1994. [3] Yeh YC, Kuo YH, Hsu DS. Building KBSE for diagnosis PC piles with artificial neural networks. ASCE J Comput Civil Eng 1993;7(1):71–93.
A. More, M.C. Deo / Marine Structures 16 (2003) 35–49
49
[4] Stuttgart Neural Network Software manual, version 4.1, University of Stuttgart, Stuttgart, Germany, 1995. [5] SAS/ETS Software. Time series modeling and forecasting. Financial Reporting and Loan Analysis, Version 6, Cary, NC, USA: SAS Institute, Inc., 1992. [6] Box GEP, Jenkins GM. Time series analysis—forecasting and control, 3rd ed. Englewood Cliffs, NJ, USA: Prentice-Hall, 1969.