Electricity Market Forecasting using Artificial Neural Network Models Optimized by Grid Computing Aishah Mohd Isa*, Takahide Niimura*, Noriaki Sakamoto**, Kazuhiro Ozawa** and Ryuichi Yokoyama* *Graduate School of Environment and Energy Engineering, Waseda University, Shinjuku-ku, Tokyo, 169-0051, JAPAN (Tel: +81-3-5935-6684; e-mail: takn@ ieee.org). **Faculty of Economics, Hosei University, Machida,Tokyo, 194-0298 JAPAN (e-mail:
[email protected])
Abstract: This paper reports a grid computing approach to parallel-process a neural network time-series model for forecasting electricity market prices. The grid computing of the neural network model not only processes several times faster than a single iterative process but also provides chances of improving forecasting accuracy. A grid-computing environment implemented in a university computing laboratory improves utilization rate of otherwise underused computing resources. Results of numerical tests using the real market data by more than twenty grid-connected PCs are presented. Keywords: Grid computing, electricity market, price, forecasting, neural networks, optimization.
1. INTRODUCTION The electric power utility industry has been deregulated and competitive markets have been introduced for electrical energy supply and related services (Shahidehpour, et al., 2002) in many parts of the world. Under the deregulated structure, a typical market is, for example, a day-ahead energy exchange market that replaces the traditional central dispatch scheduling by a regionally monopolistic utility. In such a market-based structure, all the market-participating energy suppliers are competing for a piece of load to serve at a lowest possible price at a certain hour of a day. Forecasting future market prices is critically important for determining the competitive strategies. Although the historical price information is usually available from the market operators, there are irregular price fluctuations, as in the other commodity exchange markets, and they can make the forecasting tasks quite challenging. Neural networks are known to produce generally more flexible and accurate nonlinear models than linear models such as ARMA models for a time series forecasting (Bunn, 2000). However, in addition to large data sizes required for training the neural networks, the optimization of the nonlinear model can be very computationally demanding. The optimization of neural networks requires many solutions from different (usually random) starting points to avoid local solutions and to achieve an accurate output, and it takes quite a long time to solve many optimization problems. In the recent development of networked computing environment, in the other perspective, grid computing has been found to provide extra computing power over a high-
speed communication network with relatively inexpensive hardware (Gagliardi and Grey, 2006). By the grid computing framework, spare computing resources that are otherwise unused or underused are made available for parallel computation of complex mathematical problems. The repeat optimization of neural network model can be implemented in the grid computing environment, and can produce many solutions in much shorter time than that would be required for a single processor computing, thus potentially achieving more accurate results within a reasonable amount of time frame. It also improves the utilization of computing resources that would be otherwise staying idle. In this paper, the authors report extended results of the grid computing of neural network-based forecasting system implement in a computer training environment at a university. The grid computing system experimented includes twenty one low cost PCs that are originally intended for individual use. By the combination of a commercially available market forecasting application program and a grid computing middleware, actual market data in PJM (Pennsylvania-New Jersey-Maryland) market are analyzed and results are examined. The remainder of this paper is composed of the following sections: Section 2 introduces the time series forecasting model using artificial neural networks. In Section 3, the grid computing framework is described. Section 4 presents the computational results and discusses the effectiveness of the grid computing approach. In Section 5 findings of this work are summarized.
2. FORECASTING ELECTRICITY MARKET PRICES BY NEURAL NETWORKS In the traditional electric utility industry, forecasting techniques for the future load demand have been extensively studied and standard models have been developed (Hippert, et al., 2001). Since the early 1990s, artificial neural networks (ANNs) have been applied to the forecasting of the electrical loads (Gross and Galiana, 1987), and standard packages have been used by many utility companies (Khotanzad, et al., 1997). ANNs have been reported to improve the accuracy of the forecasting noticeably over the earlier linear time seriesbased models. ANN-based models also have the flexibility of the design, and less intricate modeling is necessary for nonspecialist users to obtain reasonably good results. However, the ANN computation tends to take extensive time because of the repeat optimization. Load forecasting is often done on an off-line basis, and the time constraints are relatively unbinding for practical implementation.
START Data input
Main data for the price forecasting are the time series of past market prices published by the market operator. Typically, price records of several weeks preceding the target date are necessary for accurate forecasting. Simple statistical analysis on the input data set (e.g., mean and volatility) will give some hint of initial model selection (such as the number of input data) and later model validation. Depending on the scope of forecast the data are pre-processed by filtering and transformation before the model is optimized. A well-known example of transformation is a log-scaling of data. Removal of outliers will improve the overall performance of forecasting. It is also a general practice to separate weekend and holiday data. Parts of the thus processed data are set aside from the optimization for later validation purposes. With the remainder of the data, the neural network is trained, and the model parameters are optimized. Because the neural network is a nonlinear model, particularly to avoid local minimum solutions, optimization is repeated from different starting points for several times.
mean, volatility, etc.
Statistical analysis
Selection of model Outliers, weekends, etc.
Data pre-filtering Model review
Transformation Selection of validation data
Training (optimization)
As the electrical utility industry is deregulated and competition is introduced, many techniques of forecasting the electricity market prices have been developed (Niimura, 2006). Most of these approaches are based on either the linear time-series models or various neural networks, but some special considerations are needed for market price forecasting. Particularly, because market trading is held for every hour of the target day, while many power generating equipment are in operation for several hours continuously, forecasting the 24-hour profile of prices is important (whereas in load forecasting the peak load is often the most important). Fig. 1 shows the general procedure of the ANN-based market price forecasting. The structure of the ANN used for the forecasting is a popular three-layer feed-forward network as shown in Fig. 2 where only the hidden layer neurons represent the nonlinearity by sigmoid function (s(z)). The details of ANN structure are described in the author’s previous report (Sakamoto, et al., 2006). The ANN is optimized by a batch training based on Levenberg-Marquardt least-square minimization algorithm.
Market data
Validation
No
Error calculation
Acceptable? Yes
Actual forecast Reverse-transf.
Estimated price
Output results Stats. information
END
Fig. 1. Procedure of market price forecasting
x0
wik
z1
Σ
M x2
Σ
M
M
xm
Σ
( input layer )
z2
zh
Wkj s(z 1 )
Σ
y1
s(z 2 )
Σ
y2
M
M
M
s(zh )
Σ
( hidden ) layer
yn
( output ) layer
Fig. 2. Three-layer neural network used for the forecasting
After the model optimization, a validation is applied to measure the general performance of the model. Because the validation data set is selected randomly from the original data set, and they are not used for training, the validation data are expected to measure the errors of the model in the general context.
If the validation results are not acceptable, the model optimization may be repeated with a different set of starting parameters. If validation fails more than several times, model structure might need to be reviewed such as adding more input data. Some kind of sensitivity analysis may be useful for assisting the review process. If the model validation is successful, the optimized model is applied to the actual forecast. Output may not be only the estimated prices but it would also be convenient to obtain other statistical information such as confidence intervals. Major computing time is consumed in the optimization (or “training”) process of the neural networks. Because each optimization process starting from different starting points is not directly linked with other processes, by distributing the process of the optimization to several machines, the overall computation time can be significantly shorter than the sequential processing of many optimization tasks by a single machine. Furthermore, because the nonlinear model of the ANNs is trained with different starting parameters in each training process, there is a greater chance of finding a good solution by obtaining as many (locally-optimized) results as possible in the overall processing time similar to that of single-machine computation.
3. GRID COMPUTING IMPLEMENTED IN A UNIVERSITY COMPUTER TRAINING ENVIRONMENT Modern universities are equipped with relatively new personal computers installed for use by the students and the researchers. Many non-technical departments, such as economics where some of the authors work, also offer introductory courses on information technologies (IT), but such computing equipment are often not used unless students are in the computer training sessions. These relatively inexpensive PCs are now connected with the local area network (LAN). Therefore, by implementing the grid computing middleware, computationally demanding applications such as the market price forecasting can benefit from the parallel computing while improving the utilization rate of the otherwise underused equipment. Fig. 3 shows an outline of the grid computing structure implemented for this study. Specifying one of the grid connected machines as a master PC, we install the master controller program of the middleware, and the slave screen saver on the rest of the grid-connected machines. The master program is often supplied with a programming environment of various languages (C, Visual Basic, or MatLab, etc.) that controls the distribution of tasks and data, and the retrieval of the results of the distributed computation from each of the slave machines. When a slave machine is idle, the screen saver runs and prepares to receive tasks from the master machine. The master machine selects a target data set that has not been processed and sends the data and the task (a small code to start the application) to a slave machine. The target application program to be executed at the slave machines need not be specifically developed and/or recompiled for the grid computing environment.
... Slave (Tasks and data)
... Slave
Slave
(Results)
LAN
... Master
Slave
... Slave
Slave
Fig. 3. Outline of grid computing structure
In the university environment, the grid machines tend to be uniform and often form a cluster in the same location (e.g., in a computer lab). However, depending on the capability of the middleware, the grid computing can extend beyond the computer labs or even out of the university campus. It could be also implemented on a mixture of different hardware platforms in a business IT environment.
4. NUMERICAL EXAMPLES The authors have implemented a commercial middleware, AD-Powers, developed by Dai Nippon Printing (http://www.ad-powers.jp/), which has a user interface based on MS-Excel spreadsheets and Visual Basic for Applications (VBA) application development environment in the undergraduate Economics information technology laboratory at Hosei University. Twenty one PCs running a standard Windows operating system on Celeron D340-2.93GHz CPU with 512MB PC-2700 memory are connected by fast Ethernet network. All the PC hardware is identical including the one selected to serve as the master. Electricity market forecasting is processed by Electricity Market Predictor (EMarP) system (http://www.tncsystems.com/), another commercial application originally intended for single-processor computing and developed in C/C++ language. Although EmarP allows users to choose three different models among two, four or six-day input models, we have used the four-day input model throughout this study. The slave machines run the EmarP application under the middleware as it receives the task and the dataset from the master, and upon completion of the optimization they return the results (contained in a small text file) to the master middleware controlling the overall grid computation. The master middleware stores the results in the Excel spreadsheet. The target data are obtained from Pennsylvania-New JerseyMaryland (PJM) one-day forward electrical energy market (http://www.pjm.com/). Hourly price data up to twenty days prior to the target date are used for model training and validation.
We have picked several days during the year 2003-2005 period when the electricity demand was heavy. Typical results of price profile (forecast and actual) are shown in Figs. 4-6. Due to the space limitation, only the 5 best results selected from one hundred optimization-validation tasks are shown. 140
Price [$/MWh]
120 Actual Task-23 Task-27 Task-50 Task-85 Task-95
100 80 60 40 20 0 1
3
5
7
9 11 13 15 17 19 21 23 Time
Fig. 4. Forecasting results (Case-1: January 28, 2003).
80
Price [$/MWh]
70 60
Actual Task-24 Task-27 Task-61 Task-79 Task-95
50 40 30 20 10 0 1
3
5
7
9 11 13 15 17 19 21 23 Time
Fig. 5. Forecasting results (Case-2: August 28, 2004). 140
Errors are measured by mean absolute percentage errors (MAPE) against the actual price values. Accuracy improvements are achieved when the best cases of the overall forecasting results are selected by a certain criterion (such as the minimum validation error). Even if the parallel-computed results look similar, such as shown in Fig. 5, it is relatively easy to remove unacceptable results by a simple criterion such as the maximum limits (performed on the master PC side). As the number of validation increases, it becomes more consistent that we tend to obtain (relatively) good results among several different possibilities. The ranges of errors are also similar to those obtained by the single computing (not shown). Fig. 7 shows the errors versus the number of optimizationvalidation. It seems that the accuracy improvement levels off or even decreases as the number of optimization is increased. On the other hand, the speed improvement measured by the acceleration factor, that is the ratio of computation time in the grid computation to total computation, is shown in Fig. 8. It shows that the speed improvement also levels off as the number of optimization is increased although the actual time for computation is definitely reduced by the grid computation
Table 1. Computational Computing Case
120 Price [$/MWh]
Table 1 summarizes the computational time and forecasting accuracy obtained in the grid computing environment for different numbers of validation. While the number of PCs participating in grid computing is fixed at twenty, the number of validation is varied from twenty to one hundred for all the cases (different target dates). In each case, CPU time of grid computing is measured (in seconds) from the instance when the grid computing is started to the last second when all the results are returned to the master PC, and are stored in the master spread-sheet. On the other hand, total computing time (not shown) was measured as the sum of optimization time performed by (and returned from) the individual PCs. These numbers are compared to calculate the factors of acceleration in the table.
Actual Task-2 Task-21 Task-46 Task-64 Task-74
100 80 60 40 20
Case-1
Case-2
0 1
3
5 7
9 11 13 15 17 19 21 23 Time
Case-3
Fig. 6. Forecasting results (Case-3: June 15, 2005).
Time
and
Accuracy—Grid
Number of CPU Time Factors of Errors (%) optimization (s) acceleration 20 60 11.70 27.44 30 173 6.27 23.08 50 280 6.73 24.39 75 320 9.12 22.58 100 567 6.90 21.23 20 88 12.91 20.70 30 203 8.33 17.17 50 492 6.52 6.58 75 685 6.86 8.98 100 1122 5.85 6.29 20 65 12.51 20.17 30 167 6.96 16.83 50 269 7.21 19.19 75 485 6.65 16.71 100 679 6.53 16.68
the length of time to solve each optimization problem varies due to the nonlinear model.
30%
Errors (%)
25% 20%
5. CONCLUSIONS
15% 10% 5% 0% 20
30
50
75
100
Number of optimization Case-1
Case-2
Case-3
Fig. 7. Comparison of errors.
In this paper, the authors have presented a grid computing solutions for forecasting electricity market prices based on a neural network time-series model. The grid computing has been implemented in a university computing laboratory with relatively inexpensive hardware and has been shown to improve the computational speed and accuracy of the forecasting using existing application programs designed for a single computer processing. The observation of the results provides insights into the future applications of grid computing in forecasting tasks and related modelling.
REFERENCES Computation time (ratio)
14
Shahidehpour, M., Yamin, H., and Li, Z. (2002). Market operations in electric power systems: forecasting, scheduling, and risk management. Wiley, New York.
12 10
Bunn, D.W. (2000). Forecasting loads and prices in competitive power markets. Proceedings of IEEE, 88 (2), 163-169.
8 6 4
Gagliardi, F. and Grey, F. (2006). Old world, new grid. Spectrum, IEEE, 43 (7), 28 – 33.
2 0 20
30
50
75
100
Number of optimization Case-1
Case-2
Case-3
Fig. 8. Comparison of speed improvement.
as indicated by the ratio of acceleration. This indicates that, at least for this data set, there seems to be a reasonable range of trade-off between computational speed gains versus the accuracy around fifty or so optimization. As observed in Table 1, the factor of acceleration by the grid computing reduces as the number of optimization-validation cycle increases. There seems to be extra computation time contributing to the lower acceleration factor probably because of the overhead of the grid computing environment such as loading the application executables on the slave machines. In the cases where the number of optimization exceeds the number of PCs, the slave machines perform more than one optimization tasks in series. (Loading and unloading the application is repeated as a new task arrives at the slave machines.) The master middleware distributes the remaining (unprocessed) tasks to the slave PCs by the order of loading – for example, if one slave PC stays idle after finishing a previous task, it is more likely to receive the next task than the other PCs that are busy computing other cases. Such a distribution of tasks is not currently optimized partly because
Hippert, H.S., Pedreira, C. E., and Souza, R. C. (2001). Neural networks for short-term load forecasting: a review and evaluation. IEEE Trans. Power Systems, 16 (1), 44-55. Gross, G. and Galiana, F. D. (1987). Short-term load forecasting. Proc. IEEE, 75 (12), 1558 - 1573. Khotanzad, A., Afkhami-Rohani, R., Lu, T. L., Abaye, A., Davis, M. and Maratukulam, D. J. (1997). ANNSTLF—A neural-network-based electric load forecasting system. IEEE Trans. Neural Networks, 8 (4), 835-846. Niimura, T. (2006). Forecasting techniques for deregulated electricity market prices – Extended Survey. Proc. IEEE Power Systems Conference and Exposition, 51 – 56. Sakamoto, N., Ozawa, K., and Niimura, T. (2006). Grid Computing Solutions for Artificial Neural Network-based Electricity Market Forecasts, Proc. IEEE World Congress on Computational Intelligence 2007, 4382-4386.