Application of Advanced Computing Methods to Transmission System Operational Studies at the Bonneville Power Administration Brian Tuck*. Ramu Ramanathan**. Terry Doern***
*Transmission Technical Operations, Bonneville Power Administration, Vancouver, WA (e-mail:
[email protected]). **Maxisys, Vancouver, WA (e-mail:
[email protected]) *** DesignTech,Vancouver, WA (e-mail:
[email protected]) Abstract: This paper describes a project that will improve response to emergency transmission system conditions by providing Bonneville Power Administration (BPA) operations planning study engineers with the tools to quickly and effectively set System Operating Limits based on current system conditions. Growth of new intermittent renewable resources, smart grid controls, and increased environmental, political, and market demands, are pushing the transmission system to operate with less margin than today. In the future, the data, assumptions, and analysis tools on which today’s operational decisions rely will no longer be fast or accurate enough to reliably manage a system operated closer to the reliability limits. This paper describes improvements accomplished by BPA through use of real time data and distributed processing. Initial studies are shown applying distributed processing and real time base cases to an actual contingency event on the BPA transmission system. Keywords: Power flow, Distributed Simulation, Real Time Model, State Estimation, System Operating Limits, Contingency Analysis, Study Automation.
1. INTRODUCTION The Bonneville Power Administration (BPA) is a US Department of Energy agency located in the Pacific Northwest. BPA is responsible for the delivery of power from 31 federal hydro projects in the Columbia River basin, with 20,430 MW of generating capability, and operates 15,215 circuit miles of transmission. The Transmission Technical Operations group at BPA is responsible for providing operational planning studies and other engineering support to real-time transmission system operations. Technical Operations performs near-term system studies for seasonal operating procedures, near-term planned outages, and in response to unplanned outage events. Traditional operational planning studies are performed using standard bus/ branch, seasonal system planning study cases and off the shelf power flow and transient analysis software running on desktop computers. Contingency analysis, thermal system operating limit calculations, voltage stability, and transient- stability tools are used to compute path limits for seasonal transmission operations planning and in response to unplanned outage events. Grid operation is subject to North American Electric Reliability Corporation (NERC) criteria. This criteria states that each operating condition must meet defined post-contingency performance requirements when a single (N-1) or double (N-2)
contingency occurs. Each study therefore subjects a defined operating condition (the base case) to a series of loss-ofelement contingencies (N-1 and N-2) and evaluates the performance of each one with respect to the NERC postcontingency requirements. A typical study will evaluate a number of operating conditions in this way, chosen by the engineer as representative of the boundary of the expected operating region, creating a nomogram or other useful representation of the relationship of critical parameters to the System Operating Limit (SOL). These studies are currently run using seasonal planning cases developed by the region. Seasonal cases are typically 3-6 months old and represent a best guess of “normal” or “worstcase” conditions. Operations engineers typically will modify the seasonal case to represent expected operating conditions for the period of time under consideration, or current operating conditions when running new studies in response to an unplanned outage or other significant event. This can be a lengthy, time consuming process if the generation and load patterns of the operating condition under study differ significantly from those in the seasonal case. When an unplanned outage occurs, it is impossible to adjust the seasonal case to simulate the exact real-time conditions within the short time period necessary to provide system operators with new SOL studies.
In addition, many of the SOL studies are time consuming to run and take too long to be effectively used during emergency conditions. For example, thermal system operating limit studies for one of the interties BPA operates can take up to 24 hours to complete a full suite of scenarios (7,500 power flows). When an unplanned outage occurs, or during critical operating times (such as high load or wind generation), new SOLs need to be evaluated for use during the current or next hour. To meet the time constraints during critical times, many scenarios are not studied simply because the calculation time takes too long – the conditions are superseded before the results are available. Such incomplete analysis poses a potential reliability risk. While these traditional methods have served the purpose in the past, there is a growing need to improve the tools used, both in terms of the fidelity of the base cases with respect to current operating conditions, and in reducing the time it takes for the analysis to be performed. Drivers of this growing need for improvement include a number of factors that are increasingly utilizing the transmission system in ways that are different from historical patterns: increased integration of intermittent resources such as wind generation; greater operating restrictions on conventional generation sources such as coal/ gas (air quality regulations) and hydro (endangered species); changing load patterns and composition (shift from resistive to motor and electronic); and the potential for increasing control of loads through Smart Grid based devices. Many of these changes are occurring quickly, and as system operation shifts away from historical patterns, analysis in real-time becomes critical. One example of the change experienced by BPA is in the area of wind integration. In less than 3 years, 3,000 MW of wind generation has been integrated into the BPA balancing area, with the total energized wind fleet expected to increase to 6,000 MW by the end of 2014. As a non-dispatchable resource, the intermittency adds to the variability experienced across the transmission system through large mid-hour ramps with a characteristic that is largely unpredictable at this time. Additionally, these resources displace an equal amount of conventional generation that has historically been relied on to provide voltage control, stability, frequency response, and remedial actions, thus resulting in a transmission system that continues to experience heavy power flows but with limited sources of operating margin. This paper reports efforts at BPA to implement a real-time based model and improved analysis tools. Section 2 reviews the problem type and approach; sections 3-5 detail the software, hardware, and test cases used; and section 6 reports the results of applying these improvements to an actual case study. 2. PROJECT APPROACH As noted previously, the first objective of this project was to provide, in a timely manner, Technical Operations study engineers with a power flow base case that represents real time operating conditions. Ideally, the base case would be provided in a format that is consistent with existing power flow and analysis tools currently in use by the study staff.
BPA has maintained an Energy Management System (EMS) State Estimator since the 1980s. The State Estimator (SE) provides an accurate representation of current system conditions by creating a power flow base case that can be used for system studies. This base case is updated every 10 minutes and represents the same essential transmission system as provided by the seasonal planning case currently used for operations planning studies, with the notable improvements of including real-time generation and load patterns. In addition, the SE generated base case is a “Full Topology” model, meaning that it explicitly models the detailed switch configurations (i.e., Power Circuit Breakers and disconnects). Equipment outages within a substation can significantly change the risk and consequences of a line fault. The typical seasonal base case currently in use is simplified to a “bus/ branch” model. Since this model does not directly simulate the effect of switch status, many of the consequences of these outages are not readily apparent. Retaining the detailed modelling within the substation makes these risks readily apparent to the operations planning engineer. Despite the availability of these cases, BPA engineers have not used these cases for operational planning studies because the advanced analysis tools required by operational engineers to evaluate system reliability and set system operating limits are not currently available on the EMS. While it would have been possible to improve the tools available on the EMS, we decided early on to retain the current analytical tools and provide a means of exporting the SE case to a format that would make it available for real-time studies as these cases were produced. This approach would retain the specialized analytical tools that have been developed, and are currently in use when performing system studies, while achieving the required fidelity to current system conditions. To take full advantage of the SE real-time case, and be able to run the full number of scenarios necessary for a complete analysis, there is a need to substantially increase the speed of these tools so that analysis and results can be completed fast enough for real-time operation. The second objective of the project was to reduce the total computation time required for these studies to be run so that the real-time system can be analyzed, and new SOLs could be evaluated, within the limited time constraints of critical operating periods. Study tools used by BPA engineers include contingency analysis and thermal, voltage stability, and transient studies. Common to all of these is the need to run many different scenarios (usually hundreds) against the same base case, representing system conditions under study. Such a situation, where many independent scenarios can be run at the same time without interdependency in the solution, is a prime candidate for distributed computing techniques when the total computation time of the entire problem set needs to be reduced. Distributed computing is successfully employed in many industries. Because of the types of analysis that are performed, we looked toward distributed processing methods to achieve the necessary decrease in analytical computation times. The main problems for this effort are as follows:
• Can the contingency analysis and thermal SOL limit calculation study applications be split into many smaller calculations so they can be applied to distributed CPUs and decrease the overall elapsed calculation time? Problems must be appropriate for distributed computing and be sized to benefit from the technology. • Are the benefits retained as either the problem size, or the number of CPUs available, is scaled? Is there a limitation due to the number of CPUs, or from the number of contingencies sent to each CPU? • Distribution must not impose additional work on the user. The user interface should be same irrespective of whether one computer or multiple computers are used to perform the analysis. 3. USING POWERFLOW CASES BASED ON REALTIME DATA Given the drivers of change previously mentioned, the need for more accurate representations of current system conditions is more acute then before and has resulted in the recognition that some of the problems seen in real time cannot be simulated using the seasonal base case. The State Estimator collects real-time measurements and switch status and then estimates power system states by reducing measurement errors from the data. This process creates a power flow solution that can be exported from the control center EMS to an auxiliary file format compatible with existing power flow tools. The export is accomplished using scripts written by BPA, using the PERL programming language. When exporting the files, model transformation and validation are performed (figure 1). The real-time model auxiliary file can then be loaded into the study tools and solved.
CCM
SCADA
SE
Real Time Data
DMZ
CA
Aux File Program
Aux files @ OPI Server
provides a search capability based on pre-defined criteria and data. These include load levels, power flow across specific transmission paths, generation levels, bus voltage, bus voltage sensitivity, and reactive generation. This interface allows the study engineer to quickly identify base cases that are relevant to the conditions under study, and visually see how critical study metrics are changing over time. 4. DISTRIBUTED COMPUTING SOFTWARE The first step in achieving speed gains from distributing computing technologies was to determine which of the analysis tools currently in use would benefit from a distributed processing architecture [1-9]. A typical SOL study consists of multiple, independent simulations (e.g. the results of any basecase subjected to any contingency do not depend on the results of any other combination of basecase and contingency). As a result, these independent simulations can be grouped into subsets and distributed to multiple processors without regard to interactions between the subsets during the simulation process. Modification of existing analysis software was necessary to take advantage of multiple processor availability. To be successful, the software had to automatically break the entire problem set into smaller, independent subsets, run each subset on its assigned processor, then reassemble the results for the user. A fundamental requirement for the project was that the entire process must be transparent to the user of the tool. As built, the distributed processing tool automatically divides the total number of contingencies into Contingency Groups and runs in several CPUs (figure 2). The contingencies run in parallel. Results are automatically collected and compiled for the user. Finally, after all Contingency Groups have been processed, the distributed processes are released and destroyed. Figure 2 represents the general flow of data for parallelizing contingency analysis. There is overhead required to initiate each power flow process, pass the Contingency Groups to the power flow processes, gather results and display them to the user.
Master CPU
Figure 1: Export Process These converted real-time model auxiliary files are stored every 10 minutes in a server. This immediately brought up a new problem: where to perform studies previously engineers had only one case to begin with, now they have thousands to choose from. Depending on the study to be performed, the engineer may choose the most recent case (such as during an emergency following a contingency event) or a past case with particular conditions of interest (high load or restricted generation availability). BPA developed a user interface that
Contingency Analysis Results
Parallel Compute Contingency Manager
Power System
Parallel Thread
CPU #1
1
Parallel Thread
CPU #2
2
Parallel Thread
CPU #n
Max CPUs
Figure 2: Distributed Processing
BPA carried out performance testing on different CPUs. Tests were performed to compare CPU speed, number/ availability of CPUs, and memory speed. Three computer models used for performance testing are as follows: • DELL M6500 Laptop with dual-quad core CPU Intel I7 Q720 @ 1.6 GHZ (2 processors, 8 CPUs) Memory: DDR3, 4GB, 1333MHz. Operating System: XP-64. • HP Elite Book 8730w Intel Core 2 Extreme Q9300 @ 2.53 GHZ, (2 processors, 4 CPUs) Memory: DDR3, 4GB, 1333MHz. Operating System: XP-64. • Servers with dual-quad core CPU AMD Opteron™ 2087, 2.80 GHZ (2 processors, 8 CPUs) Memory: PC2-6400 DDR2, 16 GB, 800 MHz. Operating System: Windows server 2008 Release 1 (Vista). A cluster of six servers were purchased and installed on a 100 Mb network for testing. 6. CASE STUDY PERFORMANCE RESULTS To test the performance of the distributed processing software using multiple CPUs, the contingency analysis tools was run using a total of 3233 contingencies against a single base case. The Contingency Group size was set to contain 20 contingencies to process, and the simulation used the State Estimator full-topology model case - Jan 8, 2010 23:16. The model studied contains 43,031 nodes, 49,880 branches, 2,010 generators and 5,214 loads. The model is a nodal model. The branch data includes breakers, disconnects, fuses, ground disconnects, series caps, transformers and zero impedance branches and lines. The system contains 1,259 switched shunts and 3,319 transformer controls. There are 4,328 substations in this model. Contingency analysis always stores a pre-contingency state. Prior to solving a contingency, the pre-contingency state is used as a starting point for each contingency. At the end of processing each contingency the system is set back to the pre-contingency state. In this test, the real time case of Jan 8, 2010 23:16 is used as a base case. In this study a full Newton-Raphson solution Method was used. For each contingency bus voltage and line limits were monitored and the violations were reported. If a contingency did not solve the unsolved contingency was marked for further analysis. The elapsed times for different distributed computing configurations are presented in figure 3. Five tests were run using 1, 8, 16, 24 and 32 CPUs. While elapsed time is improved with additional CPUs, the rate of improvement goes down as more CPUs are added. This flattening of improvement is a result of the process overhead associated with distributing the work and reassembling the results for the user. This work increases with additional distribution,
showing that there is a limit to the improvements that can be gained through distribution alone.
Time in Sec.
5. SIMULATION HARDWARE
3500 3000 2500 2000 1500 1000 500 0
2721
700
0
8
483
359
315
16
24
32
Number of CPUs
Figure 3: Elapsed Processing Time vs. Number of CPUs Since continued improvement by simply adding CPUs eventually is limited, different processor speeds and memory availability were tested. Based on benchmark comparisons of the server versus quad laptops with slower CPU speed but faster memory speed, it appears that memory speed is a key factor affecting performance when distributing power flow analysis studies. The large, sparse data matrix used in power flow calculations probably forces the CPU to grab data from slower memory rather than the small, fast-cache memory on the CPU chip. The laptop memory is almost twice as fast as the server memory and, correspondingly, the laptop is almost twice as fast as the server when performing power flow studies (table 1). These factors indicate that additional optimization for speed could be accomplished through judicious choice of processor and memory. In addition, the selection of contingency group size and organization by type of contingency (loss of line elements vs. loss of generation) may yield improvements. Table 1: Processing Time vs. Server Memory Number of CPUs
8
6
4
2
1
Quad CPU Laptop CPU: 1.6GHz, I7 Memory: 1333 MHz, DDR3
56
64
78
108
179
Server CPU: 2.80 GHz, Memory: 800 MHz, DDR2
111
123
157
263
360
An additional test was run using a contingency analysis tool which defines an operating nomogram based on thermal constraints. This tool allows the engineer to run the contingency simulations against multiple base case scenarios. The general work flow for performing this analysis is the same as shown in figure 2; but instead of breaking it down by groups of contingencies, groups of scenarios are sent to
CPUs. Each scenario has different generation patterns, load levels or ambient temperatures. Performance results are presented below (figure 4) for multiple CPUs. 4000
3,720
3500 3000 Sec
2500 2000 1500 1000
583
500
265
0 0
8
16
CPUs
179 24
160 32
40
Figure 4: Scenario Processing Time vs. Number of CPUs Again we see the similar flattening of performance with an increased number of processors, though in this test with multiple base cases the overall speed improvement is greater than the case with only a single base case. Distributed processing has been successfully implemented for current system operation studies. Using quad-CPU laptops plus thermal operating limit scenario algorithm improvements, the elapsed calculation time has been decreased from 24 hours to less than an hour. This in turn has helped to shrink curtailment times after unplanned line outages. For example, on Sept. 14, 2010, an unplanned outage forced curtailment of 1500 MW. Cost of replacement energy was $40.36. The outage calculation time was reduced using distributed computing from 24 hours to 2 hours, after which flows increased by 750 MW. The estimated savings for the region is 22 hrs X 750 MW X $40.36 = $665,000. Figure 5 shows the actual flow (blue), the original curtailment (orange), the revised SOL (green), and the savings by the red shaded block. 2000
MW
1500
1000
500
0 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 TIME
Figure 5: September 14, 2010 Event
7. SUMMARY Before increasing the SOL, the operations engineer must model current operating conditions, perform real-time analysis, and identify new reliability risks. The real-time model accurately represents current conditions and provides a new case in 10 minutes compared with a few hours to a few months to build a reasonable approximation from the seasonal case. With the real-time case, SOLs are calculated based on thermal, voltage and stability limits for current and future hours using actual measurements applied to a detailed model containing all switches (Full-Topology model) to increase accuracy and achieve realistic limits. Reliability and accuracy are improved by using a real-time case using the more-realistic Full-Topology model. Distributed processing was successfully implemented and is currently used for SOL studies. The solution time has decreased, reducing curtailment time, which in turn creates a cost savings for BPA customers and the region. Additional research and implementation in error handling and clustering will make distributed processing more transparent for engineers. Using more CPUs – within limits -- and faster memory can help to improve speed and further reduce elapsed time. This project has regional benefits by enabling customers to use the BPA transmission system capacity more effectively, while more responsively analyzing the reliability impacts of continuously changing system conditions.
REFERENCES Alves A.C.B and Monticelli A. (1995). Parallel and distributed solutions for contingency analysis in energy management systems. Circuits and Systems Proceedings of the 38th Midwest Symposium, volume (vol.1), pp.449452. Di Santo M, Vaccaro A., Villacci D, and Zimeo E. (2004). A distributed architecture for online power systems security analysis. Industrial Electronics, IEEE Transactions, volume (vol. 51, no. 6), pp. 1238- 1248. Gorton, I. Zhenyu Huang, Yousu Chen, Kalahar B, Shuangshuang Jin, Chavarria-Miranda D, Baxter D, and Feo J. (2009). A High-Performance Hybrid Computing Approach to Massive Contingency Analysis in the Power Grid. e-Science 2009. Fifth IEEE International Conference, pp.277-283. R. Ramanathan, S. Rahman and L.L. Grigsby. (1982). Applications of multiprocessing to power system problems. 14th South-eastern Symposium on System Theory. R.
Ramanathan, and L.L.Grigsby. (1984). Optimal decomposition technique for Multi-Microprocessors based power system analyses. Mini Microcomputers in Control Filtering and Signal Processing Conference.
Riquelme Santos, J, Gomez Exposito A, and Martinez Ramos J.L. (1999) Distributed contingency analysis: practical issues. Power Systems, IEEE Transactions, volume ( vol.14, no.4) pp.1349-1354. Yousu Chen, Zhenyu Huang, and Chavarria-Miranda D. (2010). Performance evaluation of counter-based dynamic load balancing schemes for massive contingency analysis with different computing environments. Power and Energy Society 2010 IEEE General Meeting, pp.1-6. Zhenyu Huang, Yousu Chen, and Nieplocha, J. (2009). Massive contingency analysis with high performance computing. Power and Energy Society General Meeting, volume (IEEE PES '09), pp.1-8.