Forecasting electric loads with multiple predictors

Forecasting electric loads with multiple predictors

FORECASTING ELECTRIC LOADS WITH MULTIPLY PREDICTORS DEREK W. BUNN London Graduate School of Business Studies, Sussex Place, Regent’s Park. London N...

714KB Sizes 0 Downloads 88 Views

FORECASTING ELECTRIC LOADS WITH MULTIPLY PREDICTORS DEREK

W.

BUNN

London Graduate School of Business Studies, Sussex Place, Regent’s Park. London NWI 4SA, U.K. (Rtwiwd

7 Junr

1984)

Abstract-The simultaneous use of several forecasting models is becoming increasingly common in electric load prediction. We discuss some of the concepts and methods involved in this approach, present a form of taxonomy and evaluate counterarguments. Ultimately, a pragmatic justification for the approach is adopted, recognising its surrogate nature for either more comprehensive modelbuilding or identification. Overall, we seek to develop a critical perspective on applications of this approach in the electric power industry. I. INTRODUCTION

The choice of models available for predicting electricity demand is now an impressively large one. There are many styles of time-series analysis including, for example, spectral, Kalman, Box-Jenkins, exponential smoothing, and multiple regression, as well as econometric, trend-curve, end-use, expert use, and survey methods (see Ref. 1 for a recent review). When confronted with this methodological choice, the traditiona approach for load forecasting has been selective rather than eclectic. The problem has been viewed in terms of selecting for use the singularly best method rather than forming a composite view. Techniques of hypothesis testing, system identification, and model discrimination have been well developed to serve this problem. However, both the rational and pragmatic bases for this selective approach have now changed quite distinctly and it is becoming increasingly common for electric load forecasting to be based upon the simultaneous consideration of several predictive models. The rational basis of the selective approach was an appeal to the scientific method with its emphasis upon isolating the most truthful inductive hypothesis on the basis of its descriptive power for past and current observations. More recently, decision theory, with its emphasis upon using all the available info~ation and hypotheses to support decision-making. has come to be seen by many forecasters as a more appropriate underlying theory. The group of available predictive models should be used, so it is argued, to provide the most effective forecast for supporting the decision-making process, rather than for isolating the most realistic description of the load data. Thus, in discussing the role of forecasting in planning, Hogarth and Makridakis’ state that “one should use forecasts . _ . but not believe them.” The extreme pragmatism of this approach is not uncontroversial and some of the counterarguments will be considered in later sections. In conceptual terms, therefore, the eclectic approach follows from a rational appeal to decision theory. In theoretical terms, it reflects the fact that a more accurate prediction can be obtained from a combination of forecasts than by trying to isolate the singularly most accurate individual predictor. 3 In practical terms, computational constraints have sometimes, in the past, precluded even the singular implementation of the seemingly best model,4 quite apart from trying to run several in parallel. This situation has now changed with the advent of new generations of cheaper, more powerful computing facilities. For example, in the late 196Os, Farmer and Potton reported that, although a spectral predictive model performed best in off-line tests, a double exponential smoothing model had to be implemented on-line at the C.E.G.B. for computations reasons. With the upgrading of computing power, the C.E.G.B. is now developing a system that will run the spectral in parallel both with the exponential model and a regression-based predictor.5 In broad terms, this approach can be considered to be motivated by desires for enhanced accuracy and credibility in the forecasts. The methodological aspects of these two objectives will be considered in the following sections. 727

728

D. W. BUNN 2. ACCURACY

That a more accurate forecast can be obtained from a simple linear combination of predictors follows quite readily from the sampling properties of linear estimators.’ Thus, if we have n forecasts of the same variable defining the IZX 1 vector f, then the composite forecast f,, based upon the n X 1 vector of linear weights, w, i.e. f, = w’f will be optimum in the sense of having a minimum forecast error variance if w is determined according to w = Sie/(e’S’e) where e is an n X 1 unit vector, and S is an n X n covariance matrix of forecast errors between the n forecasts. The forecast error variance offc will be lower than any of the individual predictors in f, providing S has been well estimated. However, therein lies the problem, if there is only a small history of representative, stationary, forecast errors. Thus, for reasons of robustness to small sample estimation errors, it is sometimes suggested that S be treated as a diagonal matrix of forecast error variances (i.e. the elements off are assumed to be independent). In combining two predictors of the television-induced component of electric load, Bunn and Seiga16 found that the suboptimality due to adopting the independence assumption was very small. Other applications in load forecasting have apparently involved the choice to concentrate upon methods with the independence assumption, e.g. Srinivasan and Pronovost’ formulated an hourly predictor based upon a fourfold combination of hourly, daily, weekly, and yearly predictors, and both Gupta and Yamada and Farmer et al.’ combine on-line and off-line peak predictors with the independence assumption. The use of equal weights has been advocated on the grounds of impartiality, robustness, and in situations where little is known a priori on the error covariances. Pickles” achieved good performance with equal weights in combining on-line and offline daily predictors, as also did Bunn and Seiga16 on the television effects, although in the latter case, better results were still obtained from estimating S. It is perhaps surprising that optimal combinations based upon full estimation of S without the independence assumption have not been more apparent in the load forecasting literature. In the short term, at least, shortage of data is not usually a problem and thus small-sample robustness should not be critical. Robustness to outliers and non-stationarity is an endemic problem in this area, although state-validation functions commonly eliminate outliers prior to the forecasting facilities. Furthermore, non-stationarities in mean squared errors throughout the daily cycle are capable of being modelled. Perhaps, the invocation of the independence assumption just represents the early stage of adoption of combined forecasts in the short term. In this respect Farmer et al.’ justify their use of a simple, heuristic combination on the grounds of its being a first-stage implementation. Another complicating issue is that the available forecasts are not always directly comparable, in terms of being separate estimates of the same variable. They may be disparate in both time and form. Thus, with respect to timing, a common problem8*9 is that of using an off-line peak forecast to supplement an on-line predictor, not only for the instant of the peak but also for the load profile around it. In a sense, the off-line forecasts provide extra targets for the on-line predictor to aim at. This problem is usually further complicated by the fact that the timing of daily peaks is itself uncertain. The C.E.G.B.’ have currently developed a heuristic weighting procedure to deal with this problem. With respect to form, the combination of spot (MW) and integrated (MWh) data is typical. In the C.E.G.B. composite predictor (Farmer et al.‘), the offline predictor is for l/2 h integrated load data and is therefore combined with the online spot data by assuming it to be the l/2 h average spot demand. Similarly, Tennant” faced the problem of combining methods for sales data based upon billing cycles of different periods. The motivation of statistical accuracy clearly presupposes that such a quantitative form of evaluation is appropriate. This is more relevant to short rather than longer-term forecasting. In the short term, there is ample feedback of data from which to develop an ex post evaluation on the basis of statistical accuracy. In the longer term, a priori considerations of credibility are usually the basis of evaluation. Thus, to a significant extent, discriminating eclectic methods according to the motivations of accuracy and credibility also reflects a distinction between short- and longer-term applications. To a

Forecasting electric loads with multiple predictors

729

further extent, this discrimination reflects the speed and style of acceptability of the eclectic approach. Longer-term forecasting has been characterised by the use of expert opinions, highly subjective forecasts, and policy evaluations, rather than the highly empirical, statistical basis of the short-term situation. The managerial, decision-orientated, nature of the longer-term with its emphasis on compromising varying perspectives contrasts with the en~nee~ng background of the operational setting of short-term forecasting where the scientific method is traditional. Thus, perhaps, the earlier and easier acceptability of eclectic methods in the longer term. 3.

CREDIBILITY

It

is useful to distinguish between the desires for (a) external and (b) internal credibility. External credibility relates to the defensibility of the forecast to outside, possibly adversarial, criticism. Internal credibility, alternatively, refers to the confidence and coherence that the forecaster, or forecasting group, has developed in its own projections. Sometimes, the joint objectives of internal and external credibility can, by itself, foster multiple modelling. For example, highly subjective methods which help to develop internal credibility, personal insights, and coherence resolution, may be particularly vulnerable to adversarial criticism to the extent that an electric utility may prefer to present its forecasts to regulatory and consumer bodies in terms of more resilient models.‘2-‘6 4.

EXTERNAL

CREDIBILITY

In speaking on behalf of the New York State Electric & Gas Coloration, Fuller12 stated that “Our last, and overriding, concern in forecast design is defensibility . . . under strong cross examination. . . .” He then goes on to describe how an end-use model is less easily abused by unrealistic counter-assumptions than econometric models, and is therefore preferred for external presentation, whilst the latter is used internally for checking accuracy. On the same theme, Abromaitisi4 states that “It has become essential to. * . convince various outside interest groups that the forecast is indeed reasonable, from the standpoint of both approach and results,” and such is the concern with external credibility that Rasmussen,‘6 speaking in 198 1 on behalf of Kansas City Power & Light, stated that without it “. . . we may find the forecasting responsibility taken away from the utilities and placed under the direct control of regulatory agencies.” Several features of this concern for establishing external credibility encourage the eclectic, multiple-m~el approach through the apparent need to demonstrate (a) comprehensiveness, i.e. the forecast should appear thorough in terms of having accounted for all the available informatioon and hypotheses; (b) state-of-the-art, i.e. in terms of meeting adversarial challenges on the competence of the forecasting facility, it is important that the methods are seen to be incorporated; ” (c) accommodation of adversarial models, i.e. prior, explicit consideration of the models and ~sumptions that the utility’s critics may adopt can strengthen the defensive credibility of the forecast; and (d) a consistent view of the whole future, i.e. a priori evaluation of a longer-term forecast is often based upon the overall credibility of the forecasting facility, including short-term forecasting performance.‘7*‘R Kim” describes an overall integrated short-medium term forecasting facility and both Krogh et al.” and Ross et ~1.~’ discuss the nesting of very-short term forecasts within a short-term model, 5.

INTERNAL

CREDIBILITY

Considerations of external and internal credibility are not entirely disjointed. A high degree of external defensibility is clearly internally reassuring, as are considerations of accuracy. However, it is possible to identify some primarily internal needs for credibility that can foster multiple modelling: (a) coherence resolution, i.e. the need for a coherent synthesis of the available evidence and models is a strong motivation to the eclectic approach; (b) flexibility, i.e. the running of several models in parallel allows a forecaster to shift the weight of evidence as new information or insights become available; (c) reassurance, i.e. multiple models, can provide consistency checks on the forecast. Whilst

730

D. Vv’. BUNN

this objective is close to (a), it does imply the existence of a primary model that is being supported, namely the quotation of Fuller’* in Section 4, where econometric modelling was used as a check on the forecasts from an end-use model. (d) Dialectical inquiry, i.e. the active pursuit of several models and the consequent need to resolve the reasons why there may be disagreements, can be a spur to more insightful forecasting. Stephens22 and Torrance and Maxwell’5 make this point. 6.

ECLECTIC

METHODS

For some purposes, it seems sufficient to present a forecast in terms of two or three different models showing consistent projections. The eclectic approach has thereby been one of resolving any initial divergencies in the views of the models (or experts). This seems to be the approach of Refs. 11, 15, 22, 24, and 25. There is iittle published evidence of any formal Bayesian methods of coherence resolution being used, although a need is recognised, e.g. ThomasI states that “. . . our big challenge is to find a rational way . . , to synthesize divergent views into predictions of the most probable future.” That last phrase, hinting at the objective of a point estimate, shows where the need for a more explicit, quantitative form of synthesis arises. Where a point estimate is required, considerations of accuracy begin to appear and the composite methods described previously are relevant. This is in distinction to some of the more “discursive” forecasts where point estimates are avoided and a reasonably coherent set of models can suffice. Somewhere between these methodologies is that of overlapping confidence intervals,26,27 whereby a set of forecasts is expressed in terms of confidence intervals and the composite forecast is the intersection of those intervals. This method clearly leaves open the question of the level of confidence that should be ascribed to the intervals. 7.

CRITICISM

Despite the wide advocacy of the eclectic approach and several demonstrations of its efficacy, the approach does come in for some criticism. The counterarguments include: (a) The difficulty of communicating a combined forecast. Ashburn emphasizes the importance of simplicity in the presentation of forecasts to outside agencies. Combined forecasts are sometimes multi-modal and equivocal in their future view, which can be confusing. (b) Lack of model simplicity. A forecasting model should be an intuitively simple device to develop insights, not a clumsy a~regation of convicting ideas. This is the model-builders’ perspective and clearly relates to personal style and internal credibility. (c) A combined forecast reflects a failure to build a single comprehensive model. This is an extreme form of the two criticisms above and is quite widely held. However, whilst denigrating the eclectic approach slightly, it does nothing to invalidate it. It is recognised that combining forecasts is a pragmatic response to the multivariate problems involved in cons~cting a large, comprehensive model. Thus, in medium-term forecasting, a comprehensive model reflecting both time-series and econometric features has been recognised as desirable, but elusive, by several forecasters, e.g. Crow,29 Aigner,30 and Higley and Brannon3’ the latter describing its development as “formidable” (sic). However, Goelt3* has developed a desirable hybrid by estimating some of the variables in an end-use method by econometric means (although simulation was required) and Uri33 has combined a Box-Jenkins model for peak loads with econometric estimation of some of the parameters. It is important to realise that the distinction between a combined forecast of multiple models and a single multivariate model is that the former aggregates the outputs of several models (including expert opinions), whereas a single model develops a coherent relationship between several input variables. Thus, if we had two different multiple regression models, for example, the combined forecast would be an aggregation of the two dependent variable estimates, whereas a single comprehensive model would seek to develop a larger multiple regression model based upon all the pooled independent variables. Likewise, in combining expert opinions, a composite judgement is based upon aggregating final estimates, whereas a single model would involve rationalising a coherent

Forecasting electric loads with multiple predictors

731

structure from ail the separately explicated cognitive process models that each expert professes. Thus, it is the robustness and facility of working with the outputs of predictive models (the “forecasts”) that recommends the approach over the theoretical ideal of comprehensive model-building. Clearly, however, it cannot be used to justify superficial modelling. (d) A combined forecast reflects a failure to identify the singularly most appropriate model. Models are sometimes run in parallel as a process of identification until the weight of evidence suggests the singularly most appropriate. The common way for this to be done is via a vector, p, of probability weights, which are revised in Bayesian fashion. Thus, the combined forecast, f,, is the expectation fP = p’f which takes on the degenerate form fp = f; as pi - 1, reflecting that forecasting model i is relatively the most accurate description of the load data, posterior to the available performance data. This was used by Bunn’4 in a model that categorised daily load curves as being based upon two types of standard profiles for cloudy and bright days. Kalscheur35 also used multiple models to switch between the two Box-Jenkins formulations, although on a more subjective basis than that of formal posterior probabilities. Again, however, the model-switching role of multiple-modelling reflects a pragmatic response to the problem of identification. It is evident from (c) and (d) that multiple-modelling is to be defended as a surrogate device, either for the combination of a simple comprehensive model, or a clear identification of an appropriate process category. It is a pragmatic response to the statistical problems involved in undertaking either ideal. These two roles of model aggregation and model switching sometimes require careful cla~~cation.36 The model switching concept clearly imposes a pa~ition upon the set of predictive models in the sense that one, and only one, is deemed appropriate at a particular instant in time (the problem is knowing which one). This partitioning of the models is sensible if they relate to mutually exclusive assumptions or constructions (bright or cloudy days, for example), or if the models are nested (e.g. polynomials of differing orders) or if there is a pattern-recognition element in the modelling. Model switching is not appropriate when the ultimate desire is not to select one of the forecasts, but to aggregate them together in order to combine the best features of each. However, even in circumstances when the concept of model aggregation is more sensible than that of model switching, the latter methodology may be used for pragmatic reasons when fW is not expected to work well (e.g. with the inclusion of poor models and/or highly positive correlations). Finally, it is worth observing that, with the more general acceptability of eclectic methods, the way that multiple forecasts are used may have to be defended as carefully as the construction of the individual forecasts themselves. There is now a large literature on the methodology of combining forecasts, outside that reviewed in this study from the electric utility industry, and this shows a wide diversity of techniques. There is a range of methods based upon accuracy36-40 or coherence40-43 objectives but as yet no clear and simple insights into their appropriate realms of application. Nevertheless, it is hoped that this article has helped to identify the types of objectives and practical considerations involved in using multiple predictors, from which a broader perspective on the subject can be developed. ~c~~u~~e~g~~~fff_The study was undertaken.

author wishes to thank the C.E.G.B. for research support during the period that this

REFERENCES M. Abu-el Magd and N. Sinha, IEE Trans. SMC 12, 370 (1982) R. Hogarth and S. Makridakis, Mgml Sci. 27, 115(1981). D. W. Bunn and E. Kappos, Eur. _J.Opl Res. 9, 173 (1982). E. D. Farmer and M. J. Potton, Prnc IEE 115, 1549 ( 1968). M. Halperin, Am. Star. Ass. .I. 56, 36 (1961). D. W. Bunn and J. P. Se&al, J. Opt. Res. Sot. 34, 17 (1983). K. Srinivasan and R. Pronovost, IEE Trans. PAS 19, 1854 (1975). 8. P. Gupta and K. Yamada, Proc IEEE Winter Mechg, pp. 2085, New York (1972). 9. E. D. Farmer, W. D. Laing, A. M. Adatia, A. B. Baker and D. W. Bunn, Proc 7th PSCC, Lausanne, p. 588 (1981). 1. 2. 3. 4. 5. 6. 7.

732

D. W. BUNN

10. J. H. Pickles, RD/L/N 1I5/74, Central Electricity Research Laboratories, (1974) Kelvin Avenue, Leatherhead, U.K. 11. R. L. Tennant, EPRI EA-1035~SR (1979) Palo Alto, CA 94304, U.S.A. 12. K. E. Fuller, EPRI EA-1035~SR (1979), Palo Alto, CA 94304, U.S.A. 13. B. Thomas, EPRI EA-1035SR (1979) Palo Alto, CA 94304, U.S.A. 14. S. C. Abromaitis, EPRI EA-1035~SR (1979), Palo Alto, CA 94304, U.S.A. 15. J. M. Torrance and L. C. Maxwell, EPRI EA-1035-SR (1979) Palo Alto, CA 94304, U.S.A. 16. C. C. Rasmussen, EPRI EA-2471 (1982), Palo Alto, CA 94304, U.S.A. 17. R. S. Bower, EPRI EA-1729~SR (1981) Palo Alto. CA 94304. U.S.A. 18. G. Kim, EPRI EA-247 I (198 l), Palo Alto, CA 94304, U.S.A. 19. G. Kim. EPRI EA-2471 f 1982). Palo Alto. CA 94304. U.S.A. 20. B. Krogh, E. S. de Llinas~and D. Lesser, I&E Trans’PkS 101, 3283 (1982). 21. D. W. Ross, G. B. Ackerman, R. Bishke, R. Podmore and K. D. Wall, PICA 198 (1979). 22. G. 0. Stephens, EPRI EA-1035~SR (1979), Palo Alto, CA 94304, U.S.A. 23. H. Hirsch, EPRI EA-1729-SR (1981) Palo Alto, CA 94304, U.S.A. 24. J. L. Sweeney, EPRI EA- 1729-SR (198 I), Palo Alto, CA 94304, U.S.A. 25. C. Broder, EPRI EA-247 I (1982), Palo Alto, CA 94304, U.S.A. 26. W. G. Michaelson and R. B. Comerford, PICA 247 (1977). 27. R. B. Comerford and C. W. Gelhngs, IEEE PES Summer Meefing, San Francisco, p. 4656 (1982). 28. R. Ashburn, EPRI EA-2471 (1982). Palo Alto, CA 94304, U.S.A. 29. R. T. Crow, EPRI EA-1035-SR (1979), Palo Alto, CA 94304, U.S.A. 30. D. J. Aigner, EPRI EA-1729-SR (198 l), Palo Alto, CA 94304, U.S.A. 31. B. Higley and F. Brannon, EPRI EA-247 I ( 1982) Palo Alto, CA 94304, U.S.A. 32. A. Goelt, EPRI EA-2471 (1982), Palo Alto, CA 94304, U.S.A. 33. N. D. Uri, Cycles 27, 59 (1976). 34. D. W. Bunn, Appl. Math. Modelling 4, 1 I3 (1980). 35. R. J. Kalscheur, EPRI EA-1035-SR (1979), Palo Alto, CA 94304, U.S.A. 36. D. W. Bunn, J. Opl. Res. Sot. 32, 213 (198 1). 37. J. M. Bates and C. W. J. Granger, Opl. Res. Q. 20, 451 (1969). 38. C. W. J. Granaer and P. Newbold. Forecastina Economic Time Series. Academic Press. New York (1977). 39. C. W. J. Granger and R. Ramanathan, J. For&asting 3, 197 (1984). 40. R. F. Bordley, J. Opl. Rex Sot. 33, 17 I (1982). 41. P. A. Morris, Mgmt. Sci. 20, 1233 (1974). 42. P. A. Morris, Mgmt. Sci. 23, 679 (1977). 43. D. Lindley, Ops. Res. 31, 866 (1983).