So&-Econ.
Pfann. Sci. Vol. 21, No. 4, pp. 239-243,
1987
Copyright ‘c
Printed in Great Britain. All rights reserved
003~-~12l/X7 $3.00 + 0.00 1987 Pergamon Journals Ltd
SIMPLE RULES FOR COMBINING FORECASTS: SOME EMPIRICAL RESULTS JEFFREY
L. RINGUEST
Administrative Sciences Department, Boston College, Chestnut Hill, MA 02167, U.S.A.
and KWEI TANG Department of Quantitative Business Analysis, Louisiana State University, Baton Rouge, LA 70803-6316, U.S.A. (Receeived 1
October
1986)
Abstract-This study is an empirical comparison of three rules for aggregating forecasts. The three combined forecasts evaluated are: a simple average forecast, a median forecast and a focus forecast. These combined forecasts are compared over four economic variables (housing starts, the index of industrial production, the unemployment rate and gross national product) using a set of previously published forecasts. The results indicate that an average forecast will not perform as well as previous studies indicate if all or most of the individual forecasts tend to over- or under-predict simultaneously. The median forecast also seems to be suspect in this case. There is little evidence to suggest that the median forecast is a viable alternative to the mean forecast. Focus forecasting, however, is found to perform well for all four variables. The evidence indicates that focus forecasting is a reasonable alternative to simple averaging.
INTRODUCTION Business and economic decisions are often influenced by the ability of the decision maker to obtain accurate forecasts of a variety of variables. Thus methods of improving forecasts may be of great importance. Many approaches have been used to enhance the accuracy of forecasts. For example, the accuracy of forecasts can often be improved by gathering more and better data for building the forecasting models or by using more sophisticated models. These avenues, however, are not available to the decision maker who simply acquires forecasts from some supplier (e.g. a state econometrician who obtains forecasts of national variables from one or many national econometric models or a production planner who is given forecasts of product demand by the marketing department). One potential avenue that is available to these decision makers is through the aggregation of forecasts. Aggregating or combining forecasts is attractive for many reasons. For simple combining rules (i.e. those that are not based on past performance such as a simple average) it is not necessary to maintain any additional historical data. Alternatively, the decision maker may have solicited forecasts from many sources, rather than choosing which forecast is best the decision maker can combine them. Further, in most situations there will not be one single forecast which is always best. Bates and Granger [l], Makridakis et al. [2], Makridakis and Winkler [3], Newbold and Granger 141, Winkler and Makridakis [S] and others have examined the aggregation of econometric forecasts. These 239
studies indicate that combined forecasts generally produce better forecasts than any of the individual forecast in the combination. These studies further show that much of this improvement can be obtained by using a simple average to combine the forecasts and that only a few forecasts must be included in the average. Further support for using simple combining rules can be extrapolated from the studies conducted by Mentzer and Cox [6] and Sparkes and McHugh [7]. In both studies users of forecasting techniques were surveyed to determine those techniques most commonly used. Both studies conclude that other than accuracy users mentioned ease of use as the primary advantage of the techniques they chose and that the more sophisticated the technique is the less likely it is to be used. Thus simple combining rules are more likley to be adopted in practice than complex ones. The present study uses the data collected by Gulledge et al. [8] to evaluate three simple rules for combining forecasts. In addition to a simple average or mean this study examines an alternate measure of central tendency, the median forecast. The median is often used as a “distribution-free” measure of central tendency. In combining forecasts, the median would seem to be a good alternative to the mean if the set of forecasts to be combined contains one or two particularly bad forecasts (i.e. outliers). The third procedure included in this study is the focus forecasting procedure developed by Smith and Wright [9]. This procedure which has received relatively little attention in the literature was originally developed for inventory control. The idea behind focus forecasting is simple; the forecasting technique which
240
JEFFREY L. RINGUESTand KWEI TANG
provided the most accurate forecast for the current period is used to forecast the next period. The accuracy of each combining rule is investigated as is the effect of the number of forecasts available on the accuracy of each combination. The variables in this data set are national economic variables; namely, the index of industrial production, gross national product, housing starts and unemployment rate. Five of the forecast for each of these variables are from professional forecasting services that are nationally available. Thus, it would be impossible for the user of these forecasts to directly influence the accuracy of the forecasting model. DATA
The data were taken from an earlier study by Gulledge et al. [8]. In their study econometric forecasts were selected from the Business Forecasts publications that are compiled annually by the Federal Reserve Bank of Richmond [lo]. The forecasts presented in each issue are for five quarters. These five forecasts are for the fourth quarter of the previous year and four quarters in the current year. The forecasts are not published on the same day, but most are dated late December or early January, therefore, it is not clear if the fourth quarter forecast for the previous year is actually a forecast or merely a summary of preliminary data. For this reason the previous year forecasts were not included in the data set. Each annually published forecast summary was reduced to a series of four ex-ante quarterly forecasts for a particular year. Included in the study were the years 19781983. The index of industrial production, gross national product, housing starts, and the unemployment rate were the variables included in the study. Twenty-four quarterly forecasts for each of these were obtained from the Business Forecasts publication for the forecasting models listed in Table 1. In addition, actual values of each variable were collected from the Citibank data base [ll]. These data were used by Gulledge et al. to build univariate Box-Jenkins models for each of the four variables. These ARIMA forecasts represent the sixth entry in Table 1. In this study, the actual values were also used to evaluate the accuracy of the combined forecasts.
DESIGN
OF THE STUDY
A number of studies are present in the literature [l-5] which demonstrate the potential advantages of aggregating forecasts. The general form of this aggregate forecast is: f, = Wlfi + Wf2 + . . + w,f,
where f, is the combined forecast, J; is the forecast from the jth method, wj is the weight applied to the jth forecast and n is the number bining.
of forecasts available for com-
These studies show that a combined forecast will often predict more accurately than any of the individual forecasts that are incorporated in the combined forecast. Further, these studies show that while there is an optimal combined forecast [12, pp. 268-2781 a simple average of the forecasts will frequently provide a near optimal combination. The intent of this study was to compare three rules for combining forecasts. These rules can be expressed mathematically as follows: 1. Simple average w, = l/n;j
= 1,2,. . . ) n (n = the number casts available).
2. Median w, = 1; if the jth forecast has the median forecast
value = 0; otherwise. 3. Focus forecasting wj = 1; if thejth
forecast has the minimum absolute forecast error in the preceeding period. = 0; otherwise.
The criterion typically used for comparing forecasts forecast accuracy. Most of the earlier studies use mean absolute percent error (MAPE) as the measure of accuracy. MAPE is defined as:
is
MAPE=
5 ][(Y,-
where Yf= the actual value at time t, Y, = the predicted value at time t, and N = the number of periods. Thus, the comparisons in this study were made by constructing aggregate forecasts for each of the three combining rules and each of the four economic variables. A MAPE was then computed for each forecast and each variable. An additional objective of this study was to determine the sensitivity of each combined forecast to the number and quality of forecasts available to be
1, American Statistical Association-National Bureau of Economic Research (ASA) 2. Chemical Bank (CHEM) 3. Fidelity Bank (FID) 4. Kent Economic and Develooment Institute. Inc. (KENT) 5. Morgan Guaranty Trust of -New York (MGT) 6. Univariate Time Series Model (ARIMA) Business
Forecasts,
Y~)/Y,]*lOOl/~
I=1
Table 1. Forecasts that were available for inclusion in the aggregate forecast
Source:
of fore-
Federal Reserve Bank of Richmond.
Simple rules for combining forecasts
(MAPE) for each of the four economic variables and each of the six individual forecasts that were available for inclusion in the aggregate forecast. These MAPE’s represent a baseline for comparison with the combined forecasts. Table 3 presents the average, standard deviation, and high and low MAPE for each of the three combined forecasts for the variable housing starts. These values were computed for all combinations of two, three, four, five and six forecasts taken at a time. Tables 4, 5 and 6 present analogous results for the variables index of industrial production, unemployment rate and gross national product respectively. Table 3 shows that for this set of forecasts and the variable housing starts using either a mean or median forecast will not provide for substantial improvement in forecast accuracy. In fact, in this case both the mean and median forecasts are on average inferior to the best individual forecast in terms of MAPE. On the other hand, the focus forecast does provide for improved forecast accuracy. The focus forecast performs better than the best individual forecast for all
combined. In other words, to examine how much the accuracy of the aggregate forecast would deteriorate if there were fewer forecasts available for combination. To observe this phenomenon, the three combined forecasts were constructed for subsets of five, four, three and two forecasts. These subsets of the data included all possible combinations of five out of the original six forecasts, four out of the original six forecasts, three out of the original six forecasts and two out of the original six forecasts. (The six forecasts available in the complete data set are listed in Table 1). Thus, 57 forecasts were constructed for each of the three combining rules and each of the four economic variables. A MAPE was computed for each of these 684 forecasts. Finally, the mean, variance and high and low MAPE were computed for each forecast variable and each value of n. RESULTS
The results of this study are summarized in Tables 2-6. Table 2 presents the mean absolute percent error
Table 2. MAPE
Housing
for each individual
starts
KENT
MGT
ARIMA
19.33
20.73
14.22
20.69
12.62
17.60
3.93
3.60
3.02
3.85
3.73
3.95
8.03
9.80
7.92
8.46
9.34
9.52
3.18
2.85
2.56
2.50
2.53
2.06
Standard deviation
Average
2 3 4 5 6
16.75 16.35 16.21 16.04 16.02
rate
on MAPE
for housing
Mean forecast
Median
Standard deviation
Standard deviation
2.02 1.41 1.05 0.69 0
High
Low
20.55 19.36 17.55 16.91 16.02
12.94 13.59 14.41 15.21 16.02
Table 4. Statistics
Average 16.75 16.46 16.35 16.28 16.18
on MAPE
Average
High
Low
3.56 3.52 3.52 3.48 3.48
0.20 0.15 0.11 0.08 0
3.87 3.81 3.73 3.63 3.48
3.22 3.28 3.36 3.40 3.48
Table 5. Statistics
High
Low
8.38 8.11 8.05 7.89 7.82
0.57 0.50 0.33 0.26 0
9.53 8.99 8.56 8.24 7.82
7.36 7.12 7.42 7.52 7.61
Low
Average
12.94 13.00 14.07 14.84 16.18
14.54 13.08 11.63 Il.58 11.23
2.24 1.84 0.37 0.54 0
High
Low
17.84 17.29 12.20 12.56 11.23
11.27 11.47 11.04 11.05 11.23
oroduction
forecast
Focus forecast
Average
High
Low
Average
Standard deviation
High
Low
3.56 3.59 3.58 3.56 3.54
0.20 0.24 0.16 0.17 0
3.87 3.94 3.86 3.86 3.54
3.22 3.17 3.32 3.28 3.54
3.25 3.07 3.06 2.88 2.83
0.29 0.24 0.19 0.16 0
3.89 3.70 3.36 3.21 2.83
2.82 2.77 2.70 2.75 2.83
on MAPE
for unemnlovment . .
Median
Average
High 20.55 19.50 18.25 17.72 16.18
Standard deviation
Mean forecast Standard deviation
Focus forecast
for the index of industrial Median
Standard deviation
starts
forecast
2.02 1.84 1.41 1.oo 0
Mean forecast
2 3 4 5 6
variable
FID
Unemployment
Number of forecasts
Number of forecasts
and each economic
CHEM
Table 3. Statistics
2 3 4 5 6
forecast
ASA
Index of industrial production
Gross national oroduct
Number of forecasts
241
rate
forecast
Focus forecast
Average
Standard deviation
High
Low
Average
Standard deviation
High
Low
8.83 7.89 7.88 7.61 7.61
0.57 0.50 0.33 0.31 0
9.53 8.85 8.20 8.12 7.61
7.36 7.07 7.20 7.20 7.61
7.53 7.30 7.32 6.88 6.59
0.44 0.45 0.37 0.17 0
8.29 8.32 7.92 7.17 6.59
6.89 6.71 6.63 6.67 6.59
242
JEFFREY L. RINGUE~T and KWEI TANG Table 6. Statistics
on MAPE
Mean forecast
for gross national
Median
Number of forecasts
Average
Standard deviation
High
Low
2 3 4 5 6
2.52 2.50 2.54 2.49 2.49
0.27 0.19 0.13 0.08 0
2.98 2.83 2.69 2.63 2.49
2.09 2.21 2.29 2.35 2.49
product
forecast
Focus forecast
Average
Standard deviation
High
Low
Average
2.52 2.64 2.63 2.65 2.65
0.27 0.14 0.10 0.07 0
2.98 2.90 2.13 2.73 2.65
2.52 2.39 2.44 2.55 2.65
2.26 2.03 2.00 1.71 1.57
combinations that include at least four individual forecasts. For the index of industrial production the results are very similar. Table 4 shows that the mean and median forecast are again inferior on average to the best individual forecast in terms of forecast accuracy as measured by MAPE. Focus forecasting, however, performs better on average than the best individual forecast if at least five forecasts are available for combination. Much greater improvement in forecast accuracy can be achieved for the variable unemployment rate than for either of the two previous variables. Table 5 indicates that the mean forecast on average has a smaller MAPE than the best individual forecast if at least five forecasts are available for aggregation. In this case, the median forecast outperforms the mean forecast. Here the median forecast will on average have a lower MAPE than the best individual forecast as long as at least three individual forecasts are present for combination, The focus forecasts show even greater improvement in forecast accuracy. In all cases the focus forecasts have a reduced average MAPE as compared to the best individual forecast. Further, if at least four individual forecasts are available for combination, any of the possible focus forecast combinations will improve the forecast accuracy. The results for gross national product are given in Table 6. Here it can be seen that in all cases the average MAPE for the mean forecasts is very close to the MAPE for the best individual forecast. The median forecast does not fair as well, however. In all cases, the median forecast has a higher average MAPE than the MAPE of the best individual forecast, The focus forecasts again show the greatest improvement in forecast accuracy. Based on average MAPE focus forecasting performed better than the best individual forecast regardless of the number of forecasts available for aggregating. In fact, when at least four individual forecasts were combined each of the possible focus forecasts resulted in a reduction in MAPE. CONCLUSIONS The results of this study are somewhat contrary to the earlier research. In the comparisons presented here the mean forecast did not perform as well as has been previously reported. This anomaly can very likely be attributed to the particular data set being used. In both cases where the mean forecast performed particularly poorly (housing starts, and the index of industrial production) there were periods where all of the individual forecasts either over-
Standard deviation 0.35 0.33 0.32 0.22 0
High
Low
2.84 2.56 2.37 2.19 1.57
1.74 1.61 1.55 1.57 1.57
predicted or under-predicted. Thus, the mean forecast would have a forecast error larger than one or more of the individual forecasts. Clearly, a simple average will work best if in each period some of the forecasted values are larger than the actual value while some of the forecasted values are smaller than the actual value. This same argument holds for the median forecast. Therefore a measure of the central tendency of the forecasts may not yield a good combined forecast if it is expected that all or most of the forecasts are likley to error on the same side. One result of this study is consistent with the earlier research. From Tables 3-6 it can be seen that most of the advantage of simple averaging can be achieved by combining only a few individual forecasts. This is clearly an advantage of this combining rule. Focus forecasting on the other hand requires that a larger number of individual forecasts are available in order to achieve improved forecast performance. Clearly this can be a significant disadvantage of focus forecasting. In general, the focus forecasting procedure seemed to be a viable alternative to simple averaging. Focus forecasting has a clear advantage over measures of central tendency in cases where all of the individual forecasts are likely to simultaneously over- or underpredict. Inherent to the focus forecasting procedure is a greater data maintenance burden. The focus forecasting method requires that a larger data base of individual forecasts be maintained. There is little evidence to indicate that the median forecast has any significant advantage over the simple average. In cases where a measure of the central tendency of the candidate forecasts is likely to perform well the simple average is still the choice. The lesser data maintenance requirements of this procedure will often be a significant advantage.
REFERENCES 1. J. Bates and C. W. J. Granger. The combination of forecasts. Opl Res. Q. 20, 451-468 (1969). 2. S. Makridakis, A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. Parzen and R. Winkler. The accuracy of extrapolation (time series) methods: results of a forecasting competition. J. Forecasting 1, 111-153 (1982). 3. S. Makridakis and R. Winkier. Averages of forecasts: some empirical results. Mgmt Sci. 29, 987-997 (1983). 4. P. Newbold and C. W. J. Granger. Experience with forecasting univariate time series and the combination of forecasts. J. R. Statist. Sot. Series A 137, 131-146 (1974). 5. Winkler and Makeridakis. The combining of forecasts. J. R. Statist. Sot. Series A 146 (Part 2), 15&157 (1983).
Simple rules for combining forecasts 6. J. T. Mentzer and J. E. Cox, Jr. Familiarity, application, and performance of sales forecasting techniques. J. Forecasting 3, 27.-36 (1984). 7. J. R. Sparkes and A. K. McHugh. Awareness and use of forecasting techniques in British industry. J. Forecasting 3, 3742 (1984). 8. T. R. Gulledge Jr, J. L. Ringuest and J. A. Richardson. Subjective evaluation of composite econometric policy inputs, Socio-Econ. Plann. Sci. 20, 51-55 (1986). 9. B. T. Smith and 0. W. Wright. Focus Forecasting:
Computer
Techniques
243 for
Inventory
Control.
CBI,
Boston (1978). 10. S. D. Baker. (Editor). Business Forecasts. Richmond, Virginia: Federal Reserve Bank of Richmond (1978-1983). 11. N. A. Citibank. CITIBASE: Citibank Economic Dafubase. New York, N.Y. Socio-Econ. Plann. Sci. 20, 55 (1986). 12. C. W. J. Granger and P. Newbold. Forecasting Economic Time Series. Academic Press, New York (1977).