A commentary on the M2-Competition
26
Reference Lawrence, M.T., R.H. Edmundson and M.J. O’Connor, 1985, “An examination of the accuracy of judgemental extrapolation of time series”, International Journal of Forecasting, 1, 14-24.
The M2-Competition: Terence
C. Mills,
Some Personal Rejlections,
University
of Hull,
Hull,
UK.
My forecasts for the M2-Competition were developed using an ARIMA-plus-intervention methodology, the interventions being decided upon after an initial, exploratory, analysis of the individual series. No additional information was used in the construction of the forecasts. The company series to be forecasted were very different in nature: many, but not all, were non-stationary and their seasonal patterns were often complex, on a number of occasions requiring multiplicative seasonal polynomials either 3or 4-month patterns needing to be superimposed on the annual seasonality. Apart from all but one of the Squibb series, for which logarithms were used, no transformations other than differences were used. As a consequence, most of the models finally arrived at were considerably more ‘airline complex than, say, the conventional traditionally thought to provide an model’ adequate fit to non-stationary, seasonal, data. Moreover, for some of the series, choosing a model took a number of attempts, so that the initial model-fitting exercise was quite costly in terms of time and effort. Interventions were required for all but the Animal Motor Co. series but, to allay fears about data mining, only a maximum of two interventions were ever needed for the series from any of the other companies. All models passed standard diagnostic checks for residual autocorrelation, but their within-sample forecastability varied widely, from just 10% to over 70%, using an R*-type criterion. Nevertheless, when the models were updated after a year, few alterations were needed in their specifications and parameter estimates remained fairly stable. The relative forecasting ability of these models were, unfortunately, not particularly good, being approximately the same as automatic Box-
Jenkins and somewhat worse than the smoothing methods! The US macroeconomic series, being seasonally adjusted, were rather easier to fit, with interventions only being required for the ‘business inventories’ series, whose levels then followed an AR(l) process. All others were modelled by either AR(l) or AR(2) processes fitted to the first differences of the logarithms, drift terms also being needed for each of them. Insample forecastability varied widely, from 14% for real GNP to almost 50% (aided by the two interventions) for business inventories. Again little change was found on updating after a year. It was in forecasting these series that the methodology proved its worth, for the forecasts had the lowest MAPEs of all methods at all forecasting horizons. I found fitting the various series a valuable experience: selecting a model certainly became just as much as an art as a science, in the sense that sample and partial autocorrelation functions rarely fell into conventional patterns, so that usual identification techniques were of only limited use. Whether the effort that went into selecting the set of models is reflected in their out-of-sample forecasting performance remains, however, a matter of some conjecture-while the methodology produced excellent forecasts for the macroeconomic series, it performed less satisfactorily for the company series.
Personal views of the M2-Competition,
J. Keith Ord, Pennsylvania State University, University Park, PA, USA, Pamela A. Geriner, Mitre Corp., USA, David Reilly, Automatic Forecasting Systems, USA, Robert Winkel, Bureau voor Statistische and OA, Netherlands. During the course of the M2-Competition, the four of us collaborated in preparing forecasts using AUTOBOX. Although AUTOBOX can operate in a purely automatic modeling mode, each participant was free to go his or her way. The teamwork was organized so that each series was examined by two investigators and the final model selection was done by Keith Ord as coordinator. The M2-Competition provides some interest-