Neural Networks, Vol 7, No 1, pp 203-204, 1994 Copyright © 1994 Elsevier Science Ltd Pnnted m the USA. All rtghts reserved 0893-6080/94 $6 00 + 00
Pergamon
LETTERS
TO T H E E D I T O R
An Inappropriate Use of Neural Networks for Forecasting power of feedforward neural networks," but rather the results of using xt (or Yt) as an input variable in the model for predicating Yt (or zt). The variables xt, Yt, and zz are monthly price indices that become available at the end (or some cut-off date) of each month. If a neural network model can accurately forecast a month's Yt only at the end of the month when xt becomes available, this is very weak forecasting. Furthermore, in the assumed circumstances of implicit ordering (p. 964), we question the method for generating xt's in Figure 5 (p. 964). The seemingly accurate forecasting results by the 8-8-1 neural networks are merely the reflections of the fact that there is "a 'global' price with local variations" among the three cities, as suggested by Grubb (Tiao & Tsay, 1989, p. 203). It seems unnecessary to use neural network models to unveil that "there is a strong pairwise correlation between the data for the three cities."
Editor: The forecasting of multivariate time series using neural networks as described by Chakraborty et al. (1992) is inappropriate in at least the following three aspects. 1. The authors compared the results from the neural network models they used with the ones from the "wellknown autoregressive moving average model" given by Tiao and Tsay (1989) and found that their neural network approach leads to better predications. However, the Tiao-Tsay (1989) model is presented to "reveal possibly hidden simplifying structures of the [multivariate time series] process" (Tiao & Tsay, 1989, p. 157), not to forecast. Thus, it is not surprising to see its poor forecasting performance in Figures 21, 24, 27, and 30 of Chakraborty et al. (1992). This comparison fails to indicate how much better a neural network model would be if compared to a genuine statistical forecasting model. 2. The 6-6-1 neural network model used by the authors has no better predication power than a simple model: xt+ 1 xt,yt+l = Yt, and zt+t = zt. This model's mean square errors ( × 103) for the last I 0-month period of the three cities are 1.094, 1.291, and 1.030, better than the 6-6-1 model's 3.101, 3.169, and 2.067, respectively (Chakraborty et al., 1992, Table 2, p. 970). This is also clearly indicated in Figure 4 (p. 964), where the values generated by the neural network 6-6-1 model are approximately lagged a 1-month period behind the actual values. 3. The nearly perfect forecasts of the 8-8-1 neural network model are not the results of the "predication
Qing Hu David B. Hertz Department of Computer Information Systems University of Miami Coral Gables, FL 33124
:
REFERENCES Chakraborty, K., Mehrotra, K , Mohan, C. K , & Ranka, S. (1992). Forecastmg the behavior of muitivartate tame series using neural networks. Neural Networks, 5 (6), 961-970 Tiao, G., & Tsay, R. (1989). Model spectficaUon m multivariate time series. Journal of the Royal Stattstwal Soctety, B 51 (2), 157-213.
Response to Letter by Q. Hu and D. B. Hertz ability of more complex time-series problems would be a good idea. 3. Our paper describes both kinds of networks: one where the most recent available information is used, and another where it is not used. Blaming the former for using the most recent available information is not very useful. For predicting Xt+l, neither Yt+l nor zt+ I was used: we fail to understand the critics' confusion regarding the method for generating x{s (Figure 5, p. 964). That there is a global price with local variations is a special case of the fact that the value of one variable can be predicted using recent values of other variables;
In response to comments from Professors Hu and Hertz on our paper Forecasting the Behavior o f Multivariate T i m e Series Using Neural Networks:
1. Models such as the autoregressive moving average model (Tiao & Tsay, 1989) can and are used for forecasting problems. Modeling helps understand a phenomenon; this helps predict its future behavior. 2. The model that the 6-6-1 network has learned to approximate is admittedly very simple (lagging 1 month); however, this does not deny the fact that the network did learn the relation, and the model was not supplied by the network-builder. Examining the learn203
204
formulating the problem in the latter manner allows us to attempt to formulate exactly what relationship exists between the different variables. 4. Most importantly, our paper describes a framework m which s o m e multivariate time-series can be modelled via neural networks for forecasting; Tsay and Tiao's data is for illustrations. The main thrust of the criticism appears to be that the problem we analyzed is too simple, and that simple models are available that perform well enough for these tasks. This criticism would be valid if the neural networks used were excessively large, or if information about the nonneural models were avadable to the network-broider. This is not the case with the networks in our paper; by contrast, networks used for many reallife problems have fairly large sizes, with tens or hundreds ofh~dden nodes. Neural networks are touted as universal approximators, with the caveat "given
K Chakraborty et aL
enough nodes": for specific classes of real-life problems, we need to examine whether sufficiently small neural networks can perform adequately well. Our work shows that simple neural network models can be applied quite successfully to multivariate problems such as the multicity commodity price prediction problem. Why did we pick the flour price problem?--"because it was there," or a matter of historical chance. Should we examine more complex problems?--by all means, yes! K. Chakraborty K. Mehrotra C. Mohan S. Ranka Syracuse Umversity School of Computer and Information Science Center for Science and Technology Syracuse, NY 13244-4100