Threshold-autoregressive, median-unbiased, and cointegration tests of purchasing power parity

Threshold-autoregressive, median-unbiased, and cointegration tests of purchasing power parity

International Journal of Forecasting 14 (1998) 171–186 Threshold-autoregressive, median-unbiased, and cointegration tests of purchasing power parity ...

258KB Sizes 0 Downloads 21 Views

International Journal of Forecasting 14 (1998) 171–186

Threshold-autoregressive, median-unbiased, and cointegration tests of purchasing power parity Walter Enders*, Barry Falk Department of Economics, Iowa State University, Ames, Iowa 50011, USA

Abstract We use Dickey-Fuller tests, threshold autoregressive unit-root tests, median unbiased estimators, and cointegration tests for I(1) and I(2) variables to examine the validity of Purchasing Power Parity (PPP). The within-sample tests generally lead to the rejection of long-run PPP. Long-term out-of-sample forecasts assuming various forms of long-run PPP are not especially better than those assuming that real rates contain a unit-root. We show that no one method emerges as the ‘‘best’’ in the sense that it provides the smallest out-of-sample forecast errors.  1998 Elsevier Science B.V. Keywords: Comparative Methods; Exchange Rates; Unit Roots; Threshold Model

1. Introduction The central premise of Purchasing Power Parity (PPP) is that the rate of foreign currency appreciation should equal the inflation differential between the domestic and foreign country. Certainly, nations with high inflation rates do have depreciating currencies. Moreover, the terms ‘‘overvalued’’ and ‘‘undervalued’’ typically make reference to a currency’s PPP level. Despite the theory’s intuitive appeal, there is still widespread disagreement on the matter of mean reversion in real exchange rates. For example, Adler and Lehman (1983); Enders (1988); Mark (1990) show that the real exchange rates of industrialized nations exhibit large fluctuations with inordinately slow rates of decay. If there is any mean reversion, the point estimates for the decay factors average close to 2% per month; the implication is

*Corresponding author.

that any deviation from PPP has a half-life approximating three years. A possible explanation for the empirical shortcomings of PPP concerns the low power of standard Dickey and Fuller (1979) unit-root tests to discriminate between non-stationary versus near-stationary processes. Given the estimated rates of convergence in the real exchange rate series, it is not surprising that there is controversy concerning the issue of whether real rates are slow-decaying or unit-root processes. At first sight, the difference between slowdecaying and unit-root processes might not seem important. However, if a real exchange rate has a unit-root, deviations from PPP are permanent such that the expected rate of foreign currency appreciation is not equal to the inflation differential between the domestic and foreign country. One aim of this paper is to invoke a number of newly developed econometric methods that can potentially discriminate between unit-root and near unit-root behavior in the real exchange rate series of

0169-2070 / 98 / $19.00  1998 Elsevier Science B.V. All rights reserved. PII: S0169-2070( 98 )00025-9

172

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

the industrialized nations. Specifically, we use Dickey and Fuller (1979) tests, Enders and Granger’s (Enders and Granger, 1997) threshold autoregressive (TAR) unit-root tests, Fuller’s (Fuller, 1995) median unbiased estimators, and Johansen’s (Johansen, 1995) cointegration tests for I(1) and I(2) variables. The appropriate use of each methodology is described and applied to the various real exchange rate series. The second aim of the paper is to compare the various methods concerning their ability to generate accurate forecasts. We show that no one method emerges as the ‘‘best’’ in the sense that it provides the smallest out-of-sample forecast errors. Overall, the results are not particularly favorable to the PPP hypothesis. The within-sample tests generally lead to the rejection of long-run PPP. Long-term out-ofsample forecasts assuming various forms of long-run PPP are not especially better than those assuming that real rates contain a unit-root.

2. Tests for unit roots To illustrate the issues involved, consider the following econometric model of Purchasing Power Parity: p *t 1 e t 2 pt 5 r t

(1)

where: p t* 5logarithm of the index of the foreign price level in t; e t 5logarithm of the domestic currency price of foreign exchange in period t relative to a base year; pt 5logarithm of the index of the domestic price level in t; and r t is a stochastic disturbance representing a deviation from PPP. Note that r t is typically called the ‘‘real’’ exchange rate or ‘‘real price’’ of foreign exchange. The rationale for writing p t* , e t and pt as lefthand-side variables is to emphasize the fact that all three are jointly determined. Viewed in this way, PPP is not a ‘‘Theory of The Exchange Rate’’ any more than it is a ‘‘Theory of Differential Inflation Rates.’’ The long-run version of PPP implies that r t is stationary. After all, if r t is not stationary, deviations from PPP contain a permanent component such that any discrepancy from PPP is never fully eliminated. Using monthly data from the CD-ROM version of the International Monetary Fund’s International Financial Statistics we obtained Consumer Price

Index (CPI) and nominal exchange rate data for Belguim, Canada, France, Germany, Greece, Italy, Japan, Luxembourg, the Netherlands, Spain, Switzerland, the United Kingdom, and the United States. The data covers the period from January 1973 through December 1996. Alternately, using the United States, Germany, and the United Kingdom as the ‘‘foreign’’ country, we constructed real exchange rates in accord with Eq. (1). The four panels of Fig. 1 show the real U.S. exchange rate with each of the other twelve countries through December 1993. (As explained later, three years of data are withheld in order to perform out-of-sample forecasts.) In general, the real dollar fell throughout the 1970’s, rose until the mid-1980’s and then began a decline towards its 1970’s level. From the figure, it is clear that deviations from PPP are large and persistent. We focus on the issue of whether or not there is actually mean reversion present in the data. Also note that, all of the U.S. real rates with the European nations seem to move in tandem. We also address the issue of whether or not PPP performs better for the European nations than for the U.S. In order to test whether the real exchange rates are stationary, we performed unit-root tests on each of the series using the following procedure: Step 1: Using the U.S., Germany, and the U.K. as the base country, we constructed real exchange rates as indicated by Eq. (1) and applied OLS to estimate regression equations of the form: Dr t 5 a0 1 r r t 21 1 a1 Dr t 21 1 a2 Dr t22 1 ... 1 ak Dr t 2k 1 et

(2)

Note that a deterministic trend is not included in Eq. (2) since the presence of a trend is inconsistent with long-run PPP. We retained three years of data in order to perform out-of-sample forecasts and estimated the regression over the sample period 1973:1– 1993:12. The key feature to note in Eq. (2) is the value of r. If 22, r ,0, the real exchange rate sequence will revert to a long-run mean. However, if it is not possible to reject the null hypothesis r 50, the hr t j sequence will not be stationary. Step 2: One problem in implementing eq. (2) is the determination of the lag length k. We selected lag lengths using the Bayesian Information Criterion

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

Fig. 1. Real US exchange rates.

173

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

174

(BIC). Note that virtually identical results are obtained using the Akaike Criterion (AIC). Step 3: Given the lag length determined in Step 2, we tested the null hypothesis r 50. Under the null hypothesis of a unit-root (i.e., r 50), the distribution of the OLS estimator of r is non-standard. As such, we use the critical values tabulated by Dickey and Fuller (1979). The estimated values of r with the associated t statistics are reported in Table 1. Notice that all point estimates of r are extremely close to zero. Using Belgium and the U.S. as an example, the estimated value of r is 20.014 and the t statistic is 21.61. Given that the sample contains 250 usable observations, the critical values at the 1%, 5%, and 10% significance levels are 23.36, 22.87, and 22.57, respectively. Since the estimated t statistic falls substantially short of these critical values, it is not

Table 1 OLS estimates of r for the Dickey-Fuller test

Belgium Canada France Germany Greece Italy Japan Luxembourg Netherlands Spain Switzerland United Kingdom

United States

Germany

United Kingdom

20.014 (21.61) 20.011 (21.29) 20.020 (21.84) 20.019 (21.86) 20.022 (21.85) 20.019 (21.91) 20.011 (21.31) 20.014 (21.59) 20.020 (21.90) 20.014 (21.65) 20.026 (22.25) 20.028 (22.33)

20.020 (21.84) 20.026 (22.18) 20.051** (23.14) NA

20.015 (21.74) 20.024 (22.20) 20.028 (22.32) 20.020 (21.99) 20.027 (21.84) 20.035* (22.60) 20.015 (21.48) 20.014 (21.71) 20.020 (21.97) 20.038* (22.59) 20.037* (22.60) NA

20.080** (23.22) 20.015 (21.66) 20.010 (21.33) 20.021 (21.81) 20.030 (22.13) 20.020 (21.71) 20.017 (22.11) 20.020 (21.99)

Note: t statistics in parentheses. *, Rejection of the null hypothesis of a unit-root at the 10% significance level. **, Rejection of the null hypothesis of a unit-root at the 5% significance level.

possible to reject the null hypothesis of a unit-root at conventional significance levels. Notice the following features of the table: 1. All point estimates of r imply an extremely slow rate of convergence. For the U.S., the point estimates of r range from 20.028 (for the U.K.) to 20.011 (for Japan). Note that a point estimate of r 5 20.02 implies a half-life of any discrepancy from PPP of slightly over 34 months. 2. Purchasing Power Parity does not hold between the United States and any of the other countries. Even the most negative t statistic (i.e., the value of t between the U.S. and the U.K. is 22.33) exceeds the 10% critical value of 22.57. 3. Although Fig. 1 gives the impression that PPP might hold within the set of European countries, this conclusion is unwarranted. For Germany, the unitroot hypothesis is rejected only for France and Greece at the 10% and 5% levels of significance. It is tempting to argue that PPP should hold between Germany and France because of their involvement in the Exchange Rate Mechanism (ERM) and their close economic ties. However, this appears to be an ex post rationalization since PPP fails between Germany and all of the other European countries but Greece. 4. For the U.K., at the 10% level of significance the unit-root hypothesis is rejected for Italy, Spain, and Switzerland. Using a 5% significance level, PPP fails between the U.K. and all of the other countries examined. One possible explanation for the general rejection of PPP concerns the nature of the testing process. It is well-known that unit-root tests have very low power when the series in question is close to behaving as a random walk. Given the point estimates in Table 1, it is possible that PPP holds but that the Dickey-Fuller test does not have sufficient power to reject a false null hypothesis of a unit-root. A number of recent papers have used panel data in order to enhance the power of the Dickey-Fuller test. To understand the issues involved, suppose that there are n real exchange rates in a sample and consider the joint estimation of the system:

O a Dr k

Dr jt 5 a0j 1 r r jt 21 1

i

jt 2i

1 ejt

i 51

j 5 1,...,n

(3)

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

where the subscript j is the index for country j. Levin and Lin (1992) provide the critical values for the null hypothesis r 50. Even though the value of r is constrained to be equal across equations, they show that the panel estimate of r can have substantially increased power relative to the Dickey-Fuller test. The first set of papers using panel unit-root tests, such as Jorion and Sweeney (1996); Oh (1996); Wei and Parsley (1995); Wu (1996), generally reject the null hypothesis r 50 for various panels. However, the critical values of Levin and Lin (1992) assume that the error terms in Eq. (3) are both serially uncorrelated and contemporaneously uncorrelated. Given the tendency of the various real rates to move in the same direction, this assumption is not warranted. Papell (1997) finds that allowing for serial correlation substantially weakens the evidence in favor of long-run PPP. O’Connell (1998) uses Generalized Least Squares to eliminate any contemporaneous correlation in the error structure and finds no evidence supporting PPP. At this point, it is fair to say that the panel studies do not provide any strong evidence in favor of PPP. In the remainder of the paper we discuss three additional methods that might be more appropriate for testing long-run PPP.

3. Threshold autoregressive (TAR) models The standard Dickey-Fuller test is designed for linear models displaying symmetric adjustment. However, the test is misspecified if adjustment is asymmetric. Consider an alternative specification – called the threshold autoregressive (TAR) model – such that: ] 1 (1 2 I )r [r 2r] ] 1e Dr 5 I r [r 2r] (4) t

t

1

t 21

t

2

t 21

t

where: It 5

H

1 0

if r t 21 $r] if r t 21 ,r]

and r] is the long-run mean of the hr t j sequence. A sufficient condition for the stationarity of hr t j is: 22,( r1 , r2 ),0. As such, ]r is the long-run equilibrium value of the sequence. Whenever r t 21 is above its long-run equilibrium value, the adjustment is ] and whenever r r1 [rt 21 2r] t 21 is below long-run

175

] Since equilibrium, the adjustment is r2 [r t 21 2r]. adjustment is symmetric if r1 5 r2 , symmetric adjustment can be considered a special case of eq. (3). Moreover, if the sequence is stationary, the least squares estimates of r1 and r2 have an asymptotic 1 multivariate normal distribution. Enders and Granger (1997) develop a test statistic that can be used to test the null hypothesis of a unit-root against an alternative of stationarity with asymmetric adjustment. The power of the test is compared with the power of the Dickey-Fuller test. If adjustment is actually symmetric, the DickeyFuller test has greater power than the new test. However, within a range of adjustment parameters relevant to many economic time-series, the power of the new test is substantially greater than those of the corresponding Dickey-Fuller test. In working with specifications such as Eq. (4), diagnostic checks of the residuals (such as the correlogram of the residuals and Ljung-Box tests) and various model selection criteria (such as the AIC or BIC) can be used to determine the appropriate lag length (see Tong, 1983). It seems natural to apply the threshold model to the real exchange rate series. After all, there is a substantial literature suggesting that price levels rise more readily than they fall. It may be that a nation’s price level rises to partially eliminate a positive discrepancy from PPP but does not fall in response to a negative discrepancy. To allow for the possibility of asymmetric adjustment in price levels and real exchange rates, for each real rate series we perform the following four steps: Step 1: Regress the real exchange rate series on a constant and save the residuals in the sequence hrˆ t j. For each value rˆ t , we set the indicator function It according to whether rˆ t is positive or negative. Step 2: We estimate a regression equation in the form: 2 Drˆ t 5 It r1 rˆ t 21 1 (1 2 It )r2 rˆ t 21 1 et 1

(5)

Tong (1983) contains the proof that the least squares estimates of r1 and r2 have an asymptotic multivariate normal distribution. This result easily generalizes to higher-order autoregressive processes. Tong (1990) also develops many of the properties of the TAR model. 2 Since the hrˆ t j sequence consists of regression residuals, no deterministic components are included in Eq. (5).

176

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

Next, we obtain the sample values of the F statistic for the null hypothesis r1 5 r2 50 and compare these sample statistics with the appropriate critical values developed in Enders and Granger. With 250 observations the critical value of the F statistic (called Fm ) is 3.75 at the 10% significance level and 4.56 at the 5% significance level. Step 3: If the alternative hypothesis (of stationarity) is accepted, it is possible to test for symmetric versus asymmetric adjustment since the estimates of r1 and r2 converge to multivariate normal distributions. As such, the restriction that adjustment is symmetric (i.e., the null hypothesis: r1 5 r2 ) can be tested using the usual F statistic. Step 4: Diagnostic checking of the residuals should be undertaken to ascertain whether the het j sequence can reasonably be characterized by a whitenoise process. If the residuals are correlated, return to Step 2 and re-estimate the model in the form: Drˆ t 5 It r1 rˆ t 21 1 (1 2 It )r2 rˆ t 21 1 b1 Drˆ t 21 1 ... 1 bk 21 Drˆ t 2k 11 1 et

(6)

Lag lengths can be determined by an analysis of the regression residuals and / or using model selection criteria such as the AIC or BIC. The results, shown in Table 2, reinforce the results in Table 1. The point estimates for r1 and r2 all indicate very slow speeds of adjustment. For all real U.S. rates, the Fm statistic for the null hypothesis r1 5 r2 50 cannot be rejected. For example, the calculated value of the Fm statistic for the U.S. / U.K. rate (the largest value of Fm involving the U.S.) is 2.75; at the 10% significance level, the critical value is 3.75. For Germany, the null hypothesis of a unit-root can be rejected only for the French and Greece real rates. Respectively, the calculated values of 5.31 and 5.19 are significant at the 5% level. Given that PPP holds in these two instances, it is meaningful to test the hypothesis of symmetric adjustment. The F tests for the null hypothesis r1 5 r2 has a p value of 0.388 for France and 0.857 for Greece. As such, the null hypothesis of symmetric adjustment cannot be rejected at conventional significance levels. For the U.K., only the real rate with Italy supports the PPP hypothesis. The calculated value of Fm 5 4.20 exceeds the 10% critical value of 3.75. How-

ever, the null hypothesis of symmetric adjustment cannot be rejected; the F test for the null hypothesis r1 5 r2 has a p value of 0.205. In contrast to the Dickey-Fuller tests, notice that the Enders-Granger test using the U.K. real rates with Spain and Switzerland are not supportive of PPP. This might be a result of the fact that the Enders-Granger test has less power than the Dickey-Fuller test when adjustment is actually symmetric. Another variant of the threshold model that has received considerable attention is: 2 ur 1 r r t 21 1 et et Dr t 5 ur 1 r r t 21 1 et

5

if r t 21 . u if ur t 21 u # u if r t 21 # 2 u

(7)

In Eq. (7), the real exchange rate acts as a random-walk within a band of width 2u. Davutyan and Pippenger (1990) argue that the real exchange rate might follow such a threshold process as a result of a fixed transactions cost. Simply put, mean-reversion does not occur unless there are sufficient gross returns from commodity arbitrage. Pippenger and Goering (1993) show that the Dickey-Fuller test has low power to detect the presence of a unit-root if the size of the band is large (so that a large proportion of the observations fall in the random-walk regime) and if the value of r is close to zero (so that there is little autoregressive decay). Unfortunately, there is no simple way to test for mean reversion within a band-TAR framework unless the bounds of the band are known in advance. If the band must be estimated from the data, any test of mean-reversion is actually a joint test concerning the value of r and the value of u. As a result, Balke and Fomby (1996) and van Dijk and Franses (1995) recommend a two-step procedure for modeling bandTAR processes. First, determine whether the process is mean-reverting using the type of Dickey-Fuller test such that symmetric adjustment is the alternative hypothesis. Second, if the null of a unit-root is rejected, estimate the thresholds and autoregressive coefficients using the type of methodology developed in Tsay (1989). Note that it is possible to generalize Eq. (7) such that (i) the bounds of the band are not symmetric about the long-run mean and (ii) the speed of adjustment on one side of the band differs

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

177

Table 2 Estimates of the threshold model United States

Germany

United Kingdom

Belgium

r1 5 20.011 r2 5 20.019 fm 51.41 Signif( r1 5 r2 )50.65

r1 5 20.030 r2 5 20.012 fm 52.01 Signif( r1 5 r2 )50.43

r1 5 20.019 r2 5 20.012 fm 51.58 Signif( r1 5 r2 )50.70

Canada

r1 5 20.008 r2 5 20.015 fm 50.92 Signif( r1 5 r2 )50.67

r1 5 20.027 r2 5 20.025 fm 52.38 Signif( r1 5 r2 )50.94

r1 5 20.026 r2 5 20.022 fm 52.44 Signif( r1 5 r2 )50.98

France

r1 5 20.013 r2 5 20.031 fm 52.02 Signif( r1 5 r2 )50.42

r1 5 20.063 r2 5 20.035 fm 55.31** Signif( r1 5 r2 )50.39

r1 5 20.030 r2 5 20.026 fm 52.71 Signif( r1 5 r2 )50.86

Germany

r1 5 20.012 r2 5 20.034 fm 52.25 Signif( r1 5 r2 )50.31

NA

r1 5 20.020 r2 5 20.021 fm 51.97 Signif( r1 5 r2 )50.96

Greece

r1 5 20.018 r2 5 20.030 fm 51.84 Signif( r1 5 r2 )50.63

r1 5 20.084 r2 5 20.075 fm 55.19** Signif( r1 5 r2 )50.86

r1 5 20.037 r2 5 20.020 fm 51.86 Signif( r1 5 r2 )50.56

Italy

r1 5 20.015 r2 5 20.024 fm 51.93 Signif( r1 5 r2 )50.64

r1 5 20.021 r2 5 20.009 fm 51.58 Signif( r1 5 r2 )50.53

r1 5 20.022 r2 5 20.058 fm 54.20* Signif( r1 5 r2 )50.21

Japan

r1 5 20.019 r2 5 20.004 fm 51.30 Signif( r1 5 r2 )50.36

r1 5 20.014 r2 5 20.005 fm 51.05 Signif( r1 5 r2 )50.58

r1 5 20.022 r2 5 20.010 fm 51.27 Signif( r1 5 r2 )50.56

Luxembourg

r1 5 20.011 r2 5 20.019 fm 51.34 Signif( r1 5 r2 )50.69

r1 5 20.028 r2 5 20.016 fm 51.79 Signif( r1 5 r2 )50.59

r1 5 20.018 r2 5 20.011 fm 51.54 Signif( r1 5 r2 )50.69

Netherlands

r1 5 20.015 r2 5 20.028 fm 51.99 Signif( r1 5 r2 )50.53

r1 5 20.033 r2 5 20.025 fm 52.30 Signif( r1 5 r2 )50.78

r1 5 20.023 r2 5 20.018 fm 51.96 Signif( r1 5 r2 )50.82

Spain

r1 5 20.013 r2 5 20.015 fm 51.37 Signif( r1 5 r2 )50.92

r1 5 20.031 r2 5 20.009 fm 51.89 Signif( r1 5 r2 )50.35

r1 5 20.028 r2 5 20.051 fm 53.67 Signif( r1 5 r2 )50.42

Switzerland

r1 5 20.026 r2 5 20.027 fm 52.54 Signif( r1 5 r2 )50.96

r1 5 20.016 r2 5 20.019 fm 52.22 Signif( r1 5 r2 )50.87

r1 5 20.041 r2 5 20.033 fm 53.41 Signif( r1 5 r2 )50.77

United Kingdom

r1 5 20.024 r2 5 20.031 fm 52.75 Signif( r1 5 r2 )50.78

r1 5 20.020 r2 5 20.021 fm 51.97 Signif( r1 5 r2 )50.96

NA

*, Denotes rejection of the null hypothesis of a unit-root at the 10% significance level. **, Denotes rejection of the null hypothesis of a unit-root at the 5% significance level. Note: fm is the value of the F statistic for the null hypothesis r1 5 r2 50. Enders and Granger (1997) calculate the critical values (for 250 observations) to be 3.75 at the 10% significance level and 4.56 at the 5% significance level. Signif( r1 5 r2 ) is the significance level of the F statistic for the null hypothesis r1 5 r2 . For a stationary series, this statistic has the conventional F distribution.

from that on the other. In such circumstances, Enders and Granger (1997) show that the Fm statistic can have enhanced power over the Dickey-Fuller test in

testing for band-TAR adjustment. Given the negative results concerning PPP using both the Dickey-Fuller test and the Fm statistic, it is reasonable to conclude

178

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

that band-TAR adjustment is not supported by the empirical evidence.

4. Median unbiased estimates The Dickey-Fuller and Enders-Granger tests have low power for near unit-root processes. Recently, Fuller (1995, pp. 578–582) developed an estimator of r that is (median) unbiased when r is equal to zero and nearly unbiased for the other points in the parameter space. For expositional purposes, we first consider a first-order model and reparameterize Eq. (2) as: r t 5 a 1 b r t 21 1 et

so that b 5 1 1 r

(8)

The basic insight is that OLS provides downward biased estimates of b in the [0,1] interval and the magnitude of the bias increases with b (according to E[ bˆ 2 b ] ¯ (1 1 3b ) /T ). In principle, it is possible to reduce this bias by inflating the OLS estimate of b by an amount commensurate with the sample size and with the proximity of the estimate to unity. To translate this idea into practice, Fuller (1995) developes a procedure that uses the weighted symmetric (WS) least squares estimator of b rather than the OLS estimator and then adjusts the estimate based on the WS t statistic for the restriction that b 51 (rather than making an adjustment based on the estimate of b itself).3 3

The weighted symmetric estimator of b is described in Fuller (1995). The idea is that corresponding to the standard backward autoregressive model r t 5 a 1 b r t 21 1 et is a forward autoregressive model r t 5 a 1 b r t 11 1vt where vt is an i.i.d. sequence with the same mean and variance (but not necessarily the same distribution) as the et sequence. The weighted symmetric estimator uses the data r 1 , . . . , r T to minimize a weighted average of e2 , . . . ,eT and v1 , . . . ,vT 21 . The motivation for this procedure is to account for information about the process contained in the initial observation, r 1 , in a more satisfactory manner than OLS provides. This is likely to be particularly important when b is close to one. Further discussion of this estimator and the details of its construction are provided in Fuller (Fuller, 1995, pp. 413–416).An analogous median unbiased estimator can be constructed based on the OLS estimator of b, although it will not perform as well as the modified WS estimator in other respects. The modified OLS estimator adjusts the OLS estimator of b according to (11) but replacing the adjustment function (12) with:where: tˆ 1 is the OLS t statistic associated with the restriction b 51.

The actual procedure is to fit the model rˆ t 5 b rˆ t 21 1 et

(9)

by weighted symmetric least squares, where rˆ t is the demeaned value of the hr t j sequence. Then, compute the WS t statistic tˆ 1 corresponding to the restriction that b 51, i.e.,

tˆ 1 5 ( bˆ 2 1) /V( bˆ )0.5

(10)

where bˆ is the WS estimator of b and V( bˆ ) is its estimated variance. The modified WS estimator of b is

b˜ 5 bˆ 1 c( tˆ 1 )[V( bˆ )] 0.5

(11)

where c( tˆ 1 ) 2 tˆ 1 if tˆ 1 $ 2 1.2 2 5 0.035672( tˆ 1 1 7.0) if 2 7.00 # tˆ 1 # 2 1.2 0 if tˆ 1 # 2 7.00.

5

(12) An estimate of a0 is recovered by using the fact that a /(12 b )5E(r), replacing b with b˜ and E(r) with the sample mean. Fuller (1995) shows that this estimator is nearly median unbiased across all permissible values of b when the sample size is as small as 50. It is less biased than the unadjusted weighted symmetric (and, therefore, the OLS) estimator of b when b is greater than 0.75, even for samples as large as 200. Furthermore, when b is greater than 0.9 and sample size is on the order of 200, the mean squared error of the modified WS estimator is substantially smaller than the mean squared error of the unmodified WS estimator. For example, when b 50.95 and T5200, the median of the unmodified WS estimator is 0.935 and the median of the modified WS estimator is 0.953. Moreover, the mean squared error of the modified WS estimator is almost 25% less than that for the unmodified WS estimator. The procedure is easily generalized to allow for an AR(k) process that is permitted to have up to one unit-root. That is, suppose:

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

O g Dr

k21

r t 5 a 1 b r t 21 1

i

t 21

1 et

(13)

i51

Table 3 Comparison of point estimates of AR(2) models of real exchange rates r t 5 a 1 b r t 21 1gDr t 21 1 et

a

In this case, fit the WS least squares regression of rˆ t on rˆ t 21 , Drˆ t 21 , . . . , Drˆ t 2k 11 to obtain initial estimates of b, g1 , . . . ,gk 21 . Then, using the WS t statistic for b corresponding to the restriction that b 51, modify the WS estimator of b using the same formula that is applied in the AR(1) case. Finally, re-estimate g1 , . . . ,gk21 by a regression of r t 2 b˜ r t 21 on Dr t 21 ,...,Dr t 2k 11 . For inference purposes, Fuller recommends treating the modified WS estimators of b, g1 , . . . gk 2 1 as normally distributed unbiased random variables, using the estimated standard errors from the firststage WS regression in place of their actual standard errors except that the estimated standard error for b should be increased by 15%. Then, for example, an approximate 95% confidence interval for b can be constructed from the intersection of [ b˜ 2 ˆ bˆ ))0.5 , b˜ 1 2.30(V( ˆ bˆ ))0.5 ] and [0,1]. 2.30(V( To consider the implications of the modified WS estimator for exchange rate models, the logs of the real exchange rate series were fit as AR(2) models using three estimators: OLS in levels, OLS in first differences, and modified WS least squares. To conserve space, we focus on five of the real exchange rate series: the U.S. rate with Canada (R]CA), France (R]FR), Germany (R]GE), and Japan (R]JA) and the real French rate with Germany (R]FRGE). The results are summarized in Table 3. Notice that the modified WS estimator produces unit-root point estimates for the Japan–U.S. and Canada–U.S. exchange rates but not for the other three series.4 Recall that the Dickey-Fuller unit-root test fails to reject the unit-root null for all of these series except for the Germany–France exchange rate. The OLS point estimates imply stationarity in all five cases. The modified WS estimator is more conservative than the 4

The approximate 95% confidence intervals for b based on the modified WS estimator include b 51 in all five cases. The confidence intervals were constructed as described in the text. They are: Japan–U.S.5[0.979,1], Canada–U.S.5[0.980,1], Germany–U.S.5[0.967, 1], France–U.S.5[0.967,1], and Germany–France5[0.935,1].

179

b

g

l

Japan–U.S. Canada–U.S. Germany–U.S. France–U.S. Germany–France

OLS, unrestricted 0.045 0.99 0.054 0.99 0.088 0.98 0.088 0.98 0.001 0.95

0.30 0.16 0.30 0.28 0.37

0.98 0.99 0.97 0.97 0.92

Japan–U.S. Canada–U.S. Germany–U.S. France–U.S. Germany–France

OLS, b 51 – – – – –

1.00 1.00 1.00 1.00 1.00

0.30 0.16 0.29 0.28 0.34

1.00 1.00 1.00 1.00 1.00

Japan–U.S. Canada–U.S. Germany–U.S. France–U.S. Germany–France

Modified WSLS – 1.00 – 1.00 0.033 0.99 0.035 0.99 0.000 0.97

0.30 0.16 0.30 0.28 0.35

1.00 1.00 0.99 0.99 0.96

Notes: Sample period51973:1–1993:12. l is the largest autoregressive root implied by the estimates of b and g.

unit-root test approach in assigning unit-roots but is less conservative than the OLS estimator. In all cases the modified WS estimate of the largest root is greater than the OLS estimate of the largest root. For the Japan–U.S., Canada–U.S., Germany–U.S. and France–U.S. exchange rates, the OLS estimates of b and the upper bound of the parameter space are quite close. As such, the differences between the OLS and modified WS estimates of b are quite small. In contrast to the OLS estimates, note that the modified WS estimates yield b 51 for the Japan– U.S. and the Canada–U.S. real rates. Moreover, for the Germany–U.S. and France–U.S. real rates, the OLS and modified WS estimates of b are all less than unity. In the Germany–U.S. case the WS estimate of the dominant root is 1.84% greater than the OLS estimate (0.9895 vs. 0.9716) and in the France–U.S. case the WS estimate of the dominant autoregressive root is 1.72% greater than the OLS estimate (0.9891 vs. 0.9724). The most interesting of the five cases may be the Germany–France exchange rate. The Dickey-Fuller test rejects the unit-root restriction; the OLS estimate of b is 0.9489 and the implied estimate of the

180

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

dominant AR root is 0.9150. The modified WS estimate of b is 0.9742 and the implied estimate of the dominant autoregressive root is 0.9592, which is 4.83% greater than the OLS estimate of this root. We conclude that the modified WS point estimates provide a compromise between the OLS point estimates and the unit-root test approach. The WS estimates generate a unit-root more often than unrestricted OLS but less often than suggested by unitroot test results. In cases where the modified WS point estimates imply stationarity, there will be differences between the implied dynamics of the OLS and modified WS estimates of the model corresponding to differences in the estimates of the dominant autoregressive roots.

5. Cointegration tests A number of authors have shown that measurement errors can lead to the apparent failure of PPP. Patel (1990) explores the fact that various national price indices are constructed using different weighting systems. Clearly, an increase in the relative price of a particular good will induce a largest increase in the nation’s price index placing the largest weight on that good. Similarly, Taylor (1990) and Fischer and Park (1991) consider the fact that national price levels are weighted averages of tradable and nontradable goods. Suppose that the relative price of non-tradables in the home country rises relative to that in the foreign country. As such, the home price level will appear to be ‘‘too high’’ relative to its Purchasing Power Parity level. In either instance, there may be a long-run equilibrium relationship between prices and exchange rates. However, instead of the simple relationship postulated in eq. (1), the long-run relationship may be of the form e t 5 a0 1 a1 pt 2 a2 p *t

(14)

where: the coefficients a1 and a2 need not equal unity and a0 may differ from zero. Eq. (14) can be called Weak-PPP; the basic tenet of Weak-PPP is that the non-stationary price levels and exchange rates are cointegrated. Since the values of the ai are not known a priori, they need to be estimated from the data. Unlike more traditional

measures of PPP, Weak-PPP places no particular restriction on the cointegrating relationship. The Johansen (1995) technique is ideally suited to testing Weak-PPP. The methodology can be used to estimate the cointegrating vector (if any) and to test restrictions on the values of the ai . Note that most of the price indices show strong evidence of I(2) behavior. As such, we used the following steps in order to determine whether prices and the exchange rate are cointegrated as suggested by PPP: Step 1: We determined lag length by estimating an unrestricted VAR in the form:

OAx k

xt 5 A 0 1

i t 2i

1 et

(15)

i 51

where: x t9 5( pt , p *t , e t ), the A i are 333 matrices, A 0 is a 331 vector of intercepts, and k is the lag length to be determined. For each country we examined the multivariate AIC and BIC. In all cases, the AIC selected a lag length of 12 months while the BIC generally selected much shorter lag lengths. For purposes of simplicity, we present results using the lag lengths selected by the BIC. Note, however, that the results of the cointegration tests are somewhat sensitive to lag length. Step 2: When the variables in x t are I(1), the usual procedure is to estimate the model in the form:

O p Dx

k 21

Dx t 5 c x *t 21 1

i

t2i

1 et

(16)

i 51

* is the vector xt 21 augmented with a where: x t21 constant so as to allow for an intercept term in the cointegrating vector(s). In essence, the elements of A 0 are constrained so as to allow for an intercept in the cointegrating vector but no drift in the VARprocess. In this case, the number of independent cointegrating vectors is equal to rank(c ). Johansen shows how to test for the rank of c using the characteristic roots obtained from a reduced rank estimation of eq. (16). The method consists of ordering the three characteristic roots of c (denoted by l1 , l2 , and l3 ) such that l1 . l2 . l3 . The test for the number of characteristic roots that are insignificantly different from unity can be conducted using the following test statistic:

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

O 3

ltrace (r) 5 2 T

ln(1 2 lˆ i )

(17)

i 5r 11

where lˆ i are the estimated values of the characteristic roots obtained from the estimated c matrix and T is the number of usable observations. If the variables in x t are not cointegrated, the rank of c is zero and all the characteristic roots will equal unity. Hence, the value of ltrace (0) should equal zero when there is no cointegration. We estimated eq. (16) and compared the values of ltrace (r) to the critical values tabulated by Johansen (1995). Column 2 of Table 4 reports the values of ltrace (0) and ltrace (1) for each country pair; in no case were there more than two cointegrating vectors so that ltrace (2) is not reported. At the 10% significance level, there is at least 1 cointegrating vector for each of the country pairs considered. The cointegrating vector, normalized with respect to the

181

exchange rate, is reported in the third column. Also reported is the x 2 -statistic for the null hypothesis that the cointegrating vector can be written such that a1 5 a2 implying a proportional adjustment of prices and the exchange rate. Step 3: Diagnostic testing indicated that a number of the price indices have the attributes of I(2) variables while the nominal exchange rates are clearly I(1) variables. As such, it is necessary to consider the possibility of multi-cointegration. With such a mixture of I(1) and I(2) variables there can be a long-run equilibrium relationship of the form: e t 5 a0 1 a1 pt 2 a2 p *t 1 a3 Dpt 1 a4 Dp *t

(18)

As such, we performed Johansen’s (Johansen, 1995) cointegration test for I(2) variables. Consider the system:

O pD x

k 22

D2 x t 5 p x t*21 1 GDx t21 1

2

i

t 2i

1 et

(19)

i 51

Table 4 Cointegration tests of weak PPP

ltrace (0), ltrace (1) †

Sr,s

e116.14p*212.75p218.8750 (79.04) (65.47) x 2 52.96 (0.23)

68.63 36.44

U.S. / Canada

32.52

U.S. / France

36.87* 15.72

e12.06p*22.10p14.5550 (0.75) (0.93) x 2 51.54 (0.46)

109.21* 29.59

U.S. / Germany

90.74** 6.48

e213.83p*15.88p143.6350 (5.57) (2.72) x 2 532.87 (0.00)

315.03** 133.57**

192.74**

U.S. / Japan

99.35** 38.34**

e130.80p*213.53p289.4350 (41.30) (16.76) x 2 543.57 (0.00)

191.02** 86.98**

141.00**

Germany / France

45.30** 13.24

e10.73p*20.56p10.6450 (0.12) (0.28) x 2 523.13 (0.00)

81.59* 24.34



18.11

e1a 1 p*1a 2 p1c50

43.72

53.57

57.67

, Rejection of the null hypothesis that li 51 at the 10% significance level. *, Rejection of the null hypothesis that li 51 at the 5% significance level. **, Rejection of the null hypothesis that li 51 at the 1% significance level. Notes: In the four cases involving the U.S., e5U.S. dollar price of foreign exchange, p5U.S. price level, and p*5foreign price level. In the the case of Germany / France, e5mark price of the franc, p*5French price level, and p5German price level. At the 5% and 1% significance levels, the critical values for the null of no cointegration are 34.91 and 41.07, respectively. At the 5% and 1% significance levels, the critical values for the null of no more than 1 cointegrating vector are 19.96 and 24.60, respectively. For each country pair, the first entry in Sr,s is the sample value of S0,0 , the second is the sample value of S0,1 , and the third is value of S1,0 . At the 5% significance level, the critical values for the Si, j are: S0,0 586.66, S0,1 568.23, and S1,0 547.60. Standard errors are shown in parentheses.

182

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

In Eq. (19), the issue of multi-cointegration concerns the ranks of both p and G. Following Johansen (1995) let rank(p )5r and define the 33r matrices a and b such that ab 95 p. Also, let a' and b' denote the orthogonal complements of a and b. In principle, it is possible to consider all possible orders of integration for price levels and the exchange rate. However, we focus on the possibility that price levels are I(2) and that nominal exchange rates are I(1). As such, we consider the following restrictions for the ranks of p and G :

9 G b' )5s must be such that s#32r. 1. i) Rank(a ' For the analysis of I(2) variables to be appropriate, the values of r and s must be such that: s1r,3. If s532r, then x t contains no I(2) variables so that the analysis of Step 2 above is appropriate. 2. ii) The number of I(2) trends in a 32variable system is given by 32r2s. As such, if r1s,2, no linear combination of pt and p *t can annihilate both of the stochastic trends. Thus, if r1s,2, Weak-PPP fails. 3. iii) If r50, Weak-PPP fails since price levels and the exchange rate do not converge to an equilibrium relationship. The conditions imply that both r and s must equal unity if long-run PPP is to be substantiated by an analysis for I(2) variables. Johansen (1995) shows how to determine the value of s conditional on the * denote the value of r selected in Step 2.5 Let Q r,s quantity obtained when Eq. (17) is calculated using the values of a, b, and the specific value of r found in Step 2. Given this value of r, if the sample value * exceeds the critical value calculated by of Q r,s Johansen, reject the null hypothesis s5s 0 in favor of the alternative s.s 0 . For r51 the critical values at the 10%, 5%, and 1% significance levels are:

For example, let r51 and suppose that the sample * is found to be 35.00. As such, the null value of Q 1,s hypothesis s50 can be rejected at the 5% significance level. For the five country pairs, the calculated * are reported in the last column sample values of Q 1,s of Table 4. The key results of the cointegration tests for each country pair are: U.S. / Canada: The model for I(1) variables yields a value of l1 532.52. The critical values at the 5% and 10% significance levels are 31.88 and 34.91, respectively. Hence, at the 10% significance level, there is evidence of Weak-PPP. The interpretation of the cointegrating vector (shown in the third column of Table 4) is such the signs of the coefficients are correct. However, after normalization, the long-run relationship implies that the Canadian dollar depreciates by approximately 1614% of any increase in the U.S. price level or 1275% of any decrease in the Canadian price level. Normalizing the cointegrating vector with respect to the exchange rate, the x 2 statistic for the null hypothesis that the cointegrating vector can be written such that a1 5 a2 equals 2.96 with a p value of 23%. Using the model for I(2) variables, if r51, the null hypothesis s50 cannot be rejected. The sample value of Q *1,0 5 18.33 falls short of the 10% critical value of 31.88. Hence, the price levels appear to be I(2) variables that bear no long-run relationship to the exchange rate.

5

Note that the value of s is selected once the value of r has been chosen. Johansen shows that this two-step procedure has the following properties (i) if the rank of p is r and there are no I(2) components, the procedure picks out the true value of r with a high probability, (ii) a value of r that is too low is selected with a limiting probability of zero, and (iii) if there are I(2) components, the procedure will accept no I(2) components with a small probability. Jorgensen et al. (1996) show how to simultaneously select the values of r and s.

U.S. / France: At the 5% significance level, the model for I(1) variables indicates a single cointegrating vector. For a given value of the exchange rate, the U.S. and French price levels are roughly proportional (i.e., the respective coefficients are 2.06 and 22.10). The Franc depreciates by slightly more than 200% in response to an increase in the French price level or a

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

decrease in the U.S. price level. Note that the x 2 statistic for the null hypothesis that the normalized cointegrating vector can be written such that a1 5 a2 equals 1.54 with a p value of 46%. This result is strongly supportive of PPP. However, the obvious concern is that the exchange rate e may not belong in the cointegrating relationship. Using the model for I(2) variables, if r51, the null hypothesis s50 cannot be rejected. The sample * 5 11.34 falls short of the 10% critical value of Q 1,0 value of 31.88. The result is consistent with the notion that the I(2) price levels and the exchange rate do not form an equilibrium relationship. U.S. / Germany: Using the model for I(1) variables, at the 1% significance level, we find a single cointegrating vector. The interpretation of this vector is problematic since the coefficients on the German and U.S. price levels are of the ‘‘wrong’’ sign. In our opinion this is solid evidence against even WeakPPP. The x 2 -statistic for the null hypothesis that the normalized cointegrating vector can be written such that a1 5 a2 equals 32.87. Using the analysis for I(2) variables, for r51, the * 5 127.09. Hence, the null sample value of Q 1,0 hypotheses s50 can be rejected at the 1% signifi* 5 5.73, we cannot reject cance level. Given that Q 1,1 the null hypothesis that s51. As such, the I(2) analysis reinforces the previous suggestion that for Germany and the U.S. prices and the exchange rate are cointegrated. U.S. / Japan: At the 1% significance level, we find two cointegrating vectors (i.e., r52). Thus, there can be a cointegrating vector for the two price levels and a second for a linear combination of prices and the exchange rate. After normalizing, the most significant cointegrating vector implies that the yen appreciates by about 3080% in response to an increase in U.S. prices and by 1353% in response to a decrease in Japanese prices. The x 2 -statistic for the null hypothesis that the normalized cointegrating vector can be written such that a1 5 a2 equals 43.57. Using the analysis for I(2) variables, the result that r52 is inconsistent with Weak-PPP. However, if one * 5 is willing to set r51, the sample value of Q 1,0 48.64 so that the null hypotheses s50 can be * 5 rejected at the 1% significance level. Given Q 1,1

183

9.47, we cannot reject the null hypothesis that s51. Hence, in order to support Weak-PPP using the analysis for I(2) variables, it is necessary to ignore a cointegrating vector that has a p value of less than 1%. Germany / France: At the 5% significance level, we find a single cointegrating vector. The Franc depreciates by roughly 35% in response to an increase in the French price level and by nearly 65% in response to a decrease in the German price level. The null hypothesis that the normalized cointegrating vector has the form a1 5 a2 is strongly rejected at conventional significance levels. Using the analysis for I(2) variables, the sample value of Q *1,0 5 11.10 so that the null hypotheses s50 cannot be rejected at conventional significance levels.

6. Out of sample forecasts If PPP is to be a guide to the relationship between national price levels and the exchange rate, it should provide reasonable out-of-sample forecasts. Towards this end, we retained 35 months of data (1994:1 through 1996:11) to see how well each of the following four techniques performed: OLS: The Dickey-Fuller tests may not have sufficient power to reject the false null hypothesis of a non-stationary real exchange rate. As such, we used OLS to estimate each real exchange rate series in the form: r t 5 a0 1 b r t 21 1 ... 1 ak Dr t 2k 1 et Since we found no evidence of stationarity with threshold adjustment, we do not report any results for the TAR model. Beta51: The real exchange rates may, in fact, have a unit-root. As such, we used OLS to estimate each real rate series in the form: r t 5 a0 1 r t 21 1 ... 1 ak Dr t2k 1 et Median Unbiased: We estimated each real exchange rate series using Fuller’s median unbiased technique. As such, we used modified weighted

184

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

symmetric least squares to estimate each real rate series in the form: r t 5 a0 1 b r t 21 1 ... 1 ak Dr t 2k 1 et Weak-PPP: Weak-PPP suggests a cointegrating relationship between price levels and the exchange rate of the form: e t 5 a1 pt 2 a2 p *t Overall, the model for I(1) variables is substantially more favorable towards Weak-PPP than is the analysis for I(2) variables. As such, we estimated vector-error correction (VEC) models for the U.S. price level, exchange rate, and price levels of Canada, Germany, and France and between France and Germany using the model for I(1) variables. Each model was estimated over the 1973:1– 1993:12 period and the resulting estimates were used to make out-of-sample forecasts from 1994:1 through 1996:11. For Weak-PPP, we obtained separate forecasts of the two price levels and the nominal exchange rate. We then constructed the forecasts of the real rate in accord with eq. (1).6 For each estimation method, Table 5 reports the percentage difference between the forecasted and actual real exchange rates for forecasting horizons of 1 month, 12 months, 24 months and 35 months. In percentage terms, the forecast errors are generally smallest for a forecast horizon of 1 month. At this horizon, the French / U.S. real rate has the largest percentage forecast error for each of the four models. Nevertheless, the error is little more than one third of 1%. Otherwise, there is little relation between the length of the forecast period and the size of the forecast error. Unlike the real rates involving the U.S., both the 6

Alternatively, we could forecast the linear combination represented by: e t 2 a1 pt 1 a2 p. As formulated by Taylor (1990) and Froot and Rogoff (1995), Weak-PPP implies that temporary relative price movements in a particular direction cause the within-sample estimates a1 and a2 to differ from unity. However, in the long-run, the variable of interest is the real exchange rate constructed as e t 1p2pt . Another alternative is to estimate the price levels and exchange rate using either eq. (15) or eq. (16) and then to construct forecasts of the real exchange rate using eq. (1). Our methodology uses eq. (16) to forecast the price levels and exchange rate imposing the non-binding restriction on i represented by cointegrating relationship.

Table 5 Out of sample forecasts Beta51

Median

Weak-PPP

1 month horizon R CA 0.19% ] R FR 20.34%* ] R GE 20.28% ] R JA 20.28%* ] R FRGE 20.14% ]

OLS

0.19% 20.35% 20.29% 20.34% 20.09%*

0.19% 20.35% 20.29% 20.34% 20.12%

0.14%* 20.36% 20.27%* 20.33% 20.15%

12 month horizon R CA 21.38% ] R FR 1.46% ] R GE 2.08% ] R JA 2.93% ] R FRGE 20.69% ]

21.32% 1.41% 1.93%* 2.08% 20.43%

21.32% 1.42% 2.02% 2.08% 20.60%

20.12%* 1.06%* 2.09% 1.81%* 0.05%*

24 month horizon R CA 21.31% ] R FR 3.43% ] R GE 4.10% ] R JA 2.44% ] R FRGE 20.77% ]

21.20% 3.33% 3.84%* 0.89% 20.40%

21.20% 3.35% 4.00% 0.89% 20.69%

0.69%* 2.97%* 4.04% 0.43%* 0.39%*

35 month horizon R CA 21.11% 20.97%* 20.97%* 1.19% ] R FR 2.34% 2.21% 2.24% 1.87%* ] R GE 2.53% 2.20% 2.43% 0.38%* ] R JA 20.33%* 22.35% 22.35% 22.93% ] R FRGE 20.67% 20.02%* 20.59% 0.85% ] *, Best out-of-sample forecast for the real exchange rate in question. Numbers have been rounded to the second decimal place. OLS indicates forecasts using OLS assuming no unit-root. Beta5 1 constrains the real exchange rate to have a unit-root. Median uses the modified WS estimate and Weak-PPP does not constrain the coefficients of the PPP relationship.

Dickey-Fuller and the median unbiased techniques suggest that the French / German rate is stationary. This may explain why, with one exception, the French / German rate has the smallest forecast error for all models at all forecasting horizons. There is no single estimation technique that dominates the others. Among the univariate methods, the unit-root model had the lowest percentage forecast error in 10 cases and was tied with at least one of the others in 6 cases. Overall, the percentage forecast errors resulting from the Weak-PPP model are smallest in twelve of the twenty possible cases. However, the forecasts from the unit-root model (Model 2) are smallest in five cases and simple OLS provides the smallest errors in the remaining three cases. At the

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

longest forecasting horizon, the unit-root model predicts about as well as the models that assume PPP holds. The median unbiased estimates never provide the smallest percentage forecast error.

185

supposed to hold as a long-run equilibrium relationship.

References 7. Conclusion Long-run Purchasing Power Parity implies that the rate of foreign currency appreciation exchange rate should equal the difference between the domestic and foreign inflation rates. For the theory to have any empirical content, deviations from PPP must eventually decay towards zero. We conducted a number of tests to determine whether a number of real exchange rates (constructed such that the real rate measures the deviation from long-run PPP) are stationary. DickeyFuller tests, tests for stationarity with asymmetric adjustment, and median unbiased estimates are strongly suggestive of a unit-root in the real exchange rate sequences. Real exchange rates do not indicate any strong evidence of mean reversion so that deviations for PPP are permanent. One possible alternative is to recognize the possibility of measurement error in the price indices. So-called Weak-PPP argues that national price levels and nominal exchange rates are cointegrated such that foreign currency appreciation need not equal the difference between the domestic and foreign inflation rates. Surprisingly, we do not find especially strong evidence of Weak-PPP. Treating prices and exchange rates as I(1) variables, there does appear to be a cointegrating relationship between national price levels and exchange rates. However, the coefficients in the cointegrating vector are often of the wrong sign or of a magnitude implying a measurement error in excess of 1000%. Treating the price levels as I(2) variables does not remedy the problem. In general, a linear combination of the price levels can be formed that is I(1), but this linear combination is not found to be cointegrated with the I(1) exchange rate. The out-of-sample forecasts assuming a unit-root in the real rate series outperforms forecasting methods assuming a stationary real rate. Moreover, at long forecasting horizons, the cointegration tests using Weak-PPP are not markedly better than those assuming a unit-root in the real exchange rate series. This is a strong condemnation of a theory that is

Adler, Michael, Lehman, Bruce, 1983. Deviations from purchasing power parity in the long run. Journal of Finance 38, 1471–1487. Balke, N., Fomby, T., 1996. Threshold cointegration. Forthcoming. International Economic Review. Davutyan, N., Pippenger, John, 1990. Testing Purchasing Power Parity: some evidence on the effects of transactions costs. Econometric Reviews 9, 211–240. Dickey, David, Fuller, Wayne, 1979. Distribution of the estimates for autoregressive time series with a unit root. Journal of the American Statistical Association 74, 427–431. Enders, Walter, 1988. ARIMA and cointegration tests of PPP under fixed and flexible exchange rate regimes. Review of Economics and Statistics 70, 504–508. Enders, W., Granger, C.W.J., 1997. Unit-root tests and asymmetric adjustment with an example using the term structure of interest rates. Forthcoming. Journal of Business and Economic Statistics. Fischer, E., Park, J., 1991. Testing for purchasing power parity under the null of cointegration. The Economic Journal 101, 1476–1484. Froot, K.A., Rogoff, K., 1995. Perspectives on PPP and long-run real exchange rates. In: Grossman, G.M., Rogoff, K. (Eds.), Handbook of International Economics, vol III. Elsevier, Amsterdam, pp. 1647–88. Fuller, W., 1995. Introduction to Statistical Time Series, 2nd ed. John Wiley, New York. Johansen, S., 1995. Likelihood-Based Inference in Cointegrated Autoregressive Models. Oxford University Press, Oxford. Jorgensen, C., Kongsted, H.C., Rahbek, A., 1996. Trend-Stationarity in the I(2) Cointegration Model. University of Copenhagen, mimeo. Jorion P. and R. Sweeney (1996) Mean reversion in real exchange Rates: Evidence and implications for forecasting Journal of International Money and Finance. forthcoming. Levin, A., Lin, C.F., 1992. Unit Root Tests in Panel Data: Asymptotic and Finite-Sample Properties. University of California at San Diego, mimeo. Mark, N., 1990. Real exchange rates in the long run. Journal of International Economics 28, 115–136. O’Connell, P., 1998. The overvaluation of Purchasing Power Parity. Forthcoming. Journal of International Economics, 1, 1–21. Oh, K.Y., 1996. Purchasing Power Parity and unit root tests using panel data. Journal of International Money and Finance 15, 405–418. Papell, D., 1997. Searching for stationarity: Purchasing Power Parity under the current float. Forthcoming. Journal of International Economics, 3: 4, 313-32 Patel, Jayendu, 1990. Purchasing Power Parity as a long-run relation. Journal of Applied Econometrics 5, 367–379.

186

W. Enders, B. Falk / International Journal of Forecasting 14 (1998) 171 – 186

Pippenger, Michael, Goering, Gregory, 1993. A note on the empirical power of unit root tests under threshold processes. Oxford Bulletin of Economics and Statistics 55, 473–481. Taylor, Mark, 1990. An empirical examination of long-run Purchasing Power Parity using cointegration techniques. Applied Economics 20, 1369–1381. Tong, H., 1990. Threshold Models in Non-Linear Time Series Analysis. Springer-Verlag, New York. Tong, H., 1983. Non-Linear Time Series: A Dynamical Approach. Oxford University Press, Oxford. Tsay, R.S., 1989. Testing and modeling threshold autoregressive processes. Journal of the American Statistical Association 82, 590–604. van Dijk, D., Philip, F., 1995. Empirical Specification of Nonlinear Error-Correction Models. Erasmus University (Rotterdam) Working Paper [ 9544 /A, mimeo. Wei, S.J., Parsley, D., 1995. Purchasing Power Disparity During the Floating Rate Period: Exchange Rate Volatility, Trade Barriers and Other Culprits. NBER paper [5032.

Wu, Y., 1996. Are real exchange rates nonstationary? evidence from a panel-data test. Journal of Money, Credit and Banking 28, 54–63. Biographies: Walter ENDERS is a University Professor in the Department of Economics at Iowa State University. His recent publications include two time-series books and papers in the Journal of Business and Economic Statistics and the American Political Science Review. His current research interests focus on non-linear time-series models. Barry FALK is an Associate Professor on the Department of Economics at Iowa State University. His recent publications include articles in the American Journal of Agricultural Economics, Journal of Monetary Economics and the Journal of Political Economy. His current research focusses on applied macro-econometric issues.