Forecasting unemployment using an autoregression with censored latent effects parameters

Forecasting unemployment using an autoregression with censored latent effects parameters

International Journal of Forecasting 20 (2004) 255 – 271 www.elsevier.com/locate/ijforecast Forecasting unemployment using an autoregression with cen...

250KB Sizes 0 Downloads 59 Views

International Journal of Forecasting 20 (2004) 255 – 271 www.elsevier.com/locate/ijforecast

Forecasting unemployment using an autoregression with censored latent effects parameters Philip Hans Franses a,*, Richard Paap a, Bjo¨rn Vroomen b a

Econometric Institute, Erasmus University Rotterdam, H11-34, P.O. Box 1738, NL-3000 DR Rotterdam, The Netherlands b Erasmus Research Institute of Management, Erasmus University Rotterdam, Rotterdam, The Netherlands

Abstract Monthly observed unemployment typically displays explosive behavior in recessionary periods, while there seems to be stationary behavior in expansions. Allowing parameters in an autoregression to vary across regimes, and hence over time, can capture this feature. In this paper, we put forward a new autoregressive time series model with time-varying parameters, where this variation depends on a linear indicator variable. When the value of this variable exceeds a stochastic threshold level, the parameters change. We discuss representation, estimation and interpretation of the model. Also, we analyze its forecasting performance for unemployment series of three G-7 countries, and we compare it with various related models. D 2004 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. Keywords: Nonlinear time series; Unemployment; Censored regression

1. Introduction Several macroeconomic time series variables display features that seem to require nonlinear time series models for descriptive and forecasting purposes. A typical feature of such variables as output and unemployment is that their dynamic properties appear to change over the business cycle; see for example Neftcß i (1984), Hamilton (1989) and Tera¨svirta and Anderson (1992), among many others. There are many potentially useful nonlinear time series models, including the threshold autoregression [TAR] proposed by Tong (1983), the smooth transition autoregression [STAR] advocated in Granger and Tera¨svirta (1993) and Tera¨svirta (1994), and the

* Corresponding author. E-mail address: [email protected] (P.H. Franses).

Markov switching model put forward in Hamilton (1989, 1990). A property of these nonlinear models is that the data can be described by a weighted average of two or more linear autoregressive models. The weights are determined by certain switching functions. As such, these models impose that the AR parameters only take values in fixed intervals and hence that they are bounded from above and below. An additional property of TAR and STAR models is that the switching function incorporates a linear combination of variables, and switching is governed by whether the linear combination is above or below a fixed threshold (TAR) or by a smooth transition function (STAR). To allow for more flexibility, one may consider autoregressive models where the parameters are generated by a stochastic process, see for example Grillenzoni (1993), Leybourne, McCabe, and Tremayne (1996) and Granger and Swanson (1997). A drawback

0169-2070/$ - see front matter D 2004 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.ijforecast.2004.09.004

256

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

of these models, however, is that it may be difficult to assign interpretation to the specific values of the parameters. Therefore, in this paper we put forward a new nonlinear time series model that allows for time-variation in the parameters, while we also claim it imposes an economically interpretable structure on this variation. The key feature of our new autoregressive time series model is that the values of the AR parameters are a function of, (i) for example a leading indicator variable, although a linear combination of lagged explanatory variables is possible too, and (ii) an unobserved stochastic error process. When the sum of (i) and (ii) exceeds a threshold level, the autoregressive parameter in a for example an AR(1) model would change from q to q + qt, where the value of qt is a function of the leading indicator variable and an error term. Hence, for some observations the AR(1) parameter is q, while for others it is q + qt. We call our model an autoregression with a censored latent effects parameter [AR-CLEP]. In this paper we illustrate our time-varying parameter model for monthly unemployment for three G-7 countries, where the focus is on out-of-sample forecasting. As unemployment sometimes experiences sudden increases, thereby displaying explosive behavior in recessions, we assume that qt only takes positive values. Of course, other applications may straightforwardly relax this assumption. The resultant AR-CLEP model is rather flexible as the AR parameters can range from almost always a constant value to approximately full time-variation, depending on the contribution of the explanatory variable and the error term. Interestingly, in our illustration we find that mainly during recession periods for unemployment the AR parameters are time-varying, while in expansion periods the parameters are more or less constant. Hence, as a by-product, our model seems to be useful for identifying business cycle peaks and troughs in unemployment. The outline of our paper is as follows. In Section 2, we introduce our autoregression with censored latent effects parameters. In Section 3, we outline some possible extensions of our basic model and we compare it with closely related nonlinear models. In Section 4, we illustrate our model in detail for monthly US unemployment, while we also consider Canada and Germany in our out-of-sample forecasting exercise. In Section 5, we conclude with a few remarks.

2. The AR-CLEP model In this section we start off with a discussion of the representation of the simple AR(1)-CLEP model. A discussion of higher order dynamics is straightforward, and for notational convenience, it is postponed to Section 3. We consider parameter estimation in Section 2.2, and the construction of residuals for diagnostic purposes in Section 2.3. Finally, in Section 2.4 we show how out-of-sample forecasts can be generated. 2.1. The model Consider the following AR(1) model with a timevarying autoregressive parameter for a time series { yt}Tt = 1, that is, yt ¼ l þ ðq þ qt Þyt1 þ et ;

ð1Þ

where et f NID(0,re2), and qt is a censored latent variable defined by

qt ¼

8 < xt Vb þ lt

if xt Vb þ lt z0

:

if xt Vb þ lt < 0

0

ð2Þ

with ut f NID(0,ru2) independent of et, xt a (k  1) vector with explanatory variables including a constant and with b an unknown (k  1) parameter vector. In our illustration below, we take xt as a single lagged leading indicator variable. The assumptions on et and ut can be relaxed, but this is not pursued here. The autoregressive parameter of the AR model equals q unless xtVb exceeds the stochastic threshold level  ut, in which case the AR coefficient equals q + qt. Of course, one can replace Eq. (2) by other functions, where for example qt corresponds with negative values, that is, the z and < in Eq. (2) can be changed into > and V , or with values that are bounded from above and below. In the present paper, with our particular illustration in mind, we consider Eq. (2). Notice that qt is not fixed, but that it takes values that depend on xt, b and ut. The value of zero in Eq. (2) can be replaced by any

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

s. Given that there are no restrictions on the intercept in xtVb, this s is however not identified. In case xtVb is almost always z 0, the AR parameter is almost always time-varying. And, when xtVb is almost always < 0, then qt = 0 for almost all t. Given the unemployment example below, and assuming that the leading indicator variable has substantial predictive value, we expect to find that xtVb + ut z 0 in and around recessions. Finally, if ru = 0, Eq. (1) with Eq. (2) reduces to a threshold model with a non-stochastic threshold value. By allowing ut p 0, we introduce more uncertainty concerning the timing of the changing parameter. Our model amounts to a parsimonious multiplicative regression model as it involves qt yt  1 and not xtVyt  1. As qt in Eq. (1) is a latent variable, we can only make probability statements about its values. The probability that qt = 0 equals the probability that xtVb + ut < 0, that is, Pr½qt ¼ 0Axt ; h ¼

Z

xt Vb l

  xt Vb ¼U ¼ Ut ; ru

1 /ðut =ru Þdut ru

 E½qt Axt ; h2 :

ð6Þ

For both cases we use that if z = max(0, z*) with z*fN(m,s2) E½z ¼ Pr½z ¼ 0E½zAz ¼ 0 þ Pr½z > 0E½zAz > 0   /ðm=sÞ ¼ 0 þ ð1  Uðm=sÞÞ m þ s 1  Uðm=sÞ ¼ lð1  Uðm=sÞÞ þ s/ðm=sÞ ð7Þ and V ½z ¼ s2 ð1  Uðm=sÞÞ þ lE½z  E½z2 ;

ð8Þ

see Johnson and Kotz (1970, pp. 81 – 83) and Gourieroux and Monfort (1995, p. 483). Note that in our case we have m = xtVb and s = ru.

The parameters in model (1) with (2) can be estimated using maximum likelihood. To derive the likelihood function, we first consider the density of yt given its past contained in Yt  1={ yt  1,. . .,y1} and given qt, that is, f ðyt AYt1 ; qt ; hÞ ¼

ð4Þ

The expected value of the autoregressive parameter qt equals

  1 1 pffiffiffiffiffiffi exp  2 ðyt  l  ðq þ qt Þyt1 Þ2 : 2re re 2p ð9Þ

To obtain the density of yt given Yt  1 unconditional on qt, we have to integrate over the unobserved variable qt. The unconditional density of yt given Yt  1 and xt equals f ðyt AYt1 ; xt ; hÞ ¼ Pr½qt ¼ 0Axt ; h  f ðyt AYt1 ; xt ; qt ; hÞAqt ¼0   Z l 1 ut þ / r r u xt Vb u

E½qt Axt ; h ¼ Pr½qt ¼ 0Axt ; hE½qt Aqt ¼ 0; xt ; h þ Pr½qt > 0Axt E½qt Aqt > 0; xt ; h   /t ¼ 0 þ ð1  Ut Þ xt Vb þ ru 1  Ut ¼ xt Vbð1  Ut Þ þ ru /t ;

V ½qt Axt ; h ¼ r2u ð1  Ut Þ þ xt VbE½qt Axt ; h

2.2. Parameter estimation

l

1 Pr½qt > 0Axt ; h ¼ /ðut =ru Þdut xt Vb ru   xt Vb ¼1U ¼ 1  Ut : ru

where /t = /(  xtVb/ru), which corresponds with the standard censored regression model. The variance of qt is given by

ð3Þ

where h={l,q,re,b,ru} and /( ) and U( ) are the density function and the cumulative distribution function of the standard normal distribution, respectively. The probability that qt >0 is thus given by Z

257

 f ðyt AYt1 ; xt ; qt ; hÞAqt ¼xt bþut dut ; ð5Þ

ð10Þ

258

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

where Pr[qt = 0jxt;h] is defined in Eq. (3). The log likelihood function simply equals LðYT AXT ; hÞ ¼

T X

lnðf ðyt AYt1 ; xt ; hÞÞ;

ð11Þ

We consider the probability that qt = 0 given Yt and xt Pr½qt ¼ 0AYt ; xt ; h

t¼1

¼ where XT =(xT,. . ., x1). The maximum of this likelihood function over the parameter space provides the maximum likelihood estimator for the model parameters denoted by hˆ. In Appendix A, we derive the firstorder condition for this optimization problem. The log likelihood function can be maximized using standard numerical optimization algorithms, like for example Gauss –Newton. In this paper, we opt for the BHHH algorithm of Berndt, Hall, Hall, and Hausman (1974), as it does not require the second-order derivatives of the likelihood function. Standard errors of the maximum likelihood estimates can be obtained via the score’s average outer product; see Appendix A for details. In the application below, we opt for the heteroskedastic-consistent covariance matrix estimator of White (1980). To decrease the computational burden of calculating the integral in Eq. (10) and the log likelihood function (Eq. (11)) we use that Z

l

1 1 /ðz=r1 Þ /ððz  aÞ=r2 Þdz r r 1 2 b   Z l 1 1 pffiffiffiffiffiffi exp  ðz=r1 Þ2 ¼ 2 r1 2p b   1 1  pffiffiffiffiffiffi exp  ððz  aÞ=r2 Þ2 dz 2 r2 2p  2  r a 2 pffiffiffiffiffiffi exp ðr2 r4 ¼  r Þ 2 2 2 r1 r2 2p Z l 1 2  /ððz  r2 2 r aÞ=rÞdz r b  2  r a 2 2 pffiffiffiffiffiffi exp ¼ ðr =r  1Þ 2 2r22 r1 r2 2p 2  Uððr2 2 r a  bÞ=rÞ;

Pr½qt ¼ 0Axt ; hf ðyt AYt1 ; xt ; qt ; hÞAqt ¼0 : f ðyt AYt1 ; xt ; hÞ ð13Þ

This probability can be used to analyze whether the AR parameter increases at time t. An estimate of the size of this increase follows from the conditional expected value of qt given Yt and xt, which is given by E½qt AYt ; xt ; h   Z l 1 ut ðxt Vb þ ut Þ / f ðyt AYt1 ; xt ; qt ; hÞAqt ¼xt Vbþut dut ru ru xt Vb : ¼ f ðyt AYt1 ; xt ; hÞ

ð14Þ With these estimates of the parameters, it is possible to examine the usefulness of the CLEP part of the model. 2.3. Residuals and diagnostics As the added autoregressive parameter qt in the AR-CLEP model is unobserved, evaluating the model in the maximum likelihood estimates does not directly provide estimates of the residuals. We define residuals as the difference between the series yt and the conditional mean of yt, that is, et ¼ yt  E½yt AYt1 ; xt ; h ¼ yt  l  qyt1  E½qt Axt ; hyt1 ;

ð12Þ

where r2=(r12r22)/(r12 + r22). Note that in our case z = ut, r1 = ru, r2 = re/yt  1, b =  xtVb, and a=( yt  l  (q + xtVb)yt  1)/yt  1. Once the parameters are estimated, we can estimate the value of the unobserved qt variable.

ð15Þ

where E[qtjxt;h] is defined in Eq. (5). The corresponding fit of the model can be defined aslˆ +qˆyt  1 + E[qtjxt;hˆ ]yt  1. The estimated residuals eˆt are simply the difference between yt and this fit. Note, however, that the estimated residuals are heteroskedastic as the conditional variance of yt equals V ½yt AYt1 ; xt ; h ¼ E½ðyt  E½yt AYt1 ; xt ; hÞ2 AYt1 ; xt ; h ¼ E½ðet þ ðqt  E½qt Axt ; hÞyt1 Þ2 AYt1 ; xt  ¼ r2e þ y2t1 V ½qt Axt ; h;

ð16Þ

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

where V[qtjxt;h] is defined in Eq. (6). When constructing diagnostics for the model, one should therefore take account of this heteroskedasticity. The estimated residuals can be used in diagnostic tests to specify the model and also to analyze possible misspecification. We consider the robust, regressionbased diagnostics of Wooldridge (1991) to analyze misspecification in the conditional mean and variance. We consider a robust LM test for AR( P) errors and a robust LM test for ARCH( Q) effects in the residuals. The robust LM test for AR( P) errors can be constructed by running the following two regressions. First, we regress (eˆt  1,. . .,eˆt  P) on the derivative (gradient) of E[ y t jY t  1 , x t ; h] with respect to (l,q,b,ru) evaluated in the ML estimates. This provides the error vector rˆt = (rˆ1,t,. . .,rˆP,t). In the second step we regress 1 on rˆteˆt; see Wooldridge (1991, p. 16). The test statistic for AR( P) serial correlation equals (T  P)  SSR, where SSR denotes the usual sum of squared residuals of latter regression. It is asymptotically v2( P) distributed under the null hypothesis of no serial correlation; see Wooldridge (1991, p. 20) for details. We may also account for the fact that the residuals are heteroskedastic. In that case, we have to divide the estimated residuals eˆt, the lagged estimated residuals, eˆt  i, i = 1,.p. .,P, ffiffiffiffi and the first derivative of E[ ytjYt  1,xt;h] by sˆ t , where sˆt is V[ ytjYt  1, xt;h] evaluated in the ML estimates. The computation of the first derivative of E[ ytjYt  1, xt;h] with respect to l and q is straightforward. Additionally, the derivatives with respect to b and ru follow from BE½qt jxt ; h ¼ ð1  Ut Þxt Bb

BE½qt Axt ; h ¼ /t ; ð17Þ Bru

where we use that BUt ¼  /t xt ru Bb

BUt ¼ /t xt Vb Bru r2u

B/t / x Vbx ¼  t t2 t Bb ru

/t ðxt VbÞ2 B/t ¼ : Bru r3u

ð18Þ

To obtain the LM test statistic for ARCH( Q) effects in the residuals, we run the following two regressions; see Wooldridge (1991, p. 31). In the first step we regress (eˆ 2t  1/sˆt. . ., eˆ t2 Q /sˆt) on the first derivative of V [ ytjYt  1, xt;h]/ sˆt with respect to (re,b,ru)

259

evaluated in the ML parameters. The residuals of this regression are given by xˆ t = (xˆ 1, t,. . .,xˆ Q,t). The test statistic now equals (T  Q)  SSR, where SSR denotes the usual sum of squared residuals of the regression of 1 on xˆ t (eˆt2/sˆt  1). The test statistic is asymptotically v2( Q) distributed under the null hypothesis that the variance of yt is given by Eq. (16); see again Wooldridge (1991) for details. The derivatives of V[ ytjYt  1,xt;h] with respect to b and ru follow from BV ½qt Axt ; h ¼ ðru /t þ E½qt Axt ; h Bb þ xt Vbð1  Ut Þ  2E½qt Axt ; hð1  Ut ÞÞxt BV ½qt Axt ; h ¼ 2ru ð1  Ut Þ  2E½qt Axt ; h/t : Bru

ð19Þ ð20Þ

Simulation experiments, to analyze the distribution of the robust LM test statistics for serial correlation and ARCH effects under the null hypothesis, suggest that the asymptotic theory is valid. We report on these experiments in Appendix B. 2.4. Forecasting The AR-CLEP model in Eqs. (1) and (2) uses exogenous variables to explain the parameter variation in the time series model. Forecasts from this model are therefore conditional on the values of these explanatory variables. First, we consider the out-of-sample forecast of the value of the AR coefficient. A forecast for qT + 1 given XT + 1 is given by the unconditional expectation of qT + 1 given xT + 1, see Eq. (5). Multi-step ahead forecasts of qt can be computed in the same way. As the vector xt contains predetermined explanatory variables, their values are known at time T. However, for multi-step ahead forecasts, it is likely that some values of the explanatory variables are unknown at time T and that they themselves have to be replaced by forecasts. The one-step ahead forecast of the series at time T conditional on YT and xT + 1 follows from ˆ E½yT þ1 AYT ; xT þ1 ; h Z l ˆ Tþ1 yT þ1 f ðyT þ1 AYT ; xTþ1 ; hÞdy ¼ l

ˆ T; ¼ lˆ þ ðqˆ þ E½qT þ1 AxT þ1 ; hÞy

ð21Þ

260

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

where f( yT + 1jYT, xT + 1;h) and Eˆ[qtjxt;h] are defined in Eqs. (10) and (5), respectively. Hence, the one-step ahead forecast is equal to the fit of the model defined below (Eq. (15). The two-step ahead forecast conditional on (xT + 2, xT + 1) and YT is given by

where D denotes the familiar first-differencing operator. Here, the (q + qt) parameter equals the sum of the autoregressive parameters of an AR( p) model in levels. Estimation, inference and forecasting can be done in the same way as for the AR(1)-CLEP model above. Another possibility is to consider

ˆ E½yT þ2 AYT ; xT þ2 ; xT þ1 ; h Z lZ l ˆ ¼ yT þ2 f ðyT þ2 AYT þ1 ; xT þ2 ; hÞ

ðyt  ðq þ qt Þyt1  lÞ

l

¼

l

ˆ Tþ2 dyT þ1 :  f ðyT þ1 AYT ; xTþ1 ; hÞdy

ð22Þ

ˆ ˆ lˆ þ ðqˆ þ E½qT þ2 AxT þ2 ; hÞE½y T þ1 AYT ; xT þ1 ; h: ð23Þ In general, the h-step ahead forecast, conditional on (xT + h,. . .,xT + 1) and YT, equals ˆ E½yT þh AYT ; xT þh ; . . . ; xtþ1 ; h Z l Z l ˆ::: ::: ¼ yT þh f ðyT þh AYT þh1 ; xT þh ; hÞ l

ˆ T þh : : : dyTþ1 ;  f ðyT þ1 AYT ; xT þ1 ; hÞdy

ð24Þ

which equals ˆ ˆ lˆ þ ðqˆ þ E½qT þh AxT þh ; hÞE½y T þh1 AYT ; xT þh1 ; h: ð25Þ

3. Extensions and relation to other models In this section we consider some extensions of our basic model in Eqs. (1) and (2). Next, we relate our model to some popular alternative models for modeling time-varying autoregressive parameters. These models will also be used in the empirical exercise below. 3.1. Extensions The most obvious extension is to allow for higher order dynamics in the autoregressive part of the model. One may then opt for the AR( p) model

yt ¼ l þ ðq þ qt Þyt1 þ

p1 X i¼1

ai Dyti þ et ;

ai ðyti  ðq þ qti Þyt1i  lÞ þ et :

ð27Þ

i¼1

This expression simplifies to

l

p1 X

ð26Þ

This model assumes that the largest root of the autoregression varies over time. Parameter estimation is now substantially more difficult than for Eq. (26), as yt directly depends on lagged qt values. This suggests that a recursive method is needed to evaluate the likelihood function. For convenience, we consider Eq. (26) in the empirical illustration below. 3.2. Related time-varying parameter models Clearly, the AR-CLEP model is a time series model with time-varying parameters. In the past, several other model specifications have been proposed to model parameter variation over time, and some of these models bear similarities with our model. Basically, these models can also be written as for example yt ¼ l þ ðq þ qt Þyt1 þ et ;

ð28Þ

where now qt takes different forms. The threshold model, advocated in for example Tong (1983), assumes that 8 < q* if xt Vbz0 qt ¼ ð29Þ : 0 if xt Vb < 0: Identification and estimation (especially of b) of this threshold model is, however, very complicated, as the value of qt depends on an unknown linear combination of variables exceeding a threshold value; see also Chen (1995). To overcome this drawback, the indicator function in Eq. (29) may be replaced by a smooth function, see Granger and Tera¨svirta (1993). This smoothed threshold model allows for a continuum of possible values for the AR parameter in the interval [q,q + q*] by considering qt ¼ q*Fðxt VbÞ;

ð30Þ

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

where F( ) is a continuous function which takes values in the region [0,1]. For example, F can be the logistic function FðzÞ ¼

1 ; 1 þ expðzÞ

ð31Þ

see also Tera¨svirta and Anderson (1992). These threshold models impose an upperbound on the value of the autoregressive parameter, and this is in contrast with our model. For example, if q*>0, the maximum value of the autoregressive parameter is q + q*. The exponential AR [EAR] model in Haggan and Ozaki (1981) is a model without such a restriction as it assumes qt ¼ expðxt VbÞ:

ð32Þ

As qt is now always positive, this model assumes that the autoregressive parameter is always in excess of q. Only for a very small value of xtVb, qt gets close to 0, and the autoregressive parameter approximates q. The second difference with our AR-CLEP model is the absence of a stochastic term ut in the latter model. For the above models, the parameter variation is explained by a linear combination of exogenous variables xtVb. There are also models where qt is a time series process, like for example, qt ¼ cqt1 þ ut ; ut f NID(0,ru2);

explosive and sometimes approximately stationary behavior. A common property of these models is that the time-variation in the autoregressive parameters is only explained by an unobserved stochastic process, which is in contrast with our model, where we include observed explanatory variables. Finally, we mention the Markov switching model; see Hamilton (1989, 1990). A simple Markov switching model assumes that qt ¼ q*st ;

ð34Þ

where st is either 0 or 1, and where st follows an unobserved first-order Markov process with transition probabilities Pr½st ¼ 0Ast1 ¼ 0 ¼ pt ;

Pr½st ¼ 1Ast1 ¼ 0 ¼ 1  pt ;

Pr½st ¼ 1Ast1 ¼ 1 ¼ qt ;

Pr½st ¼ 0Ast1 ¼ 1 ¼ 1  qt :

ð35Þ Hamilton (1989) assumes that the transition probabilities are constant over time, that is pt = p and qt = q, bt. Filardo (1994) and Diebold, Lee, and Weinbach (1994) relax this assumption and consider a Markov switching model with time-varying probabilities [MSTVP], where the transition probabilities are a function of explanatory variables, that is,

ð33Þ

with see Harvey (1981) and Grillenzoni (1993) among others. The autoregressive parameter in Eq. (1) then follows a stationary autoregressive process around the mean q if jcj < 1. A special case of this model is analyzed by Leybourne, McCabe, and Mills (1996) and Leybourne, McCabe, and Tremayne (1996). These authors impose q to be 1 and discuss several special cases of Eq. (33), including c = 1. These so-called randomized unit root processes allow for time periods with explosive behavior in yt, where q + qt is larger than 1, but also for periods with q + qt < 1 corresponding to non-explosive or error correcting behavior. A related model is considered in Granger and Swanson (1997) who assume that exp(q + qt ) follows a first-order stationary autoregressive process. Under the restriction that E[exp(q + qt)] = 1, they obtain a so-called stochastic unit root process, which also allows for sometimes

261

pt ¼

1 1 and qt ¼ : 1 þ expðxt Vbp Þ 1 þ expðxt Vbq Þ

ð36Þ

The AR(1) parameter is restricted by either q or q + q*, but its conditional expectation varies over time. In our empirical section below, we will compare the various models in a forecasting exercise.

4. Forecasting unemployment In this section we consider the AR-CLEP model and various related models in a out-of-sample forecasting exercise. We first provide some preliminary remarks on modeling. Next, we give the in-sample estimation results for the US data for illustrative purposes. Finally, we compare the forecasting performance of the model for the three countries considered.

262

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

4.1. Preliminaries We use seasonally adjusted monthly total unemployment rates for the United States, Canada and West-Germany, all obtained from the Datastream database. All samples range from 1965.01 to 1999.12. We estimate all model parameters for the full sample, but also for subsamples, which allow us to examine the out-of-sample forecasting performance. To explain possible time-variation in the AR parameters, we consider leading indicator variables. We use the country-specific monthly composite leading indicator variable from the OECD obtained from Datastream. There are a few practical decisions to be made, in order to arrive at the specification of the AR-CLEP model, denoted as

do not reject. The final model orders are given in Table 1. In the forecasting study, we compare our model with an autoregressive (AR) model, a multiplicative model (ARXY), an exponential autoregression (EAR), a logistic smooth transition autoregression (LSTR), and a Markov switching model with timevarying transition probabilities (MSTVP). The first four models are written as

AR : yt ¼ l þ qyt1 þ

p1 X

ai Dyti þ et

ð40Þ

i¼1

ARXY : yt ¼ l þ qyt1 þ

p1 X

ai Dyti

i¼1

yt ¼ l þ ðq þ qt Þyt1 þ

p1 X

þ byt1 Dk xtq þ et

ai Dyti þ et

ð41Þ

i¼1

qt ¼ maxð0; b0 þ b1 Dk xtq þ ut Þ:

ð37Þ EAR : yt ¼ l þ ðq þ expðb0 þ b1 Dk xtq ÞÞyt1

These concern the order of the autoregression p, the optimal transformation of the leading indicator variable (Dk) and the lag structure in the censored regression ( q). We determine the optimal value for p (AR order) by minimizing the BIC of the following AR(p) model yt ¼ a þ

p X

bi yti þ et ;

p ¼ 1; . . . ; 12:

þ

p1 X

ai Dyti þ et

LSTR : yt ¼ l þ ðq þ q*ð1 þ expðb0 þ b1 Dxtq Þ1 Þyt1 þ

ð38Þ

p1 X

ai Dk yti þ et ;

i¼1

i¼1

ð43Þ

The optimal values for Dk and q are set according to the maximum values of the relevant t-values in yt ¼ l þ qyt1 þ byt1 Dk xtq þ

ð42Þ

i¼1

p1 X

ai Dyt1 þ et ;

i¼1

while the MSTVP model has been discussed earlier. The values of p, q and k in these models are set equal to those in the AR-CLEP model for comparison purposes.

ð39Þ q = 1,. . .,6 and k = 1, 6. Next, we estimate the ARCLEP model and apply diagnostic tests for serial correlation (of lag 1 and of lags 1 to 4) and for ARCH effects (of order 1 and of order 1 to 4). Due to the simulation experiments in Appendix B, we have confidence in the practical usefulness of these tests. If these diagnostics suggest that there is misspecification, we change the orders of p, q or k until the tests

Table 1 Lag order p in the autoregression for yt, transformation k of the leading indicator variable, and lag order q of the lagged leading variable in the censored regression Series

p

k

q

United States Canada Germany

5 1 5

6 6 1

1 1 1

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

4.2. In-sample results for the US Using the modeling approach described above, we obtain the following model for US unemployment for the full sample, that is, yˆ t ¼ 0 :064ð0:031Þ þ ð 0:978ð0:005Þ þ maxðqt ; 0ÞÞyt1  0 :134ð0:046ÞDyt1  0 :002ð0:046Þ Dyt2 þ 0 :057ð0:044ÞDyt3 þ 0 :128ð0:046Þ Dyt4 þ et ; et fNID ð 0; 0:1262ð0:007Þ Þ

ð44Þ

with qt ¼ 0 :007ð0:006Þ  0 :009ð0:001ÞD6 xt1 þ ut ; ut fNID ð 0; 0:0242ð0:004ÞÞ;

ð45Þ

where heteroskedastic-consistent standard errors appear between parentheses. We also considered a version of Eq. (44) without the Dyt  2 and Dyt  3 variables, but the qualitative results below do not change. The model can be interpreted as follows. When large positive values of changes in the leading indicator occur, that is, when the economy is going up, there is a large probability that qt is zero. And, when the economy is going down, the probability that qt >0 increases and also increases in the unemployment rate can be expected. To test whether this model can be simplified to a linear model, we test whether the censored regression part of model (2) is present in the autoregressive structure. The likelihood ratio statistic for the absence of the censored regression part in the model (b = 0 and ru2 = 0) equals 84.506. As the total restriction contains a single one-sided alternative (ru2 = 0 versus ru2>0) and two two-sided alternatives (b = 0 versus b p 0), this test statistic is asymptotically (1/2)v2(2)+(1/2)v2(3) distributed; see Wolak (1989, pp. 19 –20). The 95% percentile of this mixture distribution is 7.073. Hence, linearity is clearly rejected against the alternative of the presence of censored latent effects in the q parameter. To check for possible misspecification, we calculate the residuals defined in Eq. (15). The robust LM test statistic for first-order serial correlation equals 0.812 with a p value of 0.368. If we opt for the version of this test statistic where we specify the heteroske-

263

dasticity in the residuals, the test statistic equals 0.838, which is again not significant at the 5% level. Similar tests for first-to-fourth order serial correlation are 5.919 (0.205) and 2.924 (0.571), respectively, where p values appear in parentheses. The LM statistic for first-order ARCH effects in the residuals equals 1.732, which is clearly not significant compared with the 95% percentile of the v2(1) distribution. The same test for first-to-fourth order ARCH effects equals 3.991, which is also not significant at the 5% level. Given its empirical adequacy, we may interpret the parameters in Eqs. (44) and (45). If qt = 0, the sum of the autoregressive parameters (or, persistence) q of the AR model equals 0.978, which when compared with its standard error is not equal to 1. This indicates that shocks die out very slowly if qt = 0. The time-variation in persistence is explained by the explanatory variable in Eq. (45). Its coefficient has the expected sign. Finally, the variance of error term of the censored regression (Eq. (45)) appears to differ significantly from zero, and hence our model does not reduce to a threshold model with a non-stochastic threshold. Fig. 1 shows the estimated conditional probabilities Pr[qt>0jYt, xt;hˆ ] defined in Eq. (13). In Fig. 2 we present the conditional expectation q +E[qtjYt, xt;hˆ]. Remember that large values of these probabilities indicate periods with higher persistence. The graph in Fig. 2 shows clear indications of periods with

Fig. 1. Conditional probabilities Pr[qt >0jYt,xt;hˆ ], that is, the probability that the persistence parameter of US unemployment exceeds 0.978, due to the addition of the outcome of the censored regression.

264

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

Fig. 2. Conditional expectation q + E[qtjYt,xt;hˆ ], that is, the expected value of the persistence parameter, conditional on the outcome of the censored regression.

explosive increases in unemployment, as the conditional expectation is in excess of 1. Interestingly, several periods with high values of Pr[qt >0jYt,xt;hˆ ] appear to correspond with periods of increasing unemployment. As the graphical results are perhaps not immediately obvious, we decide to use the conditional probabilities that can now be used to investigate business cycle turning points; see Hamilton (1990) for a similar approach in Markov switching models. We may define a recession as 6 consecutive months for which Pr[qt >0jYt,xt;hˆ ]>0.5. A trough corresponds with the last observation in a recession and a peak with the last observation in an expansion. The first two columns of Table 2 display the peaks and troughs resulting from the conditional probabilities. The peaks and troughs correspond reasonably well with the official NBER peaks and troughs displayed in the last two columns of Table 2. The largest difference is found by the fact that the AR-CLEP model seems to identify a few more but short recessions, but again, our analysis only concerns unemployment. 4.3. Forecasting competition To compare our AR-CLEP model with alternative time series models with possibly time-varying autoregressive parameters, we perform an out-ofsample forecast comparison for the three countries

of interest. We confine our focus to one-step-ahead forecasts, as we believe it is already quite relevant to be able to forecast recessionary periods 1 month ahead. Of course, multi-step ahead forecasts can be done too, but this would require the construction of a forecasting model for the changes in the leading indicator. As alternative models, we consider linear and nonlinear models, as discussed above. The MSTVP model is estimated using the EM algorithm of Dempster, Laird, and Rubin (1977); see Hamilton (1990) and Diebold, Lee, and Weinback (1994). We impose the same values of p, k and q as for the ARCLEP model given in Table 1, for comparison purposes. It may be that this decision biases subsequent analysis in favor of our new model. On the other hand, though, if we allow these values to vary, we cannot tell in the end whether differences in forecast performance are due to the model specification or to specific parameter settings. To evaluate the out-of-sample forecast performance of the models, we hold out the last 12, 60 and 120 months. We re-estimate the parameters of the AR-CLEP and the above five models for the new samples. We keep the values of p, q and k the same, and hence, the forecasts are not genuine ex ante forecasts. We generate one-step ahead forecasts and compare the forecasted values with the true values. The onestep ahead forecasts for the AR-CLEP model are generated using Eq. (21), and hence we condition Table 2 Peaks and troughs for US unemployment based on conditional probabilities Unemploymenta

NBER

Peak

Trough

1966.07 1969.05 1974.03 1979.07 1981.07 1984.06 1989.05 1990.06

1967.04 1970.12 1975.05 1980.08 1982.11 1985.01 1990.01 1991.06

a

Peak

Trough

1969.12 1973.11 1980.01 1981.07

1970.11 1975.03 1980.07 1982.11

1990.07

1991.03

A recession is defined by six consecutive months for which Pr[qt >0jYt, xt;hˆ ]>0.5. A peak corresponds with the last expansion observation before a recession and a trough with the last observation in a recession.

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

on the explanatory variable. For the other models, one-step ahead forecasts can be generated in a straightforward way, except for the Markov switching model. The one-step ahead forecast at time T using this model is given by

mean squared forecast error [RMSE] and the mean absolute percentage forecast error [MAPE]. The next two columns give the results of the non-parametric binomial sign test. The first column displays the fraction that the forecast errors of the AR-CLEP are smaller in absolute value than the forecast errors of the alternative models. The second column shows the p value of the one-sided test of equal forecast accuracy against the alternative hypothesis that the ARCLEP model is better. The within-sample results for the US, that is the first panel of Table 3, show that the AR-CLEP model outperforms all other models on MAPE and on the binomial test. For the MAPE, this finding persists for the out-of-sample forecasts, although the binomial test is not significant anymore. The closest competitor to the AR-CLEP model is the MSTVP model. The final two sets of two columns of Table 3 show the outcomes of forecast encompassing tests. Let fT + h

ˆ T E½yT þ1 AYT ; xt  ¼ lˆ þ kˆTþ1 qy ˆ ˆ T þ ð1  kT þ1 Þðqˆ þ q*Þy ˆ T  yT 1 Þ þ aðy

265

ð46Þ

where kT + 1=(1  sT) pT + 1 + sT (1  qT + 1), that is the probability that sT + 1 = 0. As the variable sT is not observed, it has to be replaced by the estimated probability that sT = 0. We use the filter provided in Hamilton (1989) to compute smoothed conditional probabilities Pr[sT = 0jYT , xT]. Table 3 shows the results of the forecast comparison for the US. The first two columns show the root

Table 3 United States, forecasting performance Model

Criteria RMSE  100

Encompassing testsa

Sign test Fraction

p value

p value I

p value II

Forecasting sample 1965.01 – 1999.12 (in-sample) AR-CLEP 15.02 1.95 AR 16.29 2.05 ARXY 16.27 2.05 EAR 16.28 2.05 LSTR 16.29 2.05 MSTVP 15.35 1.97

MAPE

234/420 236/420 231/420 234/420 224/420

(0.01) (0.00) (0.02) (0.01) (0.08)

0.81 0.94 0.85 0.81 0.04

0.00 0.00 0.00 0.00 0.00

Forecasting sample 1990.01 – 1999.12 AR-CLEP 13.28 AR 13.67 ARXY 13.72 EAR 13.67 LSTR 13.67 MSTVP 13.54

1.84 1.92 1.92 1.92 1.92 1.86

64/120 65/120 64/120 64/120 65/120

(0.21) (0.16) (0.21) (0.21) (0.16)

0.05 0.06 0.06 0.05 0.25

0.00 0.00 0.00 0.00 0.02

Forecasting sample 1995.01 – 1999.12 AR-CLEP 13.03 AR 14.17 ARXY 14.10 EAR 14.10 LSTR 14.17 MSTVP 12.95

2.04 2.20 2.17 2.17 2.20 2.00

32/60 32/60 32/60 32/60 29/60

(0.26) (0.26) (0.26) (0.26) (0.55)

0.05 0.11 0.11 0.05 0.24

0.00 0.00 0.00 0.00 0.97

a The null hypothesis for the test I is that the forecasts from the AR-CLEP model cannot be improved by adding those from a competitive model. For test II, the null hypothesis is the reverse case.

266

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

encompass the forecasts from the competing model if the coefficient d in the following regression model is zero yT þh  fT þh ¼ dðf¯T þh  fT þh Þ þ eT þh ; for h ¼ 1; . . . ; H;

ð47Þ

where yT + h is the true value, see Clements and Hendry (1993, p. 634). If d p 0, the forecasting performance of the AR-CLEP model can be improved by adding some of the features of the competing model. To test for D = 0, we use an F-statistic. The first two columns of the last four in the tables concern the test whether the AR-CLEP model encompasses its rival models. The last two columns concern the test if the AR-CLEP model gets encompassed by the other models. The test results in the first two columns (of the last four) suggest that we cannot reject that forecasts

Fig. 3. Squared forecast errors of the AR-CLEP and MSTVP models for 1990.08 – 1991.03.

be the forecast from our AR-CLEP model and f¯T + h the forecast from one of the competing models. Then, the forecasts from the AR-CLEP model are said to

Table 4 Canada, forecasting performance Model

Criteria RMSE  100

Encompassing testsa

Sign test Fraction

p value

p value I

p value II

Forecasting sample 1965.01 – 1999.12 (in-sample) AR-CLEP 21.79 2.35 AR 22.84 2.41 ARXY 22.84 2.41 EAR 22.75 2.40 LSTR 22.78 2.40 MSTVP 21.51 2.32

MAPE

215/420 214/420 214/420 212/420 200/420

(0.30) (0.33) (0.33) (0.40) (0.82)

0.90 0.90 0.72 0.75 0.00

0.00 0.00 0.00 0.00 0.13

Forecasting sample 1990.01 – 1999.12 AR-CLEP 24.02 AR 23.91 ARXY 24.37 EAR 24.37 LSTR 25.34 MSTVP 23.83

1.94 1.98 2.04 2.04 2.08 1.94

59/120 61/120 61/120 61/120 59/120

(0.54) (0.39) (0.39) (0.39) (0.54)

0.02 0.04 0.04 0.20 0.07

0.03 0.01 0.01 0.00 0.76

Forecasting sample 1995.01 – 1999.12 AR-CLEP 21.50 AR 19.26 ARXY 19.28 EAR 19.28 LSTR 19.90 MSTVP 19.09

1.87 1.68 1.68 1.68 1.75 1.68

23/60 24/60 24/60 24/60 27/60

(0.95) (0.92) (0.92) (0.92) (0.74)

0.00 0.00 0.00 0.00 0.00

0.87 0.86 0.86 0.27 0.39

a The null hypothesis for the test I is that the forecasts from the AR-CLEP model cannot be improved by adding those from a competitive model. For test II, the null hypothesis is the reverse case.

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

267

AR-CLEP model gives much smaller forecast errors, while, additionally, it also gives smaller errors when the recession is over, that is, in 1991.04 and 1991.05. Hence, the AR-CLEP model works rather well when it really matters. In Tables 4 and 5, we report the forecasting results for Canada and West-Germany. Perhaps not too surprisingly, we find for Canada roughly the same qualitative results as for the US. For the first two samples, that is the full sample and the out-of-sample forecast sample 1990.01– 1999.12, we observe that the forecasts from the AR-CLEP model cannot be encompassed, while those of the other models can, except again the MSTVP model. The results for the two smaller samples are less conclusive, although for the forecasts for 1995.01 – 1999.12, we observe that the AR-CLEP model does not perform well at all. It is possible that in this period, where no serious recessionary months occurred, our model gives false signals. Finally, for the West-German data, we find that

generated by the AR-CLEP model encompass forecasts generated by one of the competing models. Hence, the forecasting performance of the AR-CLEP model cannot be improved by adding some of the features of the other models, except perhaps by features of the MSTVP model for the total sample ( p value of 0.04). These results concern all samples. If we consider the encompassing tests in the final columns of the table, we see that we reject that forecasts generated by all competing models encompass the forecasts generated by the AR-CLEP model, for all samples, except for the MSTVP model for the sample 1995.01 –1999.12. In general, we find that the AR-CLEP displays excellent within-sample and outof-sample results, compared to close competitors. In Fig. 3 we zoom in on the squared forecast errors of the AR-CLEP and MSTVP models for the US for the most recent recessionary period. We see that for the first four observations of the recession in the beginning of the nineties, that 1990.08 – 1990.11, the

Table 5 West Germany, forecasting performance Model

Criteria RMSE  100

Encompassing testsa

Sign test Fraction

p value

p value I

p value II

Forecasting sample 1965.01 – 1999.12 (in-sample) AR-CLEP 7.73 2.13 AR 7.88 2.20 ARXY 7.72 2.14 EAR 7.72 2.14 LSTR 7.71 2.13 MSTVP 7.76 2.15

MAPE

244/420 221/420 216/420 213/420 230/420

(0.00) (0.13) (0.26) (0.37) (0.02)

0.55 0.23 0.24 0.18 0.16

0.00 0.37 0.79 0.90 0.02

Forecasting sample 1990.01 – 1999.12 AR-CLEP 8.04 AR 6.70 ARXY 7.07 EAR 7.96 LSTR 7.38 MSTVP 7.79

0.74 0.64 0.67 0.72 0.69 0.74

53/120 56/120 53/120 50/120 67/120

(0.88) (0.74) (0.88) (0.96) (0.09)

0.00 0.00 0.33 0.00 0.00

0.03 0.16 0.65 0.00 0.13

Forecasting sample 1995.01 – 1999.12 AR-CLEP 6.82 AR 6.32 ARXY 6.36 EAR 6.56 LSTR 6.51 MSTVP 6.83

0.57 0.54 0.52 0.54 0.54 0.57

32/60 29/60 27/60 26/60 32/60

(0.26) (0.55) (0.74) (0.82) (0.26)

0.00 0.00 0.00 0.00 0.53

0.10 0.44 0.03 0.02 0.53

a The null hypothesis for the test I is that the forecasts from the AR-CLEP model cannot be improved by adding those from a competitive model. For test II, the null hypothesis is the reverse case.

268

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

there are not many differences between the models, at least for the within-sample period, although here the MSTVP model performs rather poorly. One possible reason for this is that the leading indicator variables for the US and Canada are better predictors for recessions that that of West-Germany. For the two larger forecast samples, the AR-CLEP model forecasts do not encompass those of other models. In sum, the AR-CLEP model can produce better insample and out-of-sample forecasts for the US unemployment rate, and also to a lesser extent for Canada, than the linear AR and ARX model and the nonlinear exponential AR model, the logistic smooth transition regression model and the Markov switching model. For the West-German data, the differences across models are not large.

in Eq. (10) with respect to the model parameters h={l,q,re,b,ru}

5. Conclusion

 2  Bln ft 1 et 1 ¼ Ut 3  f ðyt AYt1 ; qt ; hÞAqt ¼0 ft re Bre r   Z l e 1 ut þ / r r u xt Vb u ! ðet  ðxt Vb þ ut Þyt1 Þ2 1   re r3e

 f ðyt AYt1 ; qt ; hÞAqt ¼x0t b=þut dut ;

In this paper we proposed a novel parsimonious time series model with time-variation in the AR parameters. In our detailed illustration to three unemployment series, we showed that the model yields plausible inference (details were given for the US data) and could outperform alternative models in terms of forecasting. An interesting topic for further research is to see if our model can be extended to a multivariate setting. If so, it is worthwhile to see if the same or other linear combinations of lagged variables predict recessionary periods for all variables.

Bln ft 1 et ¼ Ut 2 f ðyt AYt1 ; qt ; hÞAqt ¼0 ft Bq r   Z l e 1 ut et  ðxt Vb þ ut Þyt1 þ / ru r2e xt Vb ru

 f ðyt AYt1 ; qt ; hÞAqt ¼x0t bþut dut ; Bln ft 1 et ¼ Ut 2 f ðyt AYt1 ; qt ; hÞAqt ¼0 ft Bl re   Z l 1 ut et  ðxt Vb þ ut Þyt1 þ / r r r2e u xt Vb u

 f ðyt AYt1 ; qt ; hÞAqt ¼x0t bþut dut yt1 ;

  yt1 ut ðet  ðxt Vb þ ut Þyt1 Þ / r3e r r u u xt Vb

 f ðyt AYt1 ; qt ; hÞAqt ¼xt Vb=þut dut xt

Bln ft 1 ¼ ft Bb

Acknowledgements We thank an associate editor, two anonymous referees and participants at the International Symposium of Forecasting (Dublin 2002) for many detailed and helpful comments.

Appendix A To derive the first derivative of the log likelihood function (Eq. (11)), we first consider the partial derivatives of the density ft = f ( ytjYt  1, xt;h) given

Z

l

and Bln ft 1 xt Vb ¼ / f ðyt AYt1 ; qt ; hÞAqt ¼0 ft r2u t Bru    Z l 2 ut 1 ut  þ / 4 2 ru ru xt Vb ru

 f ðyt AYt1 ; qt ; hÞAqt ¼x0t bþut dut ; where f ( ytjYt  1,qt;h) is defined in Eqs. (10) and (9), respectively, and where et = yt  l  q yt  1. The total

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

269

derivative per observation gt(h) = Blnft/Bh is a vector of stacked partial derivatives   Blnft Blnft Blnft Blnft Blnft Blnft V : gt ðhÞ ¼ ... Ba1 Bap BcV Bre BbV Bru

Table 7 Empirical size of the robust LM test statistics for serial correlation and ARCH effectsa Test statistic

Nominal size Order

0.20

0.10

0.05

0.01

Hence, the first-order derivative of the log likelihood function (Eq. (11)) equals

RBLM-serial

1 1–4 1 1–4 1 1–4

0.21 0.21 0.21 0.22 0.24 0.28

0.11 0.11 0.11 0.11 0.14 0.16

0.06 0.05 0.06 0.05 0.08 0.09

0.01 0.01 0.01 0.01 0.03 0.02

GðhÞ ¼

T BLðYT AxT ; hÞ X ¼ gt ðyt AYt1 ; xt ; hÞ: Bh t¼1

RBHLM-serial RBLM-ARCH a

The maximum likelihood estimator of h denoted by hˆ is the solution of the first-order condition BLðYT AxT ; hÞ ¼ 0: Bh To find the maximum of the likelihood function, we opt for the BHHH algorithm of Berndt, Hall, Hall, and Hausman (1974). Given an appropriate starting value h0, we iterate over GðhÞ ¼

hn ¼ hn1 þ Hðhn1 Þ1 Gðhn1 Þ

The data generating process is given in Eqs. (44) and (45). The number of replications is 10 000 and the sample size is 420.

the covariance matrix is given by the inverse of the information matrix. The information matrix can be estimated by H(hˆ ); see Hamilton (1996, p. 132) for a similar approach in Markov switching time series models.

Appendix B

until convergence, where the matrix H is defined as T 1X gt ðhÞVgt ðhÞ: HðhÞ ¼ T t¼1

In general the maximum likelihood estimator hˆ is asymptotically normally distributed with mean h and

To investigate the small sample properties of the ML estimator, we perform a simulation experiment. As the data generating process [DGP], we use the estimated AR-CLEP model (Eqs. (44) and (45)) with the realizations of the explanatory variables xt in the censored regression and the same number of obser-

Table 6 Properties of the maximum likelihood estimator and estimated ‘z-ratios’a Parameter True value E[hˆ ]b Nominal size ‘z-ratios’c Left tail

l q a1 a2 a3 a4 ru b0 b1 re

0.064 0.978  0.134  0.002 0.057 0.128 0.126 0.007  0.009 0.024 a

0.085 0.974  0.137  0.007 0.057 0.128 0.125 0.007  0.010 0.023

Right tail

0.01

0.05

0.10

0.10

0.05

0.01

0.00 0.04 0.01 0.01 0.01 0.01 0.02 0.00 0.01 0.02

0.01 0.13 0.05 0.06 0.06 0.05 0.07 0.03 0.05 0.07

0.03 0.22 0.11 0.12 0.10 0.10 0.13 0.07 0.11 0.13

0.21 0.03 0.09 0.08 0.10 0.10 0.09 0.11 0.10 0.07

0.13 0.01 0.04 0.04 0.05 0.05 0.04 0.06 0.06 0.03

0.04 0.00 0.01 0.01 0.01 0.01 0.01 0.02 0.02 0.00

The DGP is given in Eqs. (44) and (45). The number of replications is 10 000 and the sample size is 420. The mean of the maximum likelihood estimates. c The cells denote the empirical size of the distribution of the ‘z-ratios’ defined as, (hˆ  h)/rˆ (h), where rˆ (h) denotes the estimated heteroskedastic consistent standard error of hˆ . b

270

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271

vations, which is 420. In each step we estimate an AR-CLEP model with the true lag orders for the autoregression and the explanatory variables and we calculate ‘z-ratios’, that is, the difference between the maximum likelihood estimate of each parameter and the true value of the parameter divided by its estimated standard error. Table 6 shows the outcomes of this simulation experiment. The second column shows the true value of the parameters in the DGP, while in the third column we display the mean of the maximum likelihood estimates. The estimated values correspond well with the true values and hence there does not seem to be a serious bias in the parameter estimates. The final two sets of three columns show for each parameter the simulated size of the distribution of the ‘z-ratios’ corresponding to the percentiles of the normal distribution. The table suggests that there are no severe size distortions except for the constant lˆ and the autoregressive parameter qˆ, which is in correspondence with their well-known behavior in linear AR models. The distribution of the constant turns out to be skewed to the left, while the distribution of qˆ is skewed to the right. Hence, the simulation experiment suggests that we may interpret standard errors in the familiar way except for the constant and the autoregressive parameter qˆ . In the same simulation experiment, we investigate the empirical size of the robust LM tests. In each simulation step, we compute the robust LM tests for first and first-to-fourth order serial correlation with and without the specification of heteroskedasticity in the residuals, denoted by RBLM-serial and RBHLM-serial and the robust LM tests for ARCH effects [RBLMARCH]. Table 7 shows the simulated size of these six tests belonging to the corresponding critical value of the asymptotic distribution. The results are based on 10 000 replications. We see that the actual size is very close to the nominal size for the four tests for serial correlation. The RBLM-ARCH tests are a little oversized. In sum, the simulation experiment suggests that diagnostic checking with the proposed robust LM statistics is valid. References Berndt, E., Hall, B., Hall, E., & Hausman, J. (1974). Estimation and inference in non-linear structural models. Annals of Economic and Social Measurement, 3, 653 – 665.

Chen, R. (1995). Threshold variable selection in open-loop threshold autoregressive models. Journal of Time Series Analysis, 16, 461 – 482. Clements, M., & Hendry, D. (1993). On the limitations of comparing mean squared forecast errors. Journal of Forecasting, 12, 617 – 637. Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39, 1 – 38. Diebold, F., Lee, J., & Weinbach, G. (1994). Regime switching and endogenous transition probabilities. In C. Hargreaves (Ed.), Nonstationary time series analysis and cointegration (pp. 283 – 302). Oxford: Oxford University Press, Chap. 10. Filardo, A. (1994). Business-cycle phases and their transitional dynamics. Journal of Business and Economic Statistics, 12, 299 – 308. Gourieroux, C., & Monfort, A. (1995). Statistics and econometric models, vol. 2. Cambridge: Cambridge University Press. Granger, C., & Swanson, N. (1997). An introduction to stochastic unit-root processes. Journal of Econometrics, 80, 35 – 62. Granger, C., & Tera¨svirta, T. (1993). Modelling nonlinear economic relations. Oxford: Oxford University Press. Grillenzoni, C. (1993). ARIMA processes with ARIMA parameters. Journal of Business and Economic Statistics, 11, 235 – 250. Haggan, V., & Ozaki, T. (1981). Modelling non-linear random vibrations using an amplitude-dependent autoregressive time series model. Biometrika, 68, 189 – 196. Hamilton, J. (1989). A new approach to the econometric analysis of nonstationary time series and business cycles. Econometrica, 57, 357 – 384. Hamilton, J. (1990). Analysis of time series subject to changes in regime. Journal of Econometrics, 45, 39 – 70. Hamilton, J. (1996). Specification testing in Markov-switching time-series models. Journal of Econometrics, 70, 127 – 158. Harvey, A. (1981). The econometric analysis of time series. Oxford: Philip Alan. Johnson, N., & Kotz, S. (1970). Distributions in statistics: Continuous univariate distributions. Boston: Houghton Mifflin. Leybourne, S., McCabe, B., & Mills, T. (1996). Randomized unit root processes for modelling and forecasting financial time series: Theory and applications. Journal of Forecasting, 15, 253 – 270. Leybourne, S., McCabe, B., & Tremayne, A. (1996). Can economic time series be differenced to stationarity? Journal of Business and Economic Statistics, 14, 435 – 446. Neftc¸i, S. (1984). Are economic time series asymmetric over the business cycle. Journal of Political Economy, 92, 307 – 328. Tera¨svirta, T. (1994). Specification, estimation and evaluation of smooth transition autoregressive models. Journal of the American Statistical Association, 89, 208 – 218. Tera¨svirta, T., & Anderson, H. (1992). Characterizing nonlinearities in business cycles using smooth transition autoregressive models. Journal of Applied Econometrics, 7, S119 – S136.

P.H. Franses et al. / International Journal of Forecasting 20 (2004) 255–271 Tong, H. (1983). Threshold models in non-linear time series analysis. Berlin: Springer. White, W. (1980). A heteroscedasticity-consistent covariance matrix estimator and a direct test for heteroscedasticity. Econometrica, 48, 817 – 838. Wolak, F. (1989). Local and global testing of linear and nonlinear inequality constraints in nonlinear econometric models. Econometric Theory, 5, 1 – 35. Wooldridge, J. (1991). On the application of robust, regressionbased diagnostics to models of conditional means and conditional variances. Journal of Econometrics, 47, 5 – 46. Biographies: Philip Hans FRANCES is Professor of Applied Econometrics and Professor of Marketing Research, both at the Erasmus University Rotterdam. He publishes on his research interests, which are applied econometrics, time series, forecasting, marketing research and empirical finance.

271

Richard PAAP is Postdoctoral Researcher with the Econometric Institute at the Erasmus University Rotterdam, the Netherlands. He obtained his Ph.D. at the same university. His current research interests are time series analysis and econometric models for household scanner panel data. He has several publications in econometric and economic journals. Bjo¨rn L.K. VROOMEN, is a Ph.D. candidate at the Erasmus Research Institute of Management of the Erasmus University Rotterdam, the Netherlands. He obtained his M.Sc. in econometrics from the same university. His current research concerns modeling consumer behavior, with a focus on the Internet. He has published in the European Journal of Operational Research.