Testing for the cointegration rank when some cointegrating directions are changing

Testing for the cointegration rank when some cointegrating directions are changing

Journal of Econometrics 124 (2005) 269 – 310 www.elsevier.com/locate/econbase Testing for the cointegration rank when some cointegrating directions ...

470KB Sizes 1 Downloads 37 Views

Journal of Econometrics 124 (2005) 269 – 310

www.elsevier.com/locate/econbase

Testing for the cointegration rank when some cointegrating directions are changing Philippe Andradea;∗ , Catherine Bruneaub , St,ephane Gregoirc a THEMA,

b THEMA,

University of Cergy-Pontoise, 33 bd du Port, Cergy-Pontoise Cedex 95011, France University of Paris X-Nanterre, 200 av. de la R(epublique, Nanterre Cedex 92001, France c CREST, INSEE, Timbre J301, 15 bd Gabriel P( eri, Malako1 92245, France Accepted 9 February 2004

Abstract We develop some tests for characterizing the cointegration space of a cointegrated vector autoregressive model when its long-run parameters are modi3ed by a structural break at a known date. We 3rst consider the case in which the break does not a4ect the loading factors and second the more general one in which all long-run parameters change. For each con3guration, we design procedures to test for the cointegration rank as for the number of directions which are changing between the two regimes. For the simplest case, the cointegration rank test is also extended to the case of an unknown date of shift. c 2004 Elsevier B.V. All rights reserved.  JEL classi4cation: C32 Keywords: Multivariate time series; Cointegration; Structural break; Rank tests

1. Introduction When modelling multivariate integrated time series one has to investigate whether or not the series are cointegrated, a concept which was introduced by Granger (1981). Granger and Weiss (1983), Engle and Granger (1987), Johansen (1988) and Stock and Watson (1988) indeed pointed out that cointegration properties induce a speci3c representation of the dynamics and require use of speci3c estimation methods. ∗

Corresponding author. E-mail addresses: [email protected] (P. Andrade), [email protected] (C. Bruneau), [email protected] (S. Gregoir). c 2004 Elsevier B.V. All rights reserved. 0304-4076/$ - see front matter  doi:10.1016/j.jeconom.2004.02.003

270

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

It is also usual in time-series econometrics to check whether the statistical model chosen for describing the data under study is subject to structural breaks, i.e. modi3cations of the parameters characterizing its probability distribution. When one models drifting time series, structural breaks may, in particular, inBuence their long-run properties. In a univariate framework, Perron (1989), Banerjee et al. (1992) and Zivot and Andrews (1992) stressed how such breaks could a4ect the outcome of standard unit-root tests, arguing that time-series may sometimes be better described as stationary processes around a breaking level or drift rather than integrated ones. It turns out that accounting for structural breaks is also crucial for the study of integrated multivariate dynamical systems. Indeed, Gregory et al. (1996) showed that the power of conventional cointegration tests falls sharply when cointegrating relationships are subject to structural changes: one too often does not correctly reject the null of no-cointegration. Lack of careful investigation of these potential structural breaks may thus lead to misspeci3cation of the long-run properties of a dynamical system and inadequate estimation and testing procedures. Up to now, most of the literature on structural breaks in cointegrated systems has focused on investigating the stability of the parameters, which characterize the cointegrating vectors only (Hansen, 1992; Hansen and Johansen, 1999; Quintos and Phillips, 1993), jointly the cointegrating vectors and the loading factors (Seo, 1998; Hansen, 2000) or the number of cointegrating vectors (Quintos, 1997). All these approaches impose stable cointegration under the null hypothesis. They are thus conducted conditionally on a cointegration property recognized at a previous step. Consequently, they do not allow for gauging whether or not a rejection of the cointegration property comes from an undetected instability. By contrast, Gregory and Hansen (1996) propose a test for the null of noncointegration against an alternative of cointegration with a structural break of unknown timing. Nevertheless, their procedure is based on a single equation approach. This paper develops a statistical procedure to consistently identify the number of cointegration relationships in a cointegrated system when a break occurs at a known date and a4ects the long-term dynamic parameters, i.e. when at least one direction of the cointegration space is modi3ed. Rank tests for cointegrated systems in the presence of a structural break were already given in Gregoir (1995), Inoue (1999) and Saikkonen and LIutkepohl (2000) but for cases where the break only a4ects the deterministic components. However, allowing the break to a4ect the parameters of the cointegration relationships should also be of interest as it enlarges the range of economically meaningful modi3cations that can be accounted for. Along the lines of Johansen (1988), the testing strategy is to investigate a sequence of nested models: we test for the hypothesis that there exist at most r cointegrating vectors against the alternative that there are strictly more than r, beginning with r = 0 and ending with r = n − 1, with n the dimension of the system. The speci3cation of cointegration with a structural break as the null hypothesis requires deriving the representation of a cointegrated system with a break in the cointegrating vectors so as to implement speci3c estimation and testing procedures. The paper is organized as follows. Section 2 describes the class of model we work with and gives an extended Granger’s representation theorem for the particular

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

271

dynamics under study. Section 3 deals with the estimation procedure which is based on principal components analysis. The next two sections are devoted to the rank tests. In Section 4, we 3rst consider the case where the structural break a4ects the cointegration space but leaves the loading factors unchanged. This simple case is used to present the general principles of our testing procedure. We also present a test for the number of changing directions of the cointegration space after the break as well as a test for the cointegration rank when the date of break is unknown that apply to that particular framework. Identifying the dimension of the cointegration space is more complex for the general case where the loading factors are also allowed to shift, as discussed in Section 5. In particular, it requires specifying the number of common non-stationary directions under the maintained hypothesis in order to get a test statistic whose asymptotic distribution is free of nuisance parameters. Identi3cation of the number of common cointegrating directions across the two regimes therefore follows as a by-product of the cointegration rank tests. In any case, the limit distributions obtained are free of nuisance parameters but non-standard. In Section 6, we thus provide asymptotic critical values for the di4erent test statistics which are estimated by response surface regressions and Monte–Carlo simulations. We also conduct simple experiments in order to illustrate the 3nite sample properties of our procedure. Section 7 concludes. 2. The model Let us consider the following vector error correcting model (VECM) of order p with a structural break a4ecting its long-run parameters at a known date t0 L Xt = 1t6t0 [0 0 Xt−1 + 0 Dt ] + 1t¿t0 [1 1 (Xt−1 − Xt0 ) + 1 Dt ] +

p 

j L Xt−j + jt ;

t = 1; : : : ; T;

(1)

j=1

where Xt is an n-dimensional process whose components are all integrated of order one, with given initial values X−(p+1) ; L X−p ; : : : ; L X0 . 0 , 1 , 0 , 1 are n × r full rank matrices. Dt is a d-vector of deterministic regressors and 0 and 1 are n × d matrices. We restrict our analysis to the cases where the deterministic terms are Dt = 0, Dt =1 or Dt =(1 t) . These terms may be or not included in the cointegrating vectors. The date of break is characterized by the fraction 0 of the sample size T : t0 = [0 T ], where [ · ] stands for the integer part function and where 0 ∈ [ ; P ], with [ ; P ] ⊂ = ]0; 1[ − so that the break does not occur at the limit points of the sample. 1t6t0 and 1t¿t0 select the regime that currently runs at date t and are de3ned as   1; if t 6 t0 1; if t ¿ t0 1t6t0 = and 1t¿t0 = : 0; otherwise 0; otherwise The n×n matrices j , j=1; : : : ; p; are assumed to be constant across the two regimes. Nevertheless, allowing the break to a4ect those parameters would not modify the core of our results. Considering the term (Xt−1 −Xt0 ) over the second regime may be viewed

272

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

as a normalization convention which re-initializes the system at the beginning of the second regime. Thus, over the second regime the error correcting mechanism impinges on the speci3c path that begins at t0 + 1. The variance of 1 Xt = 1 (Xt − Xt0 ) + 1 Xt0 will otherwise increase linearly in T as soon as the date of break t0 is considered as a 3xed fraction of the sample size. The VECM is furthermore assumed to satisfy a “stability” condition over the two regimes. Assumption 1. The roots of the characteristic polynomials of the VECM (1), namely the solutions of   p  det (1 − z)In − i i z −

j (1 − z)z j  ; i = 0; 1; j=1

satisfy either z = 1 or |z| ¿ 1. Lastly, the innovation process {jt } is restricted to belong to the following class. Assumption 2. The n × 1 innovation process {jt } is a vector martingale di4erence sequence with respect to Ft−1 , the -3eld generated by the set of {Xt  ; t  ¡ t}, with variance–covariance matrix E(jt jt |Ft−1 ) = j , t = 1; 2; : : : Moreover ∃ ¿ 0 such that supt maxj E(|jjt |2+ |Ft−1 ) ¡ ∞, j = 1; : : : ; n, t = 1; 2; : : : : We therefore do not allow for a shift in the innovation covariance matrix j across the two regimes. This would split the model into two separate ones if the short-term coeScient matrices were also shifting with the break and would thus require a separate analysis over the two sub-samples. Similarly we do not allow for a di4erent number of cointegration relationships over the two regimes. This would correspond to a model where long-term equilibria arise or vanish across regimes and whose analysis, we think, would preferably have to be separated. Cointegration holds when full-rank r linear combinations of I (1) series cancel the unit-roots present in the processes. In the standard non-breaking case, this property ensures that the linear transformation obtained is (weakly) stationary. As is shown in Appendix A, this is no longer true when a structural break occurs along the sample. Indeed, in that case, the second regime keeps track of the observation at the date of break which is generated by the 3rst regime. As a consequence, the process does not have an invariant stationary distribution over the second regime. Nevertheless, it satis3es an asymptotically stationarity property, with an exponential convergence rate, that we now characterize through the following de3nitions. Denition 1 (Asymptotic stationarity): A univariate process {yt } is said to be asymptotically weakly stationary if there exists a covariance stationary process {xt } such that lim yt − xt L2 = 0:

t→∞

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

273

Denition 2 (Exponential asymptotic stationarity, EAS): A univariate process {yt } is said to be exponentially asymptotically weakly stationary (EAS) if there exists a covariance stationary process {xt }, a polynomial a(·) of 3nite order and a couple (; T0 ) ∈ R+ × Z; such that ∀ t ¿ T0 yt − xt L2 6 a(t)e−t : An extension of this de3nition to the multivariate case is straightforward and given in the following statement. Denition 3 (EAS for multivariate processes): An n-dimensional process {Yt } is said to be exponentially asymptotically weakly stationary if every linear combination of its components is exponentially asymptotically weakly stationary. Given these de3nitions, we can now focus on the meaning of cointegration in the presence of a structural break. Denition 4 (Cointegration with a structural break). The r × 2n matrix  = (0 1 ) is said to de3ne r cointegrating vectors with a structural break at date t0 if and only if 1. sp(0 ) = sp(1 ), where sp(·) stands for the space spanning function, 2. 0 Xt−1 de3nes r stationary processes over the 3rst regime (i.e. for t 6 t0 ) and 1 (Xt−1 − Xt0 ) de3nes r exponentially asymptotically stationary processes over the second one (i.e. for t ¿ t0 ), 3. the spectral matrix at frequency zero of 0 Xt−1 and that of the weakly stationary process that is asymptotically similar, in the L2 metric sense, to 1 (Xt−1 − Xt0 ) are both of full rank r. The 3rst point of this de3nition implies that a structural break really occurred at t0 . The EAS property appears in the second point since the distribution of 1 (Xt−1 − Xt0 ) depends on the values observed at the breaking date. Lastly, the third point assumes that there are r linearly independent cointegrating relationships in each of the two regimes. p Let =In − j=1 j . Let also i⊥ and i⊥ , i =0; 1, be n×(n−r) matrices associated with a particular base of the orthogonal supplement of the space spanned, respectively, by the loading factors i , i = 0; 1 and the cointegrating vectors i , i = 0; 1, such that i i⊥ = 0; and i i⊥ = 0, i = 0; 1. We end this section with the following theorem which is a slight extension to the well-known Granger Representation Theorem and characterizes the solution of Eq. (1). Theorem 1 (Granger Representation Theorem). Consider the model de4ned by Eq. (1) and Assumptions 1–2. A necessary and su>cient condition for the r × 2n matrix  = (0 1 ) to de4ne r cointegrating vectors with a structural break at date t0 is   that the (n − r) × (n − r) matrices 0⊥

0⊥ and 1⊥

1⊥ have full rank. In this case

274

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Xt has the following representation Xt = P0⊥ X0 + CP 0 (L)(jt + 0 Dt ) + CP 0

t 

(jj + 0 Dj );

(2)

j=1

over the 4rst regime, i.e. for t = 1; : : : ; t0 ; and Xt − Xt0 = CP 1; t−t0 (L)(jt + 1 Dt ) + CP 1

t 

(jj + 1 Dj ) + !t ;

(3)

j=t0 +1

over the second one, i.e. for t = t0 + 1; : : : ; T , with P0⊥ denoting the orthogonal projection onto sp(0⊥ ), 1 CP 0 (z) and CP 1; t−t0 (z) polynomials whose zeros lie strictly  

i⊥ )−1 i⊥ , i = 0; 1, and !t a random term which outside the unit circle, CP i = i⊥ (i⊥ p satis4es !t →0. Proof. See Appendix A. 3. Estimation Model (1) imposes non-linear restrictions across the spaces of parameters which characterize the two regimes. If this model is further restricted to a particular innovation distributional assumption, maximum likelihood estimation would possibly not result in eScient estimators. Indeed, maximizing the likelihood function requires to resort to numerical procedures such as switching-algorithms as estimators of the 3rst and the second regime are related (see Johansen and Juselius, 1992; Boswijk, 1995; Hansen, 2000). These procedures do not ensure that one 3nally converges to a global maximum. In our approach, we choose to base our estimation procedure on a principal components analysis that gives us only consistent estimators. For 3nite samples, no theoretical argument is available to compare the eSciency of both methods. Yet, principal components analysis has the comparative advantage to be very simple to implement. When testing for the cointegration rank in the presence of a structural break, we have to distinguish a simple case, where the loading factors are unchanged across the two regimes, from a more general one, where they may jointly change with the cointegration space. As these two cases involve slightly di4erent estimation procedures, we present each situation in turn. 3.1. Case 1: the break does not a1ect the loading factors The long-run parameters i , i , i = 0; 1, in (1) are not identi3ed as any alternative long-run parameters, i $, $−1 i , i = 0; 1, with $ any non-singular r × r matrix, would give the same probability distribution for the variables. Their estimation requires an 1

It can be noticed that the initial values appears only in Eq. (2) through the term P0⊥ X0 which is the piece of information necessary to de3ne the values of Xt as a function of the innovations between the dates 1 and t.

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

275

identi3cation constraint. In this section, where 0 and 1 span the same vector space, sp(), a natural identi3cation scheme for the two regimes is given by  j−1  = Ir :

(4)

Along the lines of Johansen (1988), the model (1) can then be rewritten in a more compact form Z0t = 0 Z1t(0) + 1 Z1t(1) + &Z2t + jt ;

t = 1; : : : ; T;

where the notations depend on the deterministic term speci3cation. When the deterministic terms are unrestricted, i.e. ⊥ i = 0; we have Z0t = L Xt , Z1t(0) = Xt−1 :1t6t0 , Z1t(1) =   (Xt−1 − Xt0 ):1t¿t0 , Z2t = [L Xt−1 : : : L Xt−p Dt :1t6t0 Dt :1t¿t0 ] and & = [ 1 : : : p 0 1 ]. When the deterministic terms are cointegrated, so that ⊥ i = 0, we have Z1t(0) =    : : : L Xt−p ] . Dt ] :1t6t0 , Z1t(1) = [(Xt−1 − Xt0 ) Dt ] :1t¿t0 and Z2t = [L Xt−1 [Xt−1 We can concentrate out the unrestricted short-term and deterministic parameters, &, (1) by projecting Z0t , Z1t(0) and Z1t(1) onto the space spanned by Z2t . Letting R0t , R(0) 1t and R1t be the residuals obtained from these regressions and using the Frisch–Waugh theorem we obtain  (0) R1t + (t t = 1; : : : ; T; (5) R0t =  R(1) 1t with  = (0 1 ). Let us introduce the following notations for the variance–covariance matrices of the residual S00 = T −1

T 

R0t R0t ;

t=1 (0) 0 = T −1 S01

t0  t=1

(0) 0 = T −1 0 S11

(1) −1 R0t R (0) 1t 0 ; S01 1 = T

t0  t=1

T  t=t0 +1

(0)   (1) −1 0 R(0) 1t R1t 0 ; 1 S11 1 = T

and their respective population counterparts: 00 , by 00 = E(R0t R0t ); (0) (0 0

 

(0) 0 0 0



(1) 0 ) = E R0t 1



0 R(0) 1t 1 R(1) 1t

(0) , 0 0

  ;

   (0)   0 R(0) 0 R1t 1t =E ;  (1)  (1) R R   (1) 1 1 1t 1t 1 1 0



R0t R (1) 1t 1 ; T  t=t0 +1

(1)  1 R(1) 1t R1t 1 ;

(1) 0 , (0) and (1) de3ned 1 0 0 1 1

276

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

(1)  where the last equality follows from the fact that, by de3nition, R(0) 1t R1t = 0 ∀t. As stated in the following lemma, the properties of the particular DGP studied imply that these sample moments lead to consistent estimates of their population counterparts.

Lemma 1. When T goes to in4nity, the following probability limits hold p

S00 →00 ; p

p

(0) (1) (1) (0) ; S01 1 →0 ; 0 →0 S01 1 0 p

p

(1) (0) 0 S11 : ; 1 S11 1 →(1) 0 →(0) 0 0 1 1

 be the unrestricted OLS estimator of the n × 2n matrix  in (5) Moreover, let  T and (ˆt the associated residuals. If {jt } satis4es Assumption 2, then ˆ j =T −1 t=1 (ˆt (ˆt is a consistent estimator of j . Proof. This results from the fact that L Xt , 0 Xt and 1 (Xt − Xt0 ) can (asymptotically for the latter) be expressed as linear 3lters with exponentially decreasing weights of the process {(t } that satis3es Assumption 2 so that a law of large numbers applies. When 0 is known, regression (5) is equivalent to a reduced rank regression (Anderson, 1951) and estimators of  and  can be obtained by a suitable principal components analysis. Let Vˆ be the n × r matrix of the r eigenvectors associated with the r largest eigenvalues, sorted out in decreasing order, of the covariance matrix  −1  (0)  (0)   (0) S S S10 0 10 11  ˆ j−1=2 ˆ j−1=2  (1) (1) (1) 0 S11 S10 S10 (0) (0) −1 (0) (1) (1) −1 (1) ˆ −1=2 (S11 ) S10 + S01 (S11 ) S10 ]j : =ˆ j−1=2 [S01

(6)

Principal components estimates ˆ = ˆ j1=2 Vˆ ;

(i) (i) −1 ˆi = ˆ ˆ −1 j S01 (S11 ) ; i = 0; 1;

give us consistent estimators of the parameters , 0 and 1 . Moreover, the identi3cation scheme (4) ensures that ˆi , i = 0; 1, are of full column rank r, with asymptotic probability one, provided that the limit covariance matrix over which these estimators are built satis3es the following assumption. Assumption 3. The covariance matrix (0) (1) ((0) )−1 (0) + 0 ((1) )−1 (1) ]j−1=2 ; j−1=2 [0 0 0 0 00 1 1 1 10

has r distinct strictly positive eigenvalues.

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

277

Likewise, let Vˆ ⊥ be the n × (n − r) matrix of the eigenvectors associated with the n − r smallest eigenvalues of (6). Principal components estimates ˆ⊥ = ˆ j−1=2 Vˆ ⊥ ;

(i) ˆi⊥ = ˆ⊥ S01 ; i = 0; 1;

provide us with consistent estimators of the spaces spanned respectively by ⊥ and i⊥ , i = 0; 1. Indeed, they satisfy ˆ⊥ ˆ = 0, ˆi⊥ ˆi = 0; i = 0; 1. However, the nullity of the n − r smallest eigenvalues of the covariance matrix appearing in Assumption 3 implies that our identi3cation scheme does not allow to identify a particular base for each of these spaces. While consistency is a welcome property of the estimators, the derivation of the asymptotic distribution of the test statistics that we use in the next sections also requires that one is able to characterize their convergence rate. This is achieved in the following Lemma. Lemma 2. The principal components estimates ˆi , ˆi⊥ , i = 0; 1, are consistent for i and sp(i⊥ ), i = 0; 1. Moreover, they have the following asymptotic behavior when T goes to in4nity     i i     ˆ ˜ ˆ ; i⊥ − i⊥ = Ai⊥ ,⊥ (T ) ; i = 0; 1; i − i = Ai ,(T )   i⊥ i⊥ where Ai , Ai⊥ , i = 0; 1, are two random matrices, respectively of dimension r × n and (n − r) × n, which converge weakly, L(T ) is an n × n diagonal matrix whose r 4rst diagonal elements equal T −1=2 and the remaining ones T −1 , L⊥ (T ) is an n × n diagonal matrix whose elements all equal T −1 , and ˜i⊥ is equal to i⊥ up to a full-rank matrix. Proof. See Appendix B. 3.2. Case 2: the break a1ects the cointegrating vectors and the loading factors We now suppose that the structural break a4ects the cointegrating vectors as well as the loading factors so that the spaces spanned by 0 and 1 di4er across the two regimes. Using the same notations than those introduced in the preceding section, model (1) can be rewritten as  (0)   Z1t  0 0  + &Z2t + jt ; t = 1; : : : ; T; Z0t = 0 1 0 1 Z1t(1) and concentrating out the short-term parameters, &, we get  (0)   R1t 0 0 R0t = (0 1 ) + (t ; t = 1; : : : ; T:  0 1 R(1) 1t

(7)

(1)  By de3nition, R(0) 1t R1t = 0 ∀t. When the date of break is known, consistent estimators of regression (7) parameters can then be obtained from two separate reduced rank

278

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

regressions, R0t = 0 0 R(0) 1t + (t ;

t = 1; : : : ; t0 ;

R0t = 1 1 R(1) 1t + (t ;

t = t0 + 1; : : : ; T:

A suitable principal components analysis of these regressions provides us with consistent estimators of i and i , i = 0; 1. By contrast to the previous case, this estimation (0) procedure requires to impose identi3cation constraints speci3c to each regime. Let 00 (1) and 00 , respectively denote the second-order moments of L Xt , conditional on Z2t , over the 3rst and the second regime. From (7), they satisfy the following equation: (0)    (1) (1)   E(R0t R0t ) = 0 E(0 R(0) 1t R1t 0 )0 + 1 E(1 R1t R1t 1 )1 + j

= 0 0 0 0 + 0 j + 1 1 1 1 + (1 − 0 )j (0) (1) ; = 00 + 00 (0) (1) with 00 =0 0 0 0 +0 j and 00 =1 1 1 1 +(1−0 )j . A convenient identi3cation scheme is now to impose that (i) −1 i (00 ) i = I r ;

i = 0; 1;

which ensures that the parameters of interest are identi3ed once Assumption 3 is modi3ed for this more general class of models as follows. Assumption 4. Each covariance matrix (i) −1=2 (i) (i) −1=2 (00 ) 0i ((i)i i )−1 (i)i 0 (00 ) ;

i = 0; 1;

has r distinct strictly positive eigenvalues. Now consider Vˆ i and Vˆ i⊥ the matrices of the eigenvectors associated respectively with the r largest and the n − r smallest eigenvalues, sorted in decreasing order, of the covariance matrix (i) −1=2 (i) (i) −1 (i) (i) −1=2 (S00 ) S01 (S11 ) S10 (S00 ) ; i = 0; 1; t0 T (0) (1) with S00 = T −1 t=1 R0t R0t and S00 = T −1 t=t0 +1 R0t R0t . Principal components estimates (i) 1=2 ˆ (i) −1=2 ˆ ˆi = (S00 V i⊥ ; ) V i ; ˆi⊥ = (S00 ) (i) −1 (i) (i) −1 (i) ) S01 (S11 ) ; ˆi⊥ = ˆi⊥ S01 ; i = 0; 1; ˆi = ˆi (S00

give us consistent estimators of i ; i and of the spaces spanned by i⊥ and i⊥ ; i = 0; 1. Remark 1. Lemma 2 still holds in that case.

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

279

4. Testing for the cointegration rank (and more) when the break does not a%ect the loading factors 4.1. Testing for the cointegration rank With this 3rst test we aim at identifying the (common) number of cointegrating relationships over the two regimes or equivalently to identifying the rank of the matrices 0 and 1 in model (1). This is performed by testing sequentially for the nested null hypothesis that the changing cointegrating spaces, which are supposed to di4er across regimes, are at most of dimension r against the alternative that they are of dimension greater than r, beginning with r = 0 and ending with r = n − 1. More precisely we test for H0 : sp(0 ) = sp(1 ) and rk(0 ) = rk(1 ) 6 r; against Ha : sp(0 ) = sp(1 ) and rk(0 ) = rk(1 ) ¿ r; where sp(·) and rk(·) stand, respectively, for the vector space spanning and the rank functions. Along the lines of Gregoir and Laroque (1994), we introduce in (1), for a given r;   the I (1) terms 0⊥ Xt−1 and the asymptotically I (1) terms 1⊥ (Xt−1 − Xt0 ), to get L Xt =

 0 0⊥ Xt−1

+

0 Xt−1

+

p 

j L Xt−j + jt ;

t 6 t0 ;

(8)

j=1

L Xt =

 1 1⊥ (Xt−1

− Xt0 ) +

1 (Xt−1

− X t0 ) +

p 

j L Xt−j + jt ;

t ¿ t0 ;

(9)

j=1

where, according to the DGP properties, the null hypothesis H0 is then equivalent to the following parametric restriction H0 : 0 = 1 = 0: In practice, i , i⊥ , i = 0; 1, are rarely known by the modeler. However, one can replace them by their consistent estimates obtained from the principal components analysis described in the previous section. Therefore, instead of testing for H0 in (8)– (9), we can consider the same null hypothesis in the regression equations L Xt = 0 ˆ0⊥ Xt−1 + ˆ0 Xt−1 +

q 

j L Xt−j + j˜t ;

t 6 t0 ;

(10)

j=1

L Xt = 1 ˆ1⊥ (Xt−1 − Xt0 ) + ˆ1 (Xt−1 − Xt0 ) +

q 

j L Xt−j + j˜t ; t ¿ t0 ;

(11)

j=1

where q ¿ p. Let ˆ0 and ˆ1 , respectively denotes the OLS estimators of 0 and 1 in regressions (10) and (11), the null hypothesis 0 = 1 = 0 is tested by using a

280

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

multivariate Fisher-type test statistic   t0      ˆ0⊥ Xt−1 Xt−1 ˆ0⊥      ˆ0  t=0  tr   ˆ  1     0    ×

ˆ0 ˆ1

 ˆ −1 j

          

 0 T 

ˆ1⊥ (Xt−1 − Xt0 )(Xt−1 − Xt0 ) ˆ1⊥

     

t=t0 +1

:

The asymptotic distribution of this test statistic has no nuisance parameters when the true cointegrating directions are known. Yet, it can be shown (cf. Appendix C) that the 3rst-step error in the estimation of 0 and 1 introduces a nuisance term in the limit distribution that belongs to the space spanned by the loading factors, . To get rid of it we propose to consider instead the following test statistic  t    0   0T (0 ) = tr ˆ⊥ ˆ0 ˆ0⊥ Xt−1 Xt−1 ˆ0⊥ ˆ0 ˆ⊥ (ˆ⊥ ˆ j ˆ⊥ )−1 t=0





T 

+tr ˆ⊥ ˆ1





ˆ1⊥ (Xt−1 − Xt0 )(Xt−1 − Xt0 ) ˆ1⊥ ˆ1 ˆ⊥ (ˆ⊥ ˆ j ˆ⊥ )−1 ;

t=t0 +1

whose asymptotic distribution is free of any nuisance parameter as stated in the following theorem. Theorem 2. Consider the model de4ned by Eq. (1), with 0 = 1 = , and Assumption 2. Under H0 , when T goes to in4nity,    0 −1  0 0 d −1    0T (0 )→(0 ) tr dW0 G0 G0 G0 du G0 dW0 0

+(1 − 0 )−1 tr

  

0

1

0

dW0 G1



0

1

0

G1 G1 du

−1 

1

0

G1 dW0

  

;

with W0 (u) a standard Wiener processes of dimension n − r and G0 (u) and G1 (u) two processes which speci4cation depends on the deterministic terms of the model. More precisely, letting W00 (u) = W0 (u) − W0 (0 ), we have the following speci4cations 1. when Dt = 0: G0 (u) = W0 (u) and G1 (u) = W00 (u), 2. when Dt = 1 and ⊥ i = 0, i = 0; 1: G0 (u) = [W0 (u); 1] and G1 (u) = [W00 (u); 1],

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

281

" ! 3. when Dt = 1 and ⊥ i = 0, i = 0; 1: G0 (u) = W˜ 0 (u) − 0 0 W˜ 0 (u)du; u − 20 and $ # !1 0) where W˜ 0 (u) is a standard Wiener process G1 = W˜ 00 (u) − 0 W˜ 00 (u)du; u − (1− 2 of dimension n − r − 1, ! 4. when Dt =(1 t) , i =(4i i ) and ⊥ i =0, i=0; 1: G0 (u)= W0 (u) − 0 0 W0 (u)du; $ # " !1 0) , u − 20 and G1 = W00 (u) − 0 W00 (u)du; u − (1− 2  5. when Dt = (1 t) , i = (4i i ) and ⊥ i = 0, i = 0; 1: G0 (u) = [W˜ 0 (u) − e0 − f0 u; u2 − g0 − h0 u] and G1 = [W˜ 00 (u) − e1 − f1 u; u2 − g1 − h1 u] where e0 , e1 , f0 and f1 are four vectors of dimension n − r − 1 obtained by regressing W˜ 0 (u) and W˜ 00 (u) on a constant and a linear trend for t 6 t0 and t ¿ t0 . Likewise, g0 , g1 , h0 and h1 are four scalars obtained by regressing u2 on a constant and a linear trend for t 6 t0 and t ¿ t0 . Proof. See Appendix C. Property 1. The sequential test procedure adopted gives a consistent way to determine the cointegration rank in the sense that it leads to select the true number of cointegration relationships with asymptotic probability 1 − l, with l the size level for the test. Proof. The proof closely follows that of Theorem 12.3 in Johansen (1995). Details are available from the authors on request. 4.2. Testing for the number of changing directions Once we have identi3ed the number of cointegration relationships, say r, it can be useful to know how many among them, say r  , do not change with the regime shift. Identi3cation of r  can be achieved by testing sequentially for the nested null hypothesis H0 : rk(0 ) = rk(1 ) = r and dim[sp(0 ) ∩ sp(1 )] 6 r  ; against the alternative that Ha : rk(0 ) = rk(1 ) = r and dim[sp(0 ) ∩ sp(1 )] ¿ r  : Note that the fact sp(i ) and sp(i⊥ ), i = 0; 1, are two supplementary subspaces restricts the smallest possible number of common cointegrating directions and leads to a sequential test procedure that starts with r  = max(2r − n; 0) and ends with r  = r − 1. Note also that rejecting the null at the last step is equivalent to say that there does not exist any structural break. According to the properties of the DGP under study, the term 1 L Xt has a Wold decomposition over the 3rst regime. Therefore, we can consider the following

282

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

autoregressive approximation of 1 L Xt for t 6 t0 

1 L Xt

=

:1 Xt−1

+ ˜ 0 Dt +

p 

˜ j 1 L Xt−j + ;t ;

(12)

j=1

where p = Op (T 1=3 ), : and ˜ j , j = 1; : : : ; p are r × r matrices, ˜ 0 is an r × d matrix and ;t is a martingale di4erence sequence. Under the null hypothesis H0 , the term 1 Xt−1 can be split into r  stationary directions and r−r  non-stationary ones over the 3rst regime. As 1 L Xt is weakly stationary over the 3rst regime, its representation involves the r  components of 1 Xt−1 which are also stationary for t 6 t0 . This implies that the r × r matrix in regression (12) is of reduced rank r  , which is equivalent to say that it can be decomposed into : = <= , with < and = two r × r  full rank matrices. Let =⊥ be an r × (r − r  ) full-rank matrix such that = =⊥ = 0 and extend the set of regressors of regression (12) so as to include the components of 1 Xt−1 that are non-stationary over the 3rst regime 

1 L Xt

=

>=⊥ 1 Xt−1

+

<= 1 Xt−1

+ ˜ 0 Dt +

p 

˜ j 1 L Xt−j + ;t ;

j=1

t = 1; : : : ; t0 :

(13) H0

The null hypothesis is then equivalent to the parametric restriction > = 0 in the preceding equation. Testing for the signi3cance of > in (13) requires consistent estimates of the parameters = and =⊥ . Such estimates can be obtained by a principal components analysis of 1 L Xt−1 over the 3rst regime, where the unknown matrix 1 is replaced by its consistent principal components estimator ˆ1 and for which a convenient normalization constraint is now < ;−1 < = Ir  , with ; = E(;t ;t ). This ensures that the parameters of interest are identi3ed if one further impose an assumption equivalent to Assumptions 3 or 4 for the particular context at hand. Given those estimates, we can consider the regression 

ˆ1 L Xt = >=ˆ⊥ ˆ1 Xt−1 + <=ˆ ˆ1 Xt−1 + ˜ 0 Dt +

q 

˜ j ˆ1 L Xt−j + ;˜t ;

j=1

t = 1; : : : ; t0 ; 

(14)



H0 :

with q ¿ p and test for the null hypothesis, > = 0 by use of the following test statistic     t 0   ?t0 = tr <ˆ⊥ >ˆ =ˆ⊥ ˆ1 Xt−1 Xt−1 ˆ1 =ˆ⊥ >ˆ <ˆ⊥ (<ˆ⊥ ˆ ; <ˆ⊥ )−1 ; t=1

where the projection onto the space spanned by <ˆ⊥ ensures that the limit distribution is free of the nuisance parameter which stems from the 3rst-step error in the estimation of the stationary directions common to the two regimes, =, as stated in the following theorem.

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

283

Theorem 3. Consider the model described by Eq. (12). Under H0 , when T goes to in4nity,    −1    1 1 1 d ?t0 → tr dUF  FF  du FdU  ;   0 0 0 with U (u) a standard Wiener process of dimension r − r  and F(u) a process which is de4ned 1. when Dt = 0, F(u) = U (u), 2. when Dt = 1 and <⊥ ˜ 1 = 0, F(u) 1], $ # = [U (u); !1 ˜ ˜ 3. when Dt = 1, <⊥ 1 = 0, F(u) = U (u) − 0 U˜ (u)du; u − 12 where U˜ (u) is a stan-

dard Wiener process of dimension r − r  − 1, $ # !1 4. when Dt =(1 t) , ˜ 1 =(4˜1 ˜1 ) and <⊥ ˜1 =0: F(u)= U (u) − 0 U (u)du; u − 12 , 5. when Dt =(1 t) , ˜ 1 =(4˜1 ˜1 ) and <⊥ ˜1 = 0: F(u)=[U˜ (u)−e−fu; u2 −g−hu], where e and f are two vectors of dimension r − r  − 1 obtained by regressing U˜ (u) on a constant and a linear trend and g and h are two scalars obtained by regressing u2 on a constant and a linear trend. Proof. We can use the same arguments as for Theorem 2. Indeed the order in probability of the estimation error ˆ1 − 1 ensures that the blocks of the covariance matrix over which estimates of = and =⊥ are built converge with the same rate than those of the covariance matrix involved in the estimation of the stationary and non-stationary directions of the two regimes, i , i⊥ , i = 0; 1. Remark 2. This testing procedure is equivalent to Johansen’s (1988, 1991) ones but for the transformed system associated with the 3rst regime. One should be cautious because this procedure is potentially subject to poor in-sample performances since the resulting estimation errors come not only from the principal components estimate of 1 , but also from that of =. In particular, when the date of the break occurs near the beginning of the sample, one should recommend using an (asymptotic) autoregressive approximation of 0 L Xt for t ¿ t0 despite the fact that the moving average representation of L Xt over the second regime involves time varying coeScients (see Theorem 1). 4.3. Testing for the cointegration rank with an unknown date of break We end this section by extending the previous cointegration rank test to the more general case where the date of break is unknown. Yet, we do not deal here with the question of inference about the unknown timing of the break 0 . This test is thus only viewed as a cointegration rank test robust to the presence of a structural break. For r given, it can be implemented by using the test statistic 0∗T = max 0T (0 ); 0 ∈[ ;P ]

[ ; ] P ⊂ = ]0; 1[; −

whose asymptotic distribution is given in the following theorem.

284

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Theorem 4. Consider the model de4ned by Eq. (1), with 0 = 1 = , and Assumption 2. Under H0 , when T goes to in4nity,     0 −1  0 0    −1 ∗ d G0 G0 du G0 dW0 dW0 G0 0T → sup (0 ) tr 0 ∈[ ;P ]

0

+ (1 − 0 )−1 tr

  

1

0

0

dW0 G1



1

0

G1 G1 du

0

−1 

1

0

  G1 dW0  ; 

with W0 (u) a standard Wiener process of dimension n − r and G0 (u) and G1 (u) two processes de4ned as in Theorem 2. Proof. The asymptotic properties used in the proof of Theorem 2 hold uniformly over 0 (see Gregory and Hansen, 1996). Indeed, the weak convergence results of Property 3 (given in Appendix C) are with respect to the uniform metric over 0 ∈[ ; P ] ⊂ = ]0; 1[. − Remark 3. Using the same arguments than for Property 1, we can establish that the bottom-up testing strategy adopted in Section 4.1 still provides us here with a consistent way of determining the cointegration rank. 5. Testing for the cointegration rank when the break a%ects the cointegrating vectors and the loading factors Besides the estimation details described in Section 3.2, identifying the cointegration rank in the more general case where sp(0 ) = sp(1 ); requires specifying more precisely the maintained hypothesis of the test. As explained in the next section this is necessary in order to built a test statistic whose asymptotic distribution is free of any nuisance parameters. 5.1. Nuisance parameters and correction If one extends the test statistic that was previously used to this more general case one gets  t    0    ˆ     −1 0T (0 ) = tr ˆ0⊥ 0 ˆ0⊥ Xt−1 Xt−1 ˆ0⊥ ˆ0 ˆ0⊥ (ˆ0⊥ ˆ j ˆ0⊥ )  +tr

ˆ1⊥ ˆ1

t=0



T 

 ˆ1⊥ (Xt−1

− Xt0 )(Xt−1 − Xt0 ) ˆ1⊥ 

t=t0 +1

% × ˆ1 ˆ1⊥ (ˆ1⊥ ˆ j ˆ1⊥ )−1 :

(15)

Let W0 and W1 be two standard Wiener processes of dimension n − r. Let also B   denote the long-run covariance matrix of (L Xt−1 0⊥ L Xt−1 1⊥ ) , de3ned (when

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

285

no deterministic terms are present in the model) by           0⊥ L Xt−1 W0 W0  0⊥ XT 1 = lim V √ =E ; B = VLT   T →∞ T 1⊥ XT 1⊥ L Xt−1 W1 W1 and partition it with respect to (W0  B00 B01 : B= B10 B11

W1 ) into

Let us 3nally de3ne W1:0 a Wiener process independent of W0 by orthogonal projection −1 W1:0 = W1 − B10 B00 W0 :

In the absence of deterministic terms, the limit distribution of the 0T (0 ) test statistic above is given by   −1  0  0 0 (0 )−1 tr W0 W0 W0 dW0 dW0 W0 0

0

+ (1 − 0 )−1 tr

  

1

0

dW1 W1

0



1

0

W1 W1

−1 

1

0

−1  W1 (dW1:0 + dW0 B00 B01 )

  

:

Therefore if one simply extends the previous analysis to this more general framework, one gets a test statistic whose asymptotic distribution is not free of nuisance parameters   as soon as the long-run covariance between 0⊥ L Xt−1 and 1⊥ L Xt−1 di4ers from zero. Yet, in the spirit of Phillips and Hansen (1990) we can get rid of this nuisance parameter by replacing, in the test statistic formula (15), the non-stationary terms of  the second regime with a modi3ed regressor, 1⊥ (Xt−1 − Xt0 )† , such that its long-term covariance with the non-stationary terms of the 3rst regime is equal to zero, namely   CovLT [0⊥ L Xt−1 ; (1⊥ L Xt−1 )† ] ' & 1  1  = lim Cov √ 0⊥ XT ; √ 1⊥ (XT − Xt0 )† = 0: T →∞ T T Such a modi3ed regressor could be obtained by summing −1    (1⊥ L X−1 )† = 1⊥ L X−1 − B10 B00 0⊥ L X−1 ;

from  = t0 + 1 to t, for each date t belonging to the second regime. However, such  a correction is too drastic. Indeed, it is equivalent to projecting 1⊥ L Xt−1 onto the   orthogonal supplement of sp(0⊥ ). Let r be the number of common cointegrating  ) is spanned by r −r  directions which belong directions across the two regimes. sp(1⊥     to sp(0 ) and by n−r−(r−r )=n−2r+r directions which belong to sp(0⊥ ). Thus, the †   n − r vector (1⊥ L Xt−1 ) contains only r − r non-zero components associated with the non-stationary directions that di4er across the two regimes. With this correction adopted, the term T    T −2 1⊥ (Xt−1 − Xt0 )† [1⊥ (Xt−1 − Xt0 )† ] ; t=t0 +1

286

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

which would be involved in the computation of the modi3ed test statistic would not have a full rank limit leading to a degenerate asymptotic distribution of the test statistic. A convenient modi3cation of the test statistic should therefore only correct for the  r − r  non-stationary terms in 1⊥ L Xt−1 which are speci3c to the second regime. To  do so, we partition 1⊥ L Xt−1 as the long-run covariance matrix B10 into    1 b1⊥ B10  1⊥ L Xt−1 = L X ; ; B = t−1 10 0 b01⊥ B10  with b1⊥ an (r−r  )×n matrix such that sp(b1⊥ ) * sp(0⊥ ) and b01⊥ an (n−2r+r  )×n   1  L Xt−1 ) matrix such that sp(b01⊥ ) ⊆ sp(0⊥ ), and where B10 = CovLT (b1⊥ L Xt−1 ; 0⊥  0  and B10 = CovLT (b01⊥ L Xt−1 ; 0⊥ L Xt−1 ) are two matrices respectively of dimension (r − r  ) × (n − r) and (n − 2r + r  ) × (n − r). De3ning −1  1 (b1⊥ L Xt−1 )† = b1⊥ L Xt−1 − B10 B00 0⊥ L Xt−1 ;

a suitable correction of the test statistic is achieved by replacing the non-stationary   terms 1⊥ (Xt−1 − Xt0 ) in (15) with a modi3ed regressor 1⊥ (Xt−1 − Xt0 )‡ obtained by summing  † b1⊥ L X−1  ‡ ; (1⊥ L X−1 ) = b01⊥ L X−1 from  = t0 + 1 to t, for each date t between t0 + 1 and T . A consequence of the preceding discussion is that, when the structural break a4ects the loading factors, testing for the number of cointegrating vectors in (1) requires specifying the number of common cointegrating directions across the two regimes in the maintained hypothesis of the test. Namely the hypotheses of the test have to be speci3ed as H0† : sp(0 ) = sp(1 ); dim{sp(0 ) ∩ sp(1 )} = r  and rk(0 ) = rk(1 ) 6 r; against Ha† : sp(0 ) = sp(1 ); dim{sp(0 ) ∩ sp(1 )} = r  and rk(0 ) = rk(1 ) ¿ r: Such a test is equivalent to a signi3cance test in the following regression equations p   L Xt = d0 0⊥ Xt−1 + 0 0 Xt−1 + 0 Dt +

j L Xt−j + jt ; (16) j=1

t 6 t0 and, for t ¿ t0 , L Xt = d1 b1⊥ (Xt−1 − Xt0 )† + d01 b01⊥ (Xt−1 − Xt0 ) + 1 1 (Xt−1 − Xt0 ) + 1 Dt +

p 

j L Xt−j + jt :

(17)

j=1

Indeed, from the properties of the DGP studied, the null hypothesis H0† implies that d0 = d1 = d01 = 0 in (16)–(17). Testing this requires knowledge of 0 , 0⊥ , 1 , b1⊥ and

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

287

b01⊥ , which are rarely known in practical situations. Nevertheless, in a similar way to what we have done in the previous sections, the testing strategy remains asymptotically valid if one replaces them by consistent estimates. The principal components analysis of Section 2.2 provides us with such estimators for 0 , 0⊥ and 1 . We now develop a procedure to get consistent estimators of b01⊥ and b1⊥ . 5.2. Practical implementation and test statistic Estimation of the unknown speci3c and common stationary directions b1⊥ and b01⊥ rely on a suitable principal components analysis. We 3rst decompose the term ˆ1⊥ (Xt−1 − Xt0 ) in the two supplementary subspaces sp(ˆ0 ) and sp(ˆ0⊥ ), yielding ˆ (Xt−1 − Xt ) = ˆ P ˆ (Xt−1 − Xt ) + ˆ P ˆ (Xt−1 − Xt ); t ¿ t0 ; 1⊥

0

1⊥ 0

0

1⊥ 0⊥

0

with Pˆ0 = ˆ0 (ˆ0 ˆ0 )−1 ˆ0 and Pˆ0⊥ = ˆ0⊥ (ˆ0⊥ ˆ0⊥ )−1 ˆ0⊥ . The non-zero components of ˆ1⊥ (Xt−1 − Xt0 ) in the cointegration space of the 3rst regime correspond to the r − r  relations that are stationary over the 3rst regime and non-stationary over the second one. Conversely, the projection of ˆ1⊥ (Xt−1 − Xt0 ) onto the non-stationary directions of the 3rst regime gives us the n−2r+r  directions that are non-stationary across the two regimes. Thus the (n−r)×(n−r) matrices corresponding to the empirical second-order moments, appropriately normalized, of ˆ1⊥ Pˆ0 (Xt−1 −Xt0 ) and ˆ1⊥ Pˆ0⊥ (Xt−1 − Xt0 ), are respectively of reduced rank r − r  and n − 2r + r  , with asymptotic probability one. A principal components analysis of ˆ1⊥ Pˆ0 (Xt−1 − Xt0 ) gives us bˆ1⊥ as the matrix of the eigenvectors associated with the r − r  largest eigenvalues of T −2

T  t=t0 +1

ˆ1⊥ Pˆ0 (Xt−1 − Xt0 )(Xt−1 − Xt0 ) Pˆ0 ˆ1⊥ :

Likewise, a principal components analysis of Pˆ0⊥ ˆ1⊥ (Xt−1 − Xt0 ) provides us with bˆ01⊥ which is composed of the eigenvectors associated with the n − 2r + r  largest eigenvalues of T −2

T  t=t0 +1

ˆ1⊥ Pˆ0⊥ (Xt−1 − Xt0 )(Xt−1 − Xt0 ) Pˆ0⊥ ˆ1⊥ :

Remark that by construction the estimators bˆ1⊥ and bˆ01⊥ belong to sp(ˆ1⊥ ) so that they still satisfy ˆ1 bˆ1⊥ = 0 and ˆ1 bˆ01⊥ = 0. Estimation of the modi3ed regressor b1⊥ (Xt−1 − Xt0 )† ; that we denote by bˆ1⊥ (Xt−1 − Xt0 )† , can then be achieved by summing  ( LT (b1⊥ L X−1 ; 0⊥ (bˆ1⊥ L X−1 )† = bˆ1⊥ L X−1 − Cov L X−1 )  × Vˆ LT (0⊥ L X−1 )ˆ0⊥ L X−1 ;

from  = t0 + 1 to t, for each date t between t0 + 1 and T , where estimates of the long-run covariance matrices involved can be obtained by selecting the appropriate

288

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

blocks in a non-parametric kernel estimator (see Phillips (1995)) of the non-stationary terms long-run variance    0⊥ L Xt−1     VLT   b1⊥ L Xt−1  : b01⊥ L Xt−1 Let dˆ0 , dˆ1 and dˆ01 , respectively denote the OLS estimators of d0 , d1 and d01 in the regression equations q 

j L Xt−j + j˜t ; L Xt = d0 ˆ0⊥ Xt−1 + 0 ˆ0 Xt−1 + 0 Dt + j=1

for t 6 t0 , and

L Xt = d1 bˆ1⊥ (Xt−1 − Xt0 )† + d01 bˆ01⊥ (Xt−1 − Xt0 ) + 1 ˆ1 (Xt−1 − Xt0 ) + 1 Dt +

q 

j L Xt−j + j˜t ;

j=1

for t ¿ t0 , with q ¿ p. The null hypothesis H0† is then tested with the following multivariate Fisher type test statistic  t    0  †  ˆ     −1 ˆ ˆ ˆ ˆ 0T (0 ) = tr ˆ0⊥ d0 0⊥ Xt−1 Xt−1 0⊥ d0 ˆ0⊥ (ˆ0⊥ j ˆ0⊥ ) t=0

     T    dˆ1    ‡  ‡ + tr ˆ1⊥ ˆ1⊥ (Xt−1 − Xt0 ) [ˆ1⊥ (Xt−1 − Xt0 ) ]  dˆ01 t=t +1 0

 ×

dˆ1 dˆ01



ˆ1⊥ (ˆ1⊥ ˆ j ˆ1⊥ )−1

;

where estimates of the corrected non-stationary regressors of the second regime are   bˆ1⊥ (Xt−1 − Xt0 )†  ‡ : ˆ1⊥ (Xt−1 − Xt0 ) = bˆ01⊥ (Xt−1 − Xt0 ) The asymptotic distribution of this test statistic is free of any nuisance parameter as stated in the following theorem. Theorem 5. Consider the model de4ned by Eq. (1), with 0 = 1 , and Assumption 2. Under H0† , when T goes to in4nity,    0 −1  0 0 d † −1    dW0 G0 G0 G0 du G0 dW0 0T (0 )→(0 ) tr 0

+ (1 − 0 )−1 tr

0

  

1

0

dW1 G1



0

1

0

G1 G1 du

−1 

1

0

G1 dW1

  

;

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

where W0 (u) =

)

W00 (u) W01 (u)

*

and W1 (u) =

)

W11:0 (u)−W11:0 (0 ) W01 (u)−W01 (0 )  

*

289

, with W01 (u), W00 (u) and

W11:0 (u) denoting respectively, n − 2r + r , r − r and r − r  dimensional standard    Wiener processes which verify E(W00 W11:0 ) = 0, E(W00 W01 ) = 0, and E(W01 W11:0 ) = 0, so that  0 0  : E{W0 W1 } = 0 In−2r+r  Moreover G0 (u) and G1 (u) are two processes of dimension n − r whose expression depends on the deterministic terms in the model speci4cation. More precisely, 1. when Dt = 0, G0 (u) = W0 (u);

G1 (u) = W1 (u);

2. when Dt = 1 and i⊥ i = 0, i = 0; 1, G0 (u) = [W0 (u); 1];

G1 (u) = [W1 (u); 1];

3. when Dt = 1, and i⊥ i = 0, i = 0; 1, ' &  0 0 ˜ ˜ ; W 0 (u)du; u − G0 (u) = W 0 (u) − 2 0    1 (1 − 0 ) ˜ ˜ G1 (u) = W 1 (u) − ; W 1 (u) du; u − 2 0 with W˜ 0 (u) and W˜ 1 (u) denoting Wiener processes of dimension n − r − 1, 4. when Dt = (1 t) , i = (4i i ) and i⊥ i = 0, i = 0; 1, ' &  0 0 ; W0 (u) du; u − G0 (u) = W0 (u) − 2 0    1 (1 − 0 ) G1 (u) = W1 (u) − ; W1 (u) du; u − 2 0 t) , i = (4i i ) and i⊥ i = 0, i = 0; 1, G0 (u) = [W˜ 0 (u) − e0 − f0 u; u2 − g0 − h0 u];

5. when Dt = (1

G1 (u) = [W˜ 1 (u) − e1 − f1 u; u2 − g1 − h1 u]; with W˜ 0 (u) and W˜ 1 (u) denoting Wiener n − r − 1 dimensional processes and e0 , e1 , f0 and f1 four vectors of dimension n − r − 1 obtained by regressing W˜ 0 (u) and W˜ 1 (u) on a constant and a linear trend for t 6 t0 and t ¿ t0 . Likewise, g0 , g1 , h0 and h1 are four scalars obtained by regressing u2 on a constant and a linear trend for t 6 t0 and t ¿ t0 . Proof. See Appendix D. Be the common cointegrating directions number, r  , given the sequential test procedure adopted in the preceding sections would give a consistent way of determining

290

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

the cointegration rank r. However, from a practical point of view, that parameter is in general unknown. Yet, it is still possible in that case to build a procedure that leads to a consistent determination of the cointegration rank as stated in the following Property whose proof is available from the authors on request. Property 2. The procedure that consists of testing sequentially for H0† against Ha† , starting with r = 0 and ending with r = n − 1, and, for each value of r, implementing the test for the di4erent possible values of r  , starting with the lowest one (the lower bound for r  being max(0; 2r − n) as pointed out in Section 4.2) and ending with r − 1, gives a consistent way to determine the cointegration rank r (and conjointly the number of common cointegrating directions r  ).

6. Critical values and in-sample size and power 6.1. Critical values The asymptotic distributions of the test statistics 0T (0 ) and 0†t (0 ), respectively given in Theorems 2 and 5, are characterized by the same number of independent white noise processes. As a consequence, asymptotic critical values for the cointegration rank tests are the same whether the break a4ects the loading factors or not. Following MacKinnon (1996) and MacKinnon et al. (1999) we estimate these critical values by response surface regressions. We provide results for the 5 di4erent speci3cations of the deterministic terms investigated in the text, 4 di4erent relative timings of the break : 0.2, 0.3, 0.4, 0.5 and up to 6 non-stationary directions. In each case, we drew 25 experiments, each involving 10,000 replications of the test statistic (details on the simulation experiments are available from the authors), for 13 di4erent sample sizes T : 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1000, 1250. Let qp (Ti ) denote the estimate of the p quantile of the test statistic based on the ith experiment for which the sample size is Ti , for which we thus have 13 × 25 = 325 observations at hand, and consider the response surface regression qp (Ti ) =

p ∞

+

p −1 1 Ti

+ ··· +

p −d 3 Ti

+ ui ;

p corresponds to the p quantile of the test statistic in which the 3rst parameter ∞ asymptotic distribution. Estimates of this parameter are obtained following the methodology of MacKinnon (1996) which proposes a GMM estimator to manage the error term heteroskedasticity as a speci3cation test to choose the regressor speci3cation. Depending on the cases treated here, the maximum degree of the polynomial in the inverse of the sample size was d = 2 or 3. The results are presented in Table 1. For computational cost reasons, critical values for the limit distribution of the 0∗t test statistic de3ned in Theorem 4 are estimated by a single experiment for each of the 5 di4erent speci3cations of the deterministic terms investigated and up to 6 non-stationary directions. We drew 10,000 replications of the test statistic in random samples of size T = 250 and 1000 for all possible timings of the break included in [0:15 × T ; 0:85 × T ].

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

291

Table 1 Cointegration with break—asymptotic critical values for 0T () and 0†T () n−r

80%

90%

95%

99%

No deterministic terms, Dt = 0 0 = 0:20 1 2 3 4 5 6

21.42 65.20 132.53 224.37 340.37 481.74

28.40 76.29 147.59 243.33 363.41 508.41

16.02 46.63 93.08 155.68 234.50 330.34

35.24 86.63 161.06 260.12 383.27 531.58

51.20 108.59 188.80 294.20 423.05 577.28

17.47 51.67 103.93 174.77 263.94 372.70

19.98 53.02 101.78 166.69 247.80 345.98

23.75 58.78 109.45 176.29 259.13 359.18

32.11 70.77 124.91 195.37 281.70 385.36

15.93 45.97 91.25 152.21 228.72 321.48 0 = 0:30

1 2 3 4 5 6

26.19 69.94 132.53 214.19 314.32 434.18

43.38 105.60 192.23 304.08 439.81 600.71

51.46 116.98 207.15 322.34 460.98 625.16

69.45 141.06 237.17 358.86 503.14 673.94

0 = 0:40 1 2 3 4 5 6

22.75 60.83 115.30 186.30 273.70 378.30

99%

22.31 59.46 114.46 188.08 280.00 391.48

27.04 66.49 123.92 199.80 293.96 407.64

37.83 81.33 143.13 223.30 321.97 439.57

19.72 51.93 99.41 162.57 241.22 336.14

23.29 57.33 106.56 171.53 251.82 348.64

31.23 68.56 120.91 189.29 272.93 372.81

31.97 78.73 144.09 228.57 331.52 454.03

37.50 86.60 154.38 241.20 346.29 471.21

49.73 102.98 175.11 266.27 375.68 504.71

26.15 64.93 119.25 189.57 275.62 378.14

30.21 70.81 126.99 199.08 286.78 391.24

38.89 82.95 142.32 217.94 308.55 416.51

11.16 47.44 99.59 169.68 258.41 366.59

14.91 53.86 108.59 181.11 272.07 382.26

24.09 67.25 126.86 203.67 298.80 413.78

0 = 0:50 27.40 67.93 124.68 198.15 287.80 394.73

31.75 74.28 132.86 208.34 299.91 408.82

41.19 87.29 149.65 228.40 323.30 435.74

21.81 58.29 110.44 178.53 262.50 362.73

Non-cointegrated constant, Dt = 1, i⊥ i = 0, i = 0; 1 0 = 0:20

0 = 0:30

1 2 3 4 5 6

7.56 40.44 89.45 156.87 242.78 348.18

9.88 52.12 116.47 205.33 318.66 457.36

95%

0 = 0:50

Cointegrated constant, Dt = 1 and i⊥ i = 0, i = 0; 1 0 = 0:20 34.96 92.97 175.71 283.29 415.14 572.36

90%

0 = 0:30

0 = 0:40 1 2 3 4 5 6

80%

15.16 62.57 131.18 224.02 341.12 483.71

20.74 72.05 144.19 240.34 360.84 506.37

34.59 92.56 171.40 273.65 399.98 551.22

292

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Table 1 (continued) n−r

80%

90%

95%

99%

80%

0 = 0:40 1 2 3 4 5 6

6.69 35.77 78.77 137.53 212.63 304.55

9.62 41.42 86.90 147.98 225.41 319.70

12.61 46.53 94.02 157.05 236.43 332.49

19.67 57.39 108.59 175.42 257.82 357.67

t) , i = (4i

1 2 3 4 5 6

53.77 120.03 210.06 325.02 463.54 627.90

25.81 64.69 119.43 190.51 278.00 382.42

45.69 108.13 194.92 306.44 442.08 603.11

67.54 153.00 262.33 395.56 552.82 734.96

71.57 145.02 241.04 362.33 506.55 676.58

30.44 72.03 129.12 202.49 292.20 398.95

34.73 78.52 137.53 212.83 304.35 413.00

78.73 168.68 282.06 419.24 580.25 766.07

43.92 91.87 154.62 233.36 328.19 440.54

t) , i = (4i

89.08 182.64 299.27 439.67 603.77 792.81

110.99 211.20 334.18 480.11 649.75 845.12

0 = 0:40 1 2 3 4 5 6

44.00 100.46 172.72 261.09 365.73 487.18

6.45 34.59 76.21 132.75 204.95 293.12

9.23 39.86 83.75 142.42 216.81 307.25

12.01 44.66 90.40 150.80 226.96 319.23

18.40 54.72 103.79 167.68 247.00 342.58

29.01 73.28 135.91 217.45 317.61 437.22

34.83 82.33 147.92 232.28 335.20 457.57

40.24 90.41 158.25 245.00 349.98 474.63

52.01 107.08 179.28 270.76 379.74 508.22

29.49 69.46 124.40 194.73 280.92 383.26

33.50 75.48 132.19 204.43 292.15 396.37

42.09 87.64 147.89 223.56 314.28 421.72

0 = 0:50

Non-cointegrated trend, Dt = (1 0 = 0:20 1 2 3 4 5 6

99%

i ) and i⊥ i = 0, i = 0; 1 0 = 0:30

0 = 0:40 1 2 3 4 5 6

95%

0 = 0:50

Cointegrated trend, Dt = (1 0 = 0:20 37.05 94.95 177.80 285.11 416.97 574.19

90%

25.09 62.71 115.41 183.60 267.67 367.75

i ) and i⊥ i = 0, i = 0; 1 0 = 0:30 50.72 115.49 198.39 299.75 419.43 558.39

58.44 126.30 212.25 316.18 438.70 580.51

65.39 135.92 224.37 330.68 455.27 599.24

80.22 155.54 248.01 359.29 487.79 635.85

48.06 104.58 176.08 262.85 365.49 484.25

53.33 111.79 185.05 273.76 377.98 498.60

64.09 126.40 202.88 295.10 402.65 526.60

0 = 0:50 50.25 109.31 183.95 274.61 381.47 505.27

55.91 117.03 193.70 286.20 395.01 520.64

67.73 132.90 213.02 309.15 421.24 550.01

42.18 96.26 165.58 250.30 350.78 467.14

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

293

This yields 10,000 estimates 0ˆT () of the functional 0T (). We then estimated the 0∗T test statistic limit distribution by retaining 0ˆ∗ = max 0ˆT (); T

∈[ ;P ]

from which we selected the relevant quantiles. Table 2 presents the results. Lastly, and as remarked in the text, the asymptotic distribution of the ?t0 test statistic de3ned in Theorem 3 is the same as that of Johansen (1988, 1991) trace statistic. Estimates of the critical values are available in Johansen (1995, Chapter 15) and MacKinnon et al. (1999). 6.2. In-sample size and power To evaluate the 3nite-sample properties of the proposed tests, we design some simple Monte–Carlo experiments when the date of break is known. To analyze in-sample size performances we simulate a two-dimensional non-cointegrated VAR model with a break 3xed at 0 = 0:5 and a lag number p = 1, namely    L X1t 1 0 j1t = for t 6 [0 T ]; L 1 L X2t j2t    L X1t 1 0 j1t = for t ¿ [0 T ]; (18) −2L 1 L X2t j2t with (j1t j2t ) ∼ iid N (0; I2 ), for three di4erent sample sizes, T = 100, 200 and 500. Table 3 reports rejection frequencies of the null hypothesis, H0 : r = 0, at the 5% level in 5000 replications for our test (0T (0 ) statistic), as well as for Johansen’s test over the whole (Tr statistic) and each of the two sub-samples (Tr0 and Tr1 statistics), using the relevant critical values from Table 1 and from MacKinnon et al. (1999). The results show that, for small sample sizes, our test is biased towards the null of non-cointegration while the two sub-sample Johansen tests are biased toward the alternative of cointegration. These biases are reduced when the sample size increases. The whole sample Johansen test performs better, as its in-sample size is very close to the exact value, even for small samples. To gauge the in-sample power of our test and to compare it with alternative procedures, we simulate a cointegrated two-dimensional VECM with a break 3xed at 0 =0:5 and a lag number p = 1, namely    L X1t 1 0 j1t = for t 6 [0 T ]; L 1−L L X2t j2t    L X1t 1 0 j1t = for t ¿ [0 T ]; (19) −2L 1 − L L X2t j2t with (j1t j2t ) ∼ iid N (0; I2 ), for three sample sizes T = 100, 200 and 500. Cointegrating vectors for the 3rst and second sub-samples are, respectively, 0 = (−1 1) and 1 = (2 1).

294

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Table 2 Cointegration with break—critical values for 0∗T . Sample sizes: T = 250 (top 3gure) and T = 1000 (bottom 3gure) n−r

1

2

No deterministic terms, Dt = 0 80% 61.22 144.27 71.29 163.99 90% 75.08 164.30 86.85 186.21 95% 88.97 181.83 100.87 206.01 99% 119.53 222.60 131.40 247.49

3

4

260.70 296.53 286.85 323.42 310.65 348.72 360.43 401.47

408.85 466.55 440.91 498.45 469.13 529.67 529.45 593.58

583.25 675.29 622.40 716.78 657.94 752.41 729.34 828.85

788.30 922.81 832.68 973.79 871.26 1013.63 940.29 1102.20

i = 0; 1 477.02 539.30 509.60 576.59 539.33 606.32 596.77 678.17

665.23 767.25 704.45 807.76 739.84 846.42 819.27 917.01

879.86 1029.84 927.27 1081.92 965.32 1122.42 1042.84 1206.34

542.03 606.40 578.26 644.95 608.80 681.31 679.11 754.13

743.31 847.91 784.21 890.58 822.61 928.18 903.70 1009.13

Cointegrated constant, Dt = 1 and ⊥ i = 0, 80% 86.66 183.89 314.92 97.23 205.81 354.73 90% 102.14 206.16 342.61 114.06 228.99 385.69 95% 116.93 226.23 365.17 129.47 249.60 412.28 99% 150.58 268.04 417.03 169.15 292.49 466.50

Non-cointegrated constant, Dt = 1, ⊥ i = 0, i = 0; 1 80% 39.98 111.87 224.41 368.74 42.40 124.21 247.98 406.72 90% 52.37 131.03 249.12 397.46 54.24 144.21 274.88 438.10 95% 64.72 147.51 268.58 423.84 66.93 161.50 298.28 468.47 99% 94.04 186.82 315.13 477.03 95.56 200.36 349.91 528.00 Cointegrated trend, Dt = (1 t) , i = (4i 80% 79.95 181.97 314.61 93.27 206.43 355.71 90% 95.85 202.82 342.86 110.88 228.81 385.99 95% 111.35 222.92 365.98 125.95 250.40 414.18 99% 149.76 264.06 417.67 162.24 294.75 466.06 Non-cointegrated trend, Dt = (1 80% 129.99 252.62 157.55 307.71 90% 148.24 276.09 177.63 333.56 95% 165.25 295.68 196.03 357.88 99% 200.93 339.31 236.10 409.83

5

6

i ) and ⊥ i = 0, i = 0; 1 477.95 666.52 882.36 542.97 770.56 1036.96 510.41 706.71 928.39 578.07 815.66 1087.05 541.12 744.09 967.35 608.98 853.88 1131.27 599.52 815.73 1043.65 671.80 929.07 1217.64

t) , i = (4i i ) and ⊥ i = 0, 401.58 575.15 771.54 492.30 712.29 971.45 430.14 610.93 812.04 525.44 751.23 1015.96 456.57 644.06 849.18 555.40 785.73 1054.51 512.89 707.54 923.92 616.15 861.05 1139.50

i = 0; 1 991.90 1268.17 1039.51 1319.36 1084.38 1366.81 1164.55 1457.44

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

295

Table 3 In-sample size T = 100 Probability to reject the null (r = 0) with DGP 0T (0 ) 0.026 Tr 0.061 Tr0 0.093 0.080 Tr1 +1 Tr 0.090

T = 200 (18)a

0.045 0.049 0.070 0.068 0.070

T = 500 0.061 0.045 0.057 0.051 0.052

a Rejection frequency at the 5% level using critical values from Table 1 and from MacKinnon et al. (1999). There are 5000 replications in each experiments. 0T (0 ) denotes the test statistic de3ned in Section 4.1. Tr, + 1 , respectively denote Johansen’s trace test statistics for the whole sample, each of the two Tr0 , Tr1 and Tr sub-samples, and the second one only but with a (cointegrated) constant in the VECM.

Table 4 In-sample power T = 100

T = 200

Probability to reject the null (r = 0) with DGP (19)a 0T (0 ) 0.688 0.933 (0.805) (0.937) Tr 0.062 0.067 (0.053) (0.069) Tr0 0.872 0.939 (0.833) (0.932) 0.087 0.091 Tr1 (0.070) (0.083) +1 Tr 0.997 1.00 (0.991) (1.00)

T = 500 0.988 (0.985) 0.988 (0.985) 0.988 (0.985) 0.108 (0.107) 1.00 (1.00)

a Rejection frequency at the 5% level of signi3cance using critical values from Table 1 and from MacKinnon et al. (1999). In parentheses are the size-adjusted rejection frequencies based on critical values from Table 3. There are 5000 replications in each experiment.

Table 4 reports the rejection frequencies of the null hypothesis, H0 : r = 0, in 5000 replications at the 5% level of signi3cance for the same four tests. In light of the di4erent in-sample size bias of these tests, we provide both nominal power (based on asymptotic critical values) and size-adjusted power (based on the 3nite sample critical values obtained from the size experiments). Our test now does a much better job than Johansen’s one over the whole sample. This result is nothing else than a con3rmation of Gregory et al. (1996) who showed that the in-sample power of standard cointegration tests sharply falls when cointegrating vectors are subject to a structural break. Moreover, if one conducts two separate Johansen tests over the two sub-samples the (known) timing of the break de3nes, one also rejects too often the cointegration property over the second sub-sample, despite it is actually true. The reason comes from the impact of the second regime initial value which is drawn from the 3rst regime DGP and acts as a

296

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Table 5 In-sample size and power with 0 misspeci3ed T = 100

T = 200

T = 500

Probability to reject the null (r = 0) with DGP (18) and 0 misspeci3eda;b 0T (˜0 ) 0.007 0.011 0.020 0.102 0.071 0.063 Tr0 Tr1 0.093 0.084 0.075 +1 Tr 0.266 0.212 0.198 Probability to reject the null (r = 0) under Ha with DGP (19) and 0 misspeci3edc 0T (˜0 ) 0.281 0.858 0.969 (0.701) (0.928) (0.983) 0.836 0.919 0.976 Tr0 (0.763) (0.906) (0.973) Tr1 0.095 0.094 0.105 (0.062) (0.067) (0.086) +1 Tr 0.739 0.674 0.657 (0.340) (0.371) (0.381) a True

value 0 = 0:5. Test speci3cation ˜0 = 0:4. frequency at the 5% level of signi3cance using critical values from Table 1 and from MacKinnon et al. (1999). There are 5000 replications in each experiment. c Rejection frequency at the 5% level of signi3cance using critical values from Table 1 and from MacKinnon et al. (1999). In parentheses are the size-adjusted rejection frequencies based on critical values from Table 3. There are 5000 replications in each experiment. b Rejection

(cointegrated) constant that is not accounted for in the regression of the Johansen test implemented over the second sub-period. One may wonder whether introducing such a constant term into the second regime test regression would solve the problem. We + 1 in Tables 3 and 4 the in-sample size and power performances thus also report as Tr for a Johansen’s test, with a cointegrating constant introduced into the VECM, applied to the second regime. While such a test su4ers from small sample size bias, it solves, both in nominal and size-adjusted terms, the low in-sample power associated with the Tr1 test statistic. However, if the date of the regime shift is misspeci3ed, conducting two separate Johansen tests over each sub-samples (provided deterministic terms adjustments are made) provides lower in-sample performances than our whole sample approach. We illustrate this claim by conducting the same experiments as before, yet letting the econometrician choose a relative timing of ˜0 = 0:4, instead of the true value 0 = 0:5. The results are given in Table 5. In-sample size bias of each testing procedure is ampli3ed compared to the case of exact timing of the break. The deterioration is particularly marked for the Tr1 test statistic. Yet, a size-adjusted power comparison of the di4erent tests shows that our global approach should be preferred to both (with or without a constant term into the second regime test regression) sub-samples separate Johansen’s type analysis. Indeed, when the sample size increases, misspecifying the regime shift date introduces more data from the 3rst regime into the second one. This leads to more erroneous rejection frequencies of the null hypothesis over the second

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

297

regime. Rather the probability to reject the null for the 3rst regime rises with the sample size as usually. Intuitively, our global approach has the comparative advantage to compensate for the low frequency of rejecting the null over the second regime with the high probability to reject over the 3rst regime.

7. Conclusion In this paper we propose di4erent tests to identify the cointegration rank of a multivariate non-stationary system when the cointegration space is subject to a structural change of known timing. As a by-product of the testing strategy, we also provide an estimation method based on principal components analysis. Once the cointegration rank is consistently identi3ed, this estimation method yields consistent, but ineScient, estimates of the cointegrating vectors as well as consistent asymptotic normal estimators of the loading factors and short-term dynamics parameters. Though the eSciency property is not met, this method is very simple to implement compared to the numerical methods a maximum-likelihood estimation would require. The tests developed here should be viewed as statistical tools complementary to several already existing in the literature which also focus on long-run parameter instability in cointegrated systems but where cointegration is assumed in the maintained hypothesis. These latter procedures are thus conducted conditionally on a known cointegration rank. Our testing procedures should therefore be helpful to test for the value of this parameter. Indeed, and as has been previously documented, in-sample experiments show that standard cointegration tests su4er from a very low power when applied to cointegrated models with shifting long-run relationships. In-sample experiments also illustrate that a testing strategy which relies on a global modeling of the two regimes, as proposed here, may be preferred to two separate sub-samples analysis. While we also deal here with the case of an unknown timing of the break, in the simple case where the loading factors do not shift, we do not provide a statistical device to estimate this parameter. This could be done by following Bai et al. (1998) who design inference methods for dating a break in stationary and cointegrated multivariate systems. Yet their methodology relies on a triangular representation of the system dynamics (and some stronger assumptions on the innovation terms). Therefore implementing it requires a priori knowledge of the cointegration rank as of the variables that are present with a non-zero coeScient in the cointegration relationships. The rank test for any timing of the break developed here should thus be useful to confront the presumed number of cointegrating directions to the available sample information.

Acknowledgements We thank one editor and two anonymous referees for their helpful comments on an earlier draft. All remaining errors are ours.

298

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Appendix A. Proof of Theorem 1 For notational convenience, we only deal with the simplest case where Dt = 0 in (1). Let Pi and Pi⊥ be the matrices of orthogonal projections onto sp(i ) and its supplement sp(i⊥ ), i = 0; 1. From the property of supplementary subspaces, L Xt =  i⊥ )−1 , i = 0; 1. (Pi + Pi⊥ )L Xt , i = 0; 1. Also let Pi = i (i i )−1 and Pi⊥ = i⊥ (i⊥ Summing L Xj from j = 1; : : : ; t leads to Xt = P0⊥ X0 + P0 0 Xt + P0⊥

t 

 0⊥ L Xj ;

t 6 t0 :

(A.1)

j=1

t  The same holds for the second regime Xt = P1⊥ Xt0 + P1 1 Xt + P1⊥ j=t0 +1 1⊥ L Xj , t ¿ t0 , that is equivalent to Xt − Xt0 = P1 1 (Xt − Xt0 ) + P1⊥

t 

 1⊥ L Xi ;

t ¿ t0 :

(A.2)

i=t0 +1

Eq. (1) can also be rewritten as (P0 + P0⊥ )L Xt = 0 0 Xt−1 +

p 

j (P0 + P0⊥ )L Xt−j + jt

j=1

= (0 + 1 P0 )0 Xt−1 − 1 P0 0 Xt−2 +

p 

j P0 0 L Xt−j +

j=2

p 



j P0⊥ 0⊥ L Xt−j + jt ; t 6 t0 ;

j=1

which is also equivalent to  P0 0 Xt + P0⊥ 0⊥ L Xt = (0 + (In + 1 )P0 )0 Xt−1 p−1

+



( j − j−1 )P0 0 Xt−j − p P0 0 Xt−p−1

j=2

+

p 



j P0⊥ 0⊥ L Xt−j + jt ; t 6 t0 :

j=1

Pre-multiplying the latter expression by (0 form: Z0t = A0 Z0t−1 + Et0 ;

0⊥ ) , we get the following companion

t 6 t0 ;

(A.3)

with    Z0t = (Xt 0 L Xt 0⊥ Xt−1 0 L Xt−1 0⊥ · · · Xt−p+1 0  A0 B0    0 Et = (jt 0 jt 0⊥ 0 · · · 0) ; A0 = ; C D

  L Xt−p 0⊥ Xt−p  0 ) ;

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

299

where  A0 =  B0 =



0 (In + 1 )P0 + 0 0

0 1 P0⊥

  0 0⊥ (In + 1 )P0 + 0⊥



1 P0⊥ 0⊥

0 ( 2 − 1 )P0

;

0 2 P0⊥ : : : 0 ( p − p−1 )P0

0 p P0⊥

−0 p P0

    

p P0 0⊥ ( 2 − 1 )P0 0⊥

2 P0⊥ : : : 0⊥ ( p − p−1 )P0 0⊥

p P0⊥ −0⊥    Ir 0 0 ··· 0 0 0 C= : ; D= 0 In−r 0 · · · 0 In(p−2)+r 0

;

The same steps also give us a companion form for the second regime Z1t = A1 Z1t−1 + Et1 ;

t ¿ t0 ;

Z1t = ((Xt − Xt0 ) 1

L Xt 1⊥

(A.4)

with

···

(Xt−p+1 − Xt0 ) 1

(Xt−1 − Xt0 ) 1  L Xt−p 1⊥

=

(jt 1

jt 1⊥

0

···



0) ;

A1 =

···

(Xt−p − Xt0 ) 1 ) ; 

Et1

 L Xt−1 1⊥

A1

B1

C

D

;

where A1 and B1 are de3ned as A0 and B0 with 1 , 1 , P1 , 1⊥ and P1⊥ replacing, respectively, 0 , 0 , P0 , 0⊥ and P0⊥ . Solving for (A.3) and (A.4) leads to Z0t = A0t Z00 +

t−1  j=0

Z1t = A1t−t0 Z1t0 +

A0j Et−j ;

t−t 0 −1  j=0

t 6 t0 ;

A1j Et−j ;

t ¿ t0 :

∞ Choosing Z00 to have the invariant distribution given by (Z00 )∗ = j=0 A0j E−j , pro ∞ vides us with the following representation Z0t = j=0 A0j Et−j = C0 (L)Et . By contrast, it is not possible to choose Z1t0 such that Z1t has an invariant distribution. Indeed, Z1t0 is determined from the 3rst regime characterization of the DGP. The process Z1t is therefore non-stationary. However, it is asymptotically exponentially stationary. Indeed, when t goes to in3nity, the inBuence of the initial conditions given by A1t−t0 Z1t0 vanishes and Z1t − (Z1t )∗ L2 → 0, where (Z1t )∗ has the invariant distribution +∞ (Z1t )∗ = j=0 A1j Et−j = C1 (L)Et .

300

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

0) to Z0t gives us

Applying the selection matrix K = (In 

0 Xt

=K

 0⊥ L Xt

∞  j=0

=

∞  j=0



jt−j ;

 0⊥



KA0j K 

= K(I − A0 )



0

A0j K 

−1

0

jt −

 0⊥ 

K



∞  j=0



0

 KA0j K 

(jt − jt−j );

 0⊥



jt − (1 −

 0⊥

0

L)KA0∗ (L)K 

= C0 (1)jt + (1 − L)C0∗ (L)jt = C0 (L)jt ;

0  0⊥

t 6 t0 ;

jt ; (A.5)

∞ ∞ ∞ with A0∗ (L) = j=0 A˜ 0j Lj and A˜ 0j = − i=j+1 A0i satisfying j=0 tr(A˜ 0j A˜ 0j )1=2 ¡ ∞. Things are a little bit more complicated for the second regime. Applying the same selection matrix we obtain 

1 (Xt − Xt0 )

= KA1t−t0 Z0t0

 1⊥ L Xt

= KA1t−t0 Z0t0 +

t−t0 

+K

t−t 0 −1  j=0

 K A˜ 1j; t−t0 K





t−t0 +1 i=j+1

1 (Xt − Xt0 )  1⊥ L Xt

1

jt−j ;

 1⊥



+ K(I − A1 )

j=0

with A˜ 1j; t−t0 = −

 A1j K  −1

(I −

A1t−t0 )K 



1

Ljt−j ;

 1⊥

1  1⊥

jt

t ¿ t0 ;

A1i . Remarking that KA1t−t0 Z0t0 ∼ op (1), one gets, for t ¿ t0



 = K(I − A1 )

+(1 −

−1

K



1



 1⊥

L)KA1;∗ t−t0 (L)K 

jt 

1  1⊥

jt + $t

= C1 (1)jt + (1 − L)C1;∗ t−t0 (L)jt + $t = C1; t−t0 (L)jt + $t ; with $t ∼ op (1).

(A.6)

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

301

Using the partitioned matrix inversion formula, it turns out that, for i = 0; 1, we have K(I − Ai )−1 K  = [(I − Ai ) − Bi (I − D)−1 C]−1 . Some algebra moreover shows that for i = 0; 1,   −1  p    −i i −i 

j  Pi⊥     j=1   −1  K(I − Ai ) K =       p        −i⊥ i In−r − i⊥

j  Pi⊥  j=1

 = (−i

i

−1

 i⊥

;

p

j ) and for i = 0; 1, −1    Pi Pi Pi⊥ −Ir −1 P

i⊥ ) = Pi⊥ 0 Pi⊥ Pi⊥    −(i i )−1 i [In − i⊥ (i⊥

i⊥ )−1 i⊥ ] = :    i⊥ )(i⊥

i⊥ )−1 i⊥ (i⊥

with = (In − (−i

Pi⊥ )

−1

j=1

(A.7)

If there is cointegration with a break, then, by application of De3nition 4, Ci (z), i = 0; 1, has all its roots strictly outside the unit circle and thus Ci (1) is invertible. Therefore,   −1  −Ir Pi Pi⊥ Pi −1 ; i = 0; 1; Ai (1) = Ci (1) = Pi⊥ 0 Pi⊥ Pi⊥  is such that |Ai (1)| = 0, which implies that (i⊥

i⊥ ) is of full rank n − r. Conversely, −1  if Ai (z) = Ci (z) has no-unit root, i.e. (i⊥ i⊥ ) is of full rank n − r, then

(Xt 0

L Xt 0⊥ )

and

((Xt − Xt0 ) 1

L Xt 1⊥ ) ;

have stationary and asymptotically stationary representations (A.5) and (A.6) so that 

i⊥ ) cointegration with a structural break holds. This shows that the property for (i⊥ to be of full rank n − r is a necessary and suScient condition to ensure cointegration in model (1). Finally remark that (A.5) and (A.6) can be rewritten as     0 Xt −(0 0 )−1 0 [In − 0⊥ (0⊥

0⊥ )−1 0⊥ ] = jt     L Xt 0⊥ (0⊥ 0⊥ )(0⊥

0⊥ )−1 0⊥ + (1 − L)C0∗ (L)jt ;

(A.8)

302

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

for t 6 t0 and for t ¿ t0 ,      −(1 1 )−1 1 [In − 1⊥ (1⊥

1⊥ )−1 1⊥ ] 1 (Xt − Xt0 ) = jt     1⊥ L Xt (1⊥ 1⊥ )(1⊥

1⊥ )−1 1⊥ + (1 − L)C1;∗ t−t0 (L)jt + $t :

(A.9)

Now inserting these expressions into Eqs. (A.1) and (A.2) and making use of (A.5) and (A.6) give us Xt = P0⊥ X0 + P0 (Ir

0)C0 (L)jt + CP 0

t 

ji + P0⊥ (0

In−r )C0∗ (L)jt ; t 6 t0 ;

i=1

Xt − Xt0 = P1 (Ir 0)C1 (L)jt + CP 1

t 

ji + P1⊥ (0 In−r )C1;∗ t−t0 (L)jt +!t ; t¿t0 ;

i=t0 +1

with CP i =

  i⊥ (i⊥

i⊥ )−1 i⊥ ,

CP 0 (L) = P0 (Ir

i = 0; 1 and !t ∼ op (1). Let

0)C0 (L) + P0⊥ (0

CP 1; t−t0 (L) = P1 (Ir

In−r )C0∗ (L);

0)C1 (L) + P1⊥ (0

In−r )C1;∗ t−t0 (L);

one 3nally gets Eqs. (2) and (3). Appendix B. Proof of Lemma 2 From the normalization constraint ˆ ˆ −1 j ˆ = Ir , ˆ is an n × r sub-matrix of the matrix square root of ˆ j determined up to an r × r unitary transformation. Our identi3cation scheme allows us to select one particular√square root of ˆ j . −1 −1 1 T (ˆ −1 As ˆ −1 j converges to j at rate √T : j − j ) = Op (1), and as eigenvectors are continuous functions of their associated matrix (see Theorem 8 of Chapter 8 in Magnus and Neudeucker, 1988), this rate of convergence also applies to estimates of the eigenvectors. Thus we can write   1    √ ⊥ ; ˆ = Op (1) + Op T matrix where the identi3cation constraint chosen ensures that the limit #of the O)p (1)*$  1 √ in the above equation is Ir . Equivalently, we also have ˆ = Ir + Op  + T ) *  Op √1T ⊥ . Post-multiplication by the stationary matrix ˆ −1 leaves the rate of conj vergence unchanged so that        1 1  ˆ −1 j−1 : Op √ ˆ j = Ir + Op √  T T ⊥ Now remark that estimates of the i parameters, i = 0; 1, obtained by principal (i) (i) −1 components analysis over the two regimes ˆi = ˆ ˆ −1 j S01 (S11 ) ; can be stacked into

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

303

* ) * ) (0)  (0) (1)  0 S11 −1 ˆ ˆ ˆ . Let = ˆ = ˆ ˆ −1 (1)  j S01 S11 , with  =(0 1 ), S01 =(S01 S01 ) and S11 = 0 S11 ⊥ )  *  0 1 −1 , the matrix product S01 S11 expresses as   1⊥ 0⊥       −1               −1 = S01  1    1   S11  1     1   S01 S11 √ ⊥ √ ⊥ √ ⊥ √ ⊥ T T T T −1   1 √  S11 ⊥     1 T    1 = S01  √ S01 ⊥  :    1  1  T √ ⊥ √ ⊥ S11  ⊥ S11 ⊥ T T T Using the partitioned matrix inversion formula and the rates of convergence of the terms involved, one has  −1 1 √  S11 ⊥  S11    T    1  1   √ ⊥ S11  ⊥ S11 ⊥ T T       1 1 −1  √ √ ( −O ) + O S p 11 p   T T   ; =     −1    1  1 1 −Op √ + Op √  S11 ⊥ T ⊥ T T so that the principal components estimator of the cointegrating directions rewrites as         1 1 1  −1 ˆ √ √ √  = Ir + O p Op  j (P + P⊥ ) S01  S01 ⊥ T T T        1 1 −1  Op √   ( S11 ) + Op √T  T    ×  1  ;     1 √ ⊥ Op √ Op (1) T T   where P = ( )−1  and P⊥ = ⊥ (⊥ ⊥ )−1 ⊥ . Remarking that     1 1      S01  =   S11  + Op √ ; ⊥ S01  = Op √ ; T T     1  1 1 1  1   √  S01 ⊥ =   √  S11 ⊥ + Op √ ; √ ⊥ S01 ⊥ = Op √ ; T T T T T 





 S11 

and using the identi3cation constraint  j−1  = Ir ; we 3nally get      1 1    ˆ ⊥  =  + Op √  + Op ; T T which, according to the de3nitions of the terms involved, applies for each of the two regimes and thus proves the lemma for the stationary directions.

304

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Let ˆ⊥ = (ˆ0⊥ ˆ1⊥ ). By de3nition of the terms involved, we have ˆ⊥ ˆ = 0. Convergence of the stationary directions estimates thus implies that the non-stationary ones converge to ⊥ up to a permutation matrix. Indeed p lim ˆ⊥ ˆ = p lim (ˆ⊥ ) = 0;  with # a full-rank (n − r) × (n − r) matrix. so that p lim (ˆ⊥ ) = #⊥ To establish the rate of convergence for the non-stationary directions estimates, we start by noting that, given the expressions of ˆ0⊥ and ˆ1⊥ , ˆ⊥ diagonalizes the matrix −1 S11 : −1 −1 ˆ ˆ⊥ S11 S10 ˆ j−1=2 Vˆ ⊥ = Iˆ ⊥ ; ⊥ = Vˆ ⊥ ˆ j−1=2 S01 S11

with Iˆ ⊥ a diagonal matrix composed of the n−r smallest eigenvalues of the covariance matrix −1 ˆ j−1=2 S01 S11 S10 ˆ j−1=2 :

Note also that



−1 ˆ p lim ˆ⊥ S11 ⊥ = p lim ˆ⊥





 

 ⊥

 S11 

 S11 ⊥

 ⊥ S11 

 ⊥ S11 ⊥

−1 





 ⊥

 ˆ⊥ 

   = #⊥ ⊥ p lim[(⊥ S11 ⊥ )−1 ]⊥ ⊥ # = 0;

while −1 ˆ  p lim T ˆ⊥ S11 ⊥ p lim ⊥ = #⊥



1   S11 ⊥ T ⊥

−1 

 ⊥ ⊥ # = Op (1):

We thus conclude that Iˆ ⊥ converges to I⊥ =0 and that T Iˆ ⊥ =Op (1): By use of Lemma  + Op (T −1 ) . 5.4 in Gregoir and Laroque (1994) we then have that ˆ⊥ = Op (1)⊥ Moreover, the fact that I⊥ =0 implies that only the space spanned by the non-stationary directions is identi3ed with our identi3cation scheme. Thus, the Op (1) matrix in the preceding equation converges to # a full-rank (n − r) × (n − r) matrix. We thus 3nally get     1 1    ˆ ˜  ⊥ −  ⊥ = Op ⊥ + O p  ; T T  with ˜⊥ = #⊥ .

Appendix C. Proof of Theorem 2 Property 3. Let {4t } be a unit-variance n-dimensional martingale di4erence sequence with respect to Ft−1 , the -3eld generated by the set {4t  ; t  ¡ t}, t =1; 2; : : : Moreover assume that ∃ ¿ 0 such that maxt maxj E(|4jt |2+ |Ft−1 ) ¡ ∞, j = 1; : : : ; n, t = 1; 2; : : :

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

305

Let S[T ] 4 denote the sum of 4t over t = 1; : : : ; [T ]. When T goes to in3nity, d

T −1=2 S[T ] 4→W (); T −2

T 

(C.1) d

St 4(St 4) →

t=[T ]



1



W (u)W (u) du:

(C.2)

Furthermore, let C(L) be a n × n absolutely summable matrix polynomial in the lag operator L,  1 T   d −1 T St 4[C(L)4t+1 ] → W (u) dW (u) + (1 − )I; (C.3) t=[T ]

where I = limT T −1



T

t=1

E{St 4[C(L)4t+1 ] }.

Eq. (C.1) is called the invariance principle, (C.2) is obtained by application of the continuous mapping theorem and (C.3) is shown in Hansen (1992, Theorem 4.1). Lemma 3. Let {yt } be an exponentially asymptotically covariance stationary stochastic process and {xt } the associated weakly stationary process, such that yt − xt L2 → 0. When T goes to in4nity, T

−1

T →∞ T 

(St y)zt t=[T ]

and T

−2

T 

St y(St y) ;

t=[T ]

have the same weak limit as, respectively, T −1

T 

(St x)zt and T −2

t=[T ]

T 

St x(St x) ;

t=[T ]

with {zt } any covariance stationary process. Proof. The proof is immediate and left to the reader. We now give a re-parameterization of the noises that generate the non-stationary   directions of the model, 0⊥ Xt and 1⊥ Xt . Indeed, we show that these noises are obtained as a full-rank linear combination of some independent martingale di4erence sequences of unit variance. This will be useful for the derivation of the asymptotic limit of the test statistic. Let  i i M00 M01 Mi = ; i = 0; 1; i 0 M11 be full-rank upper triangular matrices of dimension n × n and  00 Mi01 Mi −1 Mi = ; i = 0; 1; 0 Mi11

306

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

their inverse, where the matrices Mi00 , Mi01 and Mi11 , i = 0; 1, are of full rank. From the Gram–Schmidt orthogonalization procedure, there exist such matrices, M0 and M1 so that 4t0 = M0 C0 (1)jt , t 6 t0 , and 4t1 = M1 C1 (1)jt , t ¿ t0 ; are unit variance martingale di4erence sequences. Partitioning 4t0 and 4t1 with respect to M0 and M1 : 0  0   1  1   4t0 = ((40t ) (41t ) ) , 4t1 = ((40t ) (41t ) ) , from (A.5) and (A.6) we have   0 0 0 Xt M000 40t + M001 41t = + (1 − L)C0∗ (L)jt ; t 6 t0 ; (C.4)  11 0 0⊥ L Xt M0 41t   00 1 1 1 Xt M1 40t + M101 41t = + (1 − L)C1;∗ t−t0 (L)jt + op (1);  1 L Xt 1⊥ M111 41t t ¿ t0 :

(C.5)

We can thus re-parameterize the noises of the non-stationary directions of the model in a suitable manner which ensures that they are a full-rank linear combination of n − r martingale di4erence sequences of unit variance. The following Lemma relates these unit variance martingale di4erence sequences to the innovation process of the model. Lemma 4. There exist full rank (n − r) × (n − r) matrices Ni , i = 0; 1, such that  i jt = 41t , i = 0; 1. Explicit expressions for Ni are Ni = their inverse satisfy Ni−1 i⊥ i   −1 M11 (i⊥ i⊥ )(i⊥ i⊥ ) , i = 0; 1. Proof. We give a proof for the 3rst regime. It also applies over the second one provided that the required notation changes are made. Let us 3rst partition C0 (1) into (C00 (1) C01 (1) ) , with C01 (1) a full rank matrix of dimension (n − r) × n. From (A.5) we have for t 6 t0     0 Xt 0 0 (1 − L)Ir L Xt =   0⊥ L Xt 0 In−r 0⊥   0 (1 − L)Ir ∗ C0 (1) + (1 − L)C˜ 0 (L) jt ; (C.6) = 0 In−r with

 C˜ ∗0 (L)

=

(1 − L)Ir

0

0

In−r

C0∗ (L):

Now note that, for t 6 t0 ; Eq. (1) rewrites as   −1    p   0  0 In − L Xt = 0

j Lj    0⊥ 0⊥ j=1

0





 which pre-multiplied by 0⊥ gives   −1   p  0 0   j  L Xt = 0⊥

j L jt : 0⊥ In −     0⊥ 0⊥ j=1

0 Xt  0⊥ L Xt

+ jt ;

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

307

Replacing (0 0⊥ ) L Xt by its expression in (C.6) and evaluating the equality at L = 1 we have   −1  0 0   0⊥ jt = 0⊥ jt ;  0⊥ C01 (1) p with = (In − j=1 j ). Finally, remarking that [(0 0⊥ ) ]−1 = (P0 P0⊥ ) we get   0⊥ jt ;

0⊥ (0⊥ 0⊥ )−1 C01 (1)jt = 0⊥  jt is a full rank linear combination of the components of C01 (1)jt . which states that 0⊥ 0  1 , and therefore 0⊥ jt is also a full rank linear combination From (C.4), C0 (1)jt =M011 41t 0 of the white noise process 41t .

Remark 4. When the loading factors are not a4ected by the structural break, i.e. when 1 0 . In that particular case, = N1−1 N0 41t 0⊥ = 1⊥ = ⊥ , then Lemma 4 implies that 41t   the dominant terms of 0⊥ L Xt and 1⊥ L Xt are de3ned from the same unit-variance white noises. The remainder of the proof consists in deriving the asymptotic limit of the terms involved in the OLS estimators of the regression (10)–(11) parameters, using their preceding reparametrization. It is not included for lack of space and is available from the authors on request. Appendix D. Proof of Theorem 5 By use of (A.8) and (A.9), the terms of dominant order in the process     −1  (b1⊥ L Xt )† 0⊥ L Xt b1⊥ L Xt − B10 B00  ‡ (1⊥ L Xt ) = = ; b01⊥ L Xt b01⊥ L Xt can be rewritten as  L X t )‡ (1⊥

=



L11 L12



  1⊥ jt



L01 0

 0⊥ jt ;

with L11 and L01 two (r −r  )×(n−r) full rank matrices and L12 a (n−2r +r  )×(n−r)   − L01 0⊥ )jt selects the full rank matrix. Moreover, by construction, the term (L11 1⊥  components of (1⊥ L Xt ) that are speci3c to the second regime, so that it can also  jt ; with L†11 an (r − r  ) × (n − r) full-rank matrix. Therefore be written as L†11 1⊥ ‡ †   (1⊥ L Xt ) = L 1⊥ jt ; with L† = ((L†11 ) L12 ) , an (n − r) × (n − r) full-rank matrix. Accordingly, by application of Lemma 4, there exists a full rank (n − r) × (n − r) 1 †  1 † matrix, (M111 )† , such that (41t ) = (M111 )† L† 1⊥ jt , where (41t ) is a martingale di4erence sequence of dimension n − r and unit variance. 1 † We want to show that it is always possible to decompose this process (41t ) into 11  01   01   (41t 41t ) with 41t a process of dimension n − 2r + r which is common to the

308

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

 11  01   ) ∩ sp(1⊥ )) and 41t a process of dimension r − r  speci3c two regimes (41t ∈ sp(0⊥ to the second regime.  One can decompose 1⊥ jt into two components associated with the two supple  ). The preceding relation can then be rewritten mentary subspaces sp(0 ) and sp(0⊥ as 1 †     ) = K1⊥ 0 (0 0 )−1 0 jt + K1⊥ 0⊥ (0⊥ 0⊥ )−1 0⊥ jt ; (41t

where K = (M111 )† L† is an (n − r) × (n − r) full rank matrix. When n − 2r + r  directions of the space of the common trends do not change with the structural break,    0⊥ (0⊥ 0⊥ )−1 is of rank n − 2r + r  and 1⊥ 0 (0 0 )−1 is of rank r − r  . Thus 1⊥ the decomposition of the n − r independent white noises of unit variance expresses  1 † jt , with K0 and L0 two (r − r  ) × (n − r) full rank (41t ) = KK0 L0 0 jt + KK1 L1 0⊥ matrices and K1 and L1 two (n − 2r + r  ) × (n − r) full rank matrices. 1 † ) over the supplementary It is always possible to modify the components of (41t   subspaces sp(0 ) and sp(0⊥ ) by changing the bases that span those subspaces, namely by introducing two full rank matrices, ’ and , respectively of dimension r × r and (n − r) × (n − r), such that 1 † ) = KK0 L0 ’−1 ’0 jt + KK1 L1 (41t

−1

 0⊥ jt :

We are looking for such a re-parameterization which ensures that the r − r  compo1 † ) ∈ sp(0 ) correspond to its r −r  3rst rows and the n−2r +r  components nents of (41t 1 †  of (41t ) ∈ sp(0⊥ ) to its n − 2r + r  last rows, that is we want to 3nd   Ir−r  0 0 0  −1  −1 ; ; KK1 L1 = KK0 L0 ’ = 0 0 0 In−2r+r  or equivalently K0 L0

=K

 −1

Ir−r 

0

0

0



’;

K1 L1

=K

Partitioning K −1 , ’ and as follows,   11 ’11 K 12 K −1 ; ’= K = 21 22 ’21 K K the requested conditions imply that  11 K ’11 K 11 ’12  K0 L0 = ; K 21 ’11 K 21 ’12

−1

’12

=

0

0

In−2r+r 



 ;

’22 

K1 L1

0

=

:

11

12

21

22

K 12

21

K 12

22

K 22

21

K 22

22

;

;

which is always satis3ed once one chooses K0 = ((K 11 ) L1 = (

21

(K 21 ) ); 22 ):

L0 = (’11

’12 );

K1 = ((K 12 )

(K 22 ) );

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

309

The K matrix is given. But the ’ and matrices are free of any constraint (besides the full rank requirement) and K0 , L0 , K1 and L1 are de3ned up to a full rank linear 1 † transformation. Therefore, the n − r noises (41t ) can always be rewritten as   ’11 0 0 12 1 †  0⊥ jt : 0 jt + ) = (41t ’21 0 0 22 Imposing ’21 = 0 and 12 = 0 does not introduce any supplementary constraint to the program, since those parameters are not involved into the choice of L0 and L1 . We 1 † can thus decompose (41t ) into r − r  independent white noises of unit variance which belong to the space spanned by the column vectors of 0 and n − 2r + r  independent white noises of unit variance which belong to the space spanned by the column vectors 1 † 11  01   of 0⊥ , that is (41t ) = ((41t ) (41t )). 0 The same kind of results are obtained for the noises 41t of the 3rst regime, that we 0 00  01   11 00 ; 41t )=0. Finally, the consequently can split into 41t =((41t ) (41t ) ) , with CovLT (41t  1 † jt arguments developed into the proof of Theorem 2 still hold for (41t ) = (M111 )† L† 1⊥ 1 11   −1  and 40t = M0 0⊥ 0⊥ (0⊥ 0⊥ ) 0⊥ jt and thus give us the asymptotic distribution of the 0†t (0 ) test statistic under the speci3ed null hypothesis. References Anderson, T.W., 1951. Estimating linear restrictions on regression coeScients for multivariate normal distributions. Annals of Mathematical Statistics 85, 813–823. Bai, J., Lumsdaine, R.L., Stock, J.H., 1998. Testing for and dating common breaks in multivariate time series. Review of Economic Studies 65, 395–432. Banerjee, A., Lumsdaine, R.L., Stock, J.H., 1992. Recursive and sequential tests of the unit-root and trend-break hypothesis: theory and international evidence. Journal of Business and Economic Statistics 10, 271–287. Boswijk, H.P., 1995. Identi3ability of cointegrated systems. Working Paper, Tinbergen Institute. Engle, R.F., Granger, C.W.J., 1987. Co-integration and error correction: representation, estimation, testing. Econometrica 55, 251–276. Granger, C.W.J., 1981. Some properties of time series data and their use in econometric model speci3cation. Journal of Econometrics 16, 121–130. Granger, C.W.J., Weiss, A.A., 1983. Time series analysis of error correcting models. In: Karlin, S., Amemiya, T. (Eds.), Studies in Econometrics, Time Series and Multivariate Statistics. Academic Press, New York, NY, pp. 225–278. Gregoir, S., 1995. Polynomial cointegration in presence of a break at an unknown date. WCES (Tokyo), Mimeo, INSEE. Gregoir, S., Laroque, G., 1994. Polynomial cointegration: estimation and test. Journal of Econometrics 63, 183–214. Gregory, A.W., Hansen, B.E., 1996. Residuals-based tests for cointegration in models with regime shifts. Journal of Econometrics 70, 99–126. Gregory, A.W., Nason, J.M., Watt, D.G., 1996. Testing for structural breaks in cointegrated relationship. Journal of Econometrics 71, 321–341. Hansen, B.E., 1992. Test for parameters instability with I(1) processes. Journal of Business and Economics Statistics 10, 321–335. Hansen, H., Johansen, S., 1999. Some tests for parameters constancy in cointegrated var-models. Econometrics Journal 2, 306–333. Hansen, P.R., 2000. Structural breaks in the cointegrated vector autoregressive model. Mimeo, University of California, San Diego.

310

P. Andrade et al. / Journal of Econometrics 124 (2005) 269 – 310

Inoue, A., 1999. Test of cointegrating rank with a trend-break. Journal of Econometrics 90, 215–237. Johansen, S., 1988. Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control 12, 231–254. Johansen, S., 1991. Estimation and hypothesis testing of cointegration vectors in gaussian vector autoregressive models. Econometrica 59, 1551–1580. Johansen, S., 1995. Likelihood-Based Inference in Cointegrated Vector Auto-Regressive Models. Advanced Texts in Econometrics. Oxford University Press, Oxford, UK. Johansen, S., Juselius, K., 1992. Testing structural hypotheses in a multivariate cointegration analysis of the PPP and the UIP for UK. Journal of Econometrics 53, 211–244. MacKinnon, J.G., 1996. Numerical distribution functions for unit root and cointegration tests. Journal of Applied Econometrics 11, 601–618. MacKinnon, J.G., Haug, A.A., Michelis, L., 1999. Numerical distribution functions of likelihood ratio tests for cointegration. Journal of Applied Econometrics 14, 563–577. Magnus, J.R., Neudeucker, H., 1988. Matrix Di4erential Calculus. Wiley, New York, NY. Perron, P., 1989. The great crash, the oil price shock, and the unit root hypothesis. Econometrica 57, 1361–1401. Phillips, P.C.B., 1995. Fully modi3ed least squares and vector autoregression. Econometrica 63, 1023–1078. Phillips, P.C.B., Hansen, B.E., 1990. Statistical inference in instrumental variables regression with I(1) processes. Review of Economic Studies 57, 99–125. Quintos, C.E., 1997. Stability tests in error correction models. Journal of Econometrics 82, 289–315. Quintos, C.E., Phillips, P.C.B., 1993. Parameters constancy in cointegrating regression. Empirical Economics 18, 675–703. Saikkonen, P., LIutkepohl, H., 2000. Testing for the cointegrating rank of a VAR process with structural shifts. Journal of Business and Economic Statistics 18, 451–464. Seo, B., 1998. Tests for structural change in cointegrated systems. Econometric Theory 14, 222–259. Stock, J.H., Watson, M.W., 1988. Testing for common trends. Journal of the American Statistical Association 83, 1097–1107. Zivot, E., Andrews, D.W.K., 1992. Further evidence on the great crash, the oil-price shock and the unit-root hypothesis. Journal of Business and Economic Statistics 10, 251–270.