Journal of Econometrics
1 (1973)
3-16.
0 North-Holland
Publishing Company
A MARKOV MODEL FOR SWITCHING REGRESSIONS* Stephen M. GQLDFELD and Richard E. QUANDT Princeton Lhiversity
1. Introduction
Consider a situation in which observations, indexed by i (i=l, .... n), are available on a dependent variable yi and on k independent variables Xri; ***) Xki. Assume that the ith observation on y is generated by one or the other of two’ true regression equations, i.e., either k
Yi=
’
j= 1
PljXji+
&!li=X~P1
Uli
(1)
+ U2i
(2)
+
or k Yi = ,Fl
fl2pji
+ U2i=XJp2
where pt and p2 are the vectors of the coefficients Prj, fl2j, where Xi is the vector of the ith observation on the independent variables and where Uri and U2i satisfy the classical assumptions made about error terms and are, for convenience, assumed to be distributed as iV(0, u2) 2’ and N(0, ~2) respectively. If it is further assumed that (/3r, uf) # (p2, a2), the regression system given by (1) and (2) may be thought to be switching between the two equations or ‘regimes’. This raises the question of * We would like to thank Ray C. Fair for reading an earlier version of this paper. We gratefuRy acknowledge financial support from the National Science Foundation. r Lt is only for simplicity’s sake that we confine our attention to the case of only two generating equations. Although some of the algebraic details will differ if there are more than two, there is no diffe :ence in principle.
4
SM. Goldfeld, R.E. Quandt, Switching regressions
estimating the parameters of (1) and (2) and of testing the null hypothesis that (/3t, uf) = (&, ui). It is crucial for what follows that the investigator is assumed to have no definitive a priori knowledge about how to classify the data between the two regimes. If he did have such knowledge the problem of testing the null hypothesis is solved by the Chow-test (Chow, 1960). In the absence of such knowledge, the problem can be successfully attacked only by imposing some further structure on it. The simplest type of structure consists of the assumption that there is at most one switch in the data series; i.e., that the first m (m unknown) observations in a time series are generated by regime 1 and the remaining n-m observations by regime 2. Problems of this type have been analyzed in various ways by Brown and Durbin (1968), Farley and Hinich (1970) and Quandt (1958, 1960). This simple model, permitting only one switch, is clearly unrealistic in some economic contexts. A more complex situation arises if it is assumed that the system may switch back and forth between the two regimes. Accordingly the first m 1 observations may come from regime 1, the next m2 from regime 2, the next m3 from regime 1 again, etc., with ml, m2, ... . m, (J& ml=n) being unknown. Under this assumption it is theoretically possible for the system to switch between regimes every time that a new observation is generated. The basic purpose of this paper is to introduce a model that allows for numerous switches. In sect. 2 we briefly describe two of our earlier approaches to the many-switch case that will be important for what follows. These approaches have the basic characteristic that the probability of a switch does not depend, at any time, on what regime is in effect. In sect. 3 we describe a new statistical model that explicitly allows for such a dependence. In sect. 4 we apply it to a concrete economic illustration. In sect. 5 we make some suggestions for further extensions.
2. The Multiswitch 2.1.
problem
The D-method
The first of two recent approaches to this problem has been introduced in Goldfeld and Quandt (1972). In this approach it is assumed that ignorance about which regime generates an observation is only partial; specifically it is assumed that there exist observations on some exo-
S.M. Goldfeld, R.E. Quandt, Switching regressions
5
genous variables’ Zri, Z2i, . .. . Zpi (i=l, . .. . n), and that nature selects regime 1 for generating the ith observation on the dependent variable if an unknown function, possibly linear, of the z’s is less than or equal to zero. Thus, P Yi=XJPl
+&!li
if
C i= 1
TjZjiG
0
if
A j= 1
ITjZji >
0
(3)
and
yi = Xi02 + U2i
where the ni are unknown unit step function
coefficients
to be estimated.
Introduce
the
P d(z,)
= 0
if
C j= 1
TjZji G
0 (4)
P
d(Zi) = 1
if
C ~iZii> 0 j= 1
and denote by D the diagonal matrix of order n which has d(Zi) in the ith position of the main diagonal. Let Y be the n X 1 vector of observations on the dependent variable and X be the n X k matrix of observations on the independent variables. The problem of estimating the two separate regimes is- then equivalent to estimating the 2k p’s, 20~‘s and n d’s of the composite regression equation Y = (I-D)
xp1 + 0x/3,
+
+v 9
(5)
where W = (I-D) U, + DU2 is the vector of unobservable and heteroscedastic error terms. Denoting the covariance matrix of W by A-2= (I-O)20~ * The z’s may, of
+ D2u; , course, include some or all the regressors.
(6)
S.M. Goldfeld,
6
maximum
likelihood
L = constant
x 52--l[
R.E. Quondt,
estimates
Switching
are obtained
- + 1oglCII - 4{[ Y-(I-D)X&
Y-(I-@X/3,
- m/3,]}
regressions
by maximizing -OX&
1’ (7)
.
Estimating parameters by maximizing (7) in its given form is not practical because of the combinatorial aspects of the unit step function and is not likely to yield consistent estimates.3 The problem becomes tractable from the practical point of view and the method yields consistent estimates if the unit step function is replaced by a continuous approximation, say by
where u is a new parameter and must be estimated along with the ~j. Maximum likelihood estimates are then obtained by maximizing (7), subject to replacing d(zi) in (7) by the expression in (8), with respect to the 2k p’s, p T’S and u.~ 2.2. The X-method An alternative method, introduced in Quandt (1972), assumes that nature chooses between regimes 1 and 2 with probabilities X and 1 -X, where X is unknown to the investigator. Then the conditional density of yi is
htiiIXi> = Vlo/iIXi) + (l-~)f2cVilXi> = exp
=*
l
(9)
-3
-3 u12
+g2exp
i
c_Yj-x:P2)2 3 I
3 If the number of parameters to be estimated increases with the number of observations, maximum likelihood estimates are not, in general, consistent. See Kendall and Stuart (1961), p. 61. 4 For further discussion see Goldfeld and Quandt (1972), ch. 9. In addition, one may use the estimated nj [or the estimated d(Zi)] to separate the sample and then apply standard estimating techniques.
S.M. Goldfeld, R.E. Quandt, Switching regressions
from which the loglikelihood
function
is
n
L =
C
1Ogh~ilXi)
(10)
i= 1
which may be maximized with respect to the o’s, u2’s and h.’ Both the D-method and the h-method have been found to have acceptable sampling properties. 6
3. A Markov model The essence of the X-method as stated in the previous section is that the probability that nature selects regime 1 or 2 at the ith trial is independent of what state the system was in on the previous trial. We shall explicitly relax this assumption and introduce the matrix T of transition probabilities, the (rs)th (r, s = 1, 2) element of which being the probability that the system will make a transition from state r to state S, i.e., that if in period i - 1 regime r was in effect then in period i regime s will be in effect. This interpretation makes the regime switching process a Markov chain. Denote the probability vector that the system is initially in one or the other of the two states by XL = (h,, X2,) (where of course X,, = 1 - X2,) and the corresponding probability vector at the ith stage by Xi = (Xri hz). Then
and X; = hb T'.
(11)
In the conditional h~ilXi>
=
density
Xlif~tjilxi)
function
(9) for yi we now replace h by Xii:
+ (l-Xli)f2tiilxi)
= AI-f;:
(12)
‘An immediate extension of this method is obtained by allowing h to be a function of the previously introduced exogenous variables z. ‘See Goldfeld and Quandt (1972) and Quandt (1972), as well as Goldfeld et al. (1971), for a favorable contrast with a nonlinear least-squares procedure.
SM. Goldfeld. R.E. Quandt, Switching regressions
8
where fi is the vector
fi=
1
f~CYilxi) * [ f2CYilxi)
Hence the loglikelihood
function
is in general
n
(13) 3.1. The basic model We now let the matrix of transition
probabilities
be
T=
Defining tLs as the (rs)th yields the recursions {r = rr&’
+ (1-Q
element
of Ti, the application
of (11) then
t&l
(14) I!21 = (1 -r2)
6;’ + T2t;;l
with the initial conditions of the difference equation
t: 1 = 71 and til
= 1 - TV.The general solution system ( 14) is then
Since, as before, Xi = (h1,,~11+h20~21, Xl,r’,,+X2,t’,,) substitution in ( 13) yields the requisite likelihood function as a function of the P’S, u2, just described will be referred to as the TXl,, 71, and 72. The method method. 3.2. An extension The elements of the transition matrix may be made functions of an extraneous variable z. Then we can write
SM. Goldfeld, R. E. Quandt. Switching regressions
and also
If, for example, one assumed for theoretical reasons that large values of z are associated with high probabilities of entering the first state, the T functions might be defined by
q(q)
=&
j!ew 1-4(2,‘)
d-E
-00
and
r*(zi) = +fiu
f
exp
[-j(q)*)
dt.
=i
The likelihood function can then be derived and would have to be maximized with respect to the usual parameters as well as zcr, zo2, at, 022. This last method will be referred to as the r(z) method.
4. An economic example Recently Fair and Jaffee (1973) have proposed a model for a housing market in disequilibrium in which the demand and supply functions are specified as yt = CYO+ CYlXlt+
“*X*t
+ “3X3t
+ u1t
(16)
and Yt = & + alXlt
+ P4X4t + PgX5t + f16x6t + ‘21 ’
(17)
where yr is the observed number of housing starts in month t: x1 t is a time trend, x2t a measure of the stock of houses, x3t the mortgage rate lagged two periods, xqt a moving average of private deposit flows in
S.M. Gbldfeld, R. E. Quandt, Switching regressions
10
savings and loan associations and mutual savings banks lagged one period, xsr a moving average of borrowings by savings and loan associations from the Federal Home Loan Bank lagged two periods and x6t = x3[+r. If there is an excess demand, the observed point lies on the supply function and if there is an excess supply, it lies on the demand function: at least formally, therefore this is a two-regime problem. Among other methods, Fair and Jaffee estimated the model by segregating data points into two subgroups according to whether a point belonged to a period of.rising or falling price (mortgage rate): if price was rising there must have been an excess demand and the corresponding points may be used to estimate the supply function and conversely for points belonging to periods of falling price .7 111effect they thus posited the existence of a z-variable with an implied approximation of the form of (8) given by
d(zi) = &
i:
exp
i-4 (2)‘)
dE
(18)
where z. was explicitly assumed to be zero. Their model was reestimated in Goldfeld and Quandt (1972), employing the D-method where z. was taken to be a parameter to be estimated.8 It was pointed out by Fair and Kelejian (1972) that neither the procedures in Fair and Jaffee (1973), nor Goldfeld and Quandt (1972) are legitimate since they imply data segregation determined by price change which is itself endogenous. The resulting estimates are hence biased. An obvious alternative, therefore, is to neglect the extraneous information about the z’s and make use of the X-method and this was accomplished in Quandt (1972). As previously indicated, according to this the probability of observing a period of excess demand does not depend on whether the previous period was one of excess demand or not. It thus seems natural to reestimate the model using the r-method of sect. 3 but before doing so we need one further elaboration. It had been previously observed that the error terms in the two regimes ‘are autocorrelated. It was therefore necessary to derive a likelihood function that would permit the estimation of autocorrelation there are numerous ways of specicoefficients p1 and p2. Unfortunately 7 Many observations were associated with zero price change. These are then presumably equi$brium observations and were treated in various ways by Fair and Jaffee. This is clearly equivalent, in the terms of (4) to having two z’s with zti having values of unity, Zzi being the price changes, nt = -20. and “2 = 1.
S.M. Goldfeld, R.E. Quandt, Switching regressions
11
fying the error generating mechanism in the presence of autocorrelation and switching regimes. The method adopted here was the simplest, although not the most reasonable one, according to which the equations are written
and yi-pfli_1
-(x~-P2xI-1)132= ‘2i
9
(19)
and where Eli and ~2i are posited to be independent; the r-method is then applied to the conditional densities of yi-_Plyi_ 1 and yi-p2yi_ 1. The resulting likelihood function was maximized with respect to 16 parameters (9 (Y’Sand p’s, 2 u2’s, 2 p’s, A,,, 71 and r2). The estimates and their ratios to the square roots of their asymptotic variances together with the corresponding figures for the X-method are displayed in table 1. From r1 and 72 and (15) the limit probability of state 1 is 0.192. This is remarkably close to the previous estimate of the fraction of observations associated with Regime 1 which was given by h = 0.181. The reasonableness of the estimates for r1 and r2 can be checked in another way as well. Let T be the transition matrix, let A be the matrix which has the ith limit probability repeated as its ith column. Define: 2 = (I-(T-A))-’ and let Zdg be the let E be a matrix element equal to mean first passage
, matrix 2 with its off-diagonal elements replaced by 0; of all l’s and D a diagonal matrix with ith diagonal the reciprocal of the ith limit probability. Then the matrix M is given by9
A4 = (I-Z+EZdg)
D
and the variance of the first passages by V = M(2ZdgD-I) Let S be the matrix 9 See Kemeny
+ 2(ZM-E(ZM),,) containing
and Snell(1960),
.
the square roots of the elements
pp. 79-83.
of V.
12
S.M. Goldfeld, R. E. Quandt, Switching regressions Table 1 h-method
Estimate
s-method Estimate/approx. std. dev.
190.09 2.25
4.23
193.533
3.15
3.26
2.116
-2.13
-
0.020
-0.195
-2.50
-
0.193
0.247
0.93 -0.80 1.78
-0.14
-37.900
-0.84
-2.66
-
-2.58
0.330
0.057
6.39
0.059
4.93
0.033
3.79
0.035
3.50
2.72
0.199
2.66
27.53
2.04
72.890
0.66
34.45
4.85
42.591
2.68
0.143
0.538
5.16
0.513
2.54 7.62
9.71
0.717
-
-
0.916
5.38
-
-
0.980
28.82
0.698
0.181 _
* No standard deviation is reported tive values of Ale and maximizing
Computing
Estimate/approx. std. dev.
-0.016
-4.58
Al0
Estimate
these quantities
2.13 _
0.120
_*
since the optimum was achieved by scanning over alternajointly with respect to the remaining parameters.
in the present
case yields
Thus, if we are on the demand function, it takes an average of 5.19 steps to be on the demand function again; if we are on the supply function it takes only 1.24 steps.
S.M. Goldfeld, R. E. Quandt, Switching regressions
13
A totally different estimate of M can be obtained by utilizing the extraneous variable ‘price change’, as suggested by Fair and Jaffee. The application of the D-method results in an estimate of z. and u2 which leads to d(zi)-values that allocate observations with zero price change essentially to the supply function. Thus one can count directly from the time series the average number of periods that it takes to enter any regime from any regime. This yields a new estimate M* for the mean first passage matrix which is based on considerations quite independent of the r-method. M* is
The diagonal elements show remarkable agreement. The off-diagonal elements clearly agree less well. However, the difference between them is still well within a one-u range as is shown by the elements of S. On the whole, the r-method is both computationally feasible and yields reasonable results. In the present example the results are quite similar to the ordinary X-method (this is not surprising in view of the closeness of the estimated r’s to unity) but in other applications this hardly need be the case.
5. Some possible extensions Sects. 2 and 3 contain a variety of methods for estimating switching regressions that allow for multiple switches in regimes. These methods allow basically two kinds of patterns in the switching mechanism. The essence of the first pattern is that the probability of choosing a regime is either a constant’or depends only on the immediately previous state of the system (X and r methods). The second pattern permits a deterministic choice of regimes depending upon an extraneous variable (Dmethod). These two patterns may be combined yielding a probabilistic choice between regimes which depends on an extraneous variable (as in h(z) and r(z) methods). None of these methods allows the choice of a regime to depend on the historical record of regimes in the sense that the current selection of regimes is influenced neither by the number of times a given regime has prevailed nor by the temporal pattern of regime choices. There are many economic applications where such inertial dependence might be
14
S.M. Goldfeld, R.E. Quandt, Switching regressions
desirable. For example, in the application discussed in sect. 4, periods of excess demand (or supply) may have a tendency to persist. One might thus like a model which allowed the probability of entering a period of excess demand to tend to rise if excess demand has prevailed in the past.‘O One of several possible ways of accomplishing this is to specify Xi (the probability of choosing the first regime) as Xi=p(l-6i_l)+(1-P)xi-1
)
(20)
where 0 < p < 1 and where 6i_ r = 0 if in the previous period the first regime prevailed and 6i_ 1 = 1 otherwise. For p = 0, the procedure implied by (20) is simply the h-method. For p = 1, the system perpetually remains in the regime in which it was in the initial period. For intermediate values, the higher the value of p the more Xi tends to reflect predominantly the state of the system in the previous period. As a matter of practical implementation, it is clear from (20) that the Xi can take on 2’- 1 different values, each expressible in terms of X0, 6, and p. The probabilities corresponding to these values can be computed as functions of these same parameters. The values of the Xi and their probabilities can then be combined in the usual manner to form the likelihood function which is then expressed entirely in terms of the p’s, u’s and X0, 6, and p. Another extension which allows switching between states that involves temporal persistence may be specified. This extension has the feature that in some periods the system obeys regime 1, in some others regime 2, and in still some others it obeys a transitional hybrid between the two regimes. Let D be a diagonal matrix as in (5) but do not require its elements di to be 0 or 1. Of course if di is 0 or 1 we are entirely in one or the other regime; otherwise nature is assumed to generate yi from a hybrid regime. Let r(Zi) be defined by
exp
(-4 (%)*I
dt
.
lo This could, of course, come about for T(Z), A(Z) or D-methods if the z-variable appropriately but it is obviously desirable to allow for this possibility more generally.
behaved
S.M. Coldfeld, R. E. Quandt, Switching regressions
15
We now let
di = Pdi_1
+
(21)
( 1-p) y(zi)
where 0 i p < 1 and with the initial condition from (21) and the initial condition that
d, = I.
It follows
Replacing the elements of D in (5) by (22) the likelihood function is expressed as a function of the unknown p and of the unknown za and u. In the extreme case when p = 0 there is no temporal persistence in the switching process and the degree of hybridization depends only on z. Otherwise, the degree of hybridization depends on the value of p.
References Brown, R.L. and J. Durbin, 1968, Methods of investigating whether a regression relationship is constant over time, Paper presented at the European Statistical Meeting, Amsterdam. Chow, G., 1960, Tests of the equality between two sets of coefficients in two linear regressions, Econometrica 28, 561-605. Fair, R.C. and D.M. Jaffee, 1973, Methods of estimation for markets in disequilibrium, Econometrica, forthcoming. Fair, R.C. and H.H. Kelejian, 1972, Methods of estimation for markets in disequilibrium: A further study, Econometric Research Program, Research Memo. No. 135, Princeton University, Jan. 1972. Farley, J.U. and M.J. Hinich, 1970, A test for a shifting slope coefficient in a linear model, Journal of the American Statistical Association 65, 1320-1329. Goldfeld, SM., H.H. Kelejian and R.E. Quandt, 1971, Least squares and maximum likelihood estimation of switching regressions, Econometric Research Program, Research Memo. No. 130, Princeton University, November 1971. Goldfeld, SM. and R.E. Quandt, 1972, Nonlinear Methods in Econometrics (North-Holland Publishing Co., Amsterdam). Kemeny, J.G. and J.L. Snell, 1960, Finite Markov Chains (D. Van Nostrand, Inc.). Kendall, M.G. and A. Stuart, 1961, The Advanced Theory of Statistics, Vol. II (Hafner, New York). Quandt, R.E., 1958, The estimation of the parameters of a linear regression system obeying two separate regimes, Journal of the American Statistical Association, 53, 873-SRO. Quandt, R.E., 1960, Tests of the hypothesis that a linear regression system obeys two separate regimes, Journal of the American Statistical Association 55, 324-330. Quandt, R.E., 1972, A new approach to estimating switching regressions, Journal of the American Statistical Association 67, 306-310.