An aggregate time-series analysis of urban transit demand: The Montreal case

An aggregate time-series analysis of urban transit demand: The Montreal case

Transpn Res. Vol. 9. pp. 249-258. Pergamon Press 1975. Prmted in Great Brilam AN AGGREGATE TIME-SERIES ANALYSIS OF URBAN TRANSIT DEMAND: THE MONTRE...

1021KB Sizes 5 Downloads 27 Views

Transpn Res. Vol. 9. pp. 249-258.

Pergamon Press 1975.

Prmted in Great Brilam

AN AGGREGATE TIME-SERIES ANALYSIS OF URBAN TRANSIT DEMAND: THE MONTREAL CASE MARCGAUDRY Centre de RechercheSWles Transports,UniversitCde Montrbal,Case Postale 6128,Station A, MontrCal,Quebec H3C3J7,Canada (Received 20 September 1974) Abstract-This paper shows how readily available monthly time-series data may be used to explain the aggregate demand for public transit in particular urban areas in terms of the prices of public and private transportation, the price of non-transportation goods, service characteristics of the competing modes, comfort levels, income and socio-economic variables, etc. Parameter values pertain to the adult market of the Montreal Urban Community Transit Commission over the period December 1956 to December 1971. Estimates are obtained by using linear regression techniques in conjunction with the Box-Jenkins procedures for the specification of the Rth-order autoregressive process of the error terms

THEDEARTH OF TIME-SERIES STUDIES OF TRAVEL DEMAND The general travel literature contains relatively few time-series analyses. Fisher (1962) has examined the survival of the passenger train between Boston and New York; Bieber and Aurignac (1971) and Meyer and Straszheim (1971) have reported on a small number of airline demand studies while Steilberg (1973) has studied intercity travel by bus and rail in the Netherlands. Except for Steilberg, who used quarterly data, analyses are based on annual data and none use more than 4 or 5 explanatory variables. In the strictly urban travel literature, the neglect of time-series analyses contrasts with the abundance of cross-sectional analyses and suggests the presence of a conspiracy. Indeed, whether one consults surveys of a methodological nature, such as Schmidt and Campbell (1956), Martin, Memmott and Bone (1%5) or Quandt (1970),the absence of time-series analysis is obvious. This

INTRODUCTION The purpose of this paper is to formulate an inexpensive approach to urban transit demand estimation which can be of some use in planning the level of public transit services. This approach consists in estimating the aggregate demand for public transit over a metropolitan area as a whole, or over the greatest portion thereof, from monthly time-series data which are typically at the disposal of public transit authorities or which can be easily obtained from existing government agencies in most North American cities. The procedure will be tailored to the case of the Montreal Urban Community Transit Commission (henceforth MUCTC) which, as can be seen in Table 1 from data corresponding to the beginning and end of our sample period, operates a reasonably large system. In order to shorten the paper, the results presented will concern the adult market: the interesting differences with the schoolchildren demand function will be left to a later date.

Table 1. Some data concerning the MUCTC system and the area served 1956 MVCTC network length (miles) MUCK

annual

vehicle waiting

kluCTC average

(commercial)

MUCTC main Urbanized Revenue

adult area

time

49,236.901b (min.1 speed

(m.p.h.)

fare served

(acres)

of area

Population

served

ownership

2.4

4.8

10.4

12.8

$ 0.125

$ 0.300

295,442.624= served as x Wetropolits”

in area

served

1,918,700 85

172.365

b

249

*1.000 264.212.887

1,482,600 Pop.

1080a 64.498.802

46.000

passengers

PopuLaron

car

540

mileage

MUCTC average

1971

70 476.303

250

MARCGAUDRI

is also the case in more descriptive and state-of-theproblem surveys such as Meyer, Kain and Wohl (1965). Meyer and Straszheim (1971) and Brand and Manheim (1973). Whether they be mode-specific or mode-abstract, subject to sequential or “direct” estimation and whatever the level of data aggregation (individual, zonal, etc.) used, urban travel demand models are estimated from crosssectional data; there are even cases of demand functions “estimated” from trips never actually taken, such as Ben-Akiva (1973), where the “data” are cross-sectional, the ultimate proof of the existence of the conspiracy! Yet, as stated by Quandt (1970), time-series models are not in principle more difficult to estimate than crosssectional models; they also have a relative advantage in predicting the short-run effect of changes in independent variables and, contrary to cross-sectional models which must explain the levels of the dependent variable, can be formulated in such a way as to explain the changes in those variables-such a procedure removes the influence of long-run factors and obviates the formulation of long-run theories of economic behavior. Of course available time-series data are rarely at a sufficient level of disaggregation to address many of the microlevel questions which transit operators have to answer. Still, some specific studies use a limited and rather heuristic time-series approach in looking at “trends”: visual graph analysis (using perhaps a moving average) is common in financial statements of transit authorities and yearly statistics of traffic departments of many cities. A variation of this simple form of time-series analysis also frequently occurs: it consists in examining the short-run effect of a particular change in the value of one of the variables pertaining to a given market situation; a case in point is that of a New York City transit fare increase as it is explained by Lassow (1968). Analogous and sometimes more complex procedures have been used by Rainville (1948), Carsten and Csanyi (1968) and more recently by Boyd and Nelson (1973), Kemp (1973), Martin et al. (1973), Smidt (1973) and Smith and McIntosh (1973). Despite these promising beginnings, the overwhelming majority of papers have not made sufficient use of existing longitudinal data sets, a state of affairs which the simple approach taken here hopes to remedy. THE MODEL

A. The demand function Let us write the aggregate demand function for public transit trilps during period t, X,“, as X,d = D(P,. T,, C,, Y,, A,),

(I)

where P is the vector of prices of trips by the various modes and of non-transportation goods, T and C are vectors of time and comfort characteristics of relevant modes, Y denotes income and A is a vector of levels of activities whose performance require passenger transportation. That function is similar to that proposed by Ruiter (1973). It does not depart from conventional practice and can be classified as a “direct” demand function because it explains trip generation and modal split simultaneously.

The presence of P, T, C and Y as arguments of the function is consonant with a Lancaster-like (1966) maximization process where consumers’ demand functions concern a set of activity levels (transportation, work, school, shopping, etc.) chosen for their utility yielding characteristics. The presence of some of the quantities demanded, A, as arguments in the demand function for another, X, is not a perversion of demand theory: in this partial equilibrium analysis, A represents exogeneously given information on levels of activities presumed to be jointly demanded with transport activities and to that extent highly correlated with them. The vector A would be out of place if trips were made without a purpose and for their own sake. Demand functions should ideally be estimated for each subgroup of individuals of similar tastes (or socioeconomic characteristics) and for each trip purpose (or given sort of activity A,). Our aggregate data on a single institutional mode impose an aggregate single-mode specification incorporating all A,, . . ., A, activities and all individuals. Do we simply collect data and estimate the function? No, because a number of conditions have to be realized for the estimation to be possible. B. Market equilibrium The number of trips actually taken on the MUCTC network is the result of the simultaneous actions of consumers and of the MUCTC. And if the number of people who want to travel is larger than the capacity of the network to carry them, excess demand prevails: some travellers are left “in the street” and the number of trips observed during that period will not correspond to the desired number of trips which the demand function is meant to estimate. Another implication of excess demand is that waiting time for MUCTC vehicles cannot be computed from operated mileage because the queuing which accompanies excess demand is tantamount to an increase in waiting time which will not appear in data on vehicle operations. The frequency of excess demand depends on the MUCTC’s supply behavior. In Montreal, excess capacity exists on both the MUCTC’s profitable and unprofitable lines because it is possible to achieve desired levels of profitability by line without filling the vehicles. In consequence we write that, on average x,’ > Xfd.

(2)

This statement, that the quantity of trips supplied is on the average greater than or equal to the quantity demanded, is crucial to demand estimation in transportation markets where quantities supplied and demanded are not brought to equality by price (fare) variations, as is deemed to occur in most markets of the economy. In Toronto, for instance, some transit authorities believe that, since 1972, excess demand occurs frequently during the morning peak hours. A similar situation in our case would mean that observed trip totals are on the supply curve but not on the demand curve: this would complicate the task of estimating the demand function!

251

An aggregate time-series analysis of urban transit demand

C. Is the demand function identified? All observed points are on the demand function. But they are also on the supply function, so if the supplier’s behavior depends on the same variables consumers’ demand behavior depends on, how are we to know that parameter estimates of a function relating trips to those variables will yield estimates of the demand function parameters rather than estimates of the supply function parameters-or even a mixture of the two? From extensive conversation with MUCTC officials, it has been possible to determine that, over our sample period, the MUCTC’s short-run supply behavior has consisted in adjusting the capacity of the network during any period t in relation to four main factors, the first one of which determined relative supply levels among periods while the last three constrained absolute supply levels; (i) observed trip levels twelve months earlier: recommendations for supply level changes are typically made for the corresponding period of the following year; (ii) the accumulated deficit at the time of the budget; (iii) the expected deficit at the time of the budget: that deficit has been computed by using expected unit costs to compute total costs and by using, to forecast revenues, the assumption that passenger levels in the comming year would be the same as they were in the ending year (sometimes accounting for trend) and the assumption of an elasticity of demand with respect to price and waiting time smaller than unity; (iv) the willingness of local authorities to subsidize public transit. Because absolute supply levels are in general set and fixed once a year at budget time, the quantity supplied during any period t depends primarily on past values of the dependent variable and of some of the variables present in the demand function, on expected costs and on other factors. A fair mathematical representation of the supply decision process might be

discuss three different classes pertaining to our problem.

of specification

points

A. The treatment of distance A classification of urban travel demand models can be made according to their presentation of distance. To see this, let us consider the elements of the vector of time characteristics of the various m modes: T = (TW,,,, TTDC,,,, TF,,,),

m = 1,2,3,

(4)

where the arguments successively refer to waiting time, in-transit time and time on foot associated with a trip by mode m. Neglecting TW,,, and TF, for the sake of presentation, in-transit time consists in TTDC, = D/V,, namely in the ratio of distance (among activities associated with travel) to velocity of the mth mode. A first type of demand problem formulation, which we shall call the economic formulation EF, incorporates the vector T as defined in (4) or modifies it slightly if the effect of changes in speed are to be distinguished from those of changes in distance: T=

(

TW,,,, TT, = $,

m

D, TF, , K any constant. J (4’)

Under the economic approach and dropping time subscripts, demand equation (1) can be: X“ = D(P, (TW,, 7TDC,,,, TF,), C, Y, A).

(EF)

Completely different from EF is what we may call the physical formulation PF, the underlying intention of which is to consider the “true” dimension of travel X and activities A: because they occur over a certain area AR, these variables are densities and equation (1) is written

X,” = S(X-11, P,, P,-., T,, K.., C,-,, A,-., Y,-,, OTHER) (3) where n 3 3 because of the position of the budget month each year and where XIZ denotes the observed number of passengers twelve months earlier. As a result, the quantity supplied is not closely tied to expected demand on a day-to-day or week-to-week basis: demand and supply functions shift independently, or nearly so. These independent shifts of the supp/y function make it possible to identify the demand function. D. The speed of adjustment The system we are interested in consists in a demand (I), a market clearing (2) and a supply equation (3). We shall assume that changes in the values of the dependent variables occur within the period t over which observations are defined. MORE DETAILS ON THE SPECIFICATION OF THE DEMAND EQUATION

There are many ways of specifying equation (1): any change in the list of variables or any combination among them is in effect a different specification. Let us briefly

c&j”

= D(P,(TW,,

TT,, TF,). C, Y,&J.

(PF)

The crucial difference between those polar cases is that under PF it is not possible to distinguish between the impact of changes in the level of activities A and the impact of changes in the distance (a function of AR) among them. An example of the PF approach is given in a U.S. Dept. of Transportation/Federal Highway Administration publication (1967); examples of the EF formulation can be found in Sharpe et a/. (1958) and in Nakkash and Greco (1972). Most studies consist in a mixture of the two approaches and typically explain the /eve/ of trips observed-or “generated”, typically among pairs of zones-by activity densities (which incorporate distance) and sometimes by an explicit distance factor as well. The rationale for this is not clear: it may be an intuitive (but incorrect) way of controlling for heteroskedasticity of the disturbances, a frequent problem in cross-sectional studies. We shall adopt the EF approach because we think that consumers make decisions about trips rather than about trip densities.

222

MARC GAUDRY

B. Variable specific to time-series models From a single cross-section of travel behavior, it would be impossible to test for the significance of lags in adjustments to changes in some of the variables (the fare, waiting time, etc.) and very difficult to consider simultaneously the impact of many weather related comfort variables (we included four: temperature, rainfall. snowfall, cumulated snowfall). Some variables. such as the price of non-transportation goods, are easier to incorporate in a time-series study. C. Variables specific to Montreal Some of the variables we have included among the regressors are specific to the local situation. Expo variables represent fairs held every summer since 1967; transit workers’ strike variables are self-explanatory and some dummy variables account for particular accounting procedures such as the double counting of passengers using more than one fare zone. Such adjustments are unusual in urban transportation studies but they are familiar in macroeconomic model building. ESTIMATION

PHORLEMS

Although it would have been appropriate to formulate only the demand equation, an analysis of the full model was needed to assure us that such a function is identified. Similarly, estimation of the parameters of that demand equation requires explicit consideration of the system in which it is embedded if we are to have a good idea of the properties of our parameter estimates. Let us then rewrite our system under the assumption that a linear relationship holds among the variables of the demand and supply functions; using conventional econometric symbols, equations (1) (3) and (2) become



where TW is waiting time for MUCTC vehicles, p, and y, are parameters, u, and w, are disturbance terms and f is a factor which transforms seat-miles supplied into service frequency: this transformation expresses the fact that, over most of our sample period, service frequency has been the simple consequence of changes in capacity and has in any case been a function of yim12and of 2,‘s. What problems is the estimation of equation (5) likely to pose? A. Interdependence between regressors and error term u A first source of problems could arise from interdependence between regressors and the error term, specifically from the apparent determination of TW, by the system: were TW, so determined, it would introduce a simultaneous equations bias in the estimation of the parameters of the demand function, Fortunately, as argued in the previous section, TW, is set and fixed once a year, at

budget time and is consequently not the source of simultaneous equations bias. Neither does such bias arise from any of the other x, variables, like the fare, which are typically simultaneously determined in economic systems but which are fixed or predetermined in our problem. But interdependence between TW, and ui could occur through another channel: TW, depends on yi_lz and consequently on u,_,>; if, as argued in the following sub-section, u, and then 4 I? are correlated, correlated TW, and iii are contemporaneously and least-squares estimates of the 0,‘s will be biased and inconsistent (i.e. a bias persists for infinitely large samples). We may control for autocorrelation between u, and urmL2 but we cannot reestablish independence between TW, and u,_,?: we are implicitly considering a lagged dependent variable model (it is as if .v,_,:were one of the regressors of the demand equation) and least-squares estimates will be biased even if they regain the asymptotic properties of consistency and efficiency when autocorrelation between u, and u, I? is controlled for; it is not possible to ascertain the importance of this bias. Neither is it easy to say much about a similar bias due to errors of measurement of variables except that it is likely to occur in a situation such as ours where many explanatory variables are proxies and estimates. B. Nonsphericalness of disturbances Another possible source of difficulty in the estimation of linear equations is nonsphericalness of the distribution of the variance-covariance matrix of the error terms. If it is nonspherical, one has to find what shape it has and how to make it spherical: a combination of a priori considerations and specific tests will give us a fair idea of how to do that. In our problem. autocorrelation of the residuals, rather than heteroskedasticity. is expected to be a serious problem because, due to the absence of data, at least one important variable is absent from the list of regressors: even though our list accounts for the most important trip generating activities (work, school. shopping), it neglects trips made for other purposes. If we call that residual category of trips “visiting” (it accounted for approximately 10% of urban trips in the area served by the MUCTC in 1970), we may say that the absence of visiting activity from our regressor list is likely to introduce first and twelfth order serial correlation because it is likely to be correlated with its own value during the previous period and. with monthly data, during the corresponding period of the previous year as well due to the seasonality of such an activity. We also know that the absence of other relevant explanatory variables (parking prices, comfort levels of transit vehicles, etc.) would also be likely to cause interdependence among the residuals and bias the coefficient estimates. We shall therefore assume that the residuals are determined by an Rth-order autoregressive process

where the e, ‘s are independent and identically distributed with zero mean and constant variance. As pointed out in

253

An aggregate time-series analysis of urban transit demand

the Appendix, where a Box-Jenkins approach is used to identify the significant p’s, equation (8) is a fair approximation of what is in effect a mixed autoregressivemoving average process.

two-step procedure because it can be shown to converge to a maximum likelihood estimate of the parameters (Fair, 1971). GENERAL

ADORED

ESTIMATION

PROCEDURE

A particular autoregressive process described by equation (8) is combined with demand equation (5) to yield the following complete demand equation

which is often written

v: = 2, /3,x?,, t e,,

(II)

where starred values y: and XT., respectively

denote

Equation (I 1) is typically called a linear regression model on the transformed variables y* and XT or, in short, “the transformed problem”. It is of course a non-linear equation in the coefficients /3, and pt. We will solve this equation by using the technique developed by Cochrane and Orcutt (1949), generalized by Schmidt (1968)-see Pierce (1971)-and which converges to at least a local minimum. This procedure dominates Durbin’s (1960)

COMMENTS

Table 2. Adult market coefficient estimates Equation

Code 54-c

*:

MOCTC real fare number Of cars consumer price index average

FAD NC P

earnings

Y:

real

T:

MUCTC wait-time MUCK transit-time time car trip

SHONP Em Em90 Ex1 STR FIR

ON RESULTS

Tables 2 and 3 present parameter estimates and various statistics pertaining to three different equations. The first one, 53-C, is our basic equation: the others show the impact on equation 53-C of dropping the income variable RAWEM (54-C) and of adding the price of nontransportation goods P to the list of regressors (55-C). Some general comments are in order: (i) all variables (24 to 26 depending on the equation considered) have expected signs and levels of statistical significance; (ii) first and twelfth order serial correlation coefficients are significantly different from zero: as could be expected from considerations given in the Appendix, an attempt to test for the presence of third and fourth order correlation using our basic equation yielded statistically insignificant values for these coefficients and made little difference to the other parameter values. Examination of the autocorrelation and partial autocorrelation functions of the errors of the transformed problem (1 I) revealed that no significant autocorrelation remained in any of the three equations. This means that t-statistics are not biased upwards: (iii) in fact, the presence of multicollinearity would lower the value of t-statistics. Should we measure the degree of multicollinearity? Not only are “conventional methods of testing for it not very precise and exceedingly tedious to administer”, as Kane (1968) has pointed out, but practical solutions to the problem may not be profitable: with I81 observations, an increase in

Number 55-c

MARC GALDRY

254

Table 3. Adult market t-statistics

sample size yields diminishing returns: and deleting variables which have a place in the equation on grounds of specification loses information (on the sign of the deleted variable perhaps) without easing the task of interpreting the remaining coefficients-it may even introduce biases which were not present before! It seems altogether wiser to proceed, remembering that there would be no point in using multiple regression techniques for the estimation of the coefficients if all regressors were orthogonal. In our problem, the matrix of simple correlation among the variables of the transformed problem 53-C. which is often taken as a rough indicator of the importance of the problem shows that, among 300 distinct pairs of regressors, 11 have pairwise correlation coefficients greater than 0.50 and only 4 have coefficients larger than 0.60 (0.69: 0%3; 0.92: 0.95). Let us note that, dropping the variable RAWEM [it is correlated with NC (0.88) and T2TDC (0.92)] from equation 53-C hardly affects the results (54-C) and loses a crucial indication of whether public transit is an inferior good; (iv) the sample of 181 observations from the period December 1956 to December 1971 could be extended: detailed procedures on the construction of the series are given in Gaudry (1973). COMMENTS

ON SPECIFIC

VARIABLES

Let us examine the parameter estimates of equation 53-C (unless stated otherwise) by grouping them in simple categories. A. Price and income aariahles If we define the elasticity of demand with respect to a variable as minus the percentage change in the quantity of trips demanded divided by the percentage change in that

variable, we obtain observations):

(calculated

at the mean

*price (fare) elasticity of demand: car index cross-elasticity of demand: .other goods price (equation .5.5-C)price cross-e]. of d.: .income elasticity of demand:

of the

0.15 (a) 0.10 (b) 0.16 (c) -0.08 (d)

and, apart from the reasonable relative orders of magnitude of these elasticities, we note the following: (i) the real fare (the nominal fare deflated by the consumer price index) is used as regressor; the result implies that a profit-maximizing firm would have raised the fare. Substitution of the nominal fare for the fare variable yielded slightly less good results and suggested that the public does not suffer from money illusion; (ii) the number of cars in the territory served by the MUCTC, NC, is interpreted as an index of the price of car trips in the sense that increased car ownership can be thought of as reducing the per mile cost of car trips to the new car owners and consequently as decreasing the cost of car trips on average in that territory; (iii) the negative sign of P implied that the income effect is larger than the substitution effect associated with changes in the general price level; (iv) public transit is not an inferior good but the income elasticity of demand is small. B. Time variables Analogous computations time characteristics yield:

of elasticity with respect to

*waiting time elasticity of demand: -in-transit time elasticity of demand: car in-transit time cross-elasticity of demand:

0.54 (e) 0.27 (f) -0.42 (g)

An

aggregate time-series analysis of urban transit demand

A comparison between time and price elasticities shows that people are more sensitive to time than they are to money, a result which is hardly surprising; moreover, the relative orders of magnitude of time and price elasticities of public transit variables are similar to those obtained in cross-sectional studies elsewhere, as in Boston (Domencich and Kraft, 1970). It is not clear whether (g) should be expected to be higher or lower than (f) even if it is reasonable to find that it is lower than (e), the highest of the price and time elasticities. These measures of the public’s sensitivity are confirmed if we compute the implicit marginal trade-offs between time and money by asking what change in the fare would have produced the same loss of passengers as a change of one hour in a pertinent time variable. We find $9.02 per hour of waiting time, $0.70 per hour of in-transit time and $2.32 per hour of car in-transit time. These rates compare with a real average hourly wage rate of approximately $2.50 per hour over the sample period. C. Comfort variables Because the variables used for employment (EANFP), college and university (RCUTP) and shopping (SHONF) activities are constructed in such a way as to incorporate changes in the level of these activities over time, the comfort variables reflects changes in the modal split more than changes in activity levels. We have computed that, on average, passenger losses due to temperature loss (TLK: the absolute value of the difference between desired and endured temperatures), rainfall (RFK) and new snowfall (SFK) added up to 246,177 passengers per month while the gains imputable to cumulated snowfall (CSFMR: old + new snow) amounted to 207,724 passengers per month: bad weather on average caused a net loss of passengers to the MUCTC. It is interesting to note that snow had two distinct and opposite effects. D. Activity levels All activity level variables have expected levels of statistical significance. In previous specifications (Gaudry, 1973, where detailed procedures for transforming raw data into analytical variables are given), the impact of vacations (workers’, students’) was distinguished from the total impact of the employment and schooling variables, a procedure which is feasible only with time-series studies; different specifications of the employment variable were also tried successfully: the use of a presence at work index which excluded absenteeism due to bad weather increased the significance of weather variables and the use of an index which accounted for all absenteeism lowered the significance of the weather variables. Employment, schooling and shopping series are constructed in such a way as to take into account the probability that both the origins and the destinations of trips taken for these purposes be in the territory served by the MUCTC: no attempt was made to distinguish between the effects of independent changes in these probabilities. The results which pertain to transitory activities conform to expectations: the various fairs (Expo variables) are gradually losing significance; transit workers’ strikes and the temporary reorganization of lines due to

25s

the metro fire in 1971 caused statistically significant inconvenience to the public and resulted in passenger losses. E. Accounting and aggregation variables Except for ANT and ANTLl, which denote the accounting effect of purchases of school tickets in anticipation of a fare increase and which have expected signs, these variables are straight-forward: doublecounting of interzonal passengers (ZA) resulted in a significant overstatement of the true number of passengers; months are not homogeneous periods and the variables which make the aggregation process explicit (WD, SAT, SHD) have expected levels of relative significance. CONCLUSION

Could the model be used to forecast the number of adults likely to use the MUCTC’s services every month? Yes, with some modifications. Plots of residuals $ show that the model systematically overestimates the number of passengers in August and September of every year, probably because of poor knowledge of the distribution of workers’ vacations used in constructing the employment level series. Better data or dummy variables for these months would remedy this situation. Such dummy variables have not been used in our model because they would have no unambiguous interpretation and would raise the explanatory power of the equation at the cost of biasing the estimates of other parameters. For forecasting purposes, however, this need not deter us. A second modification would consist in correcting some of the series used to describe activity levels because, though their fluctuations may be correct, they are constructed from approximate benchmarks and may have slightly incorrect slopes. Small changes in the slopes or trend lines of these variables are not crucial for the purposes of this study; for forecasting purposes the survival of the public transit network may however depend on small differences in the slopes of these trend lines. A third task, albeit different from the data improvement implied by the first two modifications, might consist in introducing new variables in the equation, such as parking prices or the changes in the spatial distribution of activities. The absence of relevant regressors biases parameter estimates in a fashion which depends on the importance of the neglected variable and on statistical considerations. For instance, the bias introduced by the absence of a variable to account for changes in the spatial distribution of activities flowing from the construction of new metro lines is probably very small because such changes were probably highly correlated with congestion (T2TDC) which reached its minimum in 1966, the year during which the new metro came into service. On the contrary, the absence of an index for parking or gas prices may introduce a more serious bias, the size of which must remain unknown until new data are gathered and hypotheses concerning their significance tested. Data improvements notwithstanding, most transit au-

MARC

256

thorities have the information required to construct the series needed for a study like this one. They can also adjust the necessary effort to the yield expected from each of the variables. The general approach developed in this paper amounts to a systematization of the intuitive procedure used in heuristic time-series forecasting. It has the advantage of bringing to bear techniques which measure the weight of factors transit operators and policy makers like to view as control variables without losing the information contained in the meaningless parameters which describe-in the sense that they fit the data-the error terms. There is therefore no reason for such time-series demand studies not to have a better place in the urban

planner’s

tool box.

Acknowledgeinents-Some of the material\ used in this paper overlap with the author’s Ph.D. dissertation (1973) which benefited from the advice of E. S. Mills, R. E. Quandt and E. C. Blankmeyer at Princeton University. M. G. Dagenais and R. J. Levesque of I’Universitt de Montrtal also made pertinent and helpful comments. Unpublished data were supplied by The Montreal Urban Community Transit Commission and by many Ville de MontrCal, Gouvernement du Quebec and Gouvernement du Canada agencies. Princeton University, Le Conseil de Arts du Canada, L’Universite de Montreal and The Ford Motor Company of Canada have contributed in various ways to support research for this paper. a first version of which was presented at the Canadian Transportation Research Forum meeting, Quebec City. May 1974. Any misuse of these resources or of Y. B. Sabourin’s programming ability is the author’s responsibility. BIBLIOGRAPHY Ben-Akiva

Moshe

E. (1973) Structure of Passenger Trace/ Demand Mode/s. Ph.D. Dissertation. Massachusetts Institute of Technology. Bieber A. and Aurignac A. (1971) La prevision de la demande dP transport a&ien: ConsidCrations m&hodologiques, in L’Accls aux ACroports. O.C.D.E.. Paris, 39-54. Boyd J. H. and Nelson G. R. (1973) Demand for Urban Bus Transit: Twjo Studiev of Fare and Service Elasticities. Paper P983. Arlington. Va. The Institute for Defense Analy5es. Box G. E. P. and Jenkins G. M. (1970) Time-Series Analrsis. Forecasting and Control. Holden-Day, San Francisco. Brand D. and Manheim M. L. (1973) editors of Urban Trace/ Demand Forecasting. Highway Research Board, Special Report 143. Washington. Carslens R. L. and Csanyi I.. H. (1968) A model for estimating transit usage in cities in Iowja. Highway Research Record 213. Highway Research Board. Cochrane D. and Orcutt G. H. (1949) Application of least-squares regression to relationships containing autocorrelaled error terms. J. Amer. Statistical Society 44..32-61. Domencich Thomas A. and Kraft Gerard (1970) Free Transit Heath Lexington Books. Mass Durbin J. (1960) Estimation of parameters in time-series regression models. J. R. Statistical Societv 822, 139-159. Fair R. C. (1971) A Comparison of ‘Alternative Estimators of Macroeconomic Models. Research Memorandum No. 121, Econometric Research Program, Princeton University. Fisher F. M. (1962) A Priori Information and Time-Series Analysis. North-Holland Publishing Co. Gaudry Marc (1973) The Demand for Public Transit in Montreal and its Implications for Transportation Planning and CostBenefits Analysis. Ph.D. Dissertation, Princeton University. Kane E. J. (1968) Economic Stotisrics and Econometrics. Harper

and Row. New York. tThe author wishes to thank Claude Montmarquette for use of his pertinent program and comments concerning techniques which happen to confirm our a priori considerations on the nature of error term interdependence.

GAUDRY

Kemp M. A. (1973) Transit Improvements in Atlanta-The

Effects of Fare and Service Charges. Urban Institute Paper 1212-2, Washington, D.C. Lancaster Kevin J. (1966) A new approach to consumer theory. J. Political Economy 64, 132-157. Lassow W. (1968) The Effect of the 1966 Fare Increase on the Level of Transit Riding of the New York City Transit Ssystem. Highway Research Board Annual Meeting, (unpublished paper). Martin B. V., Memmott F. W. and Bone A. J. (1965) Principles and Techniques of Predicting Future Demand for Urban Area Transportation. M.I.T. Report No. 3. Martin Fernand, Lagana A. and Nepveu J. (1973) Essai d’estimation de la demande de transport en commun dans la rCgion mttropolitaine de MontrBal. Cahier No. 3, C.R.D.E. UniversitC de MontrCal. Meyer John R.. Kain J. F. and Wohl M. (1965) The Urban Transportation Problem. Harvard University Press, Cambridge, Mass. Meyer John R. and Straszheim Mahlon R. (1971) Pricing and Project Evaluation. The Brookings Institution. Washington, D.C. Nakkash T. Z. and Greco W. L. (1972) Accessibility models of trip generation. Highway Research Record No. 392. Nelson C. R. (1973) Applied Time-Series Analysis for Managerial Forecasting. Holden-Day, San Francisco. Pierce D. A. (1971) Least-squares estimation in the regression model with autoregressive-moving average errors. Biometrika 58, 299-312.

Quandt

R. E. (1970) The Demand for Travel: Theory and Heath Lexington Books, Lexington, Mass. Rainville W. S., Jr. (1948) Transit Riding, Revenues and Fare Structures-Bacic Approach and Computationr by Slatistical Laboratory. Division of Industrial Cooperation of The Massachusetts Institute of Technology. America1 Transit Association, Washington, D.C. Ruiter Earl R. (1973) Analytical Structures. Report. in Brand and Manheim. Smidt B. R. (1973) Effect of alternative fares systems on operational efficiency: continental experience. In Symposium on Public Transport Fare Structure: Papers and Discussions. TRRL Supplemenlaty Report 37UC, Crowthorne, U.K.. Transport and Road Research Laboratory. Smith M. G. and McIntosh P. T. (1973) Fares elasticity: interpretation and estimation. In Symposium on Public TransMeasurement.

port Fare Structure:

Papers and Discussions.

Schmidt

R. E. and Campbell M. E. (1956) Highway Traflc Estimation. Eno Foundation for Traffic Control, Saugatuck, Conn. Sharpe G. B., Hansen W. G., Hammer J. and Lamelle B. (1958) Factors affecting trip generation of residential land-use areas. Public Raods, 30, (4) 88-99.

Steilberg C. J. (1973) The Development and Application of Demand Functions for Intercity Travel by Bus and Rail in the Netherlands. Ministry of Transport, The Hague. U.S. Department of Transportation/Federal Highway Administration (1967) Guidelines for Trip Generation Analysis. Bureau of Public Roads. Zellner A. and Palm F. (1974) Time-series analysis and simultaneous equation\ models. J. Econometric,\ 2, (1) 17-54.

APPENDIXi

Determining

the error

term

structure

The estimation

of demand equation (10) calls for an identification of process (8) which generates the error terms ut. Box and Jenkins (1970) provide one approach to the problem. It consists in considering a general class of linear

models

of the form

(9

An aggregate time-series analysis of urban transit demand Table 4. Autocorrelation and partial autocorrelation functions

Estimated error for

Lags l-12 13-24 25-36 37-48 49-m

.39 .05 -.09 .12 .20

.14 -.l8 -.31 -.I5 -.05

.31 .m .*1 .33 .14

( ( ( (

,071 .09) .ll) .12)

(

,121

l-12

.39

-.o*

.33

(

,071

l-12 13-24 2X-36 37-48 49-60

.39 .05 -.09 .1* .20

.13 -.1* -.31 -.15 -.05

.37 .m .21 .32 .14

< ( ( ( (

.07) .09) ,111 .12) .12)

l-12

.39

-.o*

.34

(

.07)

Partial .09

.Ol

.02

-.OO

Standard row

*utocorre1atimls .03

where the $i’s are weights applied to previous observations and the &‘s are weights applied to previous values of a, a randomly generated independent shock term of zero mean and constant variance. The 4,‘s constitute the parameters of an autoregressive (AR) process of order p; the &‘s are the parameters of a moving average (MA) process of order q. Box and Jenkins differentiate a series such as Us until it is stationary; they then calculate estimates of the autocorrelation and partial autocorrelations of the stationary series to determine the specification of the orders p and q of the mixed ARMA process written in (i). They finally estimate the parameters of the identified ARMA process. Nelson (1973) as well as Zellner and Palm (1974) are among those who have applied and criticized this technique. We will use it to identify the structure which underlines the residuals li, from least-squares applied to (5): these residuals are produced by the first iteration of the Cochrane-Orcutt procedure when the initial values of all autocorrelation coefficients are assumed to equal zero. An appropriate process having been identified, one may proceed to solve the complete demand equation, equation (10). Table 4 presents the estimated autocorrelation and partial autocorrelation functions for the li, series of equations (53) and (55) listed in Tables 2 and 3. Despite different regressor lists, the functions of the residuals are very similar: both series are clearly generated by a stationary process; both autocorrelation functions decrease rapidly and increase again with the eleventh and twelfth lags while the partial autocorrelation functions have a cut-off after the first order and increase significantly at the eleventh and twelfth orders. This suggests in both cases an ARMA process (l,O, 12), namely a process that is stationary in the O’th difference of

-.04

.Ol

-.05

.17

the variables 6, with an AR part of order 1 and an MA part of order 12: (ii) This ARMA process is very much what we expected on a grounds. Indeed let us simplify (ii) at a small cost. We may write

priori

(iii) and substitute the right-hand side for a,_12in (ii) to yield

In our problem, the magnitude of $1 and tin is approximately 0.3; if we consider the third and fourth terms on the right-hand side of (iv) as negligible (A 0. lo), we may write that equation as

or as r

Ml =

C put-~+ a,,

r = p + q,

,=I

(4

which is an approximation of an ARMA process by an AR process of the same number of parameters: p, and plz correspond to $Q and &. More generally, this simplification consists in writing at-k = ul-t-

2 dw_,-k - $,tb2k

(vii)

and in substituting for the value of al-k in (i) to yield

MARCGAUDRY

258

+ 2 Or2at_2+ a, I-1

(viii)

or, neglecting products of parameters,

lit = 2 &L_, -i OLU,_,+ (1, I-, 1-1 which we have written

(ix)

where r = p + q and I is the maximum of p, q. By repeated substitution of atmkterms one could write any ARMA process of orders p and q as an AR process of infinite order and drop negligible terms: this is the specification implicit in autoregressive specifications of error terms. The Box-Jenkins techniques give us a reasonable idea of the costs involved in making such simplifications.