The use of dummy variables in trip generation analysis

The use of dummy variables in trip generation analysis

TranspnRes Vol 6. pp 131-142 Pergamon Press 1972 THE USE OF DUMMY VARIABLES ANALYSIS KENNETH and IN TRIP GENERATION W. HEATHINGTON Purdue Utuve...

720KB Sizes 0 Downloads 48 Views

TranspnRes Vol 6. pp 131-142 Pergamon Press 1972

THE USE OF DUMMY

VARIABLES ANALYSIS

KENNETH

and

IN TRIP GENERATION

W. HEATHINGTON

Purdue Utuvemty,

Cleveland

Prmtcdm Great Brrtatn

Lafayette,

EDWARD

Indiana

ISIBOR

State Umverslty,

Cleveland,

Ohlo

(Rccerved 1 December 1970) INTRODUCTION THE DEVELOPMENTof

models requires an understandmg and appreciation of the various techniques applicable to the problem at hand. In particular with mathematrcal modehng, one needs to employ those techniques which will allow for simphcity yet be of sufficient formulation to be representative. The trend m mathematical modeling m the transportation area seems to have become more complex but at times there seems to be lacking a sufXcient degree of representatron. Thus is partially, but not totally, due to the basic formulation of data It does not seem unfair to say that often times a linear formulation IS ma& when really some non-linear formulation would be more appropnate. At other times a continuous scaling ts assumed when m reabty a true contmuance does not exist. In many instances of model formulation, an approximation is assumed for simplicity m computation. Most often these assumpuons are made w1t.h the knowledge of probable discrepancy. However, It IS not always a profitable trade-off that results m substituting simplicity for complexity Particularly IS this true when using hnear regression techniques. Regression analysis IS used extensively m the transportation plannmg area perhaps because of its slmphcity of comprehension and computation. Often one sees many of the assump uons in hnear regression analysts violated wrth little or no justr6cauon for so doing Thus, it is mamly for that reason that this article has been prepared Specifically, it is intended to review methodology for treatment of (1) non-linear and (2) discrete relationships in linear regression analysis. An example problem from the area of urban transportation plannmg will be used for fflustratrve purposes. The use of “dummy” variables in regression analysis provides the analyst wrth a single method of handlmg discontinuous and/or non-linear variables. Such categorical variables as sex, occupation, race, etc can be afforded proper treatment without violating the basic assumptions underlying regression techmques Of course, this technique of dummy variables IS not new as at least one brief article by Guts (1957) was in the hterature in the late 1950’s. The U.S. Department of Transpottauon (1967) made apphcauon of this technique to trip generation analysis. However, the dummy variable technique possesses some unique advantages which should be brought mto current thmkmg As mentioned earher, it provides a method for mcludmg categorical data into regression analysis. Also, unhke the cross classticauon procedure, it does not requue a relatively large number of observations. This technique can also be applied to mtroduce non-linear independent variables into a regression model. These non-linear variables are transformed into a dummy format by stratifying them into 131

132

KENNETH W HEATHINOTONand EDWARD ISIBOR

a number of discrete classes, each mdtcahng a range which the ongmal vanable falls within. A dummy vanable IS then assigned to each of the discrete classes By md:catmg only the dummy class mto which an observatron falls, errors made m reportmg and recordmg data of the explanatory vartables become less drsturbmg m then effect on the response varrable Thrs IS of partrcular unportance m the urban transportatron planning area. For mstance da dummy vanable IS used to represent income, where a drscrete class may range from $3000 to $5000, It becomes msrgmficant d an mdrvrdual mcome being actually $3500 1s reported as $3900 The dummy vanable technique helps to ehmmate the biases that are usually mvolved m assummg the form of functtonal relatronslup between the varrables. Thrs IS so because rt allows the rdentrficatron of non-lmear relatronshrps wtthout any pnor assumptton as to the character of non-lmeanty. Thus such assumpuons as the lmeartty assumptron do not have to be made. THE

DUMMY

VARIABLE

TECHNIQUE

In this techmque a value of one IS assigned to the dummy variable correspondmg to the class wrthm which the value of the observed independent vanable falls; all other dummy varrables take on the value of xero. The estrmatmg equation generally takes the form Y = &+*$4

+ $r &z,, - I

(1)

where f = the estimate of the dependent varrable b, = regression constant or intercept, b4 = regresston coefficrent for the rth hnear dependent variable denoted by X, 6,1, = c&&rent of the jth dummy variable m the kth dummy varrable set & = 1, If observed vanable falls mto thejth class and 0 If not myth class n = total number of hnear vanables s = total number of dummy vanable sets c = number of classes wrthm a dummy vartable set The least square crrtenon can then be used to evolve a regression surface where the sum of the squared devratrons of all the observahons about it is a muumum. Solutron to the developed “normal” equahons yield the desued e&mates of the intercept and the regressron coe5crents. However the mtroductron of all classes of each set vta the use of dummy varrables mto the regressron model renders the normal equations mdeterminate. This mdetermmancy 1s present because there are more coefficrents than there are independent normal equattons based on the least squares cntenon. Two constramts are generally Imposed to remove the extsting mdetermmancy. The first approach constrams one of the coefficients m each of the groups to qua1 zero For an flustratton of thrs constramt let us assume a srmphfied case m which the response vanable IS Y and the last two sets of independent vanables have four classes each Then the esttmatmg equation IS Y= b,+b,x,+~+j,z,~ I I

(2)

As stated earher, there is not a feasible solutron to the above equatron. The mherent rndetermmancy can be ehmmated If we pre-assign one of the coefficrents a value of zero m each of the two dummy vanable sets Thus, one can arbttranly set bal = 0

and

b,=

0

133

The use of dummy variables m trip generatlon analysis

The reduced estrmatmg equatton IS now

Note each eshmated b;, measures the net effect that membership m the Jth dummy class has on the dependent vanable when compared to membershrp m the excluded class In using thts first type of constramt, all estrmated regressron coeffictents are mterpreted wrth reference to that of the omitted dummy class. An example ~111be used for rllustratton later in this presentatton However, It mrght be more deslrable to obtam regresston coefficrents whtch Indicate dtierences from the overall mean of the dependent vanable rather than from the base classes The second type of constramt to be mtroduced leads to the estrmatton of coe5crents of this kmd To achieve thts, the weighted sum of the coe5crents m each dummy set representmg a single factor 1s set equal to zero as mdrcated by Mehchar (1965) Applymg thts constraint to our example we have

$;;1*z;, =0

(4)

t: b,B*Z;p = 0

(5)

5-l where ZiL = number of observatrons m dummy class] of the kth dummy set Solvmg for b,* and b,, l from equations (4) and (5) respectrvely we obtam

(6)

Substrtutmg for b,* and be * from equations (6) and (7) m quatron I’= b,,*+b,* X~+~~~,j;lb,,*~~~-~Z,,) _ = 4k

(2) we get (8)

Now d we define new vat-tables (9) we

can rewrote equation (8) as f = b,* + b,* xl + jl

jilbjk* Z,k* _ _

(10)

The above equation can be solved usmg the usual least square approach to obtam estrmates of the regression coefficients The resultmg values of the regressron coe5ctents are then substrtuted into quatrons (6) and (7) to determine the remaimng regressron coeffients. Examinmg equatron (10) it IS evrdent that each of the new mdependent vanabks Z k* has a mean of zero. Thrs mfers that the constant term b,* possesses a value equal to tL at of the mean of the dependent vartable. Therefore each esttmated regression coe5crent

134

KENNETHW HEATHINOTON

and EDWARDIsmon

for a dummy class ISthe net difference from the grand mean which 1s attnbutable to membershrp m that dummy class. Although the two types of constramts discussed produce different regressron coeffictents when they are each Imposed on the normal equatroos, they are both efkctrve m drsplaymg the differences among the coefficients. The only dtssumlanty IS the basis of reference used by each type of constramt to measure thus difference. Therefore we can transform the results from the use of the lkst constramt to those obtamed from the second type of constramt. The sunilanty m the final equations (3) and (10) developed by using the first and second constramts respectrvely IS of some unportance. The only drfference is m the magmtudes of the constant term and the various regresston coeffiaents. We can therefore see that each of the coefficrents b& b,,* developed from using the first and second constramts respectively drffer by a constant. Therefore we can wnte bfk* = b;,+c

(11)

Retummg to our sample sltuatron we have 41 *=bjj+c

(12)

Applying the first construnt m which we set the coeffictent of one of the dummy classes equal to zero, m thrs case b,, we have bll* = b;,+c -o+c b4,* = c

(13)

Now applymg the second constraint i (b;, + c) z;1 = 0 j-1

therefore (14)

The adjusted regression

COCfiicleIIt(bjk*)

In

thlS case beCOmeS

41 where

I-

4Z jl 41 = z;1 ,x1

Therefore, in general, we can wnte MS

bfk* = b;k- zl$kb;k

The use of dummy vanables m trip generatlon analysis

135

where bjk* = the adJusted regression coefficrent of thelth dummy vanable m the kth dummy vanable set indrcatmg the m5uence of the/th dummy class m the ti dummy vanable set on the response vanable wth respect to the overall mean b’Ik = the uncorrected regression coefficient mdicatmg the relatrve influence of the Jth dummy class m the kth dummy vanable set on the response vanable as compared to that of the omitted dummy class III the set 4k = a weighting factor mdrcatmg the fraction of the sample m the Itb dummy class in the kth dummy vanable set An dlustratrve example An lllustrauve example will be used to help clarify the above dIscussion. The data for thrs example are found m Table 1 and do not represent any actual case study The data have been arbitrarily chosen so that the dummy vanable technique can be readily interpreted. In Table 1, the data set contams tnps per dwelhng umt, family Income, residential type of dwelhng m each zone, automobrles per dwellmg urut and household size for SOzones. All of these values are taken to be zonal averages There appears at tlus time only one dummy vanable set, 1.e the resrdenual type of dwellmg It 1s notrced that a (1) appears m the appropnate column for the residentral type of dwellmg and a (0) appears m the remammg two columns We can further define famrly mcome and car ownership m terms of dummy variable sets Thus, of course, does not have to be done but will serve a useful purpose m thrs example. Let us break down the famrly mcome into four classes as below.

: 3 4

<5ooo SOOO- 7499 7500-10,000 > 10,000

Other groupings than these could have been made. The number of classes m each dummy vanable set are arbnranly chosen by the analyst One can also define car ownership m terms of a dummy vanable set. The three groups used for thus set are Group

Range I~OrlCSS~

:

3

1 l-2 0 cars Over2Ocars

One can now structure the matrix m terms of the hnear vanables and the three dummy vanable sets that have been def&d. The matrix would take on the appearance as shown m Table 2 Whenever an observatron falls mto a gtven class of a grven dummy vanable set, a (1) is entered m that cell. The other classes m that grven observatron wrIl contam a (0). One can now formulate a proposed regression model for use m estrmatmg trips per dwelling umt. The regression would have the followmg form p= b,+b,X,+b,,Z,,+b,Z,,+b,+b,,5,+4,Z,,+b,,Z,,+b,,S,

136

KENNETHW. HEATHINGTONand EDWARD Isreo~

TABLET

DATAFORW

QENEMTIONNMLYEWIJSMGDUMMYVARIABLE

Ruldcntlal

Zone

: 3 4 5 6 ii 9 10 11 12 13 14 15 16 17 18 ii 3: :: 2: 27 28 ii :: 33 34 :: 37 38 2 41 42 43 44 45 46 47 48 49 50

Tr~ps/D U. 42 38 39 45 56 58 44 70 59 31 46 66 61 :; :i 57 85 110 49 105 I5 72 96 83 44 82 93 50 45 60 85 36 3: 47 76 42 43 :i 47 27 16 27 18 29

Famdy mcome 0)

SUlglC family

type

Combmatlon

Apartment complex

5000 3400 5500 6800 9100 6600

x

:

0

0

8

: 0

iz 8600 4200 5600 6800

1 1

1 0 1 0 0 0 1 8 0 1

0

; 0 1

: I 0

01

:,

01

x

: 1 1

:

01

: 8

: 0

0 0 1

t!l

44m 6400 10,000 3ooo 7300 11,alO 12.600 6500 12.500 1800 9100 9800 9500

Ii 8 1 1 0 0

1 I I

8 0

11,300 4300 5200 11,200 8500 6200 :z 3600 2500 3800 1700 3600

: 0 0 1 1 0 i 0 1 8 1 : 0 0 0 8 0 0

il 8

ii

ii

8

0

0 0 I 0

; 0 0 0

01 0 1 1

: 1 ii

0 0 1 1 1 1 1 1 1

0 0 0 0 0

Household Car/D.U.

SIZC

15 12 14 I5 18 19

27

:: 19 12

f:

:‘: 19 12 16 25 10

it

:: 28 25 27

5:

;

ii ii% 10,500 6400 6700 8500 10,400 5200 11,300 8500

TECHNIQUES

:i 30 :i 08 3; 26 14 ii 16 I5 :: 12 24 19 :: 14 14 22: 16 10 08 A! :: 10

3: :: 15 25 42 48 27 45 :; 43 40 24 40 43 :‘: 27 40 18 38 19 27 38 21 21 40 34 25 15 15 I5 I5 16 15 18

The use of dummy barlables m trip generation

TABLE 2.

CODING

OF DATA

TrIpsI D I_!

1 2 3 4

42 38 39 45

:

:: 44 70 59 31 46 66 61 35

; 9 10 11 I2 13 14 15 16 17 18 19 20 21 E ;: it 28 29 30 31 32 33 34 :: 37 38 ii 41 42 43 44 45 46 47 48 49 50

i: 25 i: 110 49 105 15 ii: 83 44 i: 50 45 60 ii 78 58 47 76 42 43 85 65 47 27 16 27 18 29 15 30

1 0 1 0 0 0 0 8 0 1 0 8 I 0 0 A 0 0 0 0 1 8 0 A 0 0 0 0 0 0 x 0 0 it 0 0 0 1 1 1 1 1 1 1

2

:, 1 t!t 1 0 : 0 1 A 0 :, 0 A 0 1 : x 0 8 0 1 1

8 1 x 1

type

FORMAT

Car ownershlp

4

1

2

3

1

2

3

0 0 0 0

0 0 0 0 0 0 0 0 0

0

0 0 0 1 1 0 1

1 1 1 0 0 0 0 0 0

0 0 0

1 1 1 1 1 1 1 0 1 1

0 0 0 0

:, 1 1 1 8 0 1 0 0 1 0 0

x 0 0 0 0 0 0 0 1

8 8 1 0 1 1 0 0 1 1 : 1 8 1

ii 0 1 1 A 1 x 0 1 8 0 1 : 0 0 0 1

ii : x : 0 0

x 0

ti ti : 1 0 0 1

A

it

ii 1

A

iI 1 1

8 0 0

A 1

x 0 0 1 1 0 0 0 0 1

x

8

8 0

VARlABLE

3

8 i!l 0 1

DUMMY

Resldentlal

Family mcome Zone KIVH

INTO

137

anal)se

x 8 0 1 0 0 0 1 0 A 0 1 x :, 0 0 0 0 0 0 0 0

A 0 1 :, 1 8 1 0 0 : 0 0 0 0 : 0 0

8 : 8 1 : : 0 0 0

: 1 0 0 1 0 0 0 0 0 1 8 0 1 : 0 0 8 ; 0

: 0 0 0 0 0

0” 0

8 0

:, 1 1 1 0 0 1 0 0 1 0

:,

x

8

x 1

0

x 1

8

8 0 0 0 8 0 0 0

A 1 1 0

8 0

8 1 1 1 1 1 1 1

8 1 1 1 1 1 1 1

8

x 1 1 A iI : 0 1 1 0 0 1 0 0 0 0 0 0 0

Household size 27 ;: Ii

8 0 1 0 0 0

z 31 z 24

:,

::

0”

:i 38

A 0 1 A A 1 1 A 1 A 0 0 1 0 1 0 0 1 8 1 1 x : : 0 0

:: 42 48 27 45 16 33 43 40 :: 43 :-ii 27 4-o 18 38 l-9 ii f: 40 i.1: 15 15 15 15 16 15 18

138

KENNETRW HEATHINOTONand EDWARDISIBOR

where P = tnps per dwelhng urut b0 = mtercept, x, = household srze 21, = income class 1 Z,= income class 2 2, = income class 3 Zlr = resrdential type class 1 2, = resrdenual type class 2 z,= car ownershrp class 1 Z, = car ownershrp class 2 b,, b, b, 3: coefhcrent of the appropnate ofone can abbreviate the notatron to

class of the appropriate dummy vanable set

where Z, = income class

Z, = residential type class zk = car ownershrp class One nohces that one class of each set has been deleted from the model In thus particular example, the last class was deleted However, any class may be omitted. These data sets were inserted into the BMD02R program for determmmg the parameters. AI1 of the vanables were forced into the model for ilhrstratrve purposes. The resulting parameters were found to be b, = 1.80,

b, = l-56,

6, = - 1 25,

bJ = -0 97,

b,=O.65,

b,=O32,

b,=-074,

b,=-031

b, = -0 32

Thus the model used for estrmatmg trrps per dwelling umt would be Trips/D U. = 1 80+ 1.56 (household size)- 1 25 (income class 1) -0 97 (Income class 2)-O 32 (income class 3) +0*65 (residential type class 1) +0 32 (residenual type class 2) -0 74 (car ownership class 1)-O 31 (car ownership class 1) In the regression model, one has one hnear variable (I e. household srxe) and three dummy vanable sets (I e. mcome, residentral type and car ownership) The dummy vanable sets contam various numbers of classes and class size One can now begin to interpret the model that has been formulated Each class of each set wrll be first interpreted in terms of the class that has been deleted Thus we tind that when holdmg all other vanables constant (other than mcome), those dwellmg units havmg an income less than $5000 (class 1) WIUmake 1 25 fewer tnps than those dwelling units having an income m excess of $10,008 (class &the one deleted). Likewise those dwelling umts having an Income of $500&7499 ~111make 0 97 fewer tnps than those dwelhng units

The use of dummy variables III trip generation analysis

139

havmg an mcome m excess of $10,000-assummg all other vanables are held constant Thus from vtewmg the three classes of the mcome set, on finds that as the income increases, the trrps per dwellmg umt Increases One can now interpret the resrdentlal type dummy variable set m exactly the same manner It IS seen that those dwellmg umts which are of a smgle farmly unit type wdl make O-65 more tnps than the deleted class of apartment complex. It IS further seen that the combmatron class will make 0 32 more tnps than the apartment complex classassuming all other variables are held constant Car ownership can also be interpreted m exactly the same manner In general the fewer the automobiles the fewer the tnps to be made. Those dwelhng umts havmg one car or less will make 0 74 fewer tnps than those having more than 2-O cars One can use this model for esttmatmg the tnps per dwelling umt Suppose that a given zone has the followmg characteristics Average household suze = 4 2 Average income = $11,000 Residential type = smgle family umt Average car ownershrp = 2 0 The equation would then take the form Trips/D U. = 1 80+1%(42)+065(l)-0*31(l) Tnps/D U. = 1 80+6 55+0.65-0.31 Tnps/D.U = 8 69 Since the income of $11,000 fell mto the fourth class of the income set, tt was not included. Thus one can take any combmation of values for the variables, whether of the lmear or dummy vanable form, and insert into the model for estrmatmg the value of the dependent vanable The general form of the model wrll be

where P b, b( n b,, Z,, s c

= = = = = = = =

esttmate of the dependent or response vanable mtercept partial regresston coefficient for the lth lmear variable, total number of hnear vanables partral regression coeffictent for the/th class of the kth dummy variable set takes on value of 1 or 0 total number of dummy variable sets total number of classes withm a given dummy vanable set

Some further tnterpretatlon There are some problems in leavmg the model at this pomt. One cannot perform a statistical test on the parttal regressron coefficient of each class omttted from the model. Thus the traditional r-test used for testing whether a gtven partral regressron coefficient is stgmficantly different from some stated B, (usually taken as zero) cannot be performed. However, thts dtsadvantage must be evaluated m terms of the advantages gamed from

140

KENNETH

W

H~ATIUNGTCIN and EDWARD 1s~~

using thrs techmque. We can make an estimate of the contnbutlon of all classes of a given set by makmg some adJustments to the partral regressron coeffiaents. A new partral regresston coefficrent Bjcanbedefined. C Bj = b, -$=j

6,

where

Bj= partral regressron coeflicrent of classj of a grven dummy vanable set Pr= proportron of the sample m class/ of a grven dummy variable set c = total number of classes m the set Usmg the Income set as an example, one finds that for class 1

4=-l 25-[(0.28)(-1.25)+(0.30)(-097)+(0.26)(-032)+(0 Bl= - 1.25 + 0 724 = 0.53 B,--097+0 724=-O-25 B, = - 0.32 + 0.724 = 0.40 B,=O+O 724=072

16)(O)]

The adJusted partral regression coefficients for all three dummy vanables sets are shown m Table 3 One can now vlsuahxe the dwellmg unit with an income m excess of $10,000 to make 0 72 more trips than the average mcome dwellmg unit. Likewrse, the dwellmg umt wrth an income of less than $5000 wrll make 0.53 less trrps than the average income dwelhng unit, assummg all other variables are held constant. TAels 3

ADJUSTED PAR=

Income bj -125 -097 -032 0

WonEsixxoN COE~CIENT Fan DUMMY v-tx

Resldentlal

type

pj

BJ

bj

PI

Bi

028 030 026 0 16

-053 -025 040 0 72

0 65 032 0

040 022 0 38

0 32 -001 -033

stxs

Car ownenhlp bj -074 -031 0

PJ

Bj

018 050 0 32

-045 -002 0 29

The classes of resrdentral type can be Interpreted m hke manner. The smgle fanuly urut makes 0 32 more tnps per dwellmg unit than the average type dwellmg urut for all zones. The dwelhng umt that has one car or less will make 0 45 fewer traps than the dwellmg uruts wrth an average number of cars. One can now extend the mterpretatron of the dummy vanable techmque one step further A beta coefficient can be determmed for each set to estrmate that set’s contnbutron rn the overall model. That IS, one can evaluate the contrrbutron that mcome, car TABLE 4.

Income El=-053 B,=-025 fxJ= 040 B, = 0 72

B, = 0

n,= 14 n,= I5 n3=13 n, = 8 196

BETA COEFFICIENTFOR DUMMY VARIABLESETS

Resldcntlal

El =

0 32

B, =-001 Es =-033

type n, = 20 f?, = 11 na = 19

BR = 0 120

Car ownershlp El=-045 Bz = -002

B,=

029

n,= 9 n, = 25 n,=l6

EC = 0 105

The use of dummy varlablcs III tnp generation analysts

141

ownershrp and residenttal type make m the esttmatton of the dependent vanable now Interprets sets rather than classes The following relatronshtp IS defined

One

where B, = SV = PQ= B, =

beta coefficrent of dummy vanable set S standard devmtron of the dependent vanable number of observations m each class of dummy variable sets adJusted parttal regression coefficient

The standard deviatron (S,) of the dependent vanable tnps per dwellmg umt 1s 2.40 Thus the beta coeffictent for the Income set (BI) IS equal to B, = (14)(-O [

53)0+(15)(-0~25)*+(13)(040)*+(8)(0 14+15+13+8

I/

72)O * 2 40

and therefore B,=Ol%,

B,=0120

and

B,=0105

It 1s seen that income has the greatest effect upon the trips per dwelling urut and the car ownership has the least effect of the three dummy vanable sets. Some addUiona1 comments

If one 1s developmg the “best” model for use m esttmatmg the value of the dependent vanable, he ~111probably use one of the standard methods of regresston analysts such as the step-wise regression techmque. Generally, one does not desire to force all the vanables into the model as has been done here m this example. Many times a given class of a gtven set may not have a suffictent F-level to enter the model. This creates no maJor problems. At first observation, the classes arbrtrarrly chosen have not been appropnate ones. Thrs requrres a regrouping of classes, usually mto a smaller number than first used. However, after eventually reducing the number of classes to two, which cannot be further reduced, one still fmds that the class will not enter the equatton, then the mclusron of thrs vanable into the model has to be Justtfied m some other manner than the traditional one. In many instances the altenng of the range of each class will have an effect upon the class entenng the model By analysmg the simple correlatron matnx, one can begm to Judicially select the classes or the regrouping of classes. At trmes It IS more appropnate to break the gtven set mto many classes and after vrewmg the correlatton matnx restructure the range of the classes. Classes that tend to have correlatton coefficients m the same range and are correlated m like manner (1.e postttve or negatrve correlatron) can many ttmes be regouped into one class. There are some problems that the mexpenenced may find m usmg the dummy vanable technique In regression models, one often finds that an independent vanable contnbutes m an opposite manner to sound logic. For example, one might on certam occasions find that m a gven regression model an increase m mcome would decrease the number of tnps each dwelhng unit might make. Thrs 1s normally not a logical formulatron. Oftenttmes thrs discrepancy occurs because of collmeanty. Just as thrs discrepancy can occur with hnear vanables, it can occur quite frequently with the use of dummy variables That 1s to say the

142

KENNETHW HEMWNGTON

and

EDWARD

ISIBOR

resultmg model wnh dummy vanable sets vvlll show some classes to make a contibution III an 11log1calmanner. Therefore, one has to use as much judgment m mterpretatron of the model and the acceptance of the model as bemg vahd as he does Hrlth any regresslou model usmg only hnear variables. SUMMARY

In the begmnmg of thus presentation, the use of dummy vanables was suggested to overcome problems of non-hneanty and category type vmables. Where these problems do not exat, It 1s not recommended that this particular techmque be used. The dummy vanable techmque IS sunply a tool to ad the analyst m model buddmg. Thus techmque, hke most others, has both advantages and disadvantages. It does seem to iud m the mterpretauon of more complex non-hnear and category type vanables. In many cases a more clear and concise analysis can be made usmg the dummy vanable techmque than m the transgenerauon of the independent variables. The parucular approach to the use of dummy vanables taken m this presentation represents only one of several approaches. For the reader who would hke to explore this techmque m more depth, It IS suggested that he revrew the references hsted III thus presentation. The approach taken will be Influenced by the objective of the parucular study. REFERENCES MELICHARE. (1965) Least squares analysts of cconomlc survey data. Proc Busmess of Economic Slatisiics Section, Am Statist Assoc pp. 373-385 Svm D. B. (1957) Use of dummy vanablcs III regresson cquatlons I Am. Statist Assoc. 52, 548-551. U S DEPARTMENT OP TRANSWRTATION (1967). Gurdehes for Trip Generatwn Analysts U.S. Government Rmtmp Office. Washmgtoa, D C.