Tmnspn. Res.-A. Vol. 24A, No. 3. pp. 1.97-2W. Printed in Great Brimin.
0191.2607/w 53.00+.00 0 1990 Pergamon Press plc
1990
A GENERAL LINEAR MODEL FRAMEWORK ESTIMATING WORK TRIP RATES FOR HOUSEHOLDS IN KUWAIT
FOR
GALALM. SAG, Department of Civil Engineering, Kuwait University, P.O. Box 5969, Kuwait and DAVID H. YOUNG Department of Statistics, Kuwait University, P.O. Box 5969, Kuwait (Received 25 May 1989; in revisedform 10 October 1989) Abstract-The 1985 census of Kuwait has shown that there are great variations among households of the same nationality and between households of different nationality groups. As such, routine applications of conventional trip generation procedures such as the cross-classification analysis approach may not be appropriate. Alternative statistical methods of analysis based on a generalized linear model framework (GLM) are discussed. This framework provides a flexible range of statistical models for representing the
dependence of mean household trip rates on explanatory variables of interest and for selecting the distribution of trip rates of households within individual classification cells. Seven different household major groups could be identified from the 1985 census. One of these groups, Kuwaiti households living in villas, is used for some illustrative GLM analysis in which the results of an extensive home interview survey conducted in 1988 are utilized. This analysis showed that work trip rates of this household group are influenced by car ownership, household size, and the interactive effect of these two variables.
1. INTRODUCTION The trip generation component is an essential part of most transportation planning studies. In the two specialized transportation studies conducted in Kuwait in 1977 and 1989 and in the transportation analysis
components of the Master Plan Updates, the crossclassification analysis approach was used in the trip generation analysis, Kuwait Municipality (1977, 1983). In those cross-classification analyses, the classifying variables cover nationality, house type, household size, and car ownership levels. A rather arbitrary selection of the categories of these variables was made. A total of 54 household groups was used in the analysis and trip rates were calculated by dividing total trips made by households by the frequency of households in each category. The disadvantages of the cross-classification analysis when applied in a routine manner have been discussed in the literature (Kassoff and Deutschman 1969; Stopher and McDonald, 1983). The most important of these disadvantages are related to the absence of a goodness-of-fit measure of performance and the variation in the reliability of trip rate values due to the variation in the number of households available in each cell for calibration. Also there is no well established way to choose among classifying variables or the best groupings for each variable. Another disadvantage relates to the loss of information when all households within each cell are treated similarly. The difficulties that could arise from the routine application of conventional trip generation proce-
dures, such as the cross-classification analysis, in the Kuwait context have been discussed by Said (1990). The major sources of difficulties are related to the wide range of characteristics of households in Kuwait that need to be reflected in the classification. Kuwait has essentially three different population groups; Kuwaitis, Arab non-Kuwaitis and Asian non-Kuwaitis. Households of these groups differ significantly in size, house type, occupation of working individuals, labour force participation rates, car ownership, income, and family structure. In addition to the differences observed between household groups of different nationalities, there are wide variations between households of the same nationality. For example, Kuwaiti households vary in size between 1 and 50 persons, in car ownership between 0 and 9 cars per household and in house type between modest apartments to palace-type villas. A cross-classification analysis approach for trip generation that accounts for several of these variables, and that covers the wide range observed for each may be expected to result in low frequencies of households, thus damaging the reliabilities of the trip rate estimates. Said (1990) has advanced the idea of use of the Generalized Linear Model framework (GLM) to model work trip generation rates for Kuwait. In this paper a discussion of the alternative statistical methods of analysis of trip data based on the GLM framework is presented. A major improvement of this approach is that the estimated mean trip rates for cells of the cross-classified table utilize a model fit based on data from all cells and not just the data 187
188
G. &I. SAID and D.
from the given cell, which may be sparse. Other advantages are that formal tests of significance may be used and the goodness-of-fit of competing models may be compared statistically. The basic characteristics of the different population groups, households and labour force in Kuwait that are likely to influence trip rates are outlined. The GLM framework for the modelling and statistical analysis of mean work trip rates is then discussed. This is followed by a preliminary analysis of trip data from a comprehensive home interview survey. An important finding from the analysis is that a transformation of trip rates is needed if standard regression/ ANOVA methods are to be employed. Finally, a GLM analysis of the trip data for Kuwaiti households living in villas is presented,
2. POPULATION,
LABOR FORCE, ASD HOUSEHOLD CHARACTERISTICS
The characteristics of the different population groups in Kuwait has been described in Said (1990) and Hutchinson and Said (1990). A brief background is presented in this section. The basic sources of the material in this section are; the 1985 census and the results of a home interview survey done in 1987 and 1988. According to the 1985 census, Kuwait population reached 1.697 million with 40.1% Kuwaitis, 37.9% non-Kuwaitis of Arab origin, 21.0% non-Kuwaitis of Asian origin, and 1% of other nationalities. The agesex distributions of Kuwaitis, Arabs and Asians are shown in Fig. 1. The figure demonstrates clearly the domination of the Kuwaiti population by the younger age groups with almost 50% of the Kuwaiti population under 15 years of age. Among Arabs, the presence of higher proportion of males than females is evident from Fig. 1 and in particular for those in the working age. The Asian population group is dominated by individuals in the working age group. The number of males is almost double the number of females. The labour force participation rates by sex and age groups for persons over 15 years of age for the different nationality groups are shown in Fig. 2. The figure shows a number of features such as; (a) the low labour participation rate of Kuwaiti females compared to other groups; (b) the extremely high labour participation rates of Arab and Asian males which are a result of the labour laws that govern the residence of non-Kuwaitis in Kuwait; and (c) the relatively high labour participation rates of the Asian female group compared to Arabs and Kuwaiti females. The number of households recorded in the 1985 census was 236,729. Of these, there were 85,113 Kuwaiti households, 104,037 Arab households, 33,568 Asian households, 4567 of other nationalities and 9441 collective households with members of mixed non-Kuwaiti nationalities. Figure 3 shows the household size of the different nationality groups. The av-
H. Yousc
erage size of Kuwaiti, Arab, and Asian households are 9.2, 5.4, and 3.9 respectively. Kuwaiti households are mostly large in size and of the extended familytype with the majority of household members in the school-age group. Non-Kuwaiti Arab households are mostly of the family type, of medium to large sizes. Asian households are small to medium in size with very few members in the prework age. Table 1 shows the labour force by nationality and sex in 1985. The average number of workers per household is also shown in the table which indicates that Kuwaiti households have an average of 1.48 working persons compared to 2.00 in Arab households, 3.30 in Asian households. The number of workers per household reflects the labour force participation rate differences and the age-sex composition illustrated earlier. There were 9441 collective households recorded in the 1985 census with an average number of workers of 14.4 per household. Kuwaiti male labour force is concentrated in the services (mainly civil service), clerical and professional-technical occupation categories. Non-Kuwaiti male labour force is heavily concentrated in the production worker-labour occupation groups and to a lesser extent in the service and professional-technical groups. More than 80% of Asian males are in either the services or production occupations. Kuwaiti female workers are concentrated in the professionaltechnical and clerical occupations while more than 65% of the non-Kuwaiti females are employed in the service occupations such as teachers, nursery and domestic workers. About 88% of the Asian component of the non-Kuwaiti female workers are involved in domestic occupations. These occupation differences suggest that the majority of Kuwaitis work in medium to high salary occupations. Arab non-Kuwaitis work in low to medium wage occupations while Asians work in low wage occupations. The home interview survey conducted in 1987 and 1988 has revealed other significant differences between Kuwaiti and non-Kuwairi households and between households of the same nationality as well. A detailed account of these differences and their impact on work trip rates have been discussed in Said (1990). The above discussion shows that the number of workers per household of different nationalities are very different which could result in markedly different trip rates as well. Also workers in these different types of households have different occupation types and mode choice characteristics.
3. THE GENERALIZED
LISEAR ,MODEL FRAMEWORK
This section discusses alternative statistical methods of analysis of work trip data based on a generalized linear model (GLM) framework, which are likely to be particularly useful in the Kuwait context. Using this approach, statistical models reflecting the dependence of average household work trip rates on factors of interest such as household size, household income,
189
Estimating work trip rates
FEMALE
MALE
60+ 55-59 50-54 4549
4044
KUWAITIS
35.39 30-34 25-29 20.24 15.19 1014 5-9 04
60+ 55-59 5@-54 45-49 4044
ARAF3s
35-39 30-34 s29 M24 15-19 lo-14 5-9
5M4 4549
100908070605040
302010
POFVLA’TION
010203040506070
ASIANS
80901w
IN THOUSANDS
Fig. 1. Age-sex distribution of population groups in Kuwait, 1985.
of cars owned, nationality, house type, etc., are proposed. The models are then fitted by appropriate techniques such as weighted least squares, or by maximum likelihood if a specific distribution form for household work trip rates is assumed. The goodness-
number
vi(A) 24:3-c
of-fit of competing models may be examined, the aim being to find a concise model which provides a good fit to the observed work trip data. An introduction to GLM is given by Dobson (1983) while a more comprehensive treatment is provided by McCullagh and Nelder (1983).
G. M. SAID and D. H.
190
YOUNG
& 55.59 50.54 45-49
KUWAITIS
404I 35.39 3c-34 25.29 2024 15.19
60+ 55-59 5&Y 4549 40-u 35-39
ARABS
30-34 U-29 2wL4 15-19
ASIANS
ETg. 2. Percent
population
in labour
force by age and sex for Kuwaitis, Arabs and Asians,
The types of models to be examined are of the ANOVA form when all factors are qualitative, of the regression form when all factors are quantitative, but most commonly are of the covariance form when a mixture of qualitative and quantitative factors occur. To simplify the discussion, we shall suppose that three factors are to be examined. The first factor is a qualitative factor representing the nationality type with the number of levels equals a. For example a can take the value three if the nationalities to be consid-
11985.
ered are Kuwaitis, non-Kuwaiti Arabs and Asians. The other two factors are quantitative factors; x,= household size and x,=number of cars owned per household. Letting xtr, xtt, . . . , xlb and x2,, x2,, . . . , x2, denote the observed levels of x, and x2, respectively, cell (i,j,k) will correspond to the grouping of households of nationality type i with x1 =xli and x2 =xzl, i=l , * * . 9a, j= 1, . . . ,b, k= 1, . . . ,c. Let pijk denote the true mean and ~2~~denote the true variance
191
Estimating work trip rates 20 Y 16 11 12 10
KUWAITIES
8
6 4 2 I
0 1
2
3
L
5
6
7
8
9
10
11
12
13
1L
15
16 14
1
10 12 8
ii IA
6
ARABS
4 2 0 10
123456789
11
12
13
14
154
20 16 16 1L 12 10
ASIANS
8 6 4 2 0 12
3
4
5
6
7
8
14
1; .
Fig. 3. Household size distribution of different nationality groups,
1985.
HOUSEHOLD
of household work trip numbers in cell (i,j,k). Our primary object is to obtain reliable estimates of the (pLljli).Suppose that a sample of N households is taken which contains nil& households in cell (i,j,@, where Z:,Zj&.nn,=N. We let Yijk,, I= 1,2, . . . , nijk denote the observed household numbers of work trips in cell (i,j,k) and let Y,,=C, Yi,,,/nvk and s2V&= C,( Yijk,- YVk)2/(nijk- 1) denote the observed mean and variance, respectively, for the cell.
9
10
11
1.2
13
SIZE
Using the standard statistical model notation; then Yi/k/=~tjkl+
Eijkl
(1)
for i=l, . . . , a,j=l, . . . , b, k=l, . . . , c, I=l, . . . , nijk, where the (eVk,)are the true residuals with zero expectations. To construct a suitable statistical model and to develop the associated analysis, the problem is (i) to obtain a suitable representation of
192
G.
M. SAID and D. H.
YOUNG
Table 1. Labour force and households classified by nationality groups, 1985 Households Labor Force Nationality
Male
Female
Total
101607
24803
126410
Kuwaiti Non-Kuwaiti Arabs Asians Others Total N.K.
219989 210338 6413
32989 72505 1831
252978= 282843b 8244
436650
107325
543975
Total
538347
132128
613485
Private
Collective
WorkersiHH
Workers/HH
85113
1.48 944 1
104037 33565 4570
2.00 3.33 1.8
227285
N/A 14.4 -
9441
%rcludes about 45,000 in collective households. blncludes about 80.000 who work as live-in domestic workers and roughly 91,000 in collective householhs. the dependence of pilk on x,~, x,, and the effect of nationality type i, (ii) to select a satisfactory approximating distribution for the household trip numbers (&,] within each cell or at least for the cell means l Yi,kl. The standard approach in the literature appears to use the classical linear model of the form: Pijk=P+a,+P~iX~,+P2iX2k
(2)
where p is an overall mean, (Y,is the effect of nationality type i and /3,; and &, are regression coefficients allowing for assumed linear effects of xr and x,. It is usually assumed that the true within cell variances uziikare constant, or at least approximately so. With these assumptions, the cell mean L,, is taken to be approximately normally distributed with true mean pijk and variance Oz/nilk, where 02 denotes the assumed constant variance. Estimates of w, (cYJ,0, and p2 are then obtained by minimizing the weighted sum Of squares C,EjC, W,j~ ( ~ijli-~-_“;-Pl;X,j_P~iX2k)~ where the weights are wijk=no,. Simpler models with some of the parameters (cYJ,[PI,] and [p2J set equal to zero are also fitted by weighted least squares to test if more concise models may be used. Two criticisms may be made of this standard approach. First the mean representation given by (2) will have a limited range of application because some of the 1~~~~) will be relatively small but all must be positive. However, with typical data sets a few of the estimates of the (~~~~1 could even turn out to be negative if this model was used. Second, and more important is the fact that the assumption of constant variance within cells is unlikely to be satisfied in practice with cells with the higher mean trip rates being likely to exhibit larger variances. A useful way to overcome the first problem is to adopt a logarithmic model for the means with the form: log
~~k=~++(Yi+PI,X,,+PZ,X?k
(3)
This ensures that I~~~=~x~(~+(Y,+P,~x,~+PZ~X~~) is always positive whatever the values of the parameters.
A simple way of checking the model given by (3) would be to compute: z,. = C, log Yo,lnii. and Li.k = Cj log Yiikln,.r
(4)
where_n,j. =C, nijk and qk=Cj nLk. Separate plots of the (L ii’ ] against x,, and the ( Lpk) against x2, for each nationality type i, would give approximate straight line relations if (3) is correct. To examine the second problem of variance heterogeneity among cells, plots of the sample variances [s2,jkJagainst the means ( ?,,] should be made. If the plots suggest straight line relations with the lines passing through the origin and having slopes approximately equal to one, this indicates that the distribution of number of work trips is approximately Poisson. Such an approximation is not unreasonable given that number of work trips may only assume small non-negative integer values. To obtain variance stabilisation with Poisson observations, the square root transformation is used, the transformed observations being Y*ilk,= ( YUk,+ l)r/2. If y*ijk and S*,, denote the observed mean and standard deviation of the transformed observations within cell (i,j,k), then the values of [SIljk] should be approximately constant and independent of the values of [ y *,,I. An alternative transformation which might lead to greater variance stabilisation is the logarithmic in which the transformed observations log ( YGk,+1) may be used. However, this is based on an empirical approach and unlike the square root transformation, cannot be justified through the choice of a particular distribution for the within cell work trip numbers. If graphical inspection of the variance-mean plots based on the {Yijk,] and standard deviation-mean plots based on the (Y*J indicate that the Poisson model is satisfactory,the model may be fitted as follows. Let ~uk.=nvk Y,, denote the total number of work trips for households in cell (i,j,k). Then YUk. has the Poisson distribution with mean /&jk=n,jk
&jk=n,#
exP(p+oi+Pri
x,j+P2iX2k)
(3
using (3). The model is then fitted by maximum like-
Estimating work trip rates lihood, for example, using the statistical package GLIM, Baker and Nelder (1978). The fit provides estimates p, (hi), b,, and fi, of the parameters from which the estimates of the cell true mean trip rates are given by (6)
Standard tests can then be made to test hypotheses of interest such as (i) flti=O, i=l, . . . , a (x,, household size, has no effect); (ii) &,=O, i= 1, . . . , a (x2 has no effect); (c) flti=Pt. &;=&, i= 1, . . . , a (effects of x, and x, are the same for all nationalities); (d) cui=O, i=l . . . , a (no differences among nationality groups), etc. An alternative approach, still within the GLM framework, for handling grouped data is to use ANOVA models, see for example Dobson (1976). Assume a cross-classification table with grouped data has been constructed in which rows 0”) represent household size and columns (k) represent car ownership level. In such case the mean trip rate of cell, u,k) could be expressed as: yjk. = m + mj + m, + mjk+ C, ejk,lnjk
(7)
in which m is the grand mean of the true cell means, mj are deviations of true row means about the grand mean, m, are deviations of true column means about the grand mean and mjk represent deviations from
additivity of row and column effects about the grand mean. The use of regression and ANOVA models with grouped data solves the problems related to the difficulty of forecasting household characteristics in the level of detail required for regression models with ungrouped data. However the use of ANOVA does not take into account the quantitative nature of the two variables used in this example. 4. PRELIMINARY ANALYSIS OF TRIP RATES OF HOUSEHOLDS IN KUWAIT
The home interview survey provides the basis of the analysis in the rest of this paper. The interview survey included some 6270 households with 2,270 Kuwaiti and 4,000 non-Kuwaiti households. The data collected were organized in three major data files; (a) households data file (6270 records); (b) personal data file (60,000 records); and (c) trip data file (57,000 records). Household information included size, age-sex specification for household members, occupation groups of working household members, house-type, and license-car access status of each member in the household. In addition, information on morning work trips was collected. A master file that contains households by nationality, size, car ownership, income, house type, and several other classifying variables in addition to work trips made by individual households has been compiled. The data provide the basis for further analysis using the GLM framework.
193
The discussion on modelling in Section 3 stressed the importance of examining the relation between measures of variability and mean work trip rates to see if the assumption of constant variance was reasonable. For the Kuwait data, this aspect has been examined for the different nationality groups separately using household classifications based on likely influential variables such as household size, number of cars owned and number of persons in the working age in the households. For example, for Kuwaiti households the household size ranged from 1 to 23. For each of the 23 values of the variable, the mean, standard deviation and variance of numbers of morning work trips were calculated for the relevant set of households. Scatter diagram plots of standard deviation against mean and variance against mean were then constructed. The standard deviationmean plot is shown in Fig. 4 together with the corresponding plots for the other two nationality groups; non-Kuwaiti Arabs and Asians. The plots show that there is a systematic increase in the standard deviation as the mean increases. If an underlying Poisson distribution for the household number of trips is assumed, the standard deviation-mean scatter diagram plots based on the transformed observations (1 + number of work trips)‘/2 should show little or no systematic effect. Figure 5 shows these plots when households are again classified according to household size. A comparison of Fig. 5 with Figure 4 shows the strong variance stabilizing effect, particularly in the case of Kuwaiti households. Similar patterns were observed using classifications based on car ownership and number of adults. Before embarking on any GLM analysis, it is useful to perform some preliminary investigative work to see which variables, when treated individually, have the greatest influence on mean trip rates. The idea is to try to limit the number of variables that enter into the models to be fitted, so that a more compact crossclassifying scheme is used which leads to reasonable household frequency levels. Within each of the three nationality groups, house type provides a clear indication of several of the household characteristics such as income, social status, and household size. Households have therefore been classified according to nationality type and house type. A total of seven groups has been considered with Kuwaiti houses classified into villas, government built houses (NHA villas) and other dwelling types, and both Arab and Asian non-Kuwaiti households classified as living in apartments and other dwelling unit types. For each group, simple linear regression analyses were made using individual explanatory variables from a list of household variables. Here we shall report only the results based on the use of household size (x1), cars owned per household (x2), and number of adults (x3). The household response variable was Y=(l +number of work trips)“*, and simple unweighted least squares analyses were made, the aim
G. M. S.~D and D. H. Youxa
194
KUWAITIS
0
1
#
Od
0.4
,
I
I
I
1.6
1.2
I
I
I
3
2.L
2.8
I
3.2
l
0
2-
I
2
ARABS
l *
l-
.%.
l
l l
0
B
I 0.6
a.4
I
I
I
1.2
I 1.6
I
I 2
I
I, 2.4
I,
2.8
3.2
l
ASIANS
of Q.4
I
1,,
I,,,,,
0.8
1.2
MEAN Fig. 4. Standard deviation-Mean
1.6
TRIP
2
(
2.L
,,
2.8
3.2
RATE
trip rate relationship for various household size levels.
being to screen the variables to see which variables accounted for most of the variation in the transformed numbers of work trips. Values of the coefficients of determination R* obtained from the regression analysis are shown in Table 2. For Kuwaiti households, the R2values using household size change markedly among the three house types. This effect can be seen clearly from the scatter diagram plots which are shown in Fig. 6. However, the use of x, or x3 accounts for higher percentages of the variation in trip rates over all house types.
For non-Kuwaiti Arab households, it is seen that household size is not a very effective variable for predicting the mean work trip rates. This probably reflects the fact that the larger households are mainly Palestinian and dominated by school-age children who have no real impact on work trips. Number of adults gives high correlation for both house types. For Asian households the picture is less clear cut with the number of adults being the most effective variable for apartments but the least effective variable for other dwelling types. The correlations using
Estimating
work trip rates
195
KUWAITIS
1 1 l
0
1
1.3
.5
l
e.e*e*
.*
8
.
l
I
I
I
1.5
1.7
I
1.9
ARABS
ASIANS
MEAN
TRIP
RATE
Fig. 5. Standard deviation-Mean trip rate relationship for various household size levels with square root transformation.
household size are greater than those using number of cars owned but neither variable explains a high percentage of the variations in the trip rates. 5. ILLUSTRATIVE
GLM ANALYSIS
In this section we shall illustrate the GLM approach for modelling mean work trip rates using data for Kuwaiti households living in villas. We restrict attention to two key explanatory variables, x1=
household size and x,=household car ownership; more variables could of course be considered. The household size (xlj) values ranged from 1 to 23 and the car ownership (x23 values ranged from 0 to 9. Of the 230 cells formed by the cross-classification of the values of xlj and xZk, 161 contained at least one household. We shall use the notation adopted in Section 3 except that the suffix i is dropped because only one nationality type is examined. For example Z;.kl and
196
G. M. SAIDand D. H. YOUNG Table 2. Values of R? for simple linear regression analysis
Kuwaitis : Villas NHA Housing Others Arabs/ : Apartments Others Asians/ : Apartments Others
*I
x2
X3
.936 ,637 .325 .560 .360 ,523 .621
.960 ,970 .912 .930 .755 .442 ,511
.941 .941 .883 ,940 ,940 .935 .490
x, = household size x2=cars per household
x, = number of adults
5. I Model I: Classical model for untransformed data
For individual households the assumed model is Yjkl=~+~,xlj+~~2kf~3x,jX2kfEjkl
(8)
where The term &x,,xzk is included in the model to allow for an interaction effect between the variables. It should be stressed that our earlier analysis has indicated that the assumption of constant variance is not really appropriate here. For cell means, averaging the model form given by (8) over all households within the cell gives the model: var(ejk,)=o’.
(9)
Yj~=~‘+PlxljfP:X2k+P,XljX2k+~jjk.
with E(ijk,)=O, var(Zjik.)=02/njk. The model is fitted by weighted least squares as indicated in Section 3 and an overall measure of goodness of fit of the model is provided by the deviance Ej& ( y/k.-b-fl,x,j&x~~-&x,~x~~)~ with 161-4= 157 degrees of freedom. 5.2 Model 2: Classical model for transformed data If Y*jk. denotes the mean of the transformed observations for cell (i,k) the model is: F*jk
=~++Plx,jfJ32X2k+P~X,~2U,,+?*jk.
(10)
The model is again fitted by weighted least squares and goodness of fit measured by the deviance with 157 degrees of freedom. where
E(F*jk)=O,
fore work with the model log
~jk=~“+P,,+PrvzkfP3X,,X2k
var(S*jk)=a*2/njk.
5.3 Model 3: Poisson model, logarithmic link In the final model, we specify the distribution of the [ Yjk,] in cell (i,k), which is taken to be approximately Poisson with mean pjk. Figure 7 shows plots
(11)
If qk, denotes the total number of trips in cell (j,k), then Y,, is approximately distributed as Poisson with mean &=nlkc(jk,
so
This model is fitted by the method of maximum likelihood, specifying the as Poisson observations with logarithmic link function for their means given by eq. (12). The goodness of fit is measured by the deviance which for this model is [
Y*jk,=( Y,, + 1),/z now denote the number of work trips and its transformed value, respectively, for household 1 in cell ok), I= 1, . . . . nip. The following full models and associated methods of fit are considered. In each case, reduced models are also considered to assess the effect of dropping x, and x2 from the model.
,!?(cjk[)=o,
Of Y,k, and log Y,k. against x,~ and against xZk. The plots show fairly similar patterns but the logarithmic plots show rather stronger linear relations. We there-
Yjk.1
with 157 degrees of freedom,
where
The three full models given by eqs. (9), (IO), and (11) and their reduced forms with &=O, &=&=O, fi,= & =0 were fitted to the data and their observed deviances are shown in Table 3. The associated degrees of freedom are shown in brackets. Deviance comparisons may be made within a given model type but not between different model types. An examination of the differences between the deviances within each model type shows that the omission of the term fi2xzkin a model leads to a very large increase in deviance, showing that the car ownership level has a very marked influence on the mean work trip rate. For model 1, tests of significance are not appropriate since the assumption of constant variance is not reasonable. For model 2, to test if the interaction effect is zero (& =O), we refer the value of (19.19-18.71)/(18.71/157)=4.03 to the F,,,s7 distribution. The result is just significant at the 5% level indicating the need to retain the full model for the square root transformed data. For model 3, the interaction effect is tested by referring the deviance difference (135.90-135.30)=0.60 to ~2,. The result shows no evidence against the hypothesis &=O. To test the effect of x,, the value of (141.20-135.90)=5.3 is referred to ~2,. The result is significant at the 2,/z% level showing that household size has some linear effect on mean trip rate, although the effect is much less important than that of car ownership. Estimates of the cell true mean trip rates can be obtained using the following fitted model equations: Model 1: iijk=0.4087-0.0134 +0.0109x,x~
x, +0.2568 x2
Model 2: ,Cjk=(1.193-0.00244 x,+0.0834x2 +0.00254x,x2)2-0.88 Model 3: ~jk=exp(-0.5059+0.0137x,+0.2093
XJ
Table 4 shows the observed mean trip rates and the three model estimates for x, =4, 6, 8, 10, 12, and x2= 1, 2, 3, 4, 5, 6. The number of households are shown
Estimating
work trip rates
197
2 1.9 1.8 1.7 1.6
A)
VILLAS
1.5 1.4 1.3 1.2 1.1
I
I
I
I
1
I
I
I
I
2
3
L
5
6
7
8
9
10
,
11
i_
,
13
,
,
1P
15
,
16
,
17
,
18
19
20
In
w
2
a
2
-
1.9 1.81.7 1.6 -
(B)
N.H.A.
HOUSING
1.51.41.3l.2 " 1.1-l? 1 2
I
I
I
I
I
I
I
I
3
4
5
6
7
8
9
10
I
I
I
11 12 13
I
I4
I
lb 15 16 17
I 18
I 19 20
2 1.9 1.8 1.1 -
C
1.6 1.5-
OTHER (MELUNG UNIT TYPE
1.4 l
1.3 0 1.2 1.1
I 12
I
I
I
I
I
I
I
I
3
4
5
6
7
8
9
10 11
I
I
HOUSEHOLD Fig. 6.
II
1
12 13 14 15
I
I
16
17
I
SIZE
Observed and fitted work trip rates of Kuwaiti households
in brackets. The results show that the agreement between observed and estimated mean trip rates is generally satisfactory, the only major discrepancy occurring when x, = 4, x,=6, for which only two households were available. The estimates of the three models are relatively similar, indicating that there is flexibility in the choice of a particular model for the data. It should be stressed that the above fitted models
I
18 19 20
in different
dwelling unit types.
describe observed statistical relationships at the present time. Current policies towards car ownership might well change in the future, so caution would be needed in applying the fitted models in a long range forecasting context. The fitted models obtained previously used nearly all combinations of the observed values of the two explanatory variables. When more variables are studied jointly, fairly broad groupings of the variable val-
198
and
G..M.SAID
1
0
MEAN
l
LOG
2
3
L
D.H.
YOUNG
MEAN
5
6
7
8
9
10 11
12 13
HOUSE HOLD
14 15 16 17
16 19 20 21 22
SIZE
.!I 3.5 3
0
MEAN
0
LOG
0
0
0
a
MEAN
0
2.5 2
0
0
1.5 0 l-
0
0.5 4
0 0
0
-’
-Q57, -1
0 0
-f
I
I
I
I
I
I
1
I
0
1
2
3
4
5
6
7
0
CAR
OWNERSHIP
9
LEVELS
Fig. 7. Scatter diagram plots for mean and log (mean) trip rates. ues are needed to give cell frequencies of households of reasonable size. The need for grouping also arises because of the practical difficulty of making a large number of forecasts for future household densities when many cells are defined. The regression approach to modelling used earlier can easily be modified to deal with grouping, by using variable values equal to central values of the groups. For example, if the household size was grouped as 1-3, 4-6, 7-12, 13-20, values X, =2, 5, 9.5, 16.5 would be used. 6. CONCLUDINGRE.MARKS
Alternative
statistical
methods
of analysis
of
morning work trip data based on the generalized linear model framework have been described and are believed to be particularly useful in the Kuwait context. These models are of the regression, ANOVA or the covariance type depending on the nature of variables that are used to describe household characteris-
tics. These models are fitted by weighted least squares or by maximum likelihood if a specific distribution for the household work trip data is assumed. A major improvement of this approach, that is of relevance to Kuwait, is that the estimated mean trip rates for cells of the cross-classification table utilize a model fit based on data from all cells and not just the data from the given cell which may be sparse.
Estimating Table 3. Deviance values for fits of three model forms to trip data for Kuwait households in villas Model Structure
Full P,=O &=B3=0
P, =P3=0
Model 2
Model 3
18.71 (157) 19.19 (158) 60.18 (159) 19.90 )159)
135.30 (157) 135.90 (158) 405.40 (159) 141.20 (159)
Model 1
208.2 217.0 685.1 227.4
(157) (158) (159) (159)
(157)=number of degrees of freedom An illustrative GLM analysis was described in which trip rates of Kuwaiti households living in villas were utilized. Three regression-type models were fitted. These models are; classical model for untransformed data, classical model with square root transformation of household trip data and a model that assumes a Poisson distribution of individual household trip rates within cross-classification cells. The explanatory variables used in these illustrative runs were the household size and car ownership. Other variables may be investigated in a full analysis context and these need not be of the quantitative type. An allowance for the interaction effects of these two variables was made. Further analysis is planned in which models of the ANOVA and covariance types will be used for the seven major household groups. It was concluded from the analysis that: (a) the three models produce generally adequate fits; (b) only cross-classification cells with very low frequencies showed significant discrepancies; (c) the differences between the mean trip rate estimates from the three models are relatively small indicating that there is flexibility in the choice of a particular model for the data; and (d) both the household size and car ownership variables are needed in explaining variations in work trip rates although the car ownership variable is more powerful. The difficulties in the routine use of the crossTable 4. Observed Household
Observed Model 1 Model 2 Model 3
Observed Model 1 Model 2 Model 3
Observed Model 1 Model 2 Model 3
199
work trip rates
classification analysis approach discussed in the Kuwait context could exist also in other environments. These may include areas in communities with distinctly different ethnic groups or where a mixture of households with wide variations in income exists. The regression and ANOVA techniques of the GLM framework described in this paper offer an alternative methodology to the calculation of trip rates in these cases since it will be possible to (a) identify the variables that significantly influence trip rates; (b) establish which of the single or combined effects of these variables need to be included in the model form; and (c) in the case where grouped data is used, the appropriate groupings could be identified. The trip rate estimates will be based on the entire data thus improving the reliability of trip rates for households with fewer frequencies. Acknowledgments-This research is funded by Kuwait University (Project EV 030) and Kuwait Foundation for the Advancement of Sciences (Project 870803). Data were made available by Kuwait Municipality. Bruce Hutchinson of the University of Waterloo has provided some editorial suggestions on an earlier version of the paper. Computations were made by Hassan Khalil. Reda Alfi typed the Manuscript.
REFERESCES Baker, R. & Nelder, J. (1978) GLIM Manual (release 3). Numerical Algorithms Groups, Oxford, U.K. Dobson, A. J. (1983) An introduction IO statistical modeling. Chapman and Hall, London. Dobson, R. (1976) The general linear model analysis of variance: Its relevance to transportation planning and research. Socio-Economic Planning Science, 10, 23 l-
235. Hutchinson, B. G. and Said, G. M. (1990) Spatial differentiation, transport demands and transport model design in Kuwait. Trunsport Reviews, 10(2), 91-112. Kassoff, H. and Deutschman, H. (1969) Trip generation: A critical uppruisul. Highway Research Record 297, Highway (Now Transportation) Research Board, Washington, DC, pp. 15-30.
and estimated
mean trip rates Household
Size
4
6
8
10
12
4
6
.75 .66 .75 .79
.43 .65 .74 .81
1 car .92 .65 .75 .83
.57 .64 .75 .85
.71 .64 .75 .88
(8)
(7)
(12) 3 cars
(7)
(7)
.60 .96 .99 .97 (20)
1.19 .97 1.01 .99 (57)
1.11 1.25
1.23
1.36 1.33 1.32 1.26
1.37 1.37 1.35 1.30
1.09 1.41 1.38 1.33
1.50 1.55 1.54 1.47
8
10
12
.77 1.01 1.04 1.05 (13)
.88 1.02 1.08 1.08 (9)
1.75 1.74 1.70 1.60 (20)
1.60 1.79 1.74 1.64 (10)
2.00 2.47 2.44 2.43 (7)
2.63 2.57 2.53 2.50 (8)
2 cars .83 .99 1.02 1.02 (35) 4 cars
1.19
1.29 1.29 1.23
(18)
(26)
(28) 5 cars
(16)
(21)
(4)
1.70 1.62 1.59 1.51 (20)
2.00 1.86 1.85 1.81
1.78 1.94 1.91 1.86
2.05 2.02 1.98 1.92
2.87 2.10 2.05 1.97
2.60 2.19 2.12 2.02
1.00 2.16 2.16 2.24
1.50 2.26 2.25 2.30
2.89 2.37 2.34 2.36
(1)
(9)
(20)
(15)
(5)
(2)
(2)
(9)
1.26
Size
1.76 1.68 1.64 1.55 (17) 6 cars
200
G. M. SAID and D. H. YOUNG
Kuwait Municipality, (1977) Kuwoif Transportation Study. Final Report, Kuwait. Kuwait Muncipality. (1983) Master Plan for Kuwait: Second Review. Final Report, Kuwait. McCullagh, P. and Nelder, J. A., (1983) Generalized Linear Models. Chapman and Hall, London. Said, G. M. (1990) Work trip rates and the socio-economic
characteristics of households in Kuwait. Australian Road Research, 20( 1). Stopher, P. and McDonald, K. (1983) Trip generation 6y cross-classification: An alternative methodology. Transportation Research Record 944, Transportation Research Board, Washington, DC, pp. 84-91.