ARTICLE IN PRESS Transport Policy 16 (2009) 325–334
Contents lists available at ScienceDirect
Transport Policy journal homepage: www.elsevier.com/locate/tranpol
Using odometer readings to assess VKT changes associated with a voluntary travel behaviour change program Rita Seethaler, Geoff Rose Institute of Transport Studies, Department of Civil Engineering, Monash University, Australia
a r t i c l e in fo
abstract
Available online 30 October 2009
In order to detect changes in daily average vehicle kilometres travelled (VKT) induced by a large-scale TravelSmart intervention in Melbourne, a panel of households was asked to complete before and after surveys, which included week-long odometer readings. In contrast to results reported from previous TravelSmart applications, the Melbourne program did not induce a statistically significant change in the average daily VKT when measured 1 year after the intervention. Multiple regressions revealed that the variability in change in VKT was better explained by socio-demographic variables than by the TravelSmart treatment. The change in VKT was also found to be strongly negatively correlated with the average daily vehicle kilometres recorded in the before survey—indicating the possibility of the ‘regression-to-the-mean’ effect well known in the road safety literature. The conditions under which the regression-to-the-mean effect may create the illusion of a positive TravelSmart program impact on the reduction in daily average VKT are examined. It is concluded that, in the context of voluntary travel behaviour change evaluations, greater attention should be paid to instrument reactivity arising from the impact of the before travel survey on TravelSmart uptake and/or on change in VKT, and to regression-to-the-mean effects. & 2009 Elsevier Ltd. All rights reserved.
Keywords: TravelSmart Evaluation Vehicle kilometres Odometer reading Panel survey Sample size Instrument reactivity Regression-to-the-mean effect Response bias
1. Introduction In 2004, as part of its travel behaviour change program, the Victorian Department of Infrastructure (DOI) rolled-out a largescale community-based voluntary travel behaviour change (VTBC) initiative known as TravelSmart. The initiative targeted some 30,000 private households in the Local Government Area of Darebin, located in north-eastern Melbourne. The aim of this intervention was to achieve a change in travel behaviour of approximately a 10% reduction in car trips and car kilometres across the target population (Richardson, 2005). A private contractor was responsible for the community wide implementation of TravelSmart using its individualised marketing method Indimarks (SocialData Australia, 2004). DOI also commissioned an independent evaluation that involved before and after travel surveys conducted in March 2004 and March 2005. In parallel with the TravelSmart intervention in Darebin, a doctoral research project examined the recruitment phase of TravelSmart and its impacts on car usage (Seethaler and Rose, 2005, 2006). The doctoral research project focussed on a sub-area of Darebin containing approximately 800 households and is referred to here as the sub-regional TravelSmart research project to distinguish it from the regional TravelSmart initiative described above. Consistent with the official TravelSmart program, the research project
employed a before and after evaluation using a very similar survey method. This paper draws on the data collected in the research project to address two fundamental research questions: can a oneoff community-based TravelSmart campaign engage households over a short period of time to induce lasting behaviour change and how can any change be reliably evaluated? The structure of this paper is as follows: Section 2 outlines the features of an independent evaluation of a community-based travel behaviour change program conducted in Melbourne in 2004 including a comparison between the survey instrument used in evaluating the regional intervention and sub-regional research project. Section 3 compares the daily average vehicle kilometres travelled (VKT) collected before and after the TravelSmart intervention and examines the factors that explain the change in VKT. Section 4 deals with the regression-to-the-mean effect that, when combined with a response bias, can lead to an overestimation of the TravelSmart impact. Finally, Section 5 summarises the findings and outlines a set of recommendations that have relevance to future studies seeking to evaluate the impact of voluntary travel behaviour change programs.
2. Melbourne VTBC program evaluation 2.1. Methods used to assess average daily VKT
Corresponding author.
E-mail address:
[email protected] (G. Rose). 0967-070X/$ - see front matter & 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.tranpol.2009.10.006
For the community-based TravelSmart intervention in Darebin, the DOI commissioned a three-pronged evaluation program
ARTICLE IN PRESS 326
R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
consisting of a before and after travel diary survey (called North Eastern Suburbs Travel Survey or NESTS), an odometer survey that formed part of the household travel survey and a trend analysis of secondary data sources using traffic count and public transport usage data. Each of the evaluation methods is presented in detail in technical reports with a summary of the methodology and results provided in the final evaluation report (Richardson, 2005). In terms of recording the travel behaviour of individual households, the before survey was conducted in March 2004. The design of the NESTS survey instrument drew on the Victorian Activity and Travel Survey (e.g. as described in Richardson et al., 1995) and included a household form, that collected details of people and vehicles, and a one-day diary for each household member 5 years of age and older, in which the stages of each trip were recorded in detail (e.g. departure and arrival time, mode choice, purpose, size of party, location). In addition, as a new component, the first series of odometer readings were recorded on the household form that was then picked up together with the travel diaries the day after the ‘travel recording day’. For the second odometer readings 1 week later, separate odometer cards were sent out to all the NESTS households. For those households who had responded in the before survey, exactly the same procedure was repeated 1 year later in March 2005 for the after survey. The ‘before’ evaluation survey was sent to a random selection of the households located in the TravelSmart intervention area. The contractor implementing the TravelSmart program aimed to contact all households in the study area as part of the recruitment process. Some households that were sent the ‘before’ evaluation survey later participated in TravelSmart. Importantly, for the agency that conducted the TravelSmart intervention, the evaluation survey was conducted ‘‘blind’’. At the time of the TravelSmart recruitment, the field staff had no knowledge of whether a household was part of an evaluation survey or not. The DOI decided not to require the use of a control group with the evaluation methodology. An experimental design working with a test and control group must ensure that all subjects are the same in all relevant aspects before and after the intervention except for the specific treatment introduced by the intervention. Normally, a test and control study design then randomly assigns subjects to either the test or the control group. When investigating TravelSmart programs in the natural setting of a field test, these requirements are almost impossible to meet. If all subjects are chosen from the same area and randomly assigned to the test and control group, equity issues may arise where control households feel left out from a TravelSmart intervention in which they would have liked to partake and benefit from the free services provided. Also, locating test and control households in the same area results in cross-contamination where control households learn about more efficient travel behaviour through friends and neighbours who are part of the test group. To prevent equity issues and cross-contamination, the control group could be recruited from a location other than the TravelSmart area. However, differences in road and public transport infrastructure, quality and type of services provided, and policy interventions by different local authorities make it difficult to meet the ‘ceteris paribus’ condition, as these parameters all affect travel behaviour. Instead of a control group, a trend analysis of secondary data sources using traffic count and public transport usage data were used to compare the TravelSmart intervention results to a benchmark without the intervention (Richardson et al., 2005). Following the design of the regional TravelSmart intervention evaluation, the sub-regional research project also used a before and after travel diary survey. Whilst personal, household and vehicle characteristics were recorded in exactly the same way as in the NESTS survey, the mode choice of household members was recorded in a simplified manner. Because the Monash research
project was primarily focusing on vehicle kilometres travelled (VKT) by privately owned motor vehicles, it was sufficient to collect information on general mode use as a binary variable (mode used/not used on a given day), but to extend the recording period over eight consecutive days (e.g. from one Wednesday to the following). In this way, the survey form would be completed gradually over a 1-week period, and could therefore include both odometer readings on the same form—the one at the very start and at the end of the survey period. In contrast to the official NESTS survey, a separate mail-out of a second odometer card to all households was therefore not necessary. This way, the personal pick-up of the survey form was done at the very end of the travel week instead of having to send staff out twice for collection of two separate forms. It could be argued that using the same form for both odometer readings makes it easier for people to ‘‘invent’’ the second result based on the first one rather than taking a new reading. On the other hand, sending out a separate form for the second reading increases the possibility of item non-response and thus renders it more difficult to obtain valid pairs of odometer readings. In order to increase the response levels and secure quality in response, in the Monash research project a reminder phone call was made on the evening before the first odometer reading and a reminder card was placed in the letter box of the households 2 days before the second odometer-recording day. Table 1 presents a comparison of the survey instruments used in NESTS (for the whole of Darebin) and the research project (which focussed on the Fairfield area in Darebin). While the odometer readings were recorded in a slightly different way in the sub-regional research project, the average daily vehicle kilometres at a household level were comparable with the ones recorded by the NESTS survey (Seethlaer, 2006). In the sub-regional research project, the vehicle kilometres from the week-long odometer readings were aggregated to a household level. This approach was selected to calculate average daily household VKT because it reduced the problem of variability between household members and between week days as occurs in one-day travel diaries. Also, the starting day of the odometerrecording week was spread evenly over each weekday throughout the random sample of households. In designing the before and after survey methodology, careful consideration was given to the implications for the required sample size. This was done to ensure that sufficient households were included in the research project in order to statistically test for the anticipated change in VKT. Based on the vehicle kilometres recorded in the cross-sectional Victorian Activity and Travel Survey (VATS) and the 6-week German MobiDrive household panel, the impact of a range of survey design parameters on required sample size was assessed (see Seethaler, 2006 for details). Table 2 summarises the required sample sizes that would be needed to ‘detect’ a change in VKT as a function of:
two different types of survey (panel vs. cross-sectional); two different time periods (weekly vs. daily); different levels of aggregation (household vs. person level); and
different probabilities of making a Type I error (a) and Type II (b)error
a target population of N= 1500 households, thus requiring application of a finite population factor. In addition, for each of the survey design combinations shown in Table 2, the sample sizes are indicated for cases where the detected difference between the before and after survey has been set to 5%, 10% and 15% of the mean.
ARTICLE IN PRESS R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
Table 1 Survey instrument contents used in the regional and sub-regional studies. NESTS (regional TravelSmart intervention in Darebin) Household from
Number of people living in the
Sub-regional research project (Fairfield) Household form
Number of people living in the
household
Type of dwelling Ownership status of dwelling Length of residence at current address
Number of bicycles in the household
household Type of dwelling Ownership status of dwelling Length of residence at current address Number of bicycles in the household Number of dogs in the household Contact phone number of the household
Number of dogs in the household Contact phone number of the household
Person form
Person’s first name (as link to travel diary and to second wave)
Year of birth Gender Relationship to person 1 (oldest person
Country of birth License holding status Current employment status
Person form Person’s first name (as link to travel diary and to second wave) Year of birth Gender Relationship to person 1 (oldest person Country of birth License holding status Current employment status
One-day diary based on ‘‘stops’’ for Binary modal choices recording for each person of 5 years and older ‘travel week’ (8 consecutive days) for each person Did the person travel that day or Car use as a driver (yes/no) on each of not? the 8 week days Nature of stop and location Public transit use (yes/no) on each of the Purpose 8 week days Who (from the household) Bicycle use (yes/no) on each of the 8 travelled with the person to the week days stop Mode use Additional questions for private vehicle trips Type of vehicle used Trip as driver or passenger Size of the party Vehicle from vehicle form or other? Departure and arrival time Was there further travel to a next stop?
Vehicle form Type of vehicle Make Model Year of fabrication Number of cylinders Fuel type Ownership (private, company, government) First odometer reading
Vehicle from Type of vehicle Make Model Year of fabrication Number of cylinders Fuel type Ownership (private, company, government) First odometer reading Second odometer reading
Second separate odometer card, 1 week after ‘travel day’
The results in Table 2 show that the reduction in 10% VKT reported in recent TravelSmart interventions in Australian cities, can be ‘detected’ at a household level for a week-long before and after panel survey with a sample of 242 households for alpha and beta levels of 5%. In contrast, using a cross-sectional survey design would require a sample size of 758 households for the same task. It is necessary to adjust these sample size estimates to account for the assumed response rate for the first wave (a response rate of
327
60% was assumed for the before survey) and the response rate of the second wave (a response rate of 75% was assumed for the after survey). In addition, residential mobility of residents moving out of and into the area during the year had to be taken into account (estimated to be 15% based on official records of the Australian Bureau of Statistics population census). Thus, taking all three factors into account, the initial sample size estimates above would have to be augmented by a combined adjustment factor of 2.6 [= 1/(0.6 0.75 0.85)], in order to end up with the required number of responding households needed for the comparison. Therefore given that a final sample of 242 households would be required if a week-long panel survey was used (as noted above), then 629 households ( = 242 2.6) would need to be recruited. Given that the target area for the sub-regional research project contained 800 households, this suggested that sufficient completed surveys could be obtained to detect a 10% reduction in VKT arising from the TravelSmart intervention. Whilst the recording of week-long travel data with odometer readings from a panel rather than a cross-section of households is advantageous in terms of reduced variability for the before and after comparison of average daily VKT, there are other problems arising that have the potential to compromise the quality of the results. Those problems, which are discussed in the following subsections, include the changing composition of households over time, incomplete odometer readings, and instrument reactivity. 2.2. Changing composition of panel households over time Over a year long period, some households that participated in the before survey had changed their composition due, for example, to the arrival of a baby or adult children leaving the parental household. Households where more than half of household members were identical in both waves were kept in the sample and used for analysis. This rule was also applied for shared households (e.g. a group of students living together). In contrast, households where more than half of the household members had changed in the second wave were excluded from the sample. With respect to change in vehicle fleet, households without a car in both waves were kept in the sample but were excluded from the calculations of change in daily average VKT at a household level. All other households, where the composition of the vehicle fleet remained the same in both waves or where cars were replaced by new ones or where the number of motorized vehicles was increased/decreased from the first to the second wave, were included in the calculations of change in daily average VKT. 2.3. Partly incomplete odometer readings Ideally, household members record two odometer readings for each motorized vehicle in the household on the designated recording days. In some cases however, householders misunderstood the question and recorded the kilometres travelled on the recording day or they chose to record the odometer readings on a day other than the designated recording day. In the regional NESTS survey and the sub-regional research project, households with missing odometer recordings were excluded from the analysis. However, before exclusion in the research project, clarification phone calls were made to obtain a correct reading even if it was for a day other than the designated recording day. Reporting odometer readings for a period longer than the designated survey week may introduce bias, for example if householders after having been away on the second recording day are ‘adding’ another weekend with high mileage to the recording interval. In this case, average daily VKT at a household level are then biased upwards (Richardson, 2005). However,
ARTICLE IN PRESS 328
R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
Table 2 Required sample sizes to measure a change in car kilometres travelled. Source: Extract from Richardson (2002, p. 27) Survey type
Panel Panel Panel Cross-sect. Cross-sect. Cross-sect. Panel Panel Panel Cross-sect. Cross-sect. Cross-sect.
Measurement unit
Household Household Household Household Household Household Household Household Household Household Household Household
Measurement period
Week Week Week Week Week Week Day Day Day Day Day Day
Detectable difference (%)
5 10 15 5 10 15 5 10 15 5 10 15
Alpha= Beta 5%
10%
15%
20%
653 242 118 1205 758 468 1184 726 442 1332 998 703
477 157 74 1068 573 323 1042 544 302 1242 819 523
351 106 49 927 432 228 897 406 213 1138 660 389
251 72 33 774 315 159 742 295 147 1012 512 281
Note: Results include application of a finite population factor for N= 1500.
analysing the data collected in region-wide NESTS and the subregional research project showed that average daily VKT did not differ significantly when households with longer reporting periods were added (Seethaler, 2006). 2.4. Instrument reactivity An important problem that reaches beyond the immediate concerns with the quality of an evaluation is the effect that the survey evaluation can have on the TravelSmart intervention itself. Known as ‘instrument reactivity’ (De Vaus, 2001), the before survey can affect how the responding households react to the TravelSmart intervention to which they are exposed subsequent to the survey. To date little, if any, consideration has been given to the potential impact of instrument reactivity in the context of community-based TravelSmart evaluation. As described below, four scenarios are possible with some enhancing, and others decreasing, the TravelSmart intervention effect (Seethaler, 2005). 2.4.1. Scenario 1: the before survey affects travel behaviour before TravelSmart By filling in the travel diaries, the household members become aware of the amount of their (motorized) travel. This awareness inclines them to organise their travel patterns more efficiently. Consequently, before the TravelSmart intervention has even started, their VKT at a household level decreases because of their participation in the before survey. As a result, the reduction in VKT due to TravelSmart appears to be larger because it ‘harvests’ some of the travel behaviour change that was actually initiated by the before survey. 2.4.2. Scenario 2: the before survey increases the uptake of TravelSmart In some households, the before travel survey can raise awareness of daily travel, which in turn can raise the householders’ propensity to participate in TravelSmart once the campaign has started. Households who would not have taken up TravelSmart do so now, because of their exposure to the before travel survey. 2.4.3. Scenario 3: the before survey decreases the uptake of TravelSmart In some households, the request to fill in a travel diary for each member of the household may result in household members reacting with fatigue to the TravelSmart intervention. Households
who might have participated in TravelSmart no longer do so, because of their exposure to the before travel survey.
2.4.4. Scenario 4: the before survey has no effect on travel behaviour nor on participation in TravelSmart The before travel survey has no effect on travel behaviour nor does it affect the propensity of the householders to participate in TravelSmart. From an evaluation point of view, this scenario is of course the preferred one because it allows the result of the sample to be expanded to the whole target population without correcting for the effect of survey instrument reactivity. In both the regional TravelSmart intervention and the subregional research project a number of strategies were employed in order to avoid instrument reactivity from occurring or minimise its potential effect if it did occur. For example, the before travel survey was conducted several weeks prior to the start of the TravelSmart campaign. The survey announcement letter and materials carried different logos and had a different visual appearance from the TravelSmart materials. Also, the evaluation survey and the TravelSmart intervention were announced by different institutions and carried out by different field staff. Whether all these measures combined were able to eliminate survey reactivity and its impact on TravelSmart is difficult to establish. Examination of the sub-regional research project data indicated that the mitigation strategies to avoid instrument reactivity described above seemed to have worked only partially. Although the design of this field test was not set up to fully measure and investigate the different facets of instrument reactivity, data on the before travel survey response and the TravelSmart uptake for the test area revealed interesting results, as shown in Fig. 1. In the test area there is a statistically significant difference in TravelSmart uptake (p =0.001) between those households who were included in the before travel survey (57.1% yes) and those who were not surveyed (44.7% yes). Because the survey households were sampled randomly from the same test area, a systematic difference in socio-demographics is reduced. This result therefore suggests that the before travel survey has a positive impact on TravelSmart uptake as described by Scenario 2 above. This conclusion however is somewhat superficial. If the before travel survey households are further examined in terms of their level of response, it appears that respondents to the before travel survey have a much stronger TravelSmart uptake (65.5%) than households that did not respond to the before travel survey
ARTICLE IN PRESS R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
329
Test Area Before Survey
z=3.255 (p=0.001)
TS-uptake: 57.1 %
Before Survey Response
TS-uptake: 44.7 %
Before Survey Non-response
z = 4.147 (p = 3.336 -E05)
TS-uptake: 65.5 %
No Before Survey
TS-uptake: 43.8 % Fig. 1. TravelSmart uptake by before travel survey outcome.
(43.8%). This difference of 21.7% in TravelSmart uptake between respondents and non-respondents to the before travel survey is statistically significant (po0.0001), and it points at the presence of a third type of factor (e.g. attitudinal characteristics) positively affecting both response to the before travel survey and TravelSmart uptake. In conclusion, the data of the sub-regional research project indicate that some instrument reactivity to the before travel survey must be expected even if mitigation measures are taken (e.g. a large time interval between the survey and the TravelSmart intervention, different organisations, different appearance in materials, etc.). Whilst the present field test was able to detect the problem, further research is necessary to fully understand the different types of instrument reactivity (scenarios) and their underlying mechanisms.
3. Results for daily average VKT at a household level
Table 3 Descriptive statistics for average daily VKT, at a household level, from the subregional research project. Sub-regional research project
Statistics Mean Standard error Median Standard deviation Variance Minimum Maximum Kurtosis Skewness Count
Average daily VKT at household level
2004
2005
50.2 2.7 40.2 47.4 2245.3 0.0 400.4 13.0 2.8 318
52.7 4.2 38.9 74.2 5512.4 0.0 772.0 52.6 6.4 318
Change in average daily vehicle kilometres at household level from 2004 to 2005 (calculated as VKT2005–VKT2004)
2.51 4.37 0.36 77.56 6061.45 325.00 741.28 42.21 4.95 318
3.1. Comparison of daily average VKT recorded in 2004 and 2005 During the recruitment process of TravelSmart following the Indimarks method (SocialData, 2004), each household is assigned a TravelSmart status in one of the following categories:
I—a household that was Interested in TravelSmart, because
they indicated what materials (maps, timetables etc) they required from TravelSmart Rw—a household that indicated that they were a Regular user of public transport or non-motorized transport, with need for further information (using the service sheet to order materials from TravelSmart). Rwo—a household that indicated that they were a Regular user of public transport or non-motorized transport, without the need for further information. N—a household that was Not interested in TravelSmart, and would not like to receive further information.
To assess if the TravelSmart intervention had an impact on average daily VKT at a household level, the evaluation results of the sub-regional research project were analysed using a simplified TravelSmart status categorisation. Households who requested, and therefore received, materials aimed at changing their travel behaviour (Indimarks group ‘I’ for Interested and ‘Rw’ for regular user of public transport and non-motorized transport with further information needs) were coded as taking up TravelSmart, that is as a successful outcome. A negative outcome was coded for Non-
Participants who did not actively participate in TravelSmart, consisting of the Indimarks groups ‘N’ for not interested, ‘Rwo’ for regular users of public transport and non-motorized transport without further information needs along with Refusals and Sample Loss in the TravelSmart recruitment database. In this context:
A Refusal occurred when a household refused to participate in
the TravelSmart interview process of the recruitment phone call. Sample loss occurred in two cases: first, if an address from the residential rates database of the Darebin Council proved not to be a valid occupied residential address and secondly, if the TravelSmart recruitment team did not contact the householder, e.g. because there was no phone number to contact the household and initiate the recruitment process.
Table 3 presents the main descriptive statistics of the average daily vehicle kilometres at a household level in 2004 and 2005. The data in Table 3 come from all households from the subregional research project that completed the odometer recordings in the before and after evaluation survey. In investigating the change in motorized travel, the variable ‘change’ in VKT is of particular interest. It represents the change in average daily vehicle kilometres at a household level from 2004 to 2005.
ARTICLE IN PRESS 330
R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
A negative value indicates that the average daily vehicle kilometres at a household level has decreased, and a positive value indicates it has increased from 2004 to 2005. Despite using an odometer-recording interval of 1 week and aggregating vehicle kilometres to a household level, the inherent variability of average daily vehicle kilometres is still very large and dominated by outliers (a finding also made by Stopher et al. (2009) in an evaluation of a TravelSmart program in Western Adelaide). For example, when a family makes a trip over the long weekend along the coast, or into the country, in 1 year but not in the other, this one-off holiday trip can result in a difference of more than 100 km in the difference in average daily vehicle kilometres. While TravelSmart intends to modify the daily travel patterns, it is not necessarily expected to alter the mode choice of these one-off holiday journeys. In order to control for the impact of such events, the key results were therefore calculated with both the full data set and a 5% trimmed data set. In the 5% trimmed data set, the top 5% and bottom 5% values of the composite variable ‘change’ in VKT (from before to after) are removed before calculations are performed. For example in a data set of 160 households the records with the eight highest and the eight lowest values of the variable ‘change’ are eliminated. Thus, a household with a very low negative value for ‘change’ recorded a very high VKT value in the first year, indicating a one-off holiday trip. In contrast, a household with a very high positive value for ‘change’ recorded a very high VKT value in the second year. This procedure was therefore able to control outliers in both years. While it does not affect the median, the mean values, the standard deviation and the variance do change from the untrimmed to the trimmed data set. Although the trimming process reduces sample size, the samples of the different groups are sufficiently large to perform t-tests. Rather than eliminating the trimmed values, an alternative is to replace the original values by the values recorded at the 5% and the 95% quintile respectively (Richardson, 2005). This type of imputation was not adopted here, in order to get a clearer picture of the key statistics without the influence of the long-weekend holiday journeys. In an experiment conducted in a laboratory setting, where the effects of different treatments are to be measured, the group size for each treatment group would normally be held constant. In a case like that, analysis of a continuous variable, such as change in daily average VKT, with respect to categorical data can be conducted using Analysis of Variance (ANOVA) techniques. However, in an experiment conducted in a natural setting, as is the focus of this research, the sizes of different groups may vary and ANOVA is no longer appropriate and hence paired t-tests are used. Table 4 presents the comparisons in daily average VKT from 2004 and 2005 using the paired t-test across all households
participating in the before and after survey and across survey households that did and did not participate in the TravelSmart program. The results are provided for untrimmed and for 5% trimmed data. The second block of columns in Table 4 shows the result from the paired t-test for untrimmed and 5% trimmed data comparing daily average VKT at a household level in 2004 and 2005 for all households that participated in the before and after survey, regardless of their TravelSmart status. In the untrimmed data set no statistically significant difference in daily average VKT between 2004 and 2005 is apparent (p 40.05). For the 5% trimmed data, the one-tail t-test indicates that the reduction of 2.6 daily average VKT from 2004 to 2005 is just significant (p = 0.05) at a 95% confidence level. Repeating the comparison of untrimmed and 5% trimmed 2004 and 2005 data for the participating households (and non-participating households), no statistically significant reduction in daily average VKT can be detected (p4 0.05). These comparisons of daily average VKT from odometer readings show that the TravelSmart program was not able to induce a statistically significant reduction in car mileage at the 95% confidence level among the program participants of the population of the subregional research project. 3.2. Factors influencing ‘change’ in average daily VKT Since there was no statistically significant reduction in daily average VKT, the analysis then considered whether socio-demographic variables were better predictors of change in daily VKT than knowing whether or not the household participated in the TravelSmart program. Multiple regression was used to examine the strength of different influencing factors on change in daily VKT. In this process, the TravelSmart participation status was one of the independent variables along with a number of other sociodemographic variables. Data preparation in order to comply with requirements of linearity, multivariate normality, homoscedacticity and multivariate outliers (using Mahalanobis distance) identified an appropriate transformation for independent variables (IV) and suggested that five households be removed from the data set (trimmed version of ‘change’ 4 7200 km) (Seethaler, 2006). Following this extensive data screening process, the appropriate choice of independent variables with respect to collinearity and singularity was determined. Also, the transformation of the independent variables and the removal of outliers resulted in the assumptions of normality and homoscedasticity required for multiple regression being met (Seethaler, 2006). Three different multiple regression techniques were employed to explore the factors influencing the change in daily average VKT at a household level: standard multiple regression, sequential
Table 4 Paired t-test for mean daily VKT in 2004 and 2005 and different TS status. All households
Mean Variance Observations Change in VKT Pearson’s correlation Hypothesized Mean Difference df t-Stat p(To = t) one-tail
TS participation= ‘YES’
Untrimmed
Trimmed
VKT04
VKT05
VKT04
50.22 2245 318 2.51 0.24 0 317 0.57 0.28
52.73 5512 318
43.59 1045 286 2.58 0.66 0 285 1.69 0.05
TS participation =‘NO’ or unknown
Untrimmed
Trimmed
VKT05
VKT04
VKT05
VKT04
41.01 928 286
54.64 2624 202 5.98 0.18 0 201 0.90 0.18
60.62 7093 202
47.97 1322 182 2.11 0.63 0 181 0.95 0.17
Untrimmed
Trimmed
VKT05
VKT04
VKT05
VKT04
VKT05
45.86 1096 182
42.51 1509 116 3.54 0.57 0 115 1.14 0.13
38.97 1082 116
37.34 780 104 2.97 0.72 0 103 1.51 0.07
34.37 610 104
ARTICLE IN PRESS R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
331
Table 5 Beta coefficients for the regression of ‘change’ in daily average VKT from 2004 to 2005. Model
1. 2.
3.
Unstandardized coeff.
Std. coeff.
Beta
Beta
Std error
(Constant) L10VKT04
106.78 65.69
10.86 6.41
(Constant) L10VKT04 BVEH04
119.74 77.11 13.59
(Constant) L10VKT04 BVEH04 DOGS04
118.72 78.12 12.70 10.55
t
Sig.
Colinearity statistics Tolerance
VIF
0.50
9.83 10.25
0.00 0.00
1.00
1.00
11.50 7.31 4.37
0.59 0.17
10.42 10.55 3.11
0.00 0.00 0.02
0.75 0.75
1.34 1.34
11.40 7.25 4.34 4.12
0.60 0.16 0.12
10.41 10.77 2.92 2.56
0.00 0.00 0.04 0.11
0.75 0.74 0.98
1.34 1.35 1.02
Notation: L10VKT04—logarithmic transformation of daily average VKT2004; BVEH04—dichotomous variable, one or less motorized vehicles coded as ‘0’, two or more motorized vehicles coded as ‘1’; DOGS04—dichotomous variable, household without dogs coded as ‘0’, household with one or more dogs coded as ‘1’.
regression and stepwise regression. Insight from the first two approaches is discussed qualitatively, while empirical results are presented for the stepwise regression. Standard multiple regression is a typical screening process in which every IV is entered at once and beta coefficients reflect the unique contribution of every IV to the regression of the dependent variable (DV). The variance shared between two or more IV’s is not included in the beta coefficients but is part of R2 (Tabachnick and Fidell, 2001). Standard multiple regression with transformed data and untrimmed/trimmed DV ‘change’ revealed that the daily VKT in 2004 (VKT04) was the only significant predictor of the change in vehicle kilometres (Seethaler, 2006). The dependent variable ‘change’ was computed as the difference between vehicle kilometres recorded in 2005 minus vehicle kilometres recorded in 2004. If the recorded VKT in 2004 was larger than in 2005, ‘change’ is negative, reflecting a decrease. A high VKT recording in 2004 is more likely to result in reduction; that is a negative ‘change’. Under a ‘regression-towards-the-mean’ effect, a high recording of VKT in 2004 has a considerable chance of being followed by a lower recording of VKT in 2005. Similarly, a low recording of VKT in 2004 is more likely followed by a higher VKT recording in 2005; that is a positive ‘change’. Sequential regression also includes those parts of the variance that are covered by more than one independent variable jointly. This time, however, the analyst selects the specific sequence in which variables are entered into the equation and that will determine the extent to which an overlapping part of the variance is attributed to each of them. The shared part of the variance between two (or more) variables will be assigned to the variable entered into the equation first. When socio-demographics were entered first only household size (transformed) became just significant (sig. 0.03, beta coefficient 0.174). This influence however disappeared once the daily VKT04 was entered in the next sequence. The reverse model with transport variables entered first before socio-demographics also identified VKT04 as the strongest predictor. Thus, both sequential regression models confirmed the findings of the standard multiple regression runs: the daily average VKT2004 was found to be the strongest predictor of the variable ‘change’ even though a small part of its importance was built on variance shared with socio-demographic variables (Seethaler, 2006). When using stepwise regression it is not the analyst who determines the order of entering the independent variables into the equation. Instead the choice of priority inclusion is made on the basis of which variable has the greater overall correlation with
the dependent variable. Even if this order is determined only based on the second or third decimal place, the variable that enters first gets credit for its unique contribution and for the correlation it shares with the second variable. The second variable may then drop back several ranks and other variables are included beforehand because their correlations with the DV are now larger compared to the second variable’s unique contribution even though its overall correlation with the DV was initially much larger (Tabachnick and Fidell, 2001). Letting ‘statistical measures of correlation’ determine the order in which variables are to be entered (or not entered) into the equation, the first independent variable to enter was, as expected, the average daily VKT in 2004 (log transformed). Then, however the average household age (average age across all household members, log transformed) entered next with a significant negative beta coefficient. No other independent variable was further included in the sequential process after that. In multiple regression models including dichotomous IV’s , following the principle of parsimony, the best model was found to be restricted to average daily VKT2004 (transformed), the bivariate number of vehicles in a household (0 or 1 vehicle; 2 or more vehicles) and the presence of dogs in the household. Average household age and household size along with other dichotomous variables reflecting household composition, household income or public transport use were not statistically significant in explaining change in VKT from 2004 to 2005. Also, the variable TravelSmart status had no significant impact on the dependent variable ‘change’ in average daily VKT. Table 5 presents the unstandardized and standardized beta coefficients for each of the variables’ stepwise entry into the regression of ‘change’ in average daily VKT from 2004 to 2005. As expected, the (transformed) average daily VKT in 2004 had the strongest (negative) relationship with ‘change’, followed by the dichotomized number of vehicles in the household (bveh04) and the presence of dogs in the household (dogs04). The latter two variables both show a positive association with the variable ‘change’ in daily VKT from 2004 to 2005. Also, the Tolerance and Variance Inflationary Factor (VIF) values indicate a very low collinearity between the independent variables. Without stipulating a causal relationship, the positive beta coefficients of the two dichotomous variables indicate that with two or more vehicles and with dogs present in the household, it was more likely that the household would record an increase in daily average vehicle kilometres from 2004 to 2005 (increase in ‘change’) in contrast to households that only had one or no motorized vehicle and no dogs.
ARTICLE IN PRESS 332
R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
'Change' in VKT by 'VKT04' 200 150 100 50 0
0
50
100
150
200
250
300
-50 -100 -150 y = -0.6223x + 33.755
-200 Daily average VKT 2004
Fig. 2. Change in VKT from 2004 to 2005 as a function of VKT2004. Note: Excludes data from households where no cars were owned in the before and after travel survey, and from households where the ‘change’ in VKT from 2004 to 2005 was outside plus/minus 200 km.
4. The ‘regression-to-the-mean’ problem The statistically significant negative relationship between daily average vehicle kilometres driven in 2004 and the change in VKT from 2004 to 2005, found in the regression models discussed above, raises suspicion of the presence of the ‘regression-to-the-mean’ effect. This effect, well known in road safety literature1 refers to the fact that, given random variation over time, a high initial value is more likely followed by a lower value than by an even higher value—and a low initial value is more likely followed by a higher value than an even lower value. Fig. 2 plots the average daily VKT measured in the before travel survey against the ‘change’ in VKT from 2004 to 2005. The trendline indicates that a high starting value of VKT in 2004 is more likely followed by a negative value of ‘change’, that is, by a lower value of VKT in 2005 and vice versa. This is classic evidence of a regression-to-the-mean-effect. In order to overcome a regression-to-the-mean-effect, an evaluator wanting to identify change in daily average VKT would therefore need to ensure that the before and after samples reflected the full range of initial VKT values. If households with high levels of before VKT were overrepresented, so would be the negative values for ‘change’, which would then create an illusion that the TravelSmart campaign had been particularly successful. This was confirmed in a simulation study involving a synthetically generated sample of 1000 households. The household VKTs for the before and after surveys were randomly generated from the distribution of VKT in the field test survey. A positive linear relationship between survey response and the initial daily VKT value was assumed meaning that a higher response rate for households with higher initial VKT values resulted in a negative value of ‘change’ in VKT from 1 year to the next (Seethaler, 2006). In absence of any impact from a TravelSmart campaign or any other external source, the regression-to-the-mean effect combined with a bias of higher response
1 Road safety engineers often select sites with high crash values for treatment and often report a subsequent reduction in crash rates. However, as noted by Hauer (1980) this outcome may simply be a statistical consequence of having selected high crash rate sites; random variation in crash occurrences would result in a reduction in crash rates at sites which started out with high rates even without any safety treatment.
rates for more mobile households ‘produced’ a reduction in VKT across the simulated surveyed population. In real world conditions the problem of selection bias does occur, either as a function of a policy objective or accidentally. For example, some TravelSmart campaigns deliberately concentrate on those households with high mobility because they are the ones who could most benefit from TravelSmart being able to reduce their travel, thus giving the ‘most bang for the buck’ for limited TravelSmart funding. The situation can also occur ‘accidentally’ because people with different mobility patterns also tend to have different response rates for the travel surveys. For example, throughout the before and after travel surveys conducted in the research project field work, anecdotal evidence suggested that elderly people and households without a car were more inclined to refuse to participate in the travel survey. To avoid a response bias, field staff was therefore specially advised to explain to these persons that their participation was of special importance to the survey results in order to equally represent all situations. To summarize, the empirical data of the sub-regional research project have highlighted the importance of recognizing two factors in the evaluation of TravelSmart interventions by means of before and after travel surveys. First, the potential existence of a regression-to-the-mean effect, and second, a potential bias that could arise if segments of the population, i.e. households with a low initial daily VKT value, are systematically omitted from respondents in the before or the after travel survey. This has demonstrated the importance of representing all segments of a population in an evaluation survey, or where this is not possible, the importance of obtaining knowledge of the characteristics of the non-respondents. This is particularly important for a panel survey, where the data from the before and after travel survey are used to create the composite variable of ‘change’. Bonsall (2009) emphasises the need to collect information about respondent characteristics and discusses methods of correcting for nonresponse in another paper in this Special Issue. To avoid any bias, in the sub-regional research project, special efforts were undertaken at the personal delivery stage to explain to householders with low mobility (e.g. elderly without car) the importance of their participation in the survey. In this way, an attempt was made to ensure that households with low mobility would not be underrepresented. Similarly, at the personal pick-up stage of the survey forms, special efforts were undertaken to
ARTICLE IN PRESS R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
ensure that survey forms from highly mobile households could be retrieved. The results from the trimmed and untrimmed comparisons in average daily VKT (Section 3.1) and the regression of ‘change’ in VKT from 2004 to 2005 (Section 3.2) both suggest that the TravelSmart intervention of the sub-regional research project was not able to produce a significant reduction in daily average VKT at a household level. This is in contrast with findings published previously in other Australian TravelSmart programs that reported a reduction in VKT of the order of 6–14% across the population ¨ where the TravelSmart intervention took place (James and Brog, ¨ et al., 1999; Rose and Ampt, 2001). 1998; Brog In this context, the question therefore arises as to whether we are dealing with two different approaches that are used to evaluate the same thing, namely the reduction in VKT before and after TravelSmart, and whether those approaches are reaching very different conclusions because of the problems of sample bias and regression-to-the-mean effect. In early Travel Blendings and Indimarks programs the participants of the program were surveyed and non-participants were assumed to be constant over time in terms of travel patterns and vehicle kilometres travelled. However, if the TravelSmart participants are in any way different, i.e. in terms of higher mobility compared to non-participants, then the regression-tothe-mean effect is more likely to ‘produce’ a reduction in VKT (negative ‘change’) instead of or in addition to the TravelSmart treatment. Empirical data from this research project indicate that this could well be the case, as the average daily VKT recorded in the before survey is higher for participants of TravelSmart than for non-participants (Group I = 49 km and Group Rw= 34 km for the participants; Group Rwo= 11 km and Group N= 24 km for the nonparticipating households). Thus, the effect of response bias in combination with the regression-to-the mean effect may be creating an illusion of a strong positive treatment effect. Further research is needed to examine if the regression-to-the-mean effect combined with a biased coverage was present in past evaluation surveys and how this important problem can be avoided with appropriate measures in the field. Also, given the high variability in the data (daily VKT from travel diaries, average daily VKT from week-long odometer recordings), it is still not clear how long-term trends in behaviour change can be reliably measured. Thus, another important question for future research is whether or not two point observations that may be subject to high short-term fluctuations are suitable to detect the changes in long-term trends due to a TravelSmart treatment.
5. Conclusions and recommendations As part of a large-scale TravelSmart program in a local government area of Melbourne, a research project successfully tested a travel survey design applied to a panel of households randomly selected from the TravelSmart project area. This survey instrument allowed the recording of odometer readings of all motorized vehicles in the survey households over a week-long interval. Sample size considerations indicated that this method was advantageous to one-day travel diaries in terms of reducing the variability in average daily VKT at a household level and thus yielded lower sample size requirements. Whilst the use of a panel reduces variability and hence the sample size required to ‘detect’ change in the outcome variables of interest, a number of problems need to be overcome in order to achieve a high quality result. The changing composition of panel households must be addressed, and sample size considerations
333
need to include the migration rate of households out of a project area between the two survey waves, along with response rates for the before and after survey. A particular problem that is not (yet) well understood consists of instrument reactivity between the before travel survey and the recruitment process of the TravelSmart intervention. It is worth noting that this problem may affect panel and cross-sectional evaluation surveys alike. Several scenarios are possible, whereby at least one scenario may induce a reduction in daily average VKT in households due to the before travel survey that is then ‘harvested’ by the TravelSmart intervention. In neither the research project nor the regional wide TravelSmart intervention were socio-demographic data recorded during the TravelSmart recruitment process. Consequently, from the data it was not known whether any positive instrument reactivity or confounding occurred. That is, whether the before travel survey helped to promote the TravelSmart program by raising awareness of daily travel patterns, or the respondents to both the before travel survey and TravelSmart belonged to a sociodemographic strata that is more inclined to engage in such programs in the first place (effect of confounding variables). In future programs, it would be advantageous to collect some sociodemographic information during the TravelSmart recruitment process, including for households that are not part of the evaluation survey. This information would allow checks for the presence of confounding to be undertaken. Data from the sub-regional research project suggested that some positive instrument reactivity could have been present, improving the TravelSmart uptake due to participation in the before travel survey. However, the increase in TravelSmart uptake did not translate into a substantial decrease in average daily VKT for the participants in TravelSmart. Thus, the TravelSmart participation did not translate into a change in travel behaviour. Instead, the empirical results revealed that, based on odometer readings recorded over a week-long interval in the before and after travel survey, the TravelSmart program was not able to induce a statistically significant reduction in daily average VKT at a household level. Subsequent regression analysis consistently showed that by far the strongest predictor for the variable ‘change’ in average daily VKT was the vehicle mileage measured in the before survey. It is therefore recommended that future evaluations of voluntary travel behaviour change interventions conduct statistical analysis that explicitly shows the comparative strength between sociodemographic and ‘treatment’ variables in predicting behaviour change. The statistically significant and strong negative relationship between average daily VKT of the before travel survey and the variable ‘change’ in average daily VKT raises suspicion of a regression-to-the-mean effect which combined with a sample bias of excluding households with lower average daily VKT in the before survey has the potential to create an illusion of a strong positive treatment effect of TravelSmart. In future evaluations of voluntary travel behaviour change programs, it is therefore of utmost importance to include all segments of the population in the before and after surveys and to gain socio-demographic and travel information about the non-responding households (e.g. by conducting non-response surveys). The reports from the regional TravelSmart project have not been publicly released yet. However, in contrast to the results reported from previous TravelSmart applications in various Australian cities, the Melbourne program did not induce a statistically significant change in average daily vehicle kilometres when measured 12 months after the intervention. While the analysis reported here was conducted without a proper control group, and thus it was impossible to judge whether the lack of any
ARTICLE IN PRESS 334
R. Seethaler, G. Rose / Transport Policy 16 (2009) 325–334
measurable impact of TravelSmart on VKT might be the result of any effect having been neutralised by an ambient increase in VKT, the results, nevertheless, raise doubts about whether a one-off community-based TravelSmart campaign is able to induce lasting reductions in motor vehicle use. This study has identified a number of issues which need careful consideration in the context of future TravelSmart evaluations. In addition, the absence of a program impact suggests that further research is needed to design voluntary travel behaviour change initiatives in order to engage households in a way that induces the desired changes in travel behaviour.
Acknowledgment The authors acknowledge the insightful and constructive feedback received from the reviewers and the editors of this special edition which helped us to refine the final version of this paper. References Bonsall, P., 2009. Do we know whether personal travel planning really works? Transport Policy 16, 306–314, doi:10.1016/j.tranpol.2009.10.002. ¨ W., Erl, E., Funke, S., James, B., 1999. Potential for increased public transport, Brog, cycling and walking trips. In: 23rd Australasian Transport Research Forum, vol. 23, pp. 287–301. De Vaus, D., 2001. In: Research Design in Social Research. Sage Publications Inc., London, Thousand Oaks, New Delhi. Hauer, E., 1980. Selection for treatment as a source of bias in before and after studies. Traffic Engineering and Control 21, 419–422.
¨ James, B., Brog, W., 1998. Changing travel behaviour through individualised marketing: Application and lessons from South Perth. In: 22nd Australasian Transport Research Forum ATRF, vol. 22, pp. 635–647. Richardson, A.J., 2005. North-Eastern Suburbs Travel Survey (NESTS)—2005 survey results report. Report to Victorian Department of Infrastructure, TUTI Report 48-2005, The Urban Transport Institute, Victoria. Richardson, A.J., 2002. Sample Size Design Options for Before & After Survey Evaluation of Transport Programs. Taggerty, AU, The Urban Transport Institute, TUTI Report 13-2002 (Paper presented at the annual Conference of the Transport Research Board TRB, Washington DC). Richardson, A.J., Roddis, S., Arblaster, D., Attwood, D., 2005. The role of trend analysis in the evaluation of a TravelSmart Program. TUTI Report 42-2005, Paper submitted for Presentation at the 28th Australasian Transport Research Forum, Sydney, September 2005. Richardson, A.J., Ampt, E.S., Meyburg, A.H., 1995. In: Survey Methods for Transport Planning. Eucalyptus Press, Melbourne, AU. Rose, G., Ampt, E.S., 2001. Travel Blendings: an Australian travel awareness initiative. Transportation Research Part D 6, 95–110. Seethaler, R.K., 2006. Application of persuasion principles to a community based travel behaviour change program. Doctoral Thesis, Monash University, Department of Civil Engineering, Clayton Victoria. Seethaler, R.K., Rose, G., 2005. Using the six principles of persuasion to promote travel behaviour change—preliminary findings of two TravelSmart field experiments. TUTI Report 44-2005, Paper Submitted for Presentation at the 28th Australasian Transport Research Forum, Sydney, September 2005. Seethaler, R.K., 2005. Evaluating community-based TravelSmart in Melbourne. TUTI Report 40-2005, Paper Submitted for Presentation at the 2005 Annual Meeting of the Institute of Transportation Engineers, Melbourne, August 2005. Seethaler, R.K., Rose, G., 2006. Six principles of persuasion to promote communitybased travel behaviour change. Transportation Research Record 1956, Washington DC, pp. 42–51. SocialData Australia, 2004. Final report TravleSmart Darebin. Report to the Department of Infrastructure, Mount Waverly, Australia, SocialData Australia Pty Ltd. Stopher, P.R., Cifford, E., Swann, N., Zhang, Y., 2009. Evaluating voluntary travel behaviour change—suggested guidelines and case studies. Transport Policy 16, 315–324, doi:10.1016/j.tranpol.2009.10.007. Tabachnik, B.G., Fidell, L.S., 2001. In: Using Multivariate Statistics. Allyn and Bacon, Boston.