Accid. Anal & Prey.. Vol. 14. No. 5, pp. 381-386. 1982 Printed in Great Britain
(~XH-.45751821050381-06503.0010 ((j 1982 Pergamon Press Ltd.
DEVELOPMENT OF THE MICHIGAN DISAGGREGATE DATABASE ON DRIVING EXPOSURE MARTIN E. H. LEE Lee-GosselinAssoci~s Limitee, 115 Chemindu Bout de L'ile, Ste-P~tronille I.O., Qu6bec, Canada G0A 4C0 Abstract--Recent influences on the design of automobile driving exposure surveys are briefly discussed relative to other areas of travel demand research. The case for the collection of micro data is argued as background to the developmentof a major driving exposure survey effort, known as the Michigan Driving Experience Survey.which interviewed 7581 drivers statewide during all of 1976. A summaryof sampling, interview, data preparation and quality control methodologiesis given, with a description of studies so far released and examplesof findings. In commentingon Michigan'sexperience with the methodology,which was implemented at low cost entirely with the sponsoringstate agency's own personnel, survey management is identified as the area most critical to success. The recent increase of substantial efforts to monitor driving exposure is, in part, a response to national and sub-national governments as they attempt to weigh the relative merits of their interventions on behalf of road safety, energy conservation and environmental quality. Attempts to measure the impact of these interventions have led to important changes in the perceived importance of exposure data. Among the most significant are, (1) the growing understanding that changes in travel demand (as well as accident frequency) may result from implemented traffic policies, and (2) the increasing sensitivity to the impact on population subgroups of alternative policies. Past perspectives reflected the need to assess only aggregate effects of policies; in these circumstances, exposure was something to be "controlled for". In accident analysis today, driving exposure is emerging as a class of dependent variables, deserving as much attention within different cross sections of the population as the accidents themselves. DRIVING EXPOSURE RELATIVE TO RECENT TRAVEL DEMAND RESEARCH Increasing dissatisfaction with aggregate data has not been confined to driving exposure; it has affected research in all areas of travel demand. There are in general two strategies for improving travel data. The first is to modify aggregate models of travel behavior by making them more detailed. The recent history of disaggregate travel demand modelling has demonstrated wide recognition of the need for flexibility to account for more types of human choice than was implicit in classical traffic models (trip generation and distribution, modal split and routing assignment). But as noted by Heggie [1977], the prevalent disaggregate models assume rational choice behavior and stable tastes; and it is under these assumptions that the models are manipulated to evaluate hypothetical policies. Although the coefficients of these models are calibrated to cross-sectional data, there has been little, if any, effort to validate the deductions which have been made from the models using controlled studies in which policies are actually applied. Thus attempts to make the aggregate models more detailed have enabled them at best to represent small stable cross-sections of travellers. The second of the two strategies for developing new travel data capabilities avoids starting from mathematical models with questionable behavioral bases, and amounts instead to building detailed information about travel choices from surveys of individual "micro" behaviors. The many variations of this strategy use a wide range of quantitative approaches. For example, in one of the most recent, aspects of Monte Carlo simulation and logit models are combined: travel choices are specified for groups with common "decision profiles". The method, known as microsimulation, has been calibrated to attitude survey data and validated with considerable success in a British car pooling scheme by Bonsall [1980]. By contrast, another variation assumes that the complexities of human choice are beyond mathematical description, and suggests the use of individuals (or households) to investigate travel decision-making directly. This is notably achieved through time/space budget studies [after H~igerstrand--see Godkin and 381
382
Michigandatabase on drivingexposure
Emker, 1975]. Such data are sometimes collected using gaming techniques, which demand novel insight on the part of researchers and planners who use the data. Examples are Jones [1976] and Dix [1977]. The variation represented by most large driving exposure studies falls somewhere between these two examples. At least implicitly, they have viewed behavioral utility models as premature in the absence of sound descriptions of current travel activity, distinguished by the careful recording of travel data over time in considerable detail. Early methodological work. such as that by Carroll et al. [1971], Burg [1973], Waller et al.[1973] and Foldvary [1975] were particularly focussed on problems of accident rate, but provided the basis for sound descriptions of automobile travel activity. Since that time, major exposure efforts have been undertaken in Canada, the United Kingdom, New Zealand, the United States and elsewhere, with an additional policy issue on the agenda--energy consumption. It is perhaps this development which has hastened the "dependent variable" view of driving exposure, as well as the exchange of techniques with other areas of travel demand research. WHY MICRO DATA ON EXPOSURE? Driving exposure information is fundamentally a record +of individual travel choices made over a given time frame. Only a computerized, micro data base permits analyses of requisite complexity. The requirements for an adequate data base on driving activity, logged or reconstructed for individuals over time, are similar to those for micro data in a number of other public policy fields. Cohen [1973] provides one of the first descriptions of the prerequisites for computerized micro data, developed for policy makers needing to test the limits of proposed programs and policies, and applied in that case to labor market information. The prerequisites include structuring the data so that individuals are classified on a large number of different characteristics, each of which potentially could be combined with any other to define a subset of special interest. Recent improvements in computer software have made it possible to structure and retrieve data in this form without special programming, and to create new variables which simplify the future identification of subgroups which are "discovered" in the data to have one or several characteristics in common. Thus analyses may proceed either in a "deductive" or "inductive" manner--either assuming which sets of individuals are important. or seeking and characterizing those with interesting commonalities (which latter may amount to finding "target groups" for policies). BRIEF HISTORY OF THE DATABASE During 1974, the Michigan Department of State (DOS), whose responsibilities include driver and vehicle licensing, identified the lack of exposure data as a serious handicap in its efforts to study the driving population. Considerable effort had already been invested in constructing a series of sample data bases on driver behavior, using official records of accidents, traffic violation convictions and actions (such as suspension) taken against repeat offenders. To provide exposure data, the Department contracted with the University of Michigan Highway Safety Research Institute (HSRI) in May 1975 to design a survey of driver, vehicle ownership and vehicle usage characteristics. The project, known as the Michigan Driving Experience Survey (MDES), was designed as: (a) a baseline data-set on vehicle ownership and usage throughout Michigan covering a 12-month period. (b) a micro-database which would permit extensive cross-sectional analyses as well as an exploratory study of vehicle-usage data collection methods. The survey was implemented by DOS during all of 1976 in 30 locations throughout Michigan. Initial processing of the survey data was handled by HSRI until March 1977, after which the project was conducted by DOS research staff. Extensive cross checking and editing of this large data set continued throughout 1977, and a comprehensive program of analysis has been in progress since early 1978. SUMMARY OF METHODOLOGY DEVELOPED This project focussed on the performance over time of people, as opposed to vehicles or segments of the highway network. Moreover, it examined people only as drivers of motor
Michigan database on driving exposure
383
vehicles, and not as passengers. Based partly on a review of previous travel demand survey methods, the following objectives were set for data collection: --use of personal interviews of driver license renewal applicants in local license offices --training of existing licensing agency personnel to conduct interviews of randomly selected respondents --oversampling of drivers from rural areas --emphasis on reconstruction of all trips driven on a recent day to obtain vehicle usage data (with the intention of re-aggregation by population subgroup) --development of detailed quality control procedures, including close auditing of performance on random selection of respondents. No known existing travel data sources provided precisely the type of data sought, and seven months of development and testing of the instrument and procedures was necessary before the survey could be implemented on a large scale within the operations of a state government agency. A summary of the major aspects of the design follows; a detailed methodological report is available [Lee, 1980b]. Sampling. The Michigan Driving Experience Survey consisted of 7581 personal interviews of driver license renewal applicants conducted throughout Michigan during all of 1976. It utilized a controlled selection procedure due to Groves [1975] for random selection of sites within two dimensions--level of urbanization, and gasoline sales per capita (the latter being the only indicator available of gross personal travel activity). Because of the scarcity of rural tripmaking data, the rural areas were deliberately oversampled, both in the number of sites and in per-capita sampling rates within sites. All data are capable of being weighted to compensate for designed sampling rates, for variations owing to the day of week of the interview, and for the level of non-response. Within the 30 sites, a random number system, beyond the control of the employees, was used to select seven or eight interviewees per office per week from among all driver license renewal applicants. Because the system used a meaningless sequence number which becomes a transaction identifier in an audit trail, it was possible to verify later than none of the (unannounced) eligible drivers had been missed. Follow-up procedures, which were more time-consuming than an interview done at the time the driver was in the local bureau, helped keep administrative response very high. Overall, this provided a representative sampling of the Michigan driver population, but it must be noted that drivers under the age of 19 are not represented because they are not old enough to renew a driver's license. Interview procedures. The interviews were conducted by the managers of the local license bureaus, who generally have excellent public contact skills. They received training in the interview procedures in regional seminars immediately before the survey commenced, and during training visits to every site by the project director during the ensuing four-week start-up period. The interview took place as part of the renewal process, at the manager's discretion. Permission was obtained to offer respondents a waiver of the written renewal test as an incentive and time-saver and this was used to advantage. In general it was the year-long management of this highly decentralized data collection effort, using the telephone, personal visits and a newsletter, which was the most demanding aspect of the methodology developed. Instrument. A robust, factual interview form was required. The emphasis of the survey was on the careful reconstruction of a recent trip-day, usually the previous day, and on the complete set of vehicles to which the respondent had access. Within a time limit of 15-20 rain, it proved possible to include: --"housekeeping" information on each interview or refusal --personal characteristics available from the license application --gross mileage estimates for the past year and week --characteristics of all vehicles normally used --household count of all vehicles available, including non-road vehicles, trailers, and bicycles --non-road vehicle, trailer, and bicycle use --summary reconstruction of all trips driven in a designated day, including: origin, destination, purposes, timing, stops, distance, road type most used, previously described vehicle
384
MARTINE. H. LEE
used, passenger load and relative age and the availability of public transportation alternatives --background characteristics, including household composition, employment, education. household income range, type and length of residence, and map location of residence. Eight versions of the instrument were developed during the project, including one which was used in all sites for a full scale trial varying from two to four weeks. Data preparation. The procedure for coding and editing of interview forms was designed not only to prepare data for analysis, but to provide early warning of problems with respondent selection or interview procedures in specific sites. Particular effort was made to develop new and flexible classifications of a number of items, such as trip purpose, where the confusion of customary labels (e.g. "personal business") had been avoided by using open responses. Cross checking, even though partly automated, took until the end of 1977 to complete. An elaborate system of warning flags in the data base identifies areas of questionable response, editing or coding: such data may be included or excluded at the analyst's discretion. The survey data have been integrated with the individual accident and traffic conviction records from the files of the sponsoring agency (while individual identity has been deleted). Cross reference capability has been established with selected socioeconomic characteristics of the traffic zones (used by state transportation modellers) in which the respondents resided. Certain socioeconomic characteristics are also available by zip code of residence. In addition, the interview itself provides basic biographical information on the respondent and her/his household. Considerable effort has been made to build two verified summary files. In a driver file, all time and distance information has been aggregated over the trip day for all trip attributes, including algorithm-assigned travel by purpose in multi-purpose trips; thus, the total time and distance driven by each respondent are expressed in terms of the travel under different trip regimes, purposes, light conditions, road types, vehicles used and passenger load. In a second file, each trip is treated as a separate case, with driver descriptors repeated; this file greatly facilitates analyses of the distribution of all miles driven in the state, or of subclasses of exposure. The files were built primarily with OSIRIS. IV software for use with both the OSIRIS and the MIDAS software packages on the University of Michigan computing system. RESULTS OF THE SURVEY Overall response was very high, being 85% of those asked to participate. The number of usable interview forms (7581) represents 72% of the number of interviews predicted from the workload of the 30 local driver license bureaus selected for the survey. The difference between the two percentages primarily represents some continuity gaps inevitable in the conduct of a decentralized survey operating over an entire year. Extensive analyses of drop-out from refusals and dubious data found very few significant differences between demographic characteristics of the residual sample and those of the driving population as a whole. To date, various studies using these data have been published on energy conservation and road safety. Topics covered include: the impact of gasoline rationing, commuter carpooling and the use of smaller automobiles; the ownership and usage characteristics (including driving record) of those driving vans and pickups; the differences in driving trip characteristics between those with high and low accident rates per mile driven; and a detailed investigation of miles driven and gasoline consumed by population subgroups. Many useful findings have been reported. Among them is the discovery that, despite high overall small car exposure in (especially youthful) high accident rate groups, disaggregate data shows the subgroup normally driving small cars to have lower accident rates than the remainder. A recent development has been the extensive analysis of the efficiency with which different population groups "produce" occupant-miles from the gasoline they use; the discovery of differences in occupant-miles per gallon between groups using similar amounts of gasoline per driver has interesting implications for fuel allocation and rationing policies. These two discoveries are both examples of the type of result which only micro data can provide. SOME OBSERVATIONS ON MICHIGAN'S EXPERIENCE WITH THIS METHODOLOGY Major factors contributing to the success of this project included the considerable attention paid to organizational issues and to the needs of field personnel, which the continuity of the
Michigan database on driving exposure
385
direction of the survey from conception to analysis made possible. In addition, it was felt that the enormous amount of manual effort required of project staff to maintain detailed quality c o n t r o l - - e s p e c i a l l y over s a m p l i n g - - w a s well spent. There are a number of advantages to this methodology. First, a number of design features occur in combination to great advantage. The high degree of detail in the data is enhanced considerably by the ability to cross match data for each respondant from official driving records (accidents and violations), and for each residence location (socioeconomic characteristics by highway network zone or zip code). The decentralized collection of data continuously over an entire year and an entire state captured seasonal differences and permitted the measurement of travel activity in rural areas, often disregarded because of the high cost of putting dedicated interviewers in low-volume locations. Second, the overall costs ($44,000 in consulting fees for design assistance and initial data preparation) were unusually low, especially relative to the benefits of the data set. This was because of the co-option of locally-knowledgeable government employees, whose interpersonal skills make them preferable to paid interviewers. This appraoch is particularly viable in times of budget cuts and growing public resistance to unidentifiable survey takers. Last, the state level scale of the survey was a meaningful precedent. While national surveys are of great informational value, it is the state level at which the majority of road safety policy decisions are made. The survey was also a precedent because it demonstrated that, with design assistance, state government could develop its own tool to get information relevant to a set of known policy environments, without hiring "outsiders". The potential exists for such a tool to be maintained at a high level of quality by multiple sponsors and users (the different departments of the state). The Michigan Driving Experience Survey continues to fill a major data need for policy analysis in road safety and other areas. It is hoped that such efforts can be repeated periodically. Present plans include the retrieval of 1981 vehicle ownership data from licensing records for survey respondents in order to examine the safety and energy aspects of automobile downsizing and other issues. Acknowledgements--The MDES project was initiated by the author while on the staff of the University of Michigan Highway Safety Institute; various HSRI personnel assisted with the project, notably Arthur C. Wolfe. The author gratefully acknowledges major contributions to the assembly of the database by Matthew F. Glover, formerly of the Department of State, now with the Minnesota Department of Commerce. Part of the start-up costs were provided by the U.S. Dept. of Transportation (NHTSA), through the MichiganOffice of Highway Safety Planning. Views expressed in this paper are not necessary those of the Michigan Secretary of State, the U.S. Dept. of Transportation or the Michigan Ottice of Highway Safety Planning.
PAPERS RELEASED USING DATA FROM THE MICHIGAN DRIVING EXPERIENCE SURVEY Lee M. E.. Driver--versus vehicle-basedgasoline rationing and the potential for a white market between different income groups in Michigan, Transportation Research Board 58th Annual Meeting, Washington D.C., January 1979 (Abridgement in Transpn Res. Rec. 731, 29-33), 1979(a). Lee M. E., Some Basic Results from the Michigan Driving Experience Survey, Michigan Department of State, Lansing, January 1979(b). Lee M. E.. Personal Characteristics and Patterns of Vehicle Usage among Van and Pick-up Drivers in Michigan, Society of Automotive Engineers paper No. 790379, Warrendale, Pennsylvania, March 1979(c). Lee M. E. H. and Glover M. F.. The use of disaggregatedata to evaluate gasbline conservation policies: smaller cars and carpooling. Transpn Res. Rec. 794, 16--23,1980. Lee M. E. H., Glover M. F. and Eavy P. W.. Differences in the trip attributes of drivers with low and high accident rates. In: Accident Causation. (SP--461). Society of Automotive Engineers, Warrendale, Pennsylvania February 1980. Also to be published in S.A.E. Proceedings, October 1981. Lee M. E H., Driver--versus vehicle-basedrationing and the potential for coupon sales between different income groups in Michigan. In: Considerations in Transportation Energy Contingency Planning (SR-191), pp. 138-144, Transportation Research Board. Washington D.C. 1980a. (Revised and expanded version of the 1979 TRB paper, with a preface on recent findings.) Lee M. E. H., Methodology for a Disaggregate Statewide Survey of Motor Vehicle Ownership and Usage, Technical Report. Michigan Department of State, Lansing, May 1980(b). Lee M. E. H., Lawson J. J., Br6g W. and Meyburg A. H.. Three Viewpoints of the Collection of Travel Behaviour Data over Time. Proceedings Joint Statistical Meetings, American Statistical Association. Houston, Texas, August 1980. In press). Lee M. E. H.. The Quantity of Gasoline Used. and the Amount of Personal Automotive Travel Produced, by Different Population Sectors in Michigan. Center for Research on the Utilization of Scientific Knowledge, Institute for Social Research. in cooperation with the Michigan Department of State. Ann Arbor, Michigan, 1981 (In press). gerg M. R.. Ray P. H. and Lee M. E. H.. with Meany D. P. and Winer M. K., Analyses of Emergency Gasoline Conservation Options. Energy Policy Group. Center for Research on the Utilization of ScientificKnowledge, Institute for Social Research. University of Michigan. 1981 (In review).
386
MARTINE. H. LEE
REFERENCL,8 Bonsall P., Microsimulation of organised car sharing: the model and its calibration. Washington: Transportation Research Board 59th Annual Meeting, January 1980. Burg A., The Effects o[ Exposure to Risk on Driving Record. Institute of Transportation and Tra~c Engineering, University of California, Los Angeles, June 1973. Carroll P. S., Carlson W. L., McDole T. L., Smith D. W. and Samarco, P. F., Acquisition of Information On Exposure and on Non-Fatal Crashes (Volumes I-IV). Ann Arbor, Michigan: Highway Safety Research Institute, The University of Michigan, 1971. Cohen M. S., On the Feasibility of a Labor Market Information System Draft--Final Report. Volume I. Ann Arbor, Michigan: Institute of Labor and Industrial Relations, University of Michigan-Wayne State University, 1 December 1973. Dix M. C., Report on Investigations o/Household Travel Decision-Making Behaviour. Oxford, England. Transport Studies Unit/University of Oxford, April 1977. Foldvary L. A., Road accident involvement per miles travelledPI. Accid. Anal. Prey., 7, 191-205, 1975. Godk/n M. and Emker I., Time-Space Budget Studies In Sweden: A Review and Evaluation. Lund University, Lund, Sweden, 1975. Groves R., CONSEL: Controlled probability selection sampling program. Ann Arbor, Michigan: Institute for Social Research, University of Michigan. Mimeograph, 1975. Heggie I. G., Putting Behaviour Into Behavioural Models of Travel Choice~ Oxford, England. Transport Studies Unit/University of Oxford, January 1977(b), Jones P. M., A Gaming Approach to the Study of Travel Behaviour, Using a Human Activity Framework. Oxford, England. Transport Studies Unit/University of Oxford, September 1976. Waller P. F., Reinfurt D. W., Freeman J. L. and Imrey P. B., Method for measuring exposure to automobile accidents. 101st Annual Meeting of Amer. Public Health Assoc., November 1973.