Accident Analysis and Prevention 45 (2012) 173–179
Contents lists available at SciVerse ScienceDirect
Accident Analysis and Prevention journal homepage: www.elsevier.com/locate/aap
Analytic choices in road safety evaluation: Exploring second-best approaches Rune Elvik a,b,∗ a b
Institute of Transport Economics, Gaustadalléen 21, NO-0349 Oslo, Norway Aalborg University, Department of Development and Planning, Fibigerstræde 13, DK-9220 Aalborg, Denmark
a r t i c l e
i n f o
Article history: Received 22 May 2011 Received in revised form 11 December 2011 Accepted 17 December 2011 Keywords: Road safety Evaluation study Second best approach Empirical Bayes approach Methodological study
a b s t r a c t Conducting rigorous before-and-after studies is essential for improving knowledge regarding the effects of road safety measures. However, state-of-the-art approaches like the empirical Bayes or fully Bayesian techniques cannot always be applied, as the data required by these approaches may be missing or unreliable. The choice facing researchers in such a situation is to either apply “second-best” approaches or abstain from doing an evaluation study. An objection to applying second-best approaches is that these approaches do not control as well for confounding factors as state-of-the-art approaches. This paper explores the implications of choice of study design by examining how the findings of several evaluation studies made in Norway depend on choices made with respect to: 1. Using the empirical Bayes approach versus using simpler approaches; 2. Use or non-use of a comparison group; 3. The choice of comparison group when there is more than one candidate. It is found that the choices made with respect to these points can greatly influence the estimates of safety effects in before-and-after studies. Two second-best techniques (i.e. techniques other than the empirical Bayes approach) for controlling for confounding factors were tested. The techniques were found not to produce unbiased estimates of effect and their use is therefore discouraged. © 2011 Elsevier Ltd. All rights reserved.
1. Introduction Methods for conducting observational before-and-after studies of road safety measures have developed considerably in the past 15–20 years. The empirical Bayes (EB) approach (Hauer, 1997) has been extensively applied and come to be regarded almost as the “gold standard” for before-and-after studies (Persaud and Lyon, 2007). The EB-approach is recommended and explained in detail in the Highway Safety Manual (2010). Recently, however, fully Bayesian (FB) approaches have been proposed as an equally rigorous method for performing before-and-after studies of road safety measures (Persaud et al., 2010). Both EB and FB approaches require fairly extensive data and computations if applied in their most rigorous form. These data may not always be available or easy to collect. Problems of data availability may prevent application of the most rigorous techniques for before-and-after studies. A case in point is the evaluation
∗ Corresponding author at: Institute of Transport Economics, Gaustadalléen 21, NO-0349 Oslo, Norway. E-mail address:
[email protected] 0001-4575/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.aap.2011.12.006
of the lowering of the legal limit for blood alcohol concentration (BAC-limit) in Norway from 0.05 percent to 0.02 percent in 2001 (Assum, 2010). The new BAC-limit was introduced in the whole country; a comparison group retaining the old BAC-limit did not exist. No roadside surveys of the amount of drinking and driving had been made; effects on behaviour could therefore only be assessed in terms of self-reported behaviour. Finally, accidents involving drivers who had been drinking were not recorded; the evaluation had to rely on surrogate accidents that tend to involve a high proportion of drinking drivers (e.g. single vehicle accidents at night). In short, many compromises that reduced study quality had to be made. The dilemma facing analysts in such situations is whether to try to perform a “second best” evaluation study, or refrain from doing an evaluation study at all, given the risk that findings could be misleading. The objective of this paper is to explore the use of second best approaches to road safety evaluation studies in order to gain an impression of whether such approaches are likely to be so erroneous as to be discouraged altogether, or whether they can be applied when certain conditions are fulfilled. Analysis proceeds in two stages. The first stage compares different study designs to determine the extent to which findings based on less rigorous study
174
R. Elvik / Accident Analysis and Prevention 45 (2012) 173–179
Choice of study design
Empirical Bayes approach
With comparison group
Large comparison group
Matched comparison group
Comparison group approach
Without comparison group
Change in traffic volume
Large comparison group
Before-after based on accident rates
Simple beforeafter
Matched comparison group
Local change in traffic volume
Confounding factors controlled by study design ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Regression-to-the-mean Long-term trend (comparison group) Long-term trend (comparison group) Local changes in traffic volume Local changes in traffic volume None Local changes in traffic volume ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Fig. 1. Study designs compared.
designs are influenced by confounding factors not controlled for by these designs. The second stage of analysis discusses the possibilities of controlling for potentially confounding factors by means of simpler methods than those applied by state-of-the-art techniques. The empirical Bayes (EB) approach, several versions of which will be compared in this paper, is treated as the state-of-the-art approach. The EB-approach is applied in abridged form (Hauer et al., 2002), i.e. without annual multipliers to adjust for year-to-year changes in the expected number of accidents. This was necessary because a low count of accidents in the before-period in most of the studies imply that year-to-year changes in the number of accidents are mostly random and do not reliably reflect long-term trends. 2. Analytic choices in road safety evaluation 2.1. Study designs compared The empirical Bayes (EB) approach to road safety evaluation has come to be widely applied because it controls for well-known confounding factors in before-and-after studies, such as regressionto-the-mean, long-term trends and changes in traffic volume (Persaud and Lyon, 2007). In its most advanced form, however, the EB approach requires data that may not always be available, including: 1. The count of accidents at treated sites for several years before and after treatment. 2. Traffic volume at each treated site each year before and after treatment. 3. The model-predicted (normal) number of accidents for each year before treatment, obtained by means of an accident prediction model accounting for as many sources of systematic variation in the number of accidents as possible. 4. The share of systematic variation in accident counts explained by the accident prediction model, stated in terms of the dispersion parameter of the model. As described by Hauer et al. (2002) and the Highway Safety Manual (2010), the EB-method does not use a comparison group.
Long-term trends are accounted for in terms of yearly multipliers, which are estimated either on the basis of yearly traffic volume data or by means of coefficients estimated by means of the accident prediction model. One reason for preferring this approach to the use of a comparison group is the fact that it may be difficult to choose an appropriate comparison group, as discussed by Hauer et al. (1991) and Hauer (1991). Eight study designs, including four versions of the EB-approach, are compared in this paper:
1. The empirical Bayes approach. This approach was applied in four versions: (a) with a large comparison group, which gives the most reliable estimates of long-term trends in the number of accidents; (b) with a smaller, matched comparison group, intended to be more similar to the treated sites than the large comparison group; (c) without a comparison group, adjusting for changes in traffic volume at treated sites from before to after treatment by relying on coefficients estimated in accident prediction models; (d) without a comparison group, adjusting for changes in traffic volume at treated sites if these changes deviate from general changes in traffic volume in the area where the treated sites were located. 2. The comparison group approach. This is a traditional before–after study employing a comparison group. The same two comparison groups (large, not matched and smaller, matched) were used as in the empirical Bayes approach. 3. The before–after approach based on accident rates. Data on traffic volume before and after treatment was used to estimate accident rates and state effects in terms of changes in accident rates. 4. A simple before-and-after comparison based on the recorded number of accidents and not adjusting for any confounding factors.
Fig. 1 shows these study designs and the potentially confounding factors each design controls for. The empirical Bayes design using a comparison group controls for regression-to-the-mean, long-term trends and local changes in traffic volume. The effect
R. Elvik / Accident Analysis and Prevention 45 (2012) 173–179
175
Table 1 Key data for evaluation studies included. Study
Measure evaluated
Number of sites
Number of accidents at treated sites before
Number of accidents at treated sites after
Mean duration of before-period (years)
Mean duration of after-period (years)
Stigre (1991) Stigre (1993) Buran et al. (1995) Giæver (1999)
Priority roads Priority roads Priority roads Signing sections Signing curves Signing junctions Roundabouts Roundabouts Bypass roads Environmental streets
5 5 1 16
17 52 22 61
8 45 13 28
4 4 4 3
1 4 4 2.88
19 12
38 48
17 22
3 3
2.95 2.83
26 22 20 16
67 89 374 72
25 42 363 62
5.62 4.59 6.25 4.69
5.38 3.73 6.10 4.50
Giæver (1999) Giæver (1999) Kristiansen (1992) Odberg (1996) Elvik et al. (2001) Grendstad et al. (2003)
on accidents of local change in traffic volume is defined, according to Hirst et al. (2004a) as:
local change in traffic volume =
Some key data for these studies are listed in Table 1.
traffic volume after at treated sites/traffic volume before at treated sites traffic volume after in comparison group/traffic volume before in comparison group
The coefficient reflecting the effect on accidents of changes in traffic volume is taken from the accident prediction models developed to estimate the model-predicted (normal) number of accidents in the before-period. To estimate effects of local change in traffic volume this way, general changes in traffic volume in a larger area must be known. Local change is defined as change that deviates from the general changes. Note that if changes in traffic volume in a larger area are not known, changes in traffic volume will be defined as changes from before to after treatment at the treated sites only. The traditional comparison group design does not control for regression-to-the-mean. The before–after design using accident rates assumes that there is a linear relationship between traffic volume and the number of accidents. A simple before-and-after study does not control for any confounding factors.
2.2. Sample of studies Studies that have evaluated the effects of the following road safety treatments in Norway were used as cases in this study. All these studies permit the comparison of different study designs. The studies are:
1. A set of studies (Stigre, 1991, 1993; Buran et al., 1995) that evaluated the effects of signing urban arterial roads as priority roads in the cities of Hamar, Bærum and Trondheim. These studies were re-analysed and their findings combined by applying techniques of meta-analysis (Elvik et al., 2009). 2. A study evaluating the effects of warning signs at hazardous road locations (Giæver, 1999). This study was also re-analysed and estimates of effect are reported separately for road sections, curves and junctions. 3. Two studies (Kristiansen, 1992; Odberg, 1996) evaluating the effects of converting junctions to roundabouts. Both studies were re-analysed, but their findings are reported separately, as the studies were made in different geographical regions. 4. A study (Elvik et al., 2001) evaluating the effects of bypass roads. 5. A study (Grendstad et al., 2003) evaluating the effects of converting urban main streets to environmental streets. The study was re-analysed. An environmental street is a street that has been redesigned mainly in order to reduce the speed of traffic.
coefficient
Except for the study evaluating bypass roads, all studies were based on comparatively small accident samples. The before- and after-periods were usually between 3 and 6 years. The empirical Bayes analyses relied on accident prediction models developed for road sections (Ragnøy et al., 2002) and junctions (Elvik, 2011; based on data in Kvisberg, 2003). The models were based on more recent data than the before-periods of most studies. The method suggested by Hirst et al. (2004b) was applied to adjust model predictions – although in this study the adjustments were not made because the accident prediction models were outdated, but rather because they were “too recent” compared to the before-periods of the evaluation studies that were included. Model parameters, their standard errors and P-values are reported in Table 2.
2.3. Application of the EB approach The EB-approach was applied in abridged form, treating all before-years as a single period, as annual accident counts were too low to reliably identify a long-term trend based on accident history for the treated sites in the before-period. Control for long-term trends was obtained by using a comparison group. The weight given to the model-predicted normal number of accidents when estimating the expected number of accidents in before-period was defined as:
weight(˛) =
1 1 + (/k)
(1)
Here, is the number of accidents predicted by the model and k is the inverse dispersion parameter. The expected number of accidents, controlling for regression-to-the-mean was obtained as: expected number of accidents = (˛ · ) + [(1 − ˛) · x]
(2)
In Eq. (2), x denotes the recorded number of accidents. The inverse dispersion parameter for the model used in the evaluation of environmental streets was 2.99. This parameter was adjusted for varying street length, as suggested by Hauer (2001).
176
R. Elvik / Accident Analysis and Prevention 45 (2012) 173–179
Table 2 Accident prediction models used in EB approach for junctions converted to roundabouts and streets converted to environmental streets. Accident prediction model for junctions
Accident prediction model for road sections
Variable
Coefficient (standard error)
P-Value
Variable
Coefficient (standard error)
P-Value
Constant term Ln(entering volume from major road) Ln(entering volume from minor road) Number of legs Speed limit Dispersion parameter Inverse dispersion parameter
−12.357 (0.924) 0.765 (0.077) 0.303 (0.057) 0.811 (0.163) 0.017 (0.005) 0.437 (0.115) 2.288
0.000 0.000 0.000 0.000 0.001 0.000
Constant term Ln(AADT) Ln(number of lanes + 1) Ln(junction/km + 1) Speed limit 50 km/h Speed limit 60 km/h (dummy) Speed limit 70 km/h (dummy) Speed limit 80 km/h (dummy) Motorway class A speed limit 90 km/h Motorway class B speed limit 90 km/h Other road with speed limit 90 km/h Dummy for trunk road status Dispersion parameter Inverse dispersion parameter
−6.112 (0.105) 0.923 (0.009) −0.092 (0.088) 0.189 (0.017) Reference −0.523 (0.033) −0.560 (0.046) −0.730 (0.030) −1.459 (0.281) −1.178 (0.083) −1.105 (0.057) −0.084 (0.020) 0.335 (0.011) 2.990
0.000 0.000 0.295 0.000
3. Results 3.1. Comparison of study designs Accident modification factors estimated by means of the different study designs, and the standard errors of these factors, are presented in Table 3. An accident modification factor, for example 0.80, shows that the number of accidents was reduced by 20 percent. A total of eight comparisons of the estimates of effect obtained by means of the various study designs can be made. When the mean estimates of effect obtained by means of any of the EB-approaches are compared to the mean estimates of effect obtained by means of any of the traditional approaches, it is seen that a smaller effect is attributed to the safety treatment in five cases (signing hazardous road sections, signing hazardous curves, signing hazardous junctions, conversion to roundabouts in Vestfold and environmental streets). In the remaining three cases the EB-approaches tend to attribute a larger effect to the safety treatment than the traditional approaches, although the difference is very small in the case of bypass roads.
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
The fact that traditional designs attribute larger effects to the signing of hazardous road locations than the EB-approaches is not surprising. The traditional study designs do not control for regression-to-the-mean, which is likely to be a confounding factor when treatment is introduced at locations known to have a bad safety record. In general, the various versions of the traditional approaches produce very similar estimates of effect, differing at most by a few percentage points. Before–after studies based on accident rates tend to produce larger estimates of effect for roundabouts in Vestfold and bypass roads. The choice of comparison group has a large influence on the estimate of the effect of environmental streets. The estimates of effect obtained by means of the various versions of the EB-approach are also quite close to each other. A tendency can be seen for larger effects to be attributed to the treatments when changes in traffic volume at the treated sites are accounted for by means of coefficients of the accident prediction models. The choice between a large and a matched comparison group has a large influence on the estimate of effect for environmental streets. In general, the comparisons indicate that the traditional study designs do not always control as well for confounding factors as the
Table 3 Comparison of estimates of effect for eight road safety measures based on four versions of the empirical Bayes method and four traditional approaches to before-and-after studies. Model
EB large comparison group EB matched comparison group EB treated site traffic volume EB local change in traffic volume Mean for EB-approaches Before–after large comparison group Before–after matched comparison group Before–after based on accident rates Simple before–after Mean for traditional before–after studies
Accident modification factors (standard errors in parentheses) Priority roads
Signing hazardous sections
Signing hazardous curves
Signing hazardous junctions
Conversion to roundabouts Akershus
Conversion to roundabouts Vestfold
Bypass roads
Environmental streets
0.77 (0.15)
0.66 (0.22)
0.82 (0.30)
0.57 (0.16)
0.23 (0.05)
0.67 (0.13)
0.77 (0.06)
0.79 (0.11)
0.76 (0.21)
0.67 (0.23)
0.85 (0.32)
0.56 (0.16)
0.21 (0.05)
0.75 (0.15)
0.81 (0.08)
1.04 (0.14)
0.68 (0.18)
0.62 (0.23)
0.70 (0.29)
0.47 (0.15)
0.21 (0.06)
0.55 (0.11)
0.61 (0.05)
0.73 (0.10)
0.72 (0.28)
0.73 (0.28)
0.82 (0.35)
0.54 (0.18)
0.29 (0.09)
0.68 (0.21)
0.72 (0.06)
1.07 (0.15)
0.73 (0.21)
0.67 (0.24)
0.80 (0.32)
0.54 (0.16)
0.24 (0.06)
0.66 (0.15)
0.73 (0.06)
0.91 (0.13)
0.81 (0.20)
0.46 (0.11)
0.47 (0.14)
0.50 (0.13)
0.32 (0.08)
0.61 (0.11)
0.76 (0.05)
0.79 (0.13)
0.83 (0.28)
0.45 (0.11)
0.48 (0.14)
0.49 (0.13)
0.30 (0.07)
0.69 (0.13)
0.77 (0.06)
1.04 (0.17)
0.81 (0.16)
0.43 (0.11)
0.43 (0.13)
0.51 (0.21)
0.28 (0.07)
0.55 (0.10)
0.63 (0.07)
0.80 (0.14)
0.76 (0.19)
0.47 (0.11)
0.46 (0.13)
0.49 (0.13)
0.34 (0.08)
0.62 (0.12)
0.82 (0.06)
0.80 (0.14)
0.80 (0.21)
0.45 (0.11)
0.46 (0.14)
0.50 (0.15)
0.31 (0.08)
0.62 (0.12)
0.75 (0.06)
0.86 (0.15)
Percentage difference betweenrecorded numberof accidents and model‐predicted numberofaccidents in before‐period
R. Elvik / Accident Analysis and Prevention 45 (2012) 173–179
177
Percentage difference between recorded number of accidents in before‐ period and model‐predicted number of accident sin before‐period‐ round abouts 250.0 200.0 Verticalbars = plus or minus one standard error
150.0 100.0 50.0 0.0 -50.0 -100.0 -150.0
0
1
2
3
4
5
6
7
8
9
Length of before‐period (years)
Fig. 2. Percentage difference between recorded number of accidents in before-period and model-predicted number of accidents – roundabouts.
EB-approaches. Inadequate control for confounding factors is more likely to be associated with an overestimate of the effect of a safety measure than the opposite. The findings lend support to the view that second-best approaches entail a considerable risk of bias when estimating the effect of a safety measure. The question is therefore if analytic choices can be made within second-best approaches to minimise the risk of bias when estimating the effects of road safety measures. 3.2. Analytic choices in second-best approaches How can second-best approaches to road safety evaluation be strengthened in order to reduce the chances of confounding? In discussing this question, it will be assumed that the EB-approach cannot be applied, for example because an adequate accident prediction model cannot be developed. Furthermore, it will be assumed that data on traffic volume are not available. Is it still possible to perform a before-and-after study that controls adequately for regression-to-the-mean, long-term trends and changes in traffic volume? With respect to regression-to-the-mean, there are two options: 1. Using a very long before-period, based on the assumption that random variation in the number of accidents is typically shortterm and that the mean value of a long time-series of accident counts will tend to be unbiased (i.e. close to the long-term mean). Theoretical arguments supporting this have been put forward by Nicholson (1988). 2. Deleting the worst year in the before-period, based on the assumption that the highest count of accidents was abnormal. This approach was first proposed by Brüde and Larsson (1982). The first option was applied to two of the studies that form the basis of this paper. In the study by Kristiansen (1992), the beforeperiods varied from 2 to 8 years. It was therefore possible to test if the recorded number of accidents in the longer before-periods was closer to the (presumably unknown) model-predicted estimate
of the expected number of accidents than in the shorter beforeperiods. The results are reported in Fig. 2. Fig. 2 shows the percentage difference between the recorded number of accidents in the before-period and the model-predicted number of accidents in the before-period as a function of the length of the before-period. The standard error of the percentage difference is shown by means of vertical bars. If using a longer before-period reduces bias, one would expect the differences between the recorded number of accidents and the model-predicted number of accidents to become smaller as the before-period becomes longer. A weak tendency in this direction can be seen in Fig. 2, with an exception for a before-period of 6 years. It should be noted that nearly all the junctions studied by Kristiansen (1992) had an abnormally low number of accidents in the before-period. In the study of bypass roads (Elvik et al., 2001), the length of the before-period varied from 3 years to 16 years. Fig. 3 shows the percentage difference between the recorded number of accidents in the before-period and the model-predicted number of accidents as a function of the length of the before-period. There is no clear tendency for bias to be reduced as the beforeperiod gets longer. This is somewhat surprising, since the sites had a considerably higher recorded number of accidents before the bypasses were built (374) than the junctions that were converted to roundabouts (67). It would therefore appear that using a long before-period does not necessarily eliminate regression-tothe-mean. Bias may exist even if the before-period is as long as 16 years. Another way of trying to control for regression-to-the-mean was proposed by Brüde and Larsson (1982). They suggested deleting the worst year in the before-period or the two worst years, depending on how much higher the recorded number of accidents in the before-period was compared to model estimates. Can this method be applied even when there are no model estimates to which the recorded number of accidents can be compared? An attempt was made to apply the method to the data in the study reported by Odberg (1996). As a group, the junctions he studied had an abnormally high recorded number of accidents in
R. Elvik / Accident Analysis and Prevention 45 (2012) 173–179
Percentage difference between recorded number of accidents in before‐period and model‐ predicted number of accidents
178
Percentage difference between recorded number of accidents before treatment and model‐predicted number of accidents before treatment ‐ bypass roads 100.0 Verticalbars=plusor minus one standard error
50.0
0.0
50.0 ‐
100.0 ‐
150.0 ‐ 0
2
4
6
8
10
12
14
16
18
Length of before‐period (years)
Fig. 3. Percentage difference between recorded number of accidents in before-period and model-predicted number of accidents – bypass roads.
the before-period. The method was applied as follows: The highest accident count was deleted when more than 1 year in the before-period had a positive (i.e. not zero) accident count. If, say, the count of accidents during 5 years was: 4 – 2 – 4 – 3 – 4 one of the counts of 4 was deleted and the remaining retained. Only one count was deleted. Fig. 4 shows the results. As can be seen from Fig. 4, deleting the worst year reduced bias in seven cases. It also increased bias in seven cases. In the remaining eight cases, no year was deleted. These results are not very encouraging. While the method may make more sense when its use is
informed by the possibility of comparing the recorded number of accidents to model estimates, as proposed by Brüde and Larsson (1982), a “blind” use of the method seems to produce entirely fortuitous results. Indeed, there is no basis for using the method unless one has reason to believe that the recorded number of accidents was abnormally high in the before-period. Such a reason can only be provided by comparing the recorded number of accidents to an estimate of the long-term mean number of accidents. In short, the method makes no sense unless a model-based estimate of the expected number of accidents is available, but then the method is superfluous.
Difference between recorded and model‐predicted number of accidents in before ‐ period
Difference between recorded and model ‐ predicted number of accidents in before ‐ period without and withdeletion of the worst year 14.00 12.00 Deletion of worst year reduces bias 10.00 Deletionof worst yearincreases bias 8.00 6.00 4.00
Without With
2.00 0.00 ‐2.00 ‐4.00 ‐6.00
0
5
10
15
20
25
Junction number (1 through 22) Fig. 4. Difference between recorded and model-predicted number of accidents in before-period without and with deletion of worst year.
R. Elvik / Accident Analysis and Prevention 45 (2012) 173–179
The preliminary conclusion is that neither of the two methods discussed above – using a long before-period or deleting the worst accident count – ensures adequate control for regression-to-themean. The validity of these methods remains unproven. Control for regression-to-the-mean can only be obtained if the recorded number of accidents can somehow be compared to an estimate – model-based or otherwise – of the long-term expected number of accidents. Studies that have not controlled for regression-to-themean this way are not to be trusted. How about the other confounding factors mentioned above – long-term trends and changes in traffic volume? In general, using a large comparison group and fairly long before- and after-periods (5–6 years) will control for these confounding factors – unless local changes in traffic volume deviate greatly from general trends. This can be seen by comparing the estimates of effect based on traditional before–after studies with a large comparison group and before–after studies based on accident rates in Table 2. These estimates are, in most cases, very close to each other. 4. Discussion Is it feasible to perform a methodologically adequate beforeand-after study of a road safety measure when the empirical Bayes approach cannot be applied? The comparisons that have been made in this paper suggest that the answer is “no”. In the first place, it was found that study design does indeed matter. It is not the case that simpler study designs, like a simple before-and-after study, produce the same estimates of effect as more sophisticated study designs. On the contrary, simple study designs are likely to overstate the effect of a safety measure by not controlling for important confounding factors, in particular regression-to-the-mean. In the second place, trying to control for regression-to-the-mean by means of simpler methods, like prolonging the before-period or deleting the worst year of the before-period, do not seem to eliminate bias. Deleting the worst year is an entirely arbitrary method unless it is known that accident counts were abnormally high. If, however, that is known to be the case, it is almost certainly possible to apply a more sophisticated and statistically justified method for controlling for regression-to-the-mean. The second-best methods that have been explored in this paper are therefore non-starters. While easy to apply, the accuracy of the methods is totally unknown unless quite detailed information can be obtained regarding the long-term expected number of accidents. When such information is available, it would in most cases be possible to apply at least an elementary version of the empirical Bayes approach to control for regression-to-the-mean. Unless data can be used to argue that regression-to-the-mean, long-term trends and changes in traffic volume are not likely to confound a study it would appear to be better not to perform a second-best study than to apply the ad hoc techniques discussed in this paper.
5. Conclusions The following points summarise the main conclusions of the study reported in this paper: 1. Different study designs for before-and-after studies tend to produce different estimates of the effect of the road safety measures being evaluated. In most cases, traditional approaches attribute larger effects to road safety measures than the empirical Bayes approach. 2. Since the empirical Bayes approach controls better for potentially confounding factors than traditional approaches, it is likely
179
that estimates of effect based on traditional study designs are influenced by confounding factors not controlled for. 3. Two simple methods for controlling for regression-to-the-mean were tested. The methods were: (a) to use a very long beforeperiod and (b) to delete the year with the highest count of accidents in the before-period. Neither of these methods was found to reliably control for regression-to-the-mean. Their use must therefore be discouraged. 4. Second-best studies, relying on the traditional designs for before-and-after studies, can be defended if it can be shown that neither regression-to-the-mean, long-term trends nor changes in traffic volume are likely to confound study results. If this cannot be shown, second-best studies should not be made. References Assum, T., 2010. Reduction of the blood alcohol concentration limit in Norway – effects on knowledge, behaviour and accidents. Accident Analysis and Prevention 42, 1523–1530. Brüde, U., Larsson, J., 1982. The regression-to-mean effect. In: Proceedings of Seminar on Short-term and Area-wide Evaluation of Safety Measures, SWOV Institute for Road Safety Research, Leidschendam, pp. 47–54. Buran, M., Heieraas, T., Hovin, S., 1995. Forkjørsregulering av Singsakerringen i Trondheim. In: Prosjektoppgave ved Institutt for samferdselsteknikk, Norges Tekniske Høyskole, Trondheim. Elvik, R., 2011. A simple accident model for junctions. Working paper SM/2195/2011. Institute of Transport Economics, Oslo. Elvik, R., Amundsen, F.H., Hofset, F., 2001. Road safety effects of bypasses. Transportation Research Record 1758, 13–20. Elvik, R., Høye, A., Vaa, T., Sørensen, M., 2009. The Handbook of Road Safety Measures, second edition. Emerald Group Publishing, Bingley. Giæver, T. 1999. Før-/etterundersøkelse av ulykkespunkter og strekninger med spesiell skilting. Rapport STF22 A99554. SINTEF Bygg og miljøteknikk, Samferdsel, Trondheim. Grendstad, G., Lie, T., Vik, A., Bettum, O., Fyhri, A., 2003. Fra riksveg til gate – erfaringer fra 16 miljøgater. Rapport UTB 2003/06. Statens vegvesen, Vegdirektoratet, Oslo. Hauer, E., 1991. Comparison groups in road safety studies: an analysis. Accident Analysis and Prevention 23, 609–622. Hauer, E., 1997. Observational Before–After Studies in Road Safety. Pergamon Press (Elsevier Science), Oxford. Hauer, E., 2001. Overdispersion in modelling accidents on road sections and in Empirical Bayes estimation. Accident Analysis and Prevention 33, 799–808. Hauer, E., Harwood, D., Council, F.M., Griffith, M.S., 2002. Estimating safety by the empirical Bayes method: a tutorial. Transportation Research Record 1784, 126–131. Hauer, E., Ng, J.C.N., Papioannou, P., 1991. Prediction in road safety studies: an empirical inquiry. Accident Analysis and Prevention 23, 595–607. Highway Safety Manual, 2010. American Association of State Highway and Transportation Officials (AASHTO), Washington, DC. Hirst, W.M., Mountain, L.J., Maher, M.J., 2004a. Sources of error in road safety scheme evaluation: a quantified comparison of current methods. Accident Analysis and Prevention 36, 705–715. Hirst, W.M., Mountain, L.J., Maher, M.J., 2004b. Sources of error in road safety scheme evaluation: a method to deal with outdated accident prediction models. Accident Analysis and Prevention 36, 717–727. Kristiansen, P., 1992. Erfaringer med rundkjøringer i Akershus. Statens vegvesen Akershus, Oslo. Kvisberg, J., 2003. Analyse av kryssulykker på hovedvegnettet i Region Øst. Hovedoppgave for faggruppe veg og samferdsel, Institutt for bygg, anlegg og transport. Norges Teknisk-Naturvitenskapelige Universitet, Trondheim. Nicholson, A.J., 1988. Accident count analysis: the classical and alternative approaches. In: Proceedings of Traffic Safety Theory and Research Methods, Session 2, Models for evaluation, SWOV Institute for Road Safety Research, Leidschendam. Odberg, T.A., 1996. Erfaringer med etablering av rundkjøringer i Vestfold. Ulykker, atferd og geometri. Hovedoppgave ved NTNU, Institutt for samferdselsteknikk. Norges Teknisk-Naturvitenskapelige Universitet, Trondheim. Persaud, B., Lan, B., Lyon, C., Bhim, R., 2010. Comparison of empirical Bayes and full Bayes approaches for before–after road safety evaluations. Accident Analysis and Prevention 42, 38–43. Persaud, B., Lyon, C., 2007. Empirical Bayes before–after studies: lessons learned from two decades of experience and future directions. Accident Analysis and Prevention 39, 546–555. Ragnøy, A., Christensen, P., Elvik, R., 2002. Skadegradstetthet. Et nytt mål på hvor farlig en vegstrekning er. Rapport 618. Transportøkonomisk institutt, Oslo. Stigre, S.A., 1991. Forkjørsregulering av overordnet vegnett i Hamar. Effektundersøkelse. Rykkinn, Svein Stigre. Stigre, S.A. 1993. Forkjørsregulering av overordnet vegnett i Bærum. Effektundersøkelse. Rykkinn, Svein Stigre.