Developing local safety performance functions versus calculating calibration factors for SafetyAnalyst applications: A Florida case study

Developing local safety performance functions versus calculating calibration factors for SafetyAnalyst applications: A Florida case study

Safety Science 65 (2014) 93–105 Contents lists available at ScienceDirect Safety Science journal homepage: www.elsevier.com/locate/ssci Developing ...

4MB Sizes 10 Downloads 45 Views

Safety Science 65 (2014) 93–105

Contents lists available at ScienceDirect

Safety Science journal homepage: www.elsevier.com/locate/ssci

Developing local safety performance functions versus calculating calibration factors for SafetyAnalyst applications: A Florida case study Jinyan Lu a, Kirolos Haleem b,⇑, Priyanka Alluri b, Albert Gan c, Kaiyu Liu b a

HBC Engineering Company, 13155 SW 134th Street, Suite 207, Miami, FL 33186, United States Lehman Center for Transportation Research, Department of Civil and Environmental Engineering, Florida International University, 10555 West Flagler Street, EC 3680, Miami, FL 33174, United States c Department of Civil and Environmental Engineering, Florida International University, 10555 West Flagler Street, EC 3680, Miami, FL 33174, United States b

a r t i c l e

i n f o

Article history: Received 13 November 2013 Received in revised form 10 January 2014 Accepted 17 January 2014 Available online 8 February 2014 Keywords: Safety performance functions SafetyAnalyst Freeway interchange influence areas Crash prediction Calibration factors

a b s t r a c t Safety performance functions (SPFs) are a required input to the newly released SafetyAnalyst software tool. Although SafetyAnalyst provides default SPFs that were developed based on data from multiple states in the United States, agencies have the option to calibrate local SPFs to better reflect their local conditions. However, the benefit from local calibration of SPFs is unclear and may vary from state to state. Using statistical goodness-of-fit measures and visual plots, this paper compares the performance of locally-calibrated SPFs using Florida data with the default SPFs from SafetyAnalyst for both freeway interchange influence areas and basic segments. An interchange influence area is one that extends 0.3 miles upstream and downstream of the respective gore point. Unlike for intersections, an automatic process for segmenting the influence areas for interchanges based on the above definition is relatively complex. Therefore, this paper also describes a spatial method to automatically segment a freeway facility into interchange influence areas and basic segments. Using four years of local crash data (2007–2010) from Florida, SPFs for both types of segments for both urban and rural areas were developed using the negative binomial regression model. The results showed that Florida-specific SPFs generally produced better-fitted models than the calibrated SafetyAnalyst default SPFs. This was clear in that the majority of Florida-specific models had higher Freeman–Tukey R-square (R2FT ), as well as lower mean absolute deviance (MAD) and mean square prediction error (MSPE) estimates. Overall, the results suggest that agencies implementing SafetyAnalyst could improve their crash prediction by developing local-specific SPFs. Ó 2014 Elsevier Ltd. All rights reserved.

1. Introduction A comprehensive highway safety analysis software system known as SafetyAnalyst (AASHTO, 2010a) was released in 2010 as an AASHTOware by the American Association of State Highway and Transportation Officials (AASHTO). The result of a major cooperative effort between the Federal Highway Administration (FHWA) and 27 states in the U.S., SafetyAnalyst is designed to account for the regression-to-the-mean (RTM) bias in the traditional practice of selecting locations for safety improvements (Lu et al., 2009). The RTM bias results in the overestimation of crash reduction, thus, the effectiveness of safety measures. To address this bias, SafetyAnalyst implements the empirical Bayes (EB) method which requires the use of safety performance functions (SPFs) (Gan et al., 2012).

⇑ Corresponding author. Tel.: +1 (321) 276 7889; fax: +1 (305) 348 2802. E-mail addresses: [email protected] (J. Lu), khaleemm@fiu.edu (K. Haleem), palluri@fiu.edu (P. Alluri), gana@fiu.edu (A. Gan), liuk@fiu.edu (K. Liu). http://dx.doi.org/10.1016/j.ssci.2014.01.004 0925-7535/Ó 2014 Elsevier Ltd. All rights reserved.

SPFs are regression models that are used to predict the average crash frequency of roadway segments or intersections. Traditionally, SPFs are a function of both traffic and geometric characteristics. An example of such an SPF can take on the following general functional form:

SPF ¼ expða þ b  lnðAADTÞ þ b1 X 1 þ b2 X 2 þ    þ bn X n Þ

ð1Þ

where, SPF is the predicted crashes, AADT is annual average daily traffic, X1, X2, . . . , Xn is n roadway geometric variables, and a, b, b1, b2, bn is the regression coefficients. This specific functional form assumes that crash occurrences follow the negative binomial (NB) distribution. A general problem with such an SPF is that it is subject to correlation among the traffic and geometric variables. As such, SafetyAnalyst adopts a form of SPF that includes only traffic volume. This is referred to as ‘‘traffic’’ SPF, or ‘‘simple’’ SPF, as opposed to the ‘‘full’’ SPF shown in Eq. (1), which includes geometric variables in addition to traffic volume. A

94

J. Lu et al. / Safety Science 65 (2014) 93–105

simple SPF based on the same NB distribution takes on the following functional form (Harwood et al., 2010):

Simple SPF ¼ expða þ b  lnðAADTÞÞ

ð2Þ

Because a simple SPF does not include geometric variables, it must be calibrated for some base geometric conditions such as 12-ft lane width, 6-ft shoulder width, etc. To predict crashes for a specific site, a set of crash modification factors (CMFs) can be applied to account for the effects from the prevailing geometric conditions that may have deviated from the base conditions from which the SPF was calibrated. This is implemented in the Highway Safety Manual (HSM) (AASHTO, 2010b) as follows:

NPredicted

HSM

¼ Simple SPF  CMF1  CMF 2      CMF n

ð3Þ

where, NPredicted HSM is the predicted crashes after accounting for the CMFs in the HSM, CMF1 is the crash modification factor for variable 1 (e.g., lane width), CMF2 is the crash modification factor for variable 2 (e.g., shoulder width), and CMFn is the crash modification factor for the nth variable. Each CMF accounts for the effect of change (increase or decrease in relative to a base value) in a specific geometric characteristic on crash occurrence. For example, if a CMF for ‘‘lane width = 11 feet’’ is 1.1 and it was calibrated based on a base lane width of 12 feet, a roadway segment having an 11-feet lane width is expected to increase crashes by 10%, as compared to those of 12-ft lanes. SafetyAnalyst, different from the HSM, includes a set of default SPFs that were developed using crash data from northern and western U.S. states, including California, Minnesota, North Carolina, Ohio, and Washington. Note that the default SPFs in SafetyAnalyst were developed considering traffic as the sole predictor of crashes without accounting for variations in geometric conditions, i.e., no CMFs were used. To reflect local crash experience when applying SafetyAnalyst, local jurisdictions can choose to calibrate SPFs using local data, which can be a major undertaking. An alternative approach is to make use of the default SPFs and apply a respective ‘‘calibration factor’’, C, as follows:

NPredicted

SafetyAnalyst

¼ Default Simple SPF  C

ð4Þ

where C is calculated as follows:

P C ¼ PAll

All

v ed Crashes Predicted Crashes sites

sites Obser

segmentation using a Geographic Information System (GIS) approach is also described in this paper.

ð5Þ

Thus, if a local jurisdiction experienced crashes that were 10% higher than the national average, as predicted by the default SPF, the calibration factor will be 1.1. The use of calibration factors to adjust to local conditions assumes that crash distributions are the same between those of the national averages and local jurisdiction, and that their crash experiences differ only by a factor (i.e., the calibration factor). This assumption is not valid when crash distributions between the national averages and local values differ not by a factor, but by the underlying trend itself. Would it be beneficial to develop SPFs instead of applying the default SPFs with calibration factors for SafetyAnalyst applications? Using four years of crash data from 2007 to 2010, this paper first develops SPFs for freeway facilities in urban and rural freeways in Florida. Crash predictions from these local SPFs are then compared with those using local calibration factors with default SPFs from SafetyAanlyst to assess the potential benefits of calibrating local SPFs. Given that the crash and traffic flow characteristics of interchange influence areas and basic freeway segments are considerably different (Kiattikomol, 2005), this paper includes separate models for these two categories of freeway facilities. Unlike for intersections, the process for segmenting the influence areas for interchanges is more involved. Thus, a process to perform the

2. Literature review Poisson and NB are the two most common models for SPF development. Jovanis and Chang (1986) indicated that the distribution of crash occurrences is positively skewed, and that the underlying normal distributional assumption for linear regression is undesirable. In contrast to multiple linear regression models, the Poisson regression models became widely used for modeling crashes and influencing factors. Kraus et al. (1993) explored the relationship between crashes by type and the independent variables such as geometric features, time of day, and traffic flow rate by developing a non-linear prediction model and assumed Poisson distribution for crashes for urban freeway sections without considering their locations in relation to interchanges. Khan et al. (1999) also developed a Poisson regression model, but they focused on the relationship between crashes stratified by severity, traffic volume, segment length, and vehicle miles traveled. A known limitation in applying the Poisson regression model is that the variance is restrained to be equal to the mean of the used data (Dean and Lawless, 1989). However, the variance of crash counts often exceeds its mean, a condition referred to as ‘‘overdispersion’’. Therefore, when data are overdispersed, the Poisson regression model will result in biased and inconsistent parameter estimates regarding the relation between crash frequency and exposure (Lord and Mannering, 2010). The NB model is an extension of the Poisson model to account for the possible overdispersion in the data. The NB model accounts for the overdispersion which occurs when the variance is greater than the mean (Shankar et al., 1995). The NB distribution has been widely used in safety studies (e.g., Chang, 2005; AASHTO, 2010b; Lu et al., 2012; Manan et al., 2013; Lu et al., 2013). In order to obtain the relationship between crash frequency and exposure using regression models, the traditional full SPF, which relates crash occurrence to both roadway geometric characteristics and traffic characteristics, was used. However, one main problem with the full SPF is the correlation between independent variables, which led to the adoption of the simple SPF currently in use in the HSM (AASHTO, 2010b) and SafetyAnalyst. The application of simple SPFs based on NB distribution can be found in several previous studies (e.g., Kiattikomol et al., 2008; Sacchi et al., 2012; Young and Park, 2012). Since the release of recent safety analysis systems, such as the HSM and SafetyAnalyst in 2010, several studies have been undertaken to calibrate default models (see for example, Bornheimer et al., 2012; Young and Park, 2012; Brimley et al., 2012; Sacchi et al., 2012). For example, Bornheimer et al. (2012) compared locally-developed SPFs for rural two-lane highways in Kansas with the default SPFs in the HSM that were calibrated to Kansas data. They found that both models worked similarly. In another study, Young and Park (2012) developed intersection-specific SPFs for the City of Regina and compared these models with the calibrated default HSM SPFs. Four types of intersections were used: 3-leg unsignalized, 4-leg unsignalized, 3-leg signalized, and 4-leg signalized. They found that the jurisdiction-specific SPFs provided the best fit to the data using statistical goodness-of-fit measures, such as the cumulative residual (CURE) plots and the mean square error. A clear conclusion from the review of previous studies was that comparing local-specific SPFs with calibrated default SPFs for freeways was relatively limited. Moreover, it was also observed that while developing SPFs for freeways, analysis of interchanges was performed in two ways. One was to develop SPFs while disregarding the existence and influence of interchanges (i.e., freeway

95

J. Lu et al. / Safety Science 65 (2014) 93–105

Freeway Gore point 0.3 mi

Freeway interchange influence area

0.3 mi

0.3 mi

0.3 mi

Freeway interchange influence area

Not within the interchange influence area (typical mainline)

Fig. 1. Definition of interchange influence areas.

Basic Freeway Segments

Ramps

Dissolved Buffers

Interchange Influence Area Segments

0.3 mi Crossing points

Fig. 2. Freeway segments with interchange influence areas.

Table 1 Summary statistics of freeway segments in Florida. Category

Urban freeways Basic freeway segments

Segments within Int. Inf. Area

Rural freeways Basic freeway segments Segments within Int. Inf. area

Total Segment Length (miles)

Max. AADT

No. of Sites

No. of Crashes (2007–2010) Total

FI

4 Lanes 6 Lanes 8 + Lanes 4 Lanes 6 Lanes 8 + Lanes

319.26 198.12 42.28 280.58 263.71 125.46

141,915 189,846 278,884 119,852 217,596 305,011

375 272 75 620 558 330

8592 10,317 2229 11,210 27,115 25,748

4223 4694 1010 5404 11,851 12,311

4 Lanes 6 + Lanes 4 Lanes 6 + Lanes

442.86 181.73 49.70 38.38

73,415 114,373 82,526 118,500

264 101 156 69

6222 2400 717 502

3587 994 423 228

Table 2 Summary statistics of freeway segments in SafetyAnalyst (Harwood et al., 2010). Category Urban freeways Basic freeway segments

Segments within Int. Inf. Area

Rural freeways Basic freeway segments Segments within Int. Inf. Area

Total segment length (miles)

Max. AADT

4 Lanes 6 Lanes 8 + Lanes 4 Lanes 6 Lanes 8 + Lanes

126 35 15 156 83 31

151,038 241,255 223,088 241,255 255,154 233,323

4 Lanes 6 + Lanes 4 Lanes 6 + Lanes

379 201 90 238

60,621 190,403 60,621 197,798

96

J. Lu et al. / Safety Science 65 (2014) 93–105

Table 3 Freeways SPFs. Category

Injury severity level

Florida-Specific SPFs Coefficient

Urban freeways 4-Lane Basic Freeway Segments 4-Lane Segments within Int. Inf. Area 6-Lane Basic Freeway Segments 6-Lane Segments within Int. Inf. Area (8+)-Lane Basic Freeway Segments (8+)-Lane Segments within Int. Inf. Area Rural freeways 4-Lane Basic Freeway Segments 4-Lane Segments within Int. Inf. Area (6+)-Lane Basic Freeway Segments (6+)-Lane Segments within Int. Inf. Area

SafetyAnalyst Default SPFs calibrated to Florida data Overdispersion

a

b

Total FI Total FI Total FI Total FI Total FI Total FI

9.372 10.745 11.656 12.143 13.407 14.548 15.088 15.820 6.847 7.239 5.430 7.544

1.086 1.144 1.302 1.281 1.458 1.487 1.602 1.595 0.907 0.872 0.791 0.903

Total FI Total FI Total FI Total FI

11.412 11.024 10.572 10.467 11.522 13.991 11.610 12.063

1.238 1.145 1.184 1.119 1.234 1.379 1.273 1.244

Coefficient

Overdispersion

Calib. factor (C)

1.000 1.020 1.300 1.380 0.780 0.850 1.280 1.420 1.670 1.850 2.580 2.420

0.990 1.150 0.810 0.790 0.480 0.540 0.600 0.550 0.450 0.520 0.520 0.530

0.514 0.561 0.651 0.793 1.372 1.504 0.926 0.844 0.939 0.918 0.667 0.625

0.810 0.890 0.970 0.960 0.940 1.030 1.060 1.040

0.170 0.160 0.150 0.240 0.090 0.090 0.210 0.200

1.027 1.856 0.638 1.212 1.087 1.054 1.741 1.934

a

b

0.633 0.565 0.355 0.310 0.645 0.611 0.364 0.307 0.725 0.707 0.520 0.445

7.850 8.820 11.230 12.890 5.960 7.600 11.250 13.620 16.240 19.160 26.760 25.630

0.233 0.213 0.312 0.244 0.231 0.168 0.316 0.249

6.820 8.820 7.760 8.860 8.280 10.250 9.630 10.480

Table 4 SPFs goodness-of-fit statistics. Category

Urban freeways 4-Lane basic freeway segments 4-Lane segments within Int. Inf. area 6-Lane basic freeway segments 6-Lane segments within Int. Inf. area (8+)-Lane basic freeway segments (8+)-Lane segments within Int. Inf. area Rural freeways 4-Lane basic freeway segments 4-Lane segments within Int. Inf. area (6+)-Lane basic freeway segments (6+)-Lane segments within Int. Inf. area

Injury severity level

Florida-specific SPFs

SafetyAnalyst Default SPFs Calibrated to Florida Data

MAD

MSPE

R2FT

MAD

MSPE

R2FT

Total FI Total FI Total FI Total FI Total FI Total FI

3.12 1.88 3.05 1.09 3.06 1.05 3.75 2.19 6.35 2.88 3.27 2.00

31.64 10.21 25.74 8.19 25.90 8.12 49.82 13.25 94.91 19.90 36.94 11.37

0.318 0.311 0.402 0.456 0.339 0.324 0.486 0.444 0.052 0.114 0.232 0.248

3.17 1.92 3.06 1.14 3.10 1.09 3.87 2.23 6.30 2.94 3.35 2.21

31.86 10.17 26.04 8.55 26.32 8.35 49.88 13.67 101.42 19.83 37.09 11.93

0.305 0.284 0.400 0.454 0.304 0.235 0.445 0.452 0.039 0.092 0.123 0.188

Total FI Total FI Total FI Total FI

7.83 2.90 8.01 3.09 8.57 3.10 9.33 3.18

120.61 20.92 133.33 27.43 192.78 28.55 257.48 32.85

0.210 0.144 0.252 0.181 0.234 0.214 0.162 0.203

7.89 3.15 8.22 3.67 9.04 3.02 9.27 4.05

118.95 29.33 133.60 29.64 196.31 29.08 257.58 33.71

0.204 0.151 0.247 0.177 0.233 0.226 0.134 0.203

analysis segments include the entire interchange influence area) (Kononov and Allery, 2004). The other, more rational way was to develop SPFs using freeway segments identified from the center of one interchange to the next (Persaud and Dzbik, 1993; Konduri and Sinha, 2002). Very few previous studies on freeways have considered interchange influence area as a specific analysis category. Kiattikomol et al. (2008) developed planning level prediction models on interchange and non-interchange segments (i.e., basic freeway segments) for both North Carolina and Tennessee. However, the length of interchange was omitted in the prediction model because all interchange segments had the same length of approximately 3000 ft in North Carolina. Even though Tennessee uses

varying interchange length, the underlying procedure for segment length estimation was not documented. In Florida’s roadway inventory system, freeway interchange influence areas were not explicitly identified. The authors also found that most state Departments of Transportation (DOTs) do not specifically segregate interchange influence areas from the basic freeway segments, making it difficult for the actual analysis. For example, the Model Inventory of Roadway Elements (MIRE) had identified and recommended collection and maintenance of several data elements with respect to freeway interchanges including the mileposts of all crossing points (Lefler et al., 2010). However, it may take a few years for states to collect and maintain meaningful

J. Lu et al. / Safety Science 65 (2014) 93–105

97

Fig. 3. Predicted crashes vs. AADT for urban 4-lane freeways.

interchange data. Therefore, a potential substitution needs to be proposed to develop state specific SPFs for both basic freeway segments and freeways within interchange influence areas. SafetyAnalyst provides a definition of the freeway interchange influence area, but it does not provide a specific method to achieve the separation. According to the SafetyAnalyst user’s manual, interchange influence area is defined as shown in the schematic sketch of Fig. 1. The interchange influence area of a particular interchange covers the length of the freeway section extending approximately 0.3 miles upstream of the gore point of the first exit/entrance ramp to approximately 0.3 miles downstream of the gore point of the last entrance/exit ramp of the same interchange. The area between two successive interchange influence areas is defined as basic freeway segment.

3. Data processing Four years of crash data from 2007 to 2010 along freeways were extracted from the Florida Department of Transportation’s

(FDOT’s) Crash Analysis Reporting (CAR) system. The geometric, roadway (e.g., facility type, number of lanes, and land use), and traffic data were obtained from FDOT’s Roadway Characteristic Inventory (RCI) database. As a part of data cleaning, segments with null AADT and unrealistic AADT (i.e., too high or too low traffic volumes compared to nearby segments) were excluded from the analysis. SafetyAnalyst categorizes urban freeways into six subtypes: basic freeway segments with 4 lanes, 6 lanes, and 8 + lanes, and their counterparts within interchange influence areas. Furthermore, rural freeways are categorized into four subtypes: 4 lanes and 6 + lanes, and their counterparts within interchange influence areas. The steps used for the identification and segregation of interchange influence areas using the GIS tool are as follows: (1) Geometric Data Acquisition: Geometric data for all freeway segments and all access ramps were extracted from the roadway shapefiles maintained by FDOT. (2) Interchange Influence Area Identification: A 0.3-mile buffer for each ramp of the interchange was created and the overlapped buffers were dissolved. The dissolved buffer areas

98

J. Lu et al. / Safety Science 65 (2014) 93–105

Fig. 4. Predicted crashes vs. AADT for urban 6-lane freeways.

were considered to be the interchange influence areas. Note that if an overlap exists between two nearby interchanges, the midpoint of the overlap distance was set as the separation point. (3) Milepost Estimation: The milepost of each crossing point between the dissolved buffer and the connected basic segment was estimated. In this step, the dissolved buffer layer and the freeway layer were intersected as shown in Fig. 2. (4) Basic Freeway Segments and Freeways within Interchange Influence Area Identification: Interchange influence areas were identified by spatially comparing the coordinates of the original freeway segments and coordinates of the identified interchanges. Roadway characteristics databases, constituting of roadway ID and begin and end mileposts of segments, were created for urban and rural freeway basic segments separately. The same was also done for urban and rural freeway segments within interchange influence area. The separated segments were then categorized into urban and rural areas using the RCI variable ‘‘LAND USE’’. Afterwards, crashes from 2007 to 2010 were assigned to each of the four database segments (urban basic segments, urban interchange influence

areas, rural basic segments, and rural interchange influence areas). The crash assignment was performed using the roadway ID and begin and end mileposts. Note that crashes were categorized into total and fatal and injury (FI) crash severity levels. The four databases were further separated into 20 sub-databases after classification using the number of lanes (e.g., 4, 6, 8, etc.) and crash type (total or FI). Table 1 shows summary statistics of basic freeways segments and segments within the interchange influence area for both urban and rural areas in Florida. Moreover, a random 70% of each dataset was used for calibration, while the remaining 30% was used for validation purpose. For comparison purposes, Table 2 also shows summary statistics of the freeway segments used in developing the default SPFs in SafetyAnalyst (Harwood et al., 2010). It is observed from Tables 1 and 2 that there are major differences between the AADT values of the freeway segments in the states used to develop default SPFs for SafetyAnalyst and in Florida. The closest AADT exists for the urban 4-lane freeway basic segments subtype. Additionally, all six urban freeway segment subtypes in Florida possess more representative miles of segments than those used to develop default SPFs for SafetyAnalyst. On the other hand, the rural freeway segment subtypes in the states used

99

J. Lu et al. / Safety Science 65 (2014) 93–105

Fig. 5. Predicted crashes vs. AADT for urban (8+)-lane freeways.

to develop default SPFs for SafetyAnalyst possess more representative miles of segments than those in Florida, except for rural 4-lane basic segments.

R-square (R2FT ). Note that the lower the MAD and MSPE values, the better the model is. MAD and MSPE were calculated as follows:

MAD ¼

 1X jyi  li  n

ð6Þ

4.1. Model development

MSPE ¼

1X ðyi  li Þ2 n

ð7Þ

A simple NB regression model that takes on the functional form in Eq. (2) was used to model a total of 20 local SPFs for Florida for each of the urban and rural freeway categories for total and FI crashes. The SafetyAnalyst default SPFs were then calibrated to Florida data by applying calibration factors, which were calculated using Eq. (5).

where, n is the sample size of segments in the prediction dataset, yi is the observed crash frequency for segment i, and li is the predicted crash frequency for segment i. The R2FT value is the main goodness-of-fit criterion used in SafetyAnalyst and it describes how well the model fit is. A higher value indicates a better fit, and vice versa. Note that this specific Rsquare statistic was implemented in this study for a fair comparison between Florida-specific SPFs and the SafetyAnalyst default models. According to Fridstrom et al. (1995), the R2FT was introduced by Freeman and Tukey (1950) and is estimated as follows:

4. Methodology

4.2. Goodness-of-fit measures The Florida-specific SPFs were compared with the SafetyAnalyst default SPFs calibrated to Florida data using visual plots and statistical goodness-of-fit measures. The visual plots included the plots of predicted crashes against AADT and the CURE plots. The goodness-of-fit measures included the mean absolute deviance (MAD), mean square prediction error (MSPE), and Freeman–Tukey

R2FT ¼ 1  P

P

^e2i

2 ðfi  f Þ

ð8Þ

where ^ei denotes the residuals and f denotes the average of fi for the sites considered.fi in Eq. (8) can be estimated as follows:

100

J. Lu et al. / Safety Science 65 (2014) 93–105

Fig. 6. Predicted crashes vs. AADT for rural 4-lane freeways.

fi ¼

pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi yi þ yi þ 1

ð9Þ

where yi denotes the observed crashes at site i. Note that the fi statistic is normally distributed with mean xi and unit variance, as follows:

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi xi ¼ 4 k i þ 1

ð10Þ

where ki denotes the mean of observed crashes at site i. The residuals ^ ei in Eq. (8) can be estimated as follows:

^ei ¼ fi 

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^ i þ 1 ¼ y i þ y i þ 1  4y ^i þ 1 4y

ð11Þ

^i denotes the fitted (predicted) crashes at sites similar to where y site i. Apart from the goodness-of-fit statistical measures, the fitted SPFs were assessed using visual plots, e.g., the CURE plot. CURE plot is used to assess how well the predictions fit the data over the full range of an independent variable. As documented by Hauer and Bamfo (1997), the cumulative residuals are plotted against an independent variable (e.g., AADT). The cumulative residuals are defined as the cumulative differences between the observed and predicted crashes. These differences are then plotted in an increasing order of AADT. Well-fitted models are identified whenever the

cumulative residuals oscillate around zero and fall within two standard deviations of the cumulative residual values. 5. Analysis results 5.1. Model comparison using good goodness-of-fit measures Table 3 shows the developed Florida-specific SPFs and SafetyAnalyst default SPFs calibrated to Florida data. The right portion of the table shows the estimated calibration factors for Florida using the default regression coefficients in SafetyAnalyst. A value greater than 1 indicates that the SafetyAnalyst SPFs underestimate the crash frequency. This is the case for the majority of rural freeway SPFs. A calibration factors less than 1 indicates that the SafetyAnalyst default SPFs overestimate the crash frequency, which is the case for most urban freeway SPFs. It is also observed from the table that, as expected, increasing the AADT increases total and FI crashes along both urban and rural freeways. Table 4 shows the results of the goodness-of-fit statistics for Florida-specific SPFs and SafetyAnalyst default SPFs calibrated to Florida data. The statistic measures include MAD, MSPE, and R2FT , which were performed on the 30% validation data. The highlighted

J. Lu et al. / Safety Science 65 (2014) 93–105

101

Fig. 7. Predicted crashes vs. AADT for rural (6+)-lane freeways.

cells in the table show the SPFs that performed better for each measure. It is obvious that Florida-specific SPFs yielded better prediction performance than the SafetyAnalyst default SPFs calibrated to Florida data although in some cases, there existed minor differences in the performance measures of the two models. This was obvious in the lower estimates of MAD and MSPE, as well as the higher R2FT values.

5.2. Model comparison using visual plots In addition to model comparison using goodness-of-fit statistical measures, visual plots were also employed. These visual plots mainly include the plot of the predicted crashes against AADT for all freeway facilities. Figs. 3–7 show the plots of the predicted annual crash frequency per mile against AADT for urban 4-lane, urban 6-lane, urban (8+)-lane, rural 4-lane, and rural (6+)-lane freeways, respectively. The observed annual crash frequency per mile is also shown on the figures. In the figures, the solid red line represents the Florida-specific SPF, the dashed blue line represents the default SafetyAnalyst SPFs without calibration, and the dashed

green line represents the SafetyAnalyst default SPFs calibrated to Florida data. From Fig. 3, the observed crashes are well represented by both Florida-specific SPFs and SafetyAnalyst default SPFs calibrated to Florida data for total and FI crashes. On the other hand, SafetyAnalyst default SPFs without calibration overestimate the crash frequency. This shows the importance of calibrating SafetyAnalyst default SPFs to reflect each agency’s conditions. Overall, the plots of the SafetyAnalyst default models calibrated to Florida data and Florida-specific SPFs are well matched for both basic segments and interchange areas. Fig. 4 displays the SPFs for urban 6-lane freeways. The plots show closer fitting of Florida-specific and calibrated SafetyAnalyst default SPFs for freeway segments within interchange areas. However, for both total and FI crashes along basic segments, there exist some discrepancies between Florida-specific SPFs and calibrated SafetyAnalyst default SPFs. At the same level of AADT, the predicted crash frequency for segments within interchange influence areas is higher than that for basic freeway segments. This is likely as a result of complex conflict points due to the high tendency of weaving (merging and diverging) maneuvers within these interchange

102

J. Lu et al. / Safety Science 65 (2014) 93–105

Fig. 8. CURE plots for urban 4-lane freeways.

influence areas. This shows the importance of considering interchange influence area as a separate category instead of merging both basic freeway segments and interchange influence areas into one category. Fig. 5 shows the SPFs for urban (8+)-lane freeways. A distinctive observation from this figure is that urban (8+)-lane segments (both basic freeway segments and interchange influence

areas) exhibit very different fitting of Florida-specific and SafetyAnalyst default SPFs. This might be due to the complexity of traffic characteristics at those wide facilities. For rural 4-lane freeway segments in Fig. 6, the plots of the SafetyAnalyst default models calibrated to Florida data and Floridaspecific SPFs are well matched for interchange areas for both total

J. Lu et al. / Safety Science 65 (2014) 93–105

103

Fig. 9. CURE plots for rural 4-lane freeways.

and FI crashes. The highest discrepancy between the SafetyAnalyst default SPFs calibrated to Florida data and Florida-specific SPFs occur for basic segments for total crashes. However, overall, there is a relatively closer matching of Florida-specific SPFs and the calibrated SafetyAnalyst default SPFs. Moreover, the majority of SafetyAnalyst default SPFs without calibration underestimate the crash frequency. As previously observed, at the same level of AADT, the

predicted crash frequency for segments within interchange influence areas is higher than that for basic freeway segments. Fig. 7 displays the SPFs for rural freeways with 6 + lanes. As observed from urban (6+)-lane freeway segments, there exist different fitting trends of the SafetyAnalyst default models calibrated to Florida data and Florida-specific SPFs. However, there is slightly closer fitting of Florida-specific and calibrated SafetyAnalyst default SPFs for

104

J. Lu et al. / Safety Science 65 (2014) 93–105

freeway segments within interchange areas. Again, SafetyAnalyst default SPFs without calibration underestimate the crash frequency. From the visual plots presented in this section, it can be concluded that several freeway subtypes exhibited big discrepancies in the trends of both Florida-specific SPFs and calibrated SafetyAnalyst default SPFs. It was also observed that the calibrated SafetyAnalyst default SPFs were not well representatives of the actual total and FI crash trends for the following subtypes; thus, development of local SPFs is strongly encouraged.      

Urban 6-lane basic segments. Urban (8+)-lane basic segments. Urban (8+)-lane interchange influence area segments. Rural 4-lane basic segments. Rural (6+)-lane basic segments. Rural (6+)-lane interchange influence area segments.

5.3. CURE plots Since Florida-specific SPFs were found better-fitted than calibrated SafetyAnalyst default SPFs, it is important to visualize how the plots of the cumulative residuals of Florida-specific SPFs look like. Figs. 8 and 9 show two sample CURE plots against AADT for urban 4-lane freeways and rural 4-lane freeways, respectively. Note that the solid blue line represents the cumulative residuals, the dashed red line represents the high two standard deviations (+2r) above the cumulative residual estimates, and the dashed green line represents the low two standard deviations (-2r) below the cumulative residual estimates. It can be seen from Fig. 8(a–d) that the cumulative residuals continuously fluctuate around zero and do not stray beyond the ±2r boundaries, which is an indication of well-fitted models (Hauer and Bamfo, 1997). Fig. 9(a and b) for total and FI crashes for rural 4-lane basic segments show that the cumulative residuals do not stray beyond the ±2r boundaries; however, the residuals do not continuously fluctuate around zero, specifically for higher AADT values. Furthermore, Fig. 9(c and d) for total and FI crashes for rural 4-lane interchange influence areas show that the cumulative residuals do not stray beyond the ±2r boundaries; however, the residuals do not oscillate around zero. Thus, in general, Florida-specific SPFs for rural 4-lane freeways exhibit well-fitted models, although not as good as those for urban 4-lane freeways. This is likely due to the more representative sample size of urban segments compared to their rural counterparts. These findings confirm the results from Table 4, where urban 4-lane freeway SPFs generally experience lower MAD and MSPE values, and higher R2FT values that their rural counterparts. The CURE plots for urban 6-lane freeways were very similar to those for urban 4-lane freeways. Moreover, the CURE plots for rural (6+)-lane freeways were very similar to those for rural 4-lane freeways. A distinctive observation was that the cumulative residuals for urban (8+)-lane freeways strayed beyond the ±2r boundaries, which shows the relatively low model fit of these facilities. This finding was also observed from the goodness-of-fit measures and confirms the complexity of traffic characteristics at those facilities. Overall, the CURE plots are successful in confirming the findings of the statistical goodness-of-fit measures for comparing different SPFs. 6. Summary and conclusions This study developed local simple SPFs for Florida using the NB distribution for both urban and rural basic freeway segments and interchange influence areas. The developed Florida-specific SPFs

were compared to the calibrated SafetyAnalyst default SPFs through calibration factors (C) to assess the potential merits of calibrating local SPFs. The results showed that Florida-specific SPFs generally produced better-fitted models than the calibrated SafetyAnalyst default SPFs; the majority of Florida-specific models had higher R2FT , as well as lower MAD and MSPE estimates. However, in some cases, there existed minor differences in the performance measures of the two models. The CURE plots confirmed the results of the statistical goodness-of-fit measures for comparing different SPFs (i.e., MAD, MSPE, and R2FT ). The plot of the predicted crashes against AADT revealed closer crash fitting (i.e., crash pattern) for Florida-specific SPFs and calibrated SafetyAnalyst default SPFs for specific types of freeways. These included urban 4-lane (basic segments and interchange areas), urban 6-lane (interchange areas), and rural 4-lane (interchange areas). On the other hand, development of local SPFs is strongly encouraged for the following freeway subtypes: urban 6-lane basic segments, urban (8+)-lane segments (basic segments and interchange areas), rural 4-lane basic segments, and rural (6+)-lane segments (basic segments and interchange areas). These freeway subtypes exhibited big discrepancies in the trends of both Florida-specific SPFs and calibrated SafetyAnalyst default SPFs. Furthermore, at the same level of AADT, the predicted crash frequency for segments within interchange influence areas was higher than that for basic freeway segments, especially for urban 6lane, urban (8+)-lane, rural 4-lane, and rural (6+)-lane freeways. Moreover, the SafetyAnalyst default SPFs calibrated to Florida data for basic segments and interchange influence areas on urban areas fit the observed crash data better than those on rural areas. This is likely a result of the more representative sample size of urban segments. The results from this study suggest that agencies implementing SafetyAnalyst could improve their crash prediction by developing local-specific SPFs. Further research could extend the comparison of agency-specific SPFs and calibrated SafetyAnalyst default models to other facilities, such as multilane arterials and intersections. Also, the comparison of different SPFs could be made for each crash severity level individually to deduce the variation of the results within different severity levels. Acknowledgment The authors are grateful to the Florida Department of Transportation (FDOT) Research Center for the funding provided to conduct this study. References American Association of State Highways and Transportation Officials ‘‘AASHTO’’, 2010a. ‘‘SafetyAnalyst.’’ . (accessed April 2011). American Association of State Highways and Transportation Officials ‘‘AASHTO, 2010b. Highway Safety Manual, First Edition. Transportation Research Board of the National Academies, Washington, DC. Bornheimer, C., Schrock, S., Wang, M., Lubliner, H., 2012. Developing a regional safety performance function for rural two-lane highways. Proceedings of the 91st Annual Meeting of the Transportation Research Board, Washington, D.C. Brimley, B., Saito, M., Schultz, G., 2012. Calibration of the highway safety manual safety performance function and development of new models for rural two-lane two-way highways. Proceedings of the 91st Annual Meeting of the Transportation Research Board, Washington, D.C. Chang, L., 2005. Analysis of freeway accident frequencies: negative binomial regression versus artificial neural network. Safety Sci. 43, 541–557. Dean, C., Lawless, F., 1989. Tests for detecting overdispersion in poisson regression models. J. Am. Stat. Ass. 84, 467–472. Freeman, M., Tukey, J., 1950. Transformations related to the angular and the square root. Ann. Math. Stat. 21, 607–611. Fridstrom, L., Ifver, J., Ingebrigtsen, S., Kulmala, R., Thomsen, L., 1995. Measuring the contribution of randomness, exposure, weather, and daylight to the variation in road accident counts. Accident Anal. Prevent. 27, 1–20.

J. Lu et al. / Safety Science 65 (2014) 93–105 Gan, A., Haleem, K., Alluri, P., Lu, J., Wang, T., Ma, M., Diaz, C., 2012. Preparing Florida for deployment of SafetyAnalyst for All roads. Final Report BDK80 977–07, Florida Department of Transportation, Tallahassee, FL. Harwood, D., Torbic, D., Richard, K., Meyer, M., 2010. SafetyAnalyst: software tools for safety management of specific highway sites. Report FHWA-HRT-10-063, Federal Highway Administration, Office of Safety, McLean, VA. Hauer, E., Bamfo, J., 1997. Two tools for finding what function links the dependent variable to the explanatory variables. Proceedings of International Co-operation on Theories and Concepts in Traffic Safety (ICTCT) Conference, Lund, Sweden. Jovanis, P., Chang, H., 1986. Modeling the relationship of accident to miles traveled. Transport. Res. Record 1068, 42–51. Khan, S., Shanmugam, R., Hoeschen, B., 1999. Injury, fatal, and property damage accident models for highway corridors. Transport. Res. Record 1665, 84–92. Kiattikomol, V., 2005. Freeway crash prediction models for long-range urban transportation planning. Doctoral Dissertation, University of Tennessee, Knoxville. Kiattikomol, V., Chatterjee, A., Hummer, J., Younger, M., 2008. Planning level regression models for prediction of crashes on interchange and noninterchange segments of urban freeways. ASCE. J. Transport. Eng. 134 (3), 111–117. Konduri, S., Sinha, K., 2002. Statistical models for prediction of freeway incidents. Proceedings of the 7th International Conf. on Applications of Advances Technology in Transportation Engineering, Cambridge, MA, pp. 167–174. Kononov, J., Allery, B., 2004. Explicit consideration of safety in transportation planning and project scoping. Proceedings of the 83rd Annual Meeting of the Transportation Research Board, Washington, D.C. Kraus, J., Anderson, C., Arzemanian, S., Salatka, M., Hemyari, P., Sun, G., 1993. Epidemiological aspects of fatal and severe injury urban freeway crashes. Accident Anal. Prevent. 25 (3), 229–239.

105

Lefler, N., Council, F., Harkey, D., Carter, D., McGee, H., Daul, M., 2010. Model Inventory of Roadway Elements-MIRE, Version 1.0. Report FHWA-SA-10-018, Federal Highway Administration, Office of Safety, McLean, VA. Lord, D., Mannering, F., 2010. The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transport. Res. Part A 44, 291–305. Lu, J., Gan, A., Haleem, K., Alluri, P., Liu, K., 2012. Comparing Locally-Calibrated and SafetyAnalyst-Default Safety Performance Functions for Florida’s Urban Freeways. Proceedings of the 91st Annual Meeting of the Transportation Research Board, Washington, D.C. Lu, J., Gan, A., Haleem, K., Wu, W., 2013. Clustering-based roadway segment division for the identification of high crash locations. J. Transport. Safety Security 5 (3), 224–239. Lu, L., Lu, J., Lin, P., Wang, Z., Chen, H., 2009. Development of an interface between FDOT’s crash analysis reporting system and the SafetyAnalyst. Final Report BDK84 977–01, Florida Department of Transportation, Tallahassee, FL. Manan, M., Jonsson, T., Varhelyi, A., 2013. Development of a safety performance function for motorcycle accident fatalities on malaysian primary roads. Safety Sci. 60, 13–20. Persaud, B., Dzbik, L., 1993. Accident prediction models for freeways. Transport. Res. Record 1401, 55–60. Sacchi, E., Persaud, B., Bassani, M., 2012. Assessing international transferability of highway safety manual crash prediction algorithm and its components. Transport. Res. Record 2279, 90–98. Shankar, V., Mannering, F., Barfield, W., 1995. Effect of roadway geometrics and environmental factors on rural freeway accident frequencies. Accident Anal. Prevent. 27 (3), 371–389. Young, J., P. Park., 2012. Comparing the Highway Safety Manual’s Safety Performance Functions with Jurisdiction-Specific Functions for Intersections in Regina. Proceedings of the Annual Conference of the Transportation Association of Canada, Fredericton, New Brunswick.