Determinants of the congestion caused by a traffic accident in urban road networks

Determinants of the congestion caused by a traffic accident in urban road networks

Accident Analysis and Prevention 136 (2020) 105327 Contents lists available at ScienceDirect Accident Analysis and Prevention journal homepage: www...

4MB Sizes 0 Downloads 45 Views

Accident Analysis and Prevention 136 (2020) 105327

Contents lists available at ScienceDirect

Accident Analysis and Prevention journal homepage: www.elsevier.com/locate/aap

Determinants of the congestion caused by a traffic accident in urban road networks

T

Zhenjie Zhenga, Zhengli Wanga, Liyun Zhub, Hai Jianga,* a b

Department of Industrial Engineering, Tsinghua University, Beijing 100084, China BTI Smart Tech Co., Ltd., Beijing 100073, China

ARTICLE INFO

ABSTRACT

Keywords: Traffic accident Urban road networks Congestion Determinants Mixed-effects model

Non-recurrent congestion is frustrating to travelers as it often causes unexpected delay, which would result in missing important meetings or appointments. Major causes of non-recurrent congestion include adverse weather conditions, natural hazards, and traffic accidents. Although there has been a proliferation of studies that investigate how adverse weather conditions and natural hazards impact road congestion in urban road networks, studies that look into determinants of the congestion caused by a traffic accident are scarce. This research fills in this gap in the literature. When a traffic accident occurs on an urban link, the congestion would propagate to and affect adjacent links. We develop a modified version of the Dijkstra's algorithm to identify the set of links in the neighborhood of the accident. We first measure the level of congestion caused by the traffic accident as the reduction in traveling speed on those links. As the impact of congestion varies both in space and in time, we then estimate a generalized linear mixed-effects model with spatiotemporal panel data to identify its determinants. Finally, we conduct a case study using real data in Beijing. We find that: (1) the level of congestion is mostly associated with the types of the traffic accidents, the types of vehicles involved, and the occurrence time; (2) for the three types of traffic accidents, namely, scrape among vehicles, collisions with fixed objects, and rear-end collisions, the level of congestion associated with the first two types are comparable, while that associated with the third type is 8.43% more intense; (3) for the types of vehicles involved, the level of congestion involving buses/trucks is 6.03% more intense than those involving only cars; (4) for the occurrence time, the level of congestion associated with morning peaks and afternoon peaks are 5.87% and 6.57% more intense than that associated with off-peak hours, respectively.

1. Introduction

Koesdwiady et al., 2016). There has been a proliferation of studies that investigate how the first two causes, that is, adverse weather conditions (Agarwal et al., 2005; Tsapakis et al., 2013; Koesdwiady et al., 2016) and natural hazards (Pregnolato et al., 2016; Kermanshah and Derrible, 2016; Kontou et al., 2017), impact the level of congestion in urban road networks. For example, Tsapakis et al. (2013) investigate the level of congestion with respect to rainfall, snowfall, and temperature levels. The results of seven weather stations in London have shown that the level of congestion would increase when the intensity of rain and snow increases, while it is not influenced by temperature. Pregnolato et al. (2016) develop a simulation framework to measure the level of congestion caused by flood under different management strategies. The case study of Newcastle in northeast England has shown that both the green roof infrastructures and traditional engineering interventions such as drainage channels and sewerage systems decrease the level of congestion caused by flood. Kontou et al. (2017) investigate the level of

It has been widely recognized that about half of the traffic congestion can be attributed to non-recurrent traffic events (Schrank et al., 2002, 2012; Charles, 2005). For example, Schrank et al. (2012) find that non-recurrent congestion contributes to 52–58% of the total traffic congestion in urban areas of the United States. Charles (2005) estimates that 50% of the traffic congestion in Australia's major cities is caused by non-recurrent congestion. Schrank et al. (2002) find that about 55% of the 67.5 billion costs on traffic congestion in cities of the United States are attributed to non-recurrent congestion. Therefore, it is of great importance to understand the causes of non-recurrent congestion and design effective counter measures. Major causes of non-recurrent congestion in urban road networks include adverse weather conditions, natural hazards, and traffic accidents (Chen et al., 2012; Tsapakis et al., 2013; Anbaroglu et al., 2014;



Corresponding author. E-mail address: [email protected] (H. Jiang).

https://doi.org/10.1016/j.aap.2019.105327 Received 14 May 2019; Received in revised form 17 September 2019; Accepted 11 October 2019 0001-4575/ © 2019 Elsevier Ltd. All rights reserved.

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

congestion caused by the Hurricane Sandy in 2012. It is found that the commuters who normally commute using their personal automobiles experience more intense traffic congestion compared to those using other modes. Moreover, the impact of special events (baseball games) and infrastructures (bike-share stations) on traffic congestion are also considered in Kwon et al. (2006) and Hamilton and Wichman (2018). The studies that look into how traffic accidents impact road congestion are focused exclusively on separate freeway stretches or major roads. Early research in the literature that investigates the impact of traffic accidents are primarily based on Kinematic Wave Theory and Deterministic Queuing Theory (Lighthill and Whitham, 1955; Lawson et al., 1997; Li et al., 2006; Mfinanga and Fungo, 2013). More recently, extensive studies adopt many variable approaches, such as, integer programming models (Chung and Recker, 2012, 2015; Wang et al., 2018), K-nearest neighbor clustering (Chen et al., 2016) and fuzzy cmeans (Yang et al., 2017). However, these studies cannot be directly applied to the urban road networks because of the complicated traffic conditions, such as traffic lights, pedestrian crossings, side-street parking and so on. The measurement and determinants of the congestion caused by a traffic accident in urban road networks has not been carefully investigated. Moreover, there are also some remotely related studies that investigate how traffic congestion affects the accident frequency and severity (Wang et al., 2009; Chen and Chen, 2011; Stipancic et al., 2017). In this research, we propose to investigate how a traffic accident affects the level of congestion in urban road networks, which is new to the literature. When a traffic accident occurs on an urban link, the congestion would propagate to and affect adjacent links. We develop a modified version of the Dijkstra's algorithm to identify the set of links in the neighborhood of the accident. We first measure the level of congestion caused by the traffic accident as the reduction in traveling speed on those links. As the level of congestion varies both in space and in time, we then systematically vary the size of the neighborhood and time period, which results in a panel of congestion measurements. This data is analyzed using a generalized linear mixed-effects model to quantify how the types of the accidents, types of vehicles involved, and occurrence time affect the level of congestion. Using real data in Beijing, we find that: (1) the level of congestion is mostly associated with the types of the traffic accidents, the types of vehicles involved, and the occurrence time; (2) for the three types of traffic accidents, namely, scrape among vehicles, collisions with fixed objects, and rear-end collisions, the level of congestion associated with the first two types are comparable, while that associated with the third type is 8.43% more intense; (3) for the types of vehicles involved, the level of congestion induced by the traffic accidents involving buses/trucks is 6.03% more intense than those involving only cars; (4) for the occurrence time, the level of congestion associated with morning peaks and afternoon peaks are 5.87% and 6.57% more intense than that associated with off-peak hours, respectively. We summarize the contributions of this research as follows:

Fig. 1. Illustration of the congestion caused by a traffic accident in the urban road network.

generalized linear mixed-effects model for panel data. In Section 4, we conduct a case study using real data in Beijing and analyze its results. Finally, we conclude discussions and outline potential future research directions in Section 5. 2. Measuring the level of congestion caused by a traffic accident In this section, we develop an approach to measure the level of congestion caused by a traffic accident in urban road networks. In Section 2.1, we measure the level of congestion as the reduction in traveling speed on a set of links in the neighborhood of the accident. In Section 2.2, we present our approach to identifying the set of links to calculate the level of congestion. 2.1. Measurement of the level of congestion We take the urban road network in Fig. 1 as an example to illustrate the congestion caused by a traffic accident. In this figure, the red mark O on link HI indicates the accident location. The arrows beside the links indicate the driving directions of vehicles. The color indicates whether the traveling speed is affected by the traffic accident after its occurrence. To be specific, when the color of a link is green, it indicates that the traveling speed on this link is not affected. When the color of a link is red, it indicates that the traveling speed on this link is reduced by the accident. Moreover, the links UV and YX are two overpasses that are not directly connected to other links. According to Fig. 1, we can find the reduction in traveling speed on adjacent upstream links HI, GH , MH and CH . Accordingly, we conclude that the traffic accident would reduce the traveling speed of adjacent upstream links in its neighborhood, and therefore leads to congestion in the urban road network. Such congestion is defined as the average reduction in traveling speed on a set of links in the neighborhood of the traffic accident. For a given traffic accident, suppose that the set of links 1, …, n , …, N are identified in its neighborhood. The time period is discretized into intervals labeled as 1, …, m , …, M . Let vnm indicate the traveling speed on link n in interval m . Based on the historical observations of vnm when there are no accidents, we can easily obtain the mean of vnm , denoted as v¯nm , that is, the accident-free speed. When a traffic accident occurs, vnm

• to the best of authors’ knowledge, we are the first to look into the • • •

determinants of the congestion caused by a traffic accident in urban road networks; we develop a modified version of the Dijkstra's algorithm to identify the set of links in the neighborhood of the accident; we employ a generalized linear mixed-effects model for panel data to identify the determinants of the congestion caused by the traffic accident; and we find that the level of congestion caused by the traffic accident is mostly associated with the types of the accidents, the types of vehicles involved and the occurrence time.

The remainder of this paper is organized as follows. In Section 2, we develop an approach to measuring the level of congestion caused by the traffic accident in urban road networks. In Section 3, we present the 2

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

is referred to as the accident-induced speed, denoted as vˆnm . The ratio between the accident-induced speed and accident-free speed is used to measure the level of congestion. Such congestion on link n in interval m caused by the traffic accident is expressed as follows:

ynm =

vˆnm . v¯nm

as the overpasses UV and YX . In real life, the situation happens when a traffic accident occurs under the overpasses. The overpasses are not affected by the traffic accident but are included in the circle. Second, the traffic accident only affects the traveling speed on upstream adjacent links, while the extra downstream links are also included in the circle. The above two aspects illustrate that the measurement based on the method of drawing a circle is not appropriate for the scenario when a traffic accident occurs on an urban link. We address these issues by developing a modified version of the Dijkstra's algorithm. Let the distance of a node V be the shortest distance from node V to the accident location O . Note that we only aim to search for upstream links of the accident location. The modified version of the Dijkstra's algorithm is designed to identify the link set L in which each link satisfies the property that the distance from any point on the link to the accident location O is no more than r . The idea of this algorithm is first to find the nodes whose distance are no more than r . After obtaining these nodes, we put them into a node set S and record their distance. Next, the link n whose ending node V exists in S is put into the link set L . Since the distance from node V to the accident location O is less than r , part of or the whole link n would satisfy the property. When the whole link n satisfies the property, the following equation would be established:

(1)

We then measure the level of congestion caused by the traffic accident using the average weighted reduction in traveling speed on the set of links. Let n represent the weight of link n and it is weighted by its length. Suppose that the level of congestion caused by the traffic accident is measured in the time interval m , it can be expressed as follows:

ym =

1 n N 1 n N

n ynm

.

(2)

n

When the value of ym is large, it indicates that the level of congestion is slight. When the value of ym is small, it indicates that the level of congestion is intense. 2.2. Identification of the set of links According to Section 2.1, we know that the key of measuring the level of congestion caused by a traffic accident is to identify the set of links, denoted as the link set L . The typical method used to measure the adverse weather impact on the traffic congestion is to draw a circle centered at the weather station with a predefined radius (Agarwal et al., 2005; Tsapakis et al., 2013). The links that fall within or intersect with the circle are used to calculate the level of congestion. It may be tempting to draw a circle centered at the accident location with a predefined radius and calculate the reduction in traveling speed on those links. However, this approach is not appropriate as is illustrated in Fig. 2(a). In this figure, the urban road network is the same as that in Fig. 1. For the sake of clarity, the lengths of links are labeled and the traveling speed are ignored. Assuming that radius of the circle used to identify the set of links is 500 , we can obtain the link set L that includes all black links in Fig. 2(a) by drawing a circle. When the time intervals affected by the accident are known, the level of congestion can be calculated using Eqs. (1) and (2). Although the method is convenient to be implemented in practice, it may cause large errors in measuring the level of congestion caused by the traffic accident. The errors come from two aspects: First, the circle may provide extra unconnected links, such

dist(V ) + len(n)

(3)

r,

where dist(V ) is the distance of node V and len(n) is the length of link n . When there is only part of the link n satisfies the property, we first identify a point Z on link n whose distance is exactly r . We then truncate the link n from point Z to the ending node. Let nˆ denote the truncated link, the following equation would be established: (4)

dist(Z ) = dist(V ) + len(nˆ) = r ,

where len(nˆ) is the length of the truncated link. Finally, we define n , the weight of link n , as the length of the link satisfying the property, which can be expressed as follows: n

= min {r , dist(V ) + len(n)}

dist(V ).

(5)

We illustrate the above algorithm with Fig. 2(b). Assuming that the traffic accident occurs at point O on link HI and r = 500 . In addition, we divide the origin link HI into two new links HO and OI to implement the algorithm. The length of link HO is 250. We first find the upstream nodes whose distance to point O are less than 500 and put them into the node set S : S = {O : 0, H : 250, G: 450} . Then, we get the links whose

Fig. 2. Example of drawing a circle and the modified version of the Dijkstra's algorithm. 3

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

ending nodes are O , H or G and put them into the link set L : L = {HO, GH, CH, MH, IH, BG, FG, LG, HG} . Finally, we calculate the weight of each link in set L by Eq. (5). If dist(V ) + len(n) 500 , it means that the whole link n satisfies the property, such as the link GH . In this case, the weight of link GH is its length (200). Otherwise, it means that only part of the link satisfies the property, such as the link MH . We then truncate the link from point Z to the ending node H , that is the link ZH . In this case, the weight of link MH is the length of the truncated link ZH (250). In Fig. 2(b), all links in the link set L are blacked. Note that the downstream links and unconnected overpasses UV and YX are not included in L . Algorithm 1 presents the statement for the modified version of the Dijkstra's algorithm. Assuming that the accident occurs at a point O on the link HI. In order to implement the algorithm, the link HI is divided into two links HO and OI . Let link(U , V ) represent the link from node U to node V and pre(V ) represent the predecessors of node V . Let Q and S represent the unvisited node set and visited node set, respectively. Note that the urban road network of a city may include billions of nodes. In order to avoid taking up huge space, we employ an empty unvisited node set initially and update it iteratively. The maximum distance is set to r . The preprocessing is implemented in lines 1–5. Between lines 6–19, we find each node whose distance is no more than r and put them into the visited node set S . Between lines 20–24, we get the link set L and calculate the weight of each link. Finally, in line 25 we return the link set L . Using the link set L and the time interval m , the congestion caused by the traffic accident can be calculated by Eqs. (1) and (2).

involving measurements over time, that is, panel data contain observations obtained over multiple time periods for the same individuals (Wooldridge, 2010; Hsiao, 2014). In contrast, cross-section data is the type of data that are usually collected by observing individuals at the same point of time (Wooldridge, 2010). The panel data in our study are collected over time and distance. According to the detailed steps in Section 2, the congestion caused by a traffic accident k is measured in a spatiotemporal area which is characterized by the maximum distance r and time interval m , denoted as yrmk . As the level of congestion varies both in space and in time, we systematically vary the size of the neighborhood and time interval, which results in a panel of observations. To be specific, when we fix the distance r and time interval m , a group of cross-section data is obtained with different accidents. When we vary the distance r or time interval m , different cross-section data sets can be obtained, which constitutes the panel data. In the field of traffic accident analysis, a number of studies have used the panel data to estimate regression models (Li et al., 2015; Coruh et al., 2015; Chen et al., 2018; Soro and Wayoro, 2018). For example, Li et al. (2015) use panel data to investigate the effects of the changes in road network characteristics on road casualties. Coruh et al. (2015) use the random parameters negative binomial model with the panel data to analyze factors affecting the accident frequency. Soro and Wayoro (2018) use the mixed effects negative binomial model with the panel data to identify the determinants of road mortalities. Compared with crosssection data, panel data possess two main advantages (Hsiao, 2014; Li et al., 2015): (1) panel data give the researcher a larger number of observations, which increases the degrees of freedom and reduces the

Algorithm 1. Modified version of the Dijkstra's algorithm

collinearity among independent variables, thus better estimating the parameters; and (2) panel data allow a researcher to answer the questions that cannot be completely addressed using cross-section data. In our research, cross-section data cannot identify the influence of distance r and time interval m on the level of congestion caused by a traffic accident. However, such influence can be regarded as the spatial and temporal fixed effects in the model for panel data.

3. Generalized linear mixed-effects model for panel data In order to better identify the determinants of the congestion caused by a traffic accident, the panel data is employed in our generalized linear mixed-effects model. Panel data are multi-dimensional data 4

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

We now introduce the models for panel data including fixed-effects model, random-effects model, and mixed-effects model (Hsiao, 2014). The fixed-effects model is recognized as one in which researchers make inferences conditional on the effects that are in the sample. In our research, we select the distance r and time interval m to observe how the traffic accident k impact the level of congestion yrmk . We want to investigate that how yrmk varies with respect to different distance r and time interval m we selected. For the traffic accident k , all observations of the selected distance r and time interval m are in the sample. Therefore, we use fixed terms to capture how the distance r and time interval m impact the level of congestion. The random-effects model is recognized as one in which researchers make marginal or unconditional inferences with respect to the population of all effects. In our research, we collect numerous reports of accidents in Beijing. However, the collected accidents are still partly sampled from all accidents in Beijing. We want to make inferences about how any accident occurring in Beijing affects the level of congestion based on the collected accidents. Therefore, we use random terms to capture the effects. Mixed-effects model is recognized as one that contains both fixed-effects and randomeffects, it is widely adopted in the traffic accident analysis, such as the mixed-effects negative binomial model (Soro and Wayoro, 2018), mixed-effects logit model (Dong et al., 2018), generalized linear mixedeffects model (Mussone et al., 2017) and so on. Since both the fixedeffects and random-effects are incorporated in our research and the dependent variable (the level of congestion) is continuous, a generalized linear mixed-effects model for panel data is employed. The model can be expressed as follows (Hsiao, 2014):

log(yrmk ) = x rmk + µrm +

k

+

rmk ,

(Magee, 1990). The R2 is defined as follows:

R2 = 1

= 0 , 0

4.1. Data description We conduct the case study in the urban road network of northern Beijing. A total of 907 accidents in April 2016 are used in our research. The locations of these accidents are illustrated in Fig. 3. The data used in this study contain four parts: (1) GPS data reported by probe vehicles, which record the traveling speed, locations, and heading of the probe vehicles at regular intervals. GPS data is one of the most important data sources with great space and time accuracy (Shen et al., 2013), and has been widely adopted in numerous studies (Wang et al., 2013, 2018; Tang et al., 2016). The GPS data in our study are collected from about 55,000 probe vehicles in Beijing city. The average reporting interval of the probe vehicles is about 30 s, which is integrated into 5 min; (2) the accident reports in April 2016, which record the occurrence time, location, and a detailed description of each individual accident. According to the description, we can obtain types of the accidents, injury conditions of the drivers and types of the vehicles involved; (3) topological structure of the urban road networks, which provides the detailed information of 40,703 links in total. The information includes link ID, the length and width of the link, the class of the link (such as the main road or branch), link coordinates and the number of traffic lanes; and (4) environmental data in April 2016, which provides the road surface conditions.

(6)

4.2. Independent Variables (8)

In Table 1, we show the summary of the independent variables used in the model. In this table, Columns 1 and 2 report the category of the variables. The first category is the accident characteristics which include types of the accidents, types of the vehicles involved, injury conditions of the drivers, occurrence time and occurrence day. Note that the scrape among vehicles in types of the accidents mainly indicates the sideswipe collisions, as well as other types of minor collisions. The two other accident types, rear-end collisions and collisions with fixed object, are excluded from the scrape. The second category is the regional characteristic which indicates the urban area the accident location belongs to. Boundaries of the urban area in Beijing are defined as the 2nd, 3rd, 4th and 5th rings. The third category is the environmental characteristics consisting of wet and dry road surface conditions. The fourth category is the traffic characteristics which include the linear and squared terms of average speed in 30 min before the occurrence time of the accident. The reason why we include the squared term is that the average speed might be non-linearly related to the level of congestion. The last category is the areal geometric characteristics. Given the accident location O and maximum distance r , the link set L

(9)

Cov = ZGZ + R.

Because of the random vector , least squares is no longer the best estimation method. The generalized least squares (GLS) is more appropriate. Based on the GLS, and µ can be estimated by minimizing

X

µ ) Cov 1(y

(11)

In this section, we conduct the case study using real data in Beijing. In Section 4.1, we give a brief description of the data. In Section 4.2, we show the independent variables. In Section 4.3, we apply the generalized linear mixed-effects model to identify determinants of the congestion caused by a traffic accident. Finally, we conclude the main findings and give the discussions in Section 4.4. The computer programs are written in Python 3.6.3 (Python, 2017) and SAS 9.4 (SAS, 2015). All the computational programs are performed on a desktop computer with an Intel 3.70G Hz CPU and 16 GB of memory.

where is the vector of k , is the vector of rmk , G and R are covariance matrices of and , respectively. Let Z be the matrix of all fixed terms including x rmk and µrm , y be the vector of log(yrmk ) and Cov be the covariance matrix of y , we can conclude that

(y

log L 0)

4. Case study

(7)

= G 0 , 0 R

Var

2 (log L K

where log L is the log-likelihood of our model, log L0 is the log-likelihood of the intercept-only model and K is the number of observations. Moreover, the Akaike information criterion (AIC ) and Bayesian information criterion (BIC ) are also provided.

where log(yrmk ) indicates the logarithm of the level of congestion caused by the traffic accident k given the distance r and time interval m . x rmk is the vector of independent variables including the types of accidents, types of vehicles involved, injury conditions, occurrence time, occurrence day, occurrence area, road surface conditions, traffic speed and road types. is the coefficient vector of the variables. µrm is a fixed term for distance r and time interval m , which captures the spatial and temporal congestion evolution. k is a random term for accident k , which captures the unobserved heterogeneity in traffic accidents. rmk is a random term for extra interaction effects. This model is the generalized linear mixed-effects model, which belongs to the Gaussian family and it is a linear regression. The model specification can be expressed as follows:

E

exp

X

µ ),

(10)

where X is the matrix of x rmk and µ is the vector of µrm . In order to minimize Eq. (10), the information of Cov is required. Both the maximum likelihood (ML) and restricted maximum likelihood (REML) estimators can be used to estimate it. However, the result of ML is biased (Yu et al., 2007), the unbiased REML is used to estimate the Cov . In order to evaluate the overall performance of the model specification, we employ the likelihood ratio test R2 as the goodness of fit measures 5

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

Fig. 3. Locations of the accidents in April 2016 in Northern Beijing. Table 1 Summary of the independent variables. Category Accident characteristics

Types of the accidents Types of the vehicles Injury conditions Occurrence time Occurrence day

a

Regional characteristics

Occurrence area

Environment characteristics

Road surface condition

Traffic characteristics

Speed (km/h)

Geometric characteristics

Road types

Variable

Accidents

Type

Collision with fixed object Rear-end collision Scrape among vehicles Buses/trucks Car Injuries Property damage only Morning peak (7:00–9:00) Afternoon peak (17:00-19:00) Off-peak hours Weekdays Weekends In the 2nd ring Between the 2nd and 3rd ring Between the 3rd and 4th ring Between the 4th and 5th ring Out of the 5th ring Wet Dry Average speed Square of the average speed Fraction of the minor links (%) Fraction of the secondary links (%)

20 45 842 80 827 121 786 148 160 599 745 162 39 162 346 280 80 170 737 907 907 907 907

Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Dummy Continuous Continuous Continuous Continuous

The bold values indicate the reference levels of the dummy variables.

can be obtained according to the modified Dijkstra's algorithm in Section 2.2. Two variables of areal geometric characteristics can be defined based on the L : (1) the fraction between the cumulative length of the minor links composed of 1 or 2 traffic lanes and the cumulative length of all links in L ; and (2) the fraction between the cumulative length of the secondary links (branch or bypass) and the cumulative length of all links in L . Column 3 reports the meanings of the variables. Columns 4 and 5 report the number of accidents and the variable type, respectively. The reference levels of the dummy variables are highlighted in the table. As the impact of congestion varies both in space and in time, we choose different values of distance r and time interval m to obtain the panel data. The values of r and m are set to 300, 500, 700, 900 meters and 0–10, 10–20, 20–30, 30–40 min after the occurrence of the accident, respectively. Note that the associated variables of r and m are also dummies. We let the variable of 900 meters and 30–40 min be the reference level.

4.3. Results The coefficients for the mixed-effects model are reported in Table 2. In this table, Columns 1 and 2 report the category of the variable. Columns 3 and 4 report the coefficient and stand error of the variable, respectively. Since we take the logarithm of yrmk , the coefficient of a variable can be regarded as the percentage increase in the dependent variable with 1-unit increase in the independent variable (Benoit, 2011). Moreover, high values of the dependent variable yrmk imply lower congestion. Therefore, the coefficients of the model can be interpreted as follows: (1) for continuous variables, we take the fraction of the minor links as an example, the coefficient is 0.0006, which is interpreted as the level of congestion expects to increase by 0.06% with 1-unit increase in the fraction of the minor links; and (2) for dummy variables, we take the variable rear-end collision as an example, the coefficient is 0.0843, which is interpreted as the level of congestion associated with rear-end collision is 8.43% more intense than that 6

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

values of the dependent variable yrmk imply lower congestion. The level of congestion refers to the average reduction in traveling speed on a set of links in the neighborhood of the traffic accident.

Table 2 The results of the generalized linear mixed-effects model. Category

Variables

Coefficient

Standard error

Types of the accidents

Collision with fixed object Rear-end collision Scrape among vehicles Buses/trucks Cars Injuries Property damage Morning peak Afternoon peak Off-peak hours Weekdays Weekends In the 2nd ring Between the 2nd and 3rd ring Between the 3rd and 4th ring Between the 4th and 5th ring Out of the 5th ring Wet

−0.0071

0.0178

−0.0843*** –

0.0122 –

−0.0603*** – 0.0008 – −0.0587*** −0.0657*** – −0.0384*** – 0.0045 0.0152

0.0092 – 0.0077 – 0.0072 0.0070 – 0.0071 – 0.0154 0.0108

0.0081

0.0098

−0.0088

0.0100

– 3pt 0.0032

– 0.0069

– 0.0402*** −0.0009***

– 0.0028 0.0001

−0.0006***

0.0001

−0.0001

0.0001

−0.2935*** −0.2175*** −0.1661*** −0.1342*** −0.1241*** −0.1608*** −0.1094*** −0.0686*** −0.0665*** −0.1193*** −0.0707*** −0.0342*** −0.0291*** −0.0893*** −0.0452*** −0.0082 –

0.0198 0.0053 0.0054 0.0054 0.0054 0.0050 0.0050 0.0050 0.0050 0.0050 0.0051 0.0051 0.0051 0.0050 0.0050 0.0050 –

Types of the vehicles Injury conditions Occurrence time Occurrence day Occurrence area

Road surface condition Speed (km/h) Road types

Intercept Fixed terms

Dry Average speed Square of the average speed Fraction of the minor links (%) Fraction of the secondary links (%) Constant term 0–10 min, 300 m 10–20 min, 300 m 20–30 min, 300 m 30–40 min, 300 m 0–10 min, 500 m 10–20 min, 500 m 20–30 min, 500 m 30–40 min, 500 m 0–10 min, 700 m 10–20 min, 700 m 20–30 min, 700 m 30–40 min, 700 m 0–10 min, 900 m 10–20 min, 900 m 20–30 min, 900 m 30–40 min, 900 m

• For the three types of traffic accidents, namely, scrape among ve-





• •

hicles, collision with fixed objects, and rear-end collision, the level of congestion associated with the first two types are comparable, while that associated with the third type is 8.43% more intense. Since the traffic accidents induced by scrape among vehicles and collision with fixed object usually cause slight damage, the drivers will reach an agreement and pull vehicles over quickly, therefore, the traffic movement will be slightly affected. However, the traffic accidents induced by rear-end collision usually cause severe damage (Kim et al., 2007; Kusano and Gabler, 2012), and the vehicles will not be moved until the intervention of traffic police, therefore, the traffic movement will be seriously affected. For the types of vehicles involved, the level of congestion involving buses/trucks is 6.03% more intense than those involving only cars. This is because that the traffic accidents with buses/trucks involved will block more traffic lanes due to its weight and size (Chung and Recker, 2015; Uddin and Huynh, 2017), thus causing more intense congestion. For the occurrence time, the level of congestion associated with the morning peaks and afternoon peaks are 5.87% and 6.57% more intense than that associated with the traffic accidents occurring in off-peak hours, respectively. We also find that the level of congestion associated with afternoon peak is more intense than that associated with morning peak. The reason is that the traffic in afternoon peak is heavier than that in morning peak. For the occurrence day, the level of congestion associated with weekdays is 3.84% more intense than that associated with weekends. The reason is that lots of people are on duty on weekdays and the traffic is heavier than that on weekends, which makes more intense congestion on weekdays. Both the average speed and its square are statistically significant. Let x v and yv indicate the average speed and its effect on the dependent variable, respectively. According to the regression results in Table 2, we can get the following equation:

yv = 0.0402x v

The R2 of the model is 0.3966. The AIC and BIC of the model are −20393.2 and −20383.6, respectively. The variance of the random effect is 0.0053 and it is statistically significant at the 0.01 level.



Notes: The dependent variable is log(yrmk ) which indicates the logarithm of the level of congestion caused by the traffic accident k given the distance r and time interval m . b The bold values indicate the reference levels of the dummy variables. *** Indicate the statistical significance at the 0.01 level.



associated with scrape among vehicles. The fixed effects of the distance r and time interval m are also reported in Table 2. We take the value of 0–10 min and 300 m as an example. The value of the fixed term is 0.2175, which means that the level of congestion measured within 0–10 min and 300 m is 21.75% more intense than that measured within 30–40 min and 900 m.

0.0009x v2 .

(12)

The relationship between the dependent variable and the average speed is inverted U-shaped. This means that the congestion is intense when the average speed is very low or high, which can be interpreted as follows: When the average speed is very low, it indicates that the traffic congestion already exists. Occurrence of the traffic accident will make the congestion much more intense. When the average speed is very high, the traffic condition is vulnerable, occurrence of the traffic accident will reduce the speed sharply. We expect to see that the level of congestion will increase by 0.06% with 1-unit increase in the fraction of minor links. The larger fraction of minor links indicates lower traffic capacity in the neighborhood of the accident, therefore the congestion will be more intense when the traffic accident occurs. According to the fixed effects in Table 2, we expect to see that the level of congestion will decrease when we extend the predefined area (r ) or the time interval (m ). This means that the impact of traffic accidents will decrease with the increase of distance or time.

4.5. Discussion Our study can be used for the management of the non-recurrent congestion caused by a traffic accident. Such congestion is typified by unpredictable locations, times, types, and severity (Anbaroglu et al., 2014, 2015). Effective management of the congestion relies heavily on the timely response of traffic management system. Based on the

4.4. Findings According to the results in Table 2, we summarize the main findings and give the corresponding discussions in this section. Note that high 7

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al.

aforementioned findings, the potential congestion associated with the accident can be estimated. Then, the traffic management agencies can allocate resources in the most efficient way (Chung and Recker, 2012, 2015). For example, there exists the situation where two accidents occur at near times and they have similar characteristics but different accident types. One accident is a rear-end collision, while the other is a scrape among vehicles. According to our findings, the level of congestion associated with the rear-end collision is likely to be more intense than that associated with the scrape type. Therefore, we should put more priority to clear up the accident of the rear-end collision. Moreover, dissemination of the information on the estimated congestion via a real time traffic information source (Farradyne, 2000; Gordon, 2016), such as in-vehicle or personal devices and roadside variable message signs, will allow the travellers to adjust their traveling plans or alter the driving routes to avoid the congestion.

influence the work reported in this paper. Acknowledgments This research is supported by the Natural Science Foundation of China [Grant 71622006, 71761137003] and the Center for DataCentric Management in the Department of Industrial Engineering at Tsinghua University. References Agarwal, M., Maze, T.H., Souleyrette, R., 2005. Impacts of Weather on Urban Freeway Traffic Flow Characteristics and Facility Capacity. pp. 18–19. Anbaroglu, B., Cheng, T., Heydecker, B., 2015. Non-recurrent traffic congestion detection on heterogeneous urban road networks. Transportmetrica A 11 (9), 754–771. Anbaroglu, B., Heydecker, B., Cheng, T., 2014. Spatio-temporal clustering for non-recurrent traffic congestion detection on urban road networks. Transp. Res. Part C 48, 47–65. Benoit, K., 2011. Linear regression models with logarithmic transformations. Lond. School Econ. Lond. 22 (1), 23–36. Charles, P., 2005. Effective implementation of a regional transport strategy: traffic incident management case study. WIT Trans. Built Environ. 77, 609–618. Chen, B.Y., Lam, W.H., Sumalee, A., Li, Q., Li, Z.-C., 2012. Vulnerability analysis for largescale and congested road networks with demand uncertainty. Transp. Res. Part A 46 (3), 501–516. Chen, F., Chen, S., 2011. Injury severities of truck drivers in single- and multi-vehicle accidents on rural highways. Accid. Anal. Prev. 43 (5), 1677–1688. Chen, F., Chen, S., Ma, X., 2018. Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data. J. Saf. Res. 65, 153–159. Chen, Z., Liu, X.C., Zhang, G., 2016. Non-recurrent congestion analysis using data-driven spatiotemporal approach for information construction. Transp. Res. Part C 71, 19–31. Chung, Y., Recker, W.W., 2012. A methodological approach for estimating temporal and spatial extent of delays caused by freeway accidents. IEEE Trans. Intell. Transp. Syst. 13 (3), 1454–1461. Chung, Y., Recker, W.W., 2015. Frailty models for the estimation of spatiotemporally maximum congested impact information on freeway accidents. IEEE Trans. Intell. Transp. Syst. 16 (4), 2104–2112. Coruh, E., Bilgic, A., Tortum, A., 2015. Accident analysis with aggregated data: the random parameters negative binomial panel count data model. Anal. Methods Accid. Res. 7, 37–49. Dong, B., Ma, X., Chen, F., Chen, S., 2018. Investigating the differences of single-vehicle and multivehicle accident probability using mixed logit model. J. Adv. Transp. Farradyne, P., 2000. Traffic Incident Management Handbook. Prepared for Federal Highway Administration. Office of Travel Management. Gordon, R., 2016. Non-recurrent congestion: improvement of time to clear incidents. Intelligent Transportation Systems 41–90. Hamilton, T.L., Wichman, C.J., 2018. Bicycle infrastructure and traffic congestion: evidence from dc's capital bikeshare. J. Environ. Econ. Manag. 87, 72–93. Hsiao, C., 2014. Analysis of Panel Data. No. 54. Cambridge University Press. Kermanshah, A., Derrible, S., 2016. A geographical and multi-criteria vulnerability assessment of transportation networks against extreme earthquakes. Reliab. Eng. Syst. Saf. 153, 39–49. Kim, D.-J., Park, K.-H., Bien, Z., 2007. Hierarchical longitudinal controller for rear-end collision avoidance. IEEE Trans. Ind. Electron. 54 (2), 805–817. Koesdwiady, A., Soua, R., Karray, F., 2016. Improving traffic flow prediction with weather information in connected cars: a deep learning approach. IEEE Trans. Vehic. Technol. 65 (12), 9508–9517. Kontou, E., Murray-Tuite, P., Wernstedt, K., 2017. Duration of commute travel changes in the aftermath of hurricane sandy using accelerated failure time modeling. Transp. Res. Part A 100, 170–181. Kusano, K.D., Gabler, H.C., 2012. Safety benefits of forward collision warning, brake assist, and autonomous braking systems in rear-end collisions. IEEE Trans. Intell. Transp. Syst. 13 (4), 1546–1555. Kwon, J., Mauch, M., Varaiya, P., 2006. Components of congestion: delay from incidents, special events, lane closures, weather, potential ramp metering gain, and excess demand. Transp. Res. Rec. 1959 (1), 84–91. Lawson, T., Lovell, D., Daganzo, C., 1997. Using input–output diagram to determine spatial and temporal extents of a queue upstream of a bottleneck. Transp. Res. Rec.: J. Transp. Res. Board 1572, 140–147. Li, H., Graham, D.J., Majumdar, A., 2015. Effects of changes in road network characteristics on road casualties: an application of full BAYES models using panel data. Saf. Sci. 72, 283–292. Li, J., Lan, C.-J., Gu, X., 2006. Estimation of incident delay and its uncertainty on freeway networks. Transp. Res. Rec. 1959 (1), 37–45. Lighthill, M.J., Whitham, G.B., 1955. On kinematic waves. II. A theory of traffic flow on long crowded roads. Proc. R. Soc. Lond. Ser. A Math. Phys. Sci. 229 (1178), 317–345. Magee, L., 1990. R2 measures based on Wald and likelihood ratio joint significance tests. Am. Stat. 44 (3), 250–253. Mfinanga, D., Fungo, E., 2013. Impact of incidents on traffic congestion in Dar Es Salaam city. Int. J. Transp. Sci. Technol. 2 (2), 95–108. Mussone, L., Bassani, M., Masci, P., 2017. Analysis of factors affecting the severity of

5. Conclusions and future research directions In this paper, we propose to investigate how a traffic accident affects the level of congestion in urban road networks, which is new to the literature. When a traffic accident occurs on an urban link, the congestion would propagate to and affect adjacent links. We develop a modified version of the Dijkstra's algorithm to identify the set of links in the neighborhood of the accident. We first measure the level of congestion caused by the traffic accident as the reduction in traveling speed on those links. As the level of congestion varies both in space and in time, we then systematically vary the size of the neighborhood and time period, which results in a panel of congestion measurements. This data is analyzed using a generalized linear mixed-effects model to quantify how the types of the accidents, types of vehicles involved, and occurrence time affect the level of congestion. Using real data in Beijing, we find that the determinants of the congestion caused by a traffic accident include the types of the accidents, the types of vehicles involved, the occurrence time, the occurrence day, the average speed and the number of traffic lanes. Moreover, our method can be easily applied to other regions when the data are available. For regions with similar traffic conditions as Beijing, the coefficient results would be comparable. For regions with different traffic conditions, our method can be used as well and different results may be produced, which can also be analyzed following the same logic. We outline the following possible future research directions:

• conduct spatial analysis using the topological structure of the urban • •

road networks. This would allow for spatial autocorrelation testing and examine how much the congestion of a link is influenced by the congestion of proximal links. Therefore, it would be interesting to examine the spatial effects on the level of congestion; collect more accident-related data, such as the gender and age of the driver, traffic volume, weather conditions and so on. Because of the data limitation, some factors related to the accident are not considered in the model. It is of great interest to investigate how these factors affect the level of congestion; and develop models that are capable of predicting the propagation of the congestion caused by a traffic accident in urban road networks. To the best of authors’ knowledge, all existing literature only employs the simulation models to predict the congestion propagation at the aggregated level. However, the results cannot be used to predict the congestion propagation of a specific traffic accident. It is, therefore, of great importance to predict the propagation of such congestion. The results can be used for navigation systems to effectively plan the drivers’ routes by avoiding the congestion area.

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to 8

Accident Analysis and Prevention 136 (2020) 105327

Z. Zheng, et al. crashes in urban road intersections. Accid. Anal. Prev. 103, 112–122. Pregnolato, M., Ford, A., Robson, C., Glenis, V., Barr, S., Dawson, R., 2016. Assessing urban strategies for reducing the impacts of extreme weather on infrastructure networks. R. Soc. Open Sci. 3 (5), 160023. Python, 2017. Python Language Reference (Version 3.6.3). SAS, 2015. Base SAS 9.4 Procedures Guide. SAS Institute. Schrank, D., Eisele, B., Lomax, T., 2012. Tti's 2012 Urban Mobility Report. Texas A&M Transportation Institute. The Texas A&M University System 4. Schrank, D., Lomax, T., et al., 2002. The 2002 Urban Mobility Report. Texas Transportation Institute College Station. Shen, Y., Kwan, M.-P., Chai, Y., 2013. Investigating commuting flexibility with GPS data and 3d geovisualization: a case study of Beijing, china. J. Transp. Geogr. 32, 1–11. Soro, W.L., Wayoro, D., 2018. A mixed effects negative binomial analysis of road mortality determinants in sub-Saharan African countries. Transp. Res. Part F 52, 120–126. Stipancic, J., Miranda-Moreno, L., Saunier, N., 2017. Impact of congestion and traffic flow on crash frequency and severity: application of smartphone-collected GPS travel data. Transp. Res. Rec. 2659 (1), 43–54. Tang, J., Jiang, H., Li, Z., Li, M., Liu, F., Wang, Y., 2016. A two-layer model for taxi customer searching behaviors using GPS trajectory data. IEEE Trans. Intell. Transp.

Syst. 17 (11), 3318–3324. Tsapakis, I., Cheng, T., Bolbol, A., 2013. Impact of weather conditions on macroscopic urban travel times. J. Transp. Geogr. 28, 204–211. Uddin, M., Huynh, N., 2017. Truck-involved crashes injury severity analysis for different lighting conditions on rural and urban roadways. Accid. Anal. Prev. 108, 44–55. Wang, C., Quddus, M.A., Ison, S.G., 2009. Impact of traffic congestion on road accidents: a spatial analysis of the m25 motorway in England. Accid. Anal. Prev. 41 (4), 798–808. Wang, Z., Lu, M., Yuan, X., Zhang, J., Van De Wetering, H., 2013. Visual traffic jam analysis based on trajectory data. IEEE Trans. Visual. Comput. Graph. 19 (12), 2159–2168. Wang, Z., Qi, X., Jiang, H., 2018. Estimating the spatiotemporal impact of traffic incidents: An integer programming approach consistent with the propagation of shockwaves. Transp. Res. Part B 111, 356–369. Wooldridge, J.M., 2010. Econometric Analysis of Cross Section and Panel Data. MIT Press. Yang, H., Wang, Z., Xie, K., Dai, D., 2017. Use of ubiquitous probe vehicle data for identifying secondary crashes. Transp. Res. Part C 82, 138–160. Yu, J., Chou, E.Y., Luo, Z., 2007. Development of linear mixed effects models for predicting individual pavement conditions. J. Transp. Eng. 133 (6), 347–354.

9