Transportation Research Part C 57 (2015) 30–41
Contents lists available at ScienceDirect
Transportation Research Part C journal homepage: www.elsevier.com/locate/trc
Development of a maximum likelihood regression tree-based model for predicting subway incident delay Jinxian Weng a,b,⇑, Yang Zheng b, Xiaobo Qu c, Xuedong Yan b a
College of Transport and Communications, Shanghai Maritime University, Shanghai 201306, China MOE Key Laboratory for Urban Transportation Complex Systems Theory and Technology, Beijing Jiaotong University, Beijing 100044, China c Griffith School of Engineering, Griffith University, Gold Coast, 4222 QLD, Australia b
a r t i c l e
i n f o
Article history: Received 10 September 2014 Received in revised form 11 February 2015 Accepted 1 June 2015 Available online 17 June 2015 Keywords: Subway incidents Delay Maximum likelihood regression tree Accelerated failure time
a b s t r a c t This study aims to develop a maximum likelihood regression tree-based model to predict subway incident delays, which are major negative impacts caused by subway incidents from the commuter’s perspective. Using the Hong Kong subway incident data from 2005 and 2009, a tree comprising 10 terminal nodes is selected to predict subway incident delays in a case study. An accelerated failure time (AFT) analysis is conducted separately for each terminal node. The goodness-of-fit results show that our developed model outperforms the traditional AFT models with fixed and random effects because it can overcome the heterogeneity problem and over-fitting effects. The developed model is beneficial for subway engineers looking to propose effective strategies for reducing subway incident delays, especially in super-large-sized cities with huge public travel demand. Ó 2015 Elsevier Ltd. All rights reserved.
1. Introduction Subway transportation is an efficient and green transportation mode that can produce low carbon emissions. In general, it plays a vital role in the urban public transportation service. However, breakdowns of subway services caused by system component failures (e.g., power failures), which are defined as subway incidents, can cause great disruption to the commuters though the occurrence likelihood is low. From the commuter’s perspective, the subway incident delay is the major consequence resulting from a subway incident, measured as the difference between the scheduled and actual subway train departure times. Note that the delay is not the only factor, but is a major factor affecting commuters’ trips. For example, the cancellation of trains may sometimes influence the frequency of trains. However, this is a rare event because subway companies typically have a few spare or reserve trains in case there is an emergency. In fact, the subway delay is able to capture the exposure of train cancellations. For example, the subway delay that results from a train cancellation is generally larger than that when there is no train cancellation. It is thus necessary for public transportation authorities to implement effective management strategies for clearing subway incidents as quickly as possible. The quick clearance of a subway incident usually requires an efficient allocation of resources so as to dispatch a crew in a timely manner. In an attempt to achieve this objective, there is a critical need to develop a model to comprehensively explore the influencing factors and predict the delays caused by subway incidents (Chung, 2010).
⇑ Correspondence to: Dr. Jinxian Weng, Professor, Shanghai Maritime University, China. E-mail addresses:
[email protected],
[email protected] (J. Weng). http://dx.doi.org/10.1016/j.trc.2015.06.003 0968-090X/Ó 2015 Elsevier Ltd. All rights reserved.
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
31
So far, many studies have been conducted on the analysis of the causes of subway incidents. For example, fire may be a disaster in an underground subway system. Based on the archived subway fire incident data, Cheng et al. (2001) investigated the major causes of fire accidents and then provided some suggestions to reduce the occurrence likelihood of subway fire incidents. Some researchers have also examined the major causes of terrorist attacks (Staten, 1997; Okumura et al., 2005; Mumpower et al., 2013) and drivers’ mental state (Mishara, 1999). In general, subway incidents might be induced by failures of multiple factors including the power system, subway vehicles, ventilation and smoke exhaust systems, the water supply and drainage system, and the communication and signal system (He et al., 2005). Providing information to the public about delays caused by subway incidents could alert commuters to the necessity of rescheduling their trips. Therefore, it is of the utmost importance to accurately predict subway incident delays. To date, various parametric models, such as the accelerated failure time (AFT) models (Chung et al., 2010; Tavassoli et al., 2013), have been developed to estimate freeway incident delay. However, parametric models have predetermined functional forms and assumptions that are not usually met in the case of subway incidents, especially given the large number of influencing factors. In order to eliminate the drawbacks of parametric models, many researchers (e.g., Wei and Lee, 2007; Ma and Kockelman, 2008) have proposed nonparametric models for estimating subway/freeway incident delay. However, it should be noted that these nonparametric models make it difficult to in examine the marginal effect of influencing factors on the incident delay, which is useful for subway staff in establishing priorities for reducing subway incident delays. Therefore, this study aims to develop a maximum likelihood regression tree-based model to predict the delays caused by subway incidents. Each terminal node of the regression tree is assigned a single AFT model to describe the distribution of the subway incident delay. The contribution of this study is twofold. First, it makes an initial attempt to build a model that will compensate for the weak points of parametric AFT models. Unlike the AFT models, our developed model is able to account for the heterogeneity effect as well as avoid the over-fitting problem. Second, the developed model could help subway staff to quickly implement the most effective strategies for reducing subway incident delays, especially in the super-large-sized cities with huge public travel demand.
2. Literature review To date, various parametric models have been proposed for the analysis of traffic incident durations. Among these parametric models, the AFT models have been the most widely used in previous studies (Haque and Washington, 2015). For example, Giuliano (1989) developed a lognormal distributed AFT model to determine the traffic incident duration. Chung (2010) found that the log-logistic AFT model fitted the freeway incident duration best. Other parametric models, including time sequential models (Khattak et al., 1995), Bayesian models (Ma and Kockelman, 2008; Kim and Chang, 2012), linear regression models (Valenti et al., 2010), and the mechanism-based approach (Hou et al., 2013) have also been developed for the analysis of traffic incidents. However, it should be pointed out that parametric models have predetermined functional forms and assumptions that are sometimes violated in reality. Thus, many nonparametric models, such as fuzzy logic models (Kim and Choi, 2001), artificial neural networks (Wang et al., 2005; Wei and Lee, 2007), tree-style classification models (Smith and Smith, 2001; Ma and Kockelman, 2008), and the text analysis approach (Pereira et al., 2013), have been proposed in an attempt to eliminate the drawbacks of parametric models. Ferdous et al. (2011) discussed the event tree and fault tree methods. Fuzzy set theory and evidence theory were used to describe the uncertainties of the inputs associated with the event occurrence likelihoods so as to mitigate the impact of the assumption that all events are independent. Because of their simplicity and high prediction accuracy, many tree-based models have been developed for analyzing accident injury severity. Nevertheless, it is difficult to interpret the marginal effect of influencing factors using nonparametric models, whereas it is a key step for the transportation authority toward establishing priorities for shortening subway incident delays. Some other researchers have employed least square tree methods (Quinlan, 1992) to analyze accident injury severity. Kuhnert et al. (2000) utilized multivariate adaptive regression splines (MARS), classification and regression trees (CART), and logistic regression to analyze motor vehicle injury data. Yan et al. (2010) adopted the hierarchical tree-based regression (HTBR) method to investigate train–vehicle crashes at highway–rail grade crossings. Weng et al. (2013) presented a tree-based logistic regression approach to assess work zone casualty risk. Their results showed that the proposed approach outperformed the decision tree approach and the logistic regression approach. However, one shortcoming of the least square tree approach is that it cannot be applied to analyze a dependent variable with a large variance. To overcome this shortcoming, Torgo (2000) built a least absolute deviation regression tree as an extension of the least square regression tree. It was found that the built tree can alleviate the effect of large variance to some extent. Moreover, it performed well at handling data that followed a skewed distribution. Nevertheless, both least absolute deviation regression trees and least square regression trees have poor stability. Hence, some researchers (e.g., Su, 2002; Mohamed et al., 2013) have proposed a maximum likelihood regression tree and found that it has a rigorous mathematical justification and better tree stability than the least square regression tree. In summary, the existing literature clearly shows that both parametric AFT approaches and tree-based methods are available for traffic safety analysis. However, these approaches cannot be applied to predict subway incident delay for the following reasons. First, the parametric AFT approaches may yield biased and implausible results even though they can be used to interpret the marginal effects of influencing factors on subway incident delay. This is because AFT approaches cannot explain
32
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
the fact that some influencing factors may present various or opposing effects under different circumstances. Second, there is a big uncertainty (variance) associated with subway incident delay, whereas the least square tree methods can only be applied to analyze dependent variables with small variances. Although the maximum likelihood regression tree approach is a better choice for describing subway incident delay, the existing approach still needs to be improved because in the past a constant mean has been assigned to each terminal node. However, the mean subway incident delay is affected by various external factors. In order to produce better model performance (i.e., goodness-of-fit), each terminal node in the regression tree should be assigned a parametric AFT model. 3. Methodology 3.1. Maximum likelihood regression tree-based model In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In a decision tree, the root node that contains the entire data sample is at the top and the terminal nodes are at the bottom. In general, decision trees can be divided into two types, namely classification trees and regression trees. Since the subway incident delay is a continuous target variable, the developed tree in this study is actually a regression tree. In a regression tree, the tree grows as a result of the testing of partitioning data at the parent nodes. The outgoing branches of a parent node correspond to all the possible outcomes of the test at the node. To figure out the node to which a data point belongs, one can start at the root node of the tree and trace a path down the tree according to the features of the data point. In addition, each terminal node will be assigned an AFT model to describe the distribution of subway incident delays. Similarly to previous studies (Golob et al., 1987; Giuliano, 1989), the lognormal distributed AFT model is used in this study and the probability density function can be expressed by
2 ! 1 1 ln y l f ðy; l; rÞ ¼ pffiffiffiffiffiffiffi exp 2 r y 2pr
ð1Þ
l ¼ Xb þ e e Nð0; r2 Þ
ð2Þ ð3Þ
where l is the mean of the logarithm of the subway incident delay, r is the standard deviation of the logarithm of the subway incident delay, where X = (x1, x2, . . . , xn) is a vector of explanatory variables with respect to the factors influencing the subway incident delay, e is a random error term and b = (b0, b0, . . . , bn)T is a vector of coefficients that can be calibrated using the maximum likelihood estimation technique. 3.2. Construction of maximum likelihood tree structure A two-step procedure is implemented to construct the structure of the maximum likelihood regression tree, namely tree growing and tree pruning. The tree construction procedures are graphically depicted in Fig. 1 and discussed in detail in the following sections. 3.2.1. Tree growing The principle behind growing a maximum likelihood regression tree is to recursively partition the target variable so that the data in the descendant nodes always have bigger likelihoods than the data in the parent node. When the training data enter the root node of a maximum likelihood tree, a test is performed to search for all possible splits among all variables using a maximum likelihood splitting (MLS) algorithm. Here, the MLS algorithm is a greedy search algorithm that aims to maximize the log-likelihood of the tree-based model by splitting the data in a parent node into several subgroups. Based on the MLS algorithm, the following five steps are implemented to determine which variable can be used to best split a parent node. _
_
Step 1. For a given node k, we first determine the parameters ðlk ; rk Þ of the AFT model shown in Eqs. (1)–(3) using maximum likelihood estimation techniques. The maximum log-likelihood for the node k, denoted by LL(k), is calculated by
LLðkÞ ¼
nk X
_
_
ln f ðyi jlk ; rk Þ
ð4Þ
i¼1
where nk is the number of observations in node k, and yi is the subway incident delay of the ith observation. Step 2. For all possible splits in the values of an explanatory variable x, the maximum log-likelihood increment on the parent node k caused by a split s is computed by
DLLðx; s; kÞ ¼ LLðkR Þ þ LLðkL Þ LLðkÞ
ð5Þ
33
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
Training data
Validation data
Select a node
Determine the subtree Tj-Th
Search possible splitters
Calculate AIC(Tj-Th) No
Calculate the increment of log-likelihood
AIC(Tj-Th)
Meeting the maximum likelihood splitting criterion?
Output the final tree Tj
h*=h; AIC(Tj)=AIC(Tj-Th*)
Yes
No
No
h=h+1
Select as a splitter and split
Yes
AIC(T*)>=AIC(Tj) ? h>the number of terminal nodes in the tree Tj?
Set as a terminal node j=0, h=1 No
Meeting stopping criterion?
Yes T*=Tj-Th*
Yes Grown tree Tree growing
Tj=Tj-Th* , j=j+1, h=1
No
Tree pruning
Fig. 1. A flowchart for constructing a maximum likelihood regression tree.
where kR and kL are the right and left child nodes of the parent node k, respectively. The split s can be used only if DLL(x, s, k) > 0, which implies that the data in the descendent nodes are purer than those in the parent node. Through exhaustive searching of all possible splits of the explanatory variable x, the best split s⁄ that causes the maximum increment in log-likelihood is finally selected. Namely,
DLLðx; s ; kÞ ¼ max DLLðx; s; kÞ s2S
ð6Þ
where S represents the set of all possible splits for the variable x. Step 3. The best splits for all explanatory variables are determined by repeating Step 2. Among these best splits, the split s⁄ of the variable x⁄ that causes the global maximum increment in log-likelihood can be determined
DLLðx ; s ; kÞ ¼ max DLLðx; s ; kÞ x2X
ð7Þ
where X represents the explanatory variable set. Step 4. If DLLðx ; s ; kÞ 6 0, the parent node k will be considered as the terminal node. Otherwise, we choose the variable x⁄ and split s⁄ to split the parent node k. Step 5. Splitting stops when one of the following two stopping rules is satisfied: (a) none of the nodes at the bottom can be split further because DLLðx ; s ; kÞ 6 0; (b) the current tree depth has reached the preset maximum tree depth. Otherwise, go to Step 1. 3.2.2. Tree pruning Overly large trees may result in lower prediction accuracy on new data. In this situation, there is a critical need to remove those parts of branches that do not contribute to the prediction accuracy for new data. In this study, we employ a cost-complexity pruning approach to prune the grown tree. According to the study of Akaike (1974), we can use the Akaike information criterion (AIC) statistic to represent the cost complexity of a maximum likelihood regression tree T. Namely,
AICðTÞ ¼ 2LLðTÞ þ 2 ðj Te j þ 1Þ
ð8Þ
34
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
e j þ 1 is the total numwhere AIC(T) is the AIC statistic for the tree T, LL(T) is the maximum log-likelihood for the tree T, and j T ber of parameters for the AFT models in the tree T. The small tree with the best performance can be selected by minimizing the AIC statistic. In general, the cost-complexity pruning procedure can be described as follows: start from the bottom of the tree and examine each node and subtree. If the replacement of this subtree with a terminal node will produce a lower AIC statistic for the new data, then prune the tree accordingly. In detail, the procedures are as follows: Step P1. We start with the largest initial tree Tj. For any parent node h, we can determine the subtree that contains no descendent nodes associated with the parent node h, denoted by Tj Th, and then compute the corresponding AIC statistic (i.e., AIC(Tj Th)) using the validation data and Eq. (8). Step P2. Using this method, we can search for the best parent node h⁄ which produces the smallest AIC(Tj Th) for the validation data, namely,
AICðT j T h Þ ¼ minfAICðT j T h Þg
ð9Þ
Step P3. Let the optimal tree be T ¼ T j T h . If AICðT Þ P AICðT j Þ, stop and output the final optimal tree. Otherwise, set the initial tree as T j ¼ T j T h and j = j + 1, and go to Step P1. 4. Case study 4.1. Subway incident data We collected Hong Kong subway incident data between 2005 and 2009 for the case study analysis. These data were mainly drawn from the database published by the Legislative Council of Hong Kong (LegCo, 2013). A Google search was used as a second channel to obtain more information regarding incomplete Hong Kong subway incident data from various incident reports published online. One sample of the incident records collected was as follows: ‘‘A subway train of the East Rail Line collides with a man jumping from the Tai Po Market Station at 10:41 pm on the March 1 of 2005, which causes a subway incident delay of 18 minutes’’. From the collected data, we were able to obtain the information regarding: (i) the subway incident occurrence time, (ii) the subway line, (iii) the causes of subway incident, and (iv) the subway incident delay. Two variables, namely the time of day and the day of the week, were used to represent the subway incident occurrence time in this study. The eight major causes of subway incidents are power failure, door failure, vehicle failure, emergency incident, signal failure, collision with falling objects/passengers, rail track failure and operational problems. More detailed explanations of these eight major causes are provided in Table 1. Note that some subway incidents were induced by multiple factors, such as a power failure together with a signal failure. It should also be pointed out that only Hong Kong mass transit Table 1 Variable descriptions. Variables
Descriptions
Values
The day of week
Weekends Weekdays
0 = Weekend 1 = Weekday
The time of day
Off-peak hours of the day Peak hours of the day
0 = Offpeak 1 = Peak
Power failure
No power related failure Power failure due to cable problems Power failure due to equipment problems
0 = NoPowFail 1 = PowCab 2 = PowEqu
Door failure
No door related failure Platform screen door or vehicle door failure
0 = NoDoorFail 1 = DoorFail
Vehicle failure
No subway vehicle related failure Subway vehicle failure
0 = NoVehFail 1 = VehFail
Emergency incident
No emergency incidents occurred Emergency call is made by passengers
0 = NoEmergency 1 = Emergency
Signal failure
No signal related failure Signal failure
0 = NoSigFail 1 = SigFail
Collision with falling objects/passengers
No Yes
0 = No 1 = Yes
Rail track failure
No rail track related failure Rail track turnout failure Rail track mechanical abrasion
0 = NoRailFail 1 = Turnout 2 = MechAbr
Operational problem
No operation related problems Working staff operational problems
0 = NoOperPro 1 = Operation
Delay
Subway incident delay
Continuous
35
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
railway (MTR) subway incidents were recorded before 2007. Since the Kowloon–Canton Railway was merged into the MTR in December of 2007, subway incidents occurring on the railroad crossings of the original Kowloon–Canton Railway lines have also been reported to the Legislative Council of Hong Kong. A total of 1332 subway incident records were collected. Note that the delay in each subway incident was no less than 8 min because the Legislative Council of Hong Kong only requires the MTR to record subway incidents causing at least 8 min of delay. It should be pointed out that the collected subway incident records are trustworthy. The delay has a wide range (from 8 to 418 min), meaning that the standard deviation (18.2 min) is larger than the corresponding mean (14.6 min). Fig. 2 presents the distribution of subway incident records with respect to different variable values. Fig. 2(a) shows that subway incidents are more likely to occur on weekdays. Also, the occurrence frequency is higher during the off-peak periods. In order to build a maximum likelihood regression tree-based model, the collected data were divided into two groups: one for training and the other for validation. More specifically, 80% of the collected data (1066 subway incident records) were randomly selected for training and 20% of the data (266 subway incident records) were used for tree validation. 4.2. Results The learning process of the maximum likelihood tree-based model fusing the 1066 training data records and 266 validation data records is depicted in Fig. 3. It can be seen that the negative log-likelihood for the training data monotonically 0.8 Frequency
Frequency
1 0.5
0.6 0.4 0.2 0
0
Offpeak Peak (b) The time of day
Weekend Weekday (a) The day of week 1 Frequency
Frequency
1 0.5 0
0.5 0
NoPowFail
PowEqu
NoDoorFail
(c) Power failure 1 Frequency
Frequency
1 0.5 0
0.5 0
NoVehFail
VehFail
NoEmergency
(e) Vehicle failure 1 Frequency
Frequency
Emergency
(f) Emergency incidents
1 0.5 0 NoSigFail
SigFail
0.5 0 No Yes (h) Collision with falling objects/passengers
(g) Signal failure 1
1 Frequency
Frequency
DoorFail
(d) Door failure
0.5 0 NoRailFail Turnout MechAbr (i) Rail track failure
0.5 0 NoOperPro Operation (j) Operational problems
Fig. 2. Frequency distribution of subway incidents.
36
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
decreases with the number of terminal nodes. However, the decrease of the number of terminal nodes does not contribute to the decrease in the AIC value for the validation data when the number of terminal nodes is smaller than 10. Therefore, the maximum likelihood tree comprising 10 terminal nodes is considered the optimal tree, as shown in Fig. 4. The interpretation of the maximum likelihood tree is straightforward. With the training data, the initial split at the root node is based on the door failure that causes the largest increment in the log-likelihood. The fact that the largest log-likelihood increment is caused by the door failure can be attributed to the fact that subway incidents involving door failures are very concentrated while non-door failure incidents have a large standard deviation of delay (r = 3.03 min for door failures vs. r = 15.50 min for non-door failures). The tree directs the door failure attribute ‘‘DoorFail’’ to the left forming Terminal Node 1, and the attribute ‘‘NoDoorFail’’ to the right forming Node 1. The maximum likelihood splitting algorithm then continues splitting Node 1 based on the power failure into two groups (Nodes 2 and 3). Node 3 is further split based on the factors of collision with falling objects/passengers, vehicle failure, operational problems, the time of day, and signal failure. The final maximum likelihood tree structure is depicted in Fig. 4. As mentioned above, each terminal node should be assigned its own AFT model. Table 2 gives the AFT model parameters, which have been calibrated using the maximum likelihood estimation method for each terminal node. Since not all of the factors exhibit significant effects on incident delays at each node, the backward elimination method, in which variables are tested for removal from the model one by one based on the significance level of the likelihood ratio, was used to choose the variables for the AFT model of a given terminal node. Initially, the AFT model contained all 10 explanatory variables. The least significant variable with a significance level >0.10 was repeatedly removed. After each removal, the remaining variables were tested again until all variables kept in the model were statistically significant at the level of 0.10. It can be seen from Table 2 that the mean subway incident delay in Terminal Node 1 is only affected by vehicle failure. For Terminal Nodes 3 and 7, the day of the week shows negative effects on the mean subway incident delay. The subway delay in Terminal Node 3 is also influenced by operational problems. The subway incident delays in Terminal Nodes 4 and 6 are negatively affected by the time of day. Using the same training data, we also developed 10 traditional AFT models with different distribution types such as log-logistic, normal, gamma, lognormal and Weibull, considering fixed and random effects. Table 3 gives the statistics of the best-fitting AFT models for the different distribution types. It can clearly be seen that our maximum likelihood regression tree-based model (MLRT) provides the smallest AIC statistic. Although the log-logistic AFT model with random parameters has a slightly higher log-likelihood value than the MLRT model, it may be over-fitted. In this study, the AIC pruning criterion is selected for the MLRT model so that the over-fitting problem can be avoided. Table 4 compares the goodness-of-fits of the validation data for different models. It can clearly be seen that the MLRT provides a higher log-likelihood value but a smaller AIC statistic for the validation data, compared with the log-logistic AFT model with random parameters. In addition, two measures including the mean absolute percentage error (MAPE) and root mean square error (RMSE) are further used to compare the accuracy of predicted values for the validation data from the MLRT model and other models. It can be clearly seen from Table 4 that both RMSE and MAPE from the MLRT model are the smallest than those from other models. This implies that the MLRT performs better on the new datasets than other models, which confirms the superiority of the MLRT model in predicting subway incident delays. It should be pointed out that our MRT model also performs better than other non-parametric models (e.g., artificial neural Network and least square tree) in terms of prediction accuracy. For example, the RMSE of the predicted values for validation data are 26.9 from the artificial neural network and 26.3 from the least square tree, which are larger than that from our MLRT model (4.9). 4.3. Discussions
700 680 660 640 620 600 580 560 540 520 500
Training data Validation data
1
2
3
350 340 330 320 310 300 290 280 270 260 250
AIC of the validation data
NegLoglike of the training data
4.3.1. Impact analysis The graphical display of the tree-based model in Fig. 4 makes it easier to understand the results. Using the tree-based model, we can easily examine the effects of influencing factors on the subway incident delay.
4 5 6 7 8 9 10 11 12 13 Number of terminal nodes
Fig. 3. Learning process of the maximum likelihood tree-based model.
37
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
Root Node N=1066 LL=-675.8 Mean=14.3
NoDoorFail
DoorFail Terminal Node 1
0.30
Node 1 N=887 LL=-621.5 Mean=15.1
N=179 LL=7.6 Mean=10.2
pdf
0.20
Subway incident delay
0.10
Power failure NoPowFail
0.00 8
12
16
20
24
Subway incident delay Node 2 N=90 LL=-95.6 Mean=23.4
Node 3 N=797 LL=-497.9 Mean=14.2 Collision with falling objects/passengers
PowCab
PowEqu
Terminal Node 2
Yes
Terminal Node 3
Terminal Node 4
0.08 N=24 LL=-32.1 Mean=38.1
0.08
N=66
0.06
pdf
0.04 0.02
LL=-56.4 Mean=18
Node 4 N=638 LL=-354.5 Mean=13.3
0.04 0.02
N=159 LL=-129.8 Mean=17.6
0.06
pdf
0.06
pdf
No
0.04 0.02
0.00
0.00 8
12
16
20
8
24
12
16
20
24
Subway incident delay
Subway incident delay
0.00
Vehicle failure
8
12
16
20
24
Subway incident delay NoVehFail
VehFail Terminal Node 5 0.12
Node 5 N=434 LL=-210 Mean=13.2
N=204 LL=-138.1 Mean=13.6
pdf
0.08 0.04
Operational problems NoOperPro
0.00
Operation
8
12
16
20
24
Subway incident delay Terminal Node 6 Node 6 N=284 LL=-128.7 Mean=13.5 pdf
0.12 0.08 0.04
Emergency incidents Emergency
N=150 LL=-75.6 Mean=12.6
NoEmergency
0.00 8
pdf
16
20
24
Node 7 N=260 LL=-127.4 Mean=13.9
N=24 LL=-13.3 Mean=9.3
0.20
12
Subway incident delay
Terminal Node 7
0.15 0.10 0.05
The time of day
0.00 8
12
16
20
Offpeak
24
Peak
Subway incident delay Node 8 N=160 LL=-75.8 Mean=13.6
Terminal Node 10
pdf
N=100 LL=-50.3 Mean=14.3
Signal failure NoSigFail Terminal Node 8 0.12
SigFail Terminal Node 9
0.12
N=80 LL=-26.3 Mean=13.1
N=80 LL=-46.8 Mean=14.1
Subway incident delay
0.08
pdf
pdf
0.08
0.04
0.04 0.00
0.00 8
12
16
20
24
Subway incident delay
8
12
16
20
24
Subway incident delay
Fig. 4. Maximum likelihood regression tree for subway incidents.
Table 2 shows that vehicle failure has a positive coefficient in the AFT model for Terminal Node 1. This suggests that, for subway incidents involving door failures, the additional presence of vehicle failure could increase the subway incident delay by 93.9% (=e0.6623 1). For non-door-failure incidents, power failure has the biggest influence on the subway incident delay. Fig. 4 shows that the observed mean of the subway delays caused by power failures is 23.4 min, which is 64.8% longer than that of delays not caused by power failures. The wide range in the subway delays caused by incidents involving power failures could be attributed to the fact that power failures usually induce other subway exponent failures such as vehicle failures and signal failures (Baysari et al., 2009). Obviously, it will take much longer to recover the subway service when a subway incident is caused by multiple exponent failures. Another possible reason might be that the causes of power failures (e.g.,
38
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
Table 2 AFT model in each terminal node. Terminal node #
AFT models
1 2 3 4 5 6 7 8 9 10
l
r
l = 2.2821 + 0.6623 vehicle failure l = 2.7642 + 1.8350 operational problem l = 2.7843 0.1876 the day of week + 0.7736 operational problem l = 2.691 0.1214 the time of day + 0.3909 operational problem l = 2.4251 l = 2.442 0.1181 the time of day + 0.4198 rail track failure l = 2.3502 0.2126 the day of week l = 2.5127 l = 2.5306 l = 2.5652
0.2336 0.9505 0.5752 0.5529 0.4787 0.4018 0.1395 0.3374 0.4369 0.4132
Table 3 Comparisons of best-fitting models with different distribution types. Distribution type
Lognormal
Log-logistic
Weibull
Gamma
Weibull with Gamma heterogeneity
Lognormal with random parameters
Log-logistic with random parameters
Weibull with random parameters
Gamma with random parameters
Our model
Log-likelihood AIC
686.4 1396.8
603.1 1230.2
1049.7 2123.4
669.2 1362.4
1117 2258
591.9 1246.6
523.5 1095
911 1751
708 1352
534 1089
Table 4 Goodness-of-fits of validation data for different models. Model
Lognormal
Loglogistic
Weibull
Gamma
Log264.6 AIC MAPE (%) RMSE
134.8 361.2 32.8 8.8
129.9 309.2 29.0 9.0
189.7 502.2 61.3 10.8
– – – –
Weibull with Gamma heterogeneity
Lognormal with random parameters
Log-logistic with random parameters
Weibull with random parameters
Gamma with random parameters
likelihood 118 551.1 41.5 8.7
169.6
143.6
240.1
–
291.6 32.0 8.8
281.8 29.0 8.9
401.4 53.5 9.1
– – –
Our model
257 27.2 4.9
– No convergence. rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn 2 P ^ ðy ^ yi Þ i¼1 i ^i is the estimated value for the ith MAPE ¼ 1n ni¼1 yi yyi ; RMSE ¼ , where n is the number of datasets, yi is the observed value for the ith dataset, y n i dataset.
power cable disconnections and fractures) are not easy to detect (Reis, 2006), which will cause a longer delay. In addition, Table 2 shows that the subway incident delay will increase significantly (by 527% = e1.835 1) when subway incidents involving power failures also involve operational problems. Similarly to the case of power failure, collisions with falling objects or passengers also play an important role in increasing subway incident delays. By comparing Node 4 and Terminal Node 4, we can see that the mean of observed delays caused by collision-related subway incidents is 32.3% longer than that for incidents that do not involve collisions. Many previous studies (e.g., Aberg, 1988; Wigglesworth and Uber, 1991) have found that the installation of active protection systems such as flashing lights, barrier booms, gates and constant-warning-time displays can greatly reduce the likelihood of collisions with falling objects or passengers. For subway incidents involving operational problems, the presence of rail track failures can cause a significant increase in the subway incident delay. From Table 2, it can be seen that a rail track turnout communication disruption can result in a 52.2% (=e0.4198 1) longer subway delay compared to incidents not involving rail track failures. This result is consistent with the findings from Pender et al. (2012). The comparison of Terminal Nodes 8 and 9 shows that subway incidents involving signal failures cause an incident delay of 14.1 min on average, which is about 7.6% longer than that caused by incidents that do not involve signal failures. It can also be seen that the time of day and the day of the week are another influence the subway incident delay. As shown in Table 2, the negative signs of coefficients associated with the day of the week in the AFT models for Terminal Nodes 3 and 7 show that the subway incident delays on weekdays are generally shorter than those during weekends. The subway incident delays during peak periods are also shorter by 5.2% on average, compared with off-peak periods. The shorter delays associated with weekdays and peak periods can be explained by the fact that more staff and resources are put into the subway system, and the incident clearance efficiency is thus higher at these times.
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
39
4.3.2. Merits of the MLRT model Compared with the existing parametric AFT models and tree-based regression models, our MLRT model is considered superior in the following aspects. From the viewpoint of practical applications, the MLRT model has the merit of ease of use. Using the developed MLRT model, subway staff could easily determine the distribution of subway delays by tracing a path down the tree to a terminal node according to the characteristics of subway incidents. The graphical display of the MLRT model could help subway staff to make quick and reasonable emergency response plans. From the methodological viewpoint, the developed MLRT model compensates for the weak points of the previous tree-based regression models for predicting accident durations/delays. In the past, a simple sample mean value was assigned to each terminal node of the tree-based model. However, the mean subway incident delay might be significantly affected by various external factors. In order to account for possible effects of external factors, our MLRT model assigns an AFT model instead of the sample mean value to each terminal node. Theoretically, the developed MLRT model also overcomes the shortcomings of traditional AFT models. One fundamental assumption of AFT models is that all explanatory variables are highly independent. In addition, the effect of any specific explanatory variable in the AFT model is assumed to be the same under different circumstances. Obviously, these two assumptions are inconsistent with actual cases. In fact, the influencing factors may exhibit various effects or hidden effects on the subway delay. For example, Table 2 shows that the coefficients of operational problems in the AFT models for Terminal Nodes 2 and 4 are 1.8350 and 0.3909, respectively. This implies that the presence of operational problems could have a bigger effect on incidents also involving power cable failures than the no power failure involved incidents. However, the effects of operational problems are treated as being the same in the traditional AFT models. Hence, the developed MLRT model provides a better goodness-of-fit (i.e., smaller AIC value) than the AFT models for the training data and validation data, as shown in Table 3. Unlike the AFT models, AFT models with random parameters are partially able to account for the heterogeneity problem. However, these models may result in an over-fitting problem. The developed MLRT model avoids the over-fitting problem because it employs the AIC pruning criterion to prune the tree, which effectively eliminates the over-fitting effect. Therefore, the developed MLRT outperforms the AFT models with random parameters because the former produces a smaller AIC value, as shown in Table 4. 4.3.3. Limitations of the MLRT model It should be pointed out that the development of the MLRT model requires a large data sample. This is because, firstly, the data sample needs to be divided into training and validation subsamples to determine the tree structure. Secondly, each terminal node in the MLRT model requires adequate data so that the AFT model for each terminal node can be developed. It will give rise to biased/incorrect results if insufficient data are used for developing the MLRT model. More specifically, the MLRT model will be changed significantly if a small number of additional data are added to the insufficient data. In reality, the subway incident data may sometimes have poor quality when a part of subway incidents are recorded incorrectly. Theoretically, poor data quality may result in low model performance and biased results. In general, the model 1600 Training data Validation data 1400
1200
AIC
1000
800
600
400
200 0%
2%
4% 6% 8% 10% 12% Proportion of incorrect/inaccurate data
14%
16%
Fig. 5. Effects of inaccurate/incorrect data on the MLRT model performance.
40
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
stability may be affected by the proportion of inaccurate/incorrect data. Assuming all the collected subway incident records are correct, we can evaluate the effects of incorrect/inaccurate data by randomly changing the attributes among a part of subway incident records. Fig. 5 shows the effects of poor quality data on the MLRT model performance results. It can be seen that the AIC would increase from 1089 to 1397 for the training data and from 257 to 384 for the validation data if a small proportion of inaccurate/incorrect data (e.g., 15% of data) were used for the model development. The relatively big change in the AIC might be because that the tree structure of the MLRT model was altered significantly. The number of terminal nodes can be reduced from 10 to 2 when there exist 15% of inaccurate or incorrect data. Hence, it is of a great necessity to make sure that subway incidents can be recorded completely and correctly. 5. Conclusions From the commuter’s perspective, the delays resulting from subway incidents are their main negative impacts. If a subway incident delay is too long, it may attract serious complaints from the public. It is of utmost importance to develop a model to accurately predict subway incident delay because publishing such information could alert commuters to the necessity of rescheduling their trips. Therefore, we developed a maximum likelihood regression tree-based model to predict the subway incident delay, and to examine the effects of possible influencing factors. A case study was created using Hong Kong subway incident data between 2005 and 2009 to train and validate the developed MLRT model. Finally, a tree comprising 10 terminal nodes was selected to predict subway incident delays. We also built a single lognormal AFT model for each terminal node. For the purpose of comparison, we also developed ten widely used AFT models, including log-logistic, normal, gamma, lognormal and Weibull AFT models, considering both fixed and random effects using the same training data. The results of the comparison show that our developed MLRT model produces a smaller AIC statistic than the traditional AFT models. This confirms the superiority of our developed model in predicting subway incident delays. Our model results clearly indicate that the presence of a power failure can significantly increase the subway incident delay. Longer subway delays are also associated with signal failure, and collisions with falling objects/passengers. A subway incident occurring during the peak period will cause a slightly shorter delay than off-peak periods. According to the model results, we can suggest some strategies. For example, additional power supply systems should be established to deal with incidents involving power failures (e.g., power cable failures, power-related equipment failures). Unlike the traditional AFT models, our MLRT model is able to account for the heterogeneity effect and provides a better goodness-of-fit in predicting subway incident delay. In addition, the MLRT model outperforms the AFT model with random parameters because the former can avoid the over-fitting problem owing to the use of the AIC-based pruning criterion. This demonstrates that the developed MLRT model is a good alternative for predicting subway incident delays. Although the developed MLRT model exhibits the ability to predict subway incident delays as well as examine the factors that influence subway delays, it does have the following two limitations. First, the model was developed based on Hong Kong subway incident data. The causes and impacts of subway incidents might be different in other countries. Further research will be conducted to collect more subway incident data from other cities so that a more generalized model can be developed. Second, the lognormal distributed AFT model is assigned to each terminal node. Although this study has shown that our maximum likelihood tree incorporating lognormal distributed AFT models is already better than the traditional AFT models with fixed and random parameter, we will nevertheless investigate in the future whether the model’s performance can be improved by assigning AFT models with other distribution types to different terminal nodes. Acknowledgements The authors sincerely thank the editor and anonymous referees for their helpful comments and valuable suggestions, which considerably improved the exposition of this work. Special thanks are given to Ms. Yang Nan from the MTR for her advices and assistance in improving the paper quality. This study is supported by the National Natural Science Foundation of China (Grant Nos. 51308037, 51338008, 71210001). References Aberg, L., 1988. Driver behavior at flashing light, rail-highway crossings. Accid. Anal. Prev. 20, 59–65. Akaike, Hirotugu., 1974. A new look at the statistical model identification. IEEE Trans. Autom. Control 19 (6), 716–723. Baysari, M.T., Caponecchia, C., Mclntosh, A.S., Wilson, J.R., 2009. Classification of errors contributing to rail incidents and accidents: a comparison of two human error identification techniques. Saf. Sci. 47 (7), 948–957. Cheng, L.H., Ueng, T.H., Liu, C.W., 2001. Simulation of ventilation and fire in the underground facilities. Fire Saf. J. 36 (6), 597–619. Chung, Y., 2010. Development of an accident duration prediction model on the Korean Freeway Systems. Accid. Anal. Prev. 42 (1), 282–289. Chung, Y., Walubita, L.F., Choi, K., 2010. Modeling accident duration and its mitigation strategies on South Korean Freeway Systems. Transport. Res. Rec.: J. Transport. Res. Board 2178 (1), 49–57. Ferdous, R., Khan, F., Sadiq, R., Amyotte, P., Veitch, B., 2011. Fault and event tree analyses for process systems risk analysis: uncertainty handling formulations. Risk Anal. 31 (1), 86–107. Giuliano, G., 1989. Incident characteristics, frequency, and duration on a high volume urban freeway. Transport. Res. Part A: Gen. 23 (5), 387–396. Golob, T.F., Recker, W.W., Leonard, J.D., 1987. An analysis of the severity and incident duration of truck-involved freeway accidents. Accid. Anal. Prev. 19 (5), 375–395.
J. Weng et al. / Transportation Research Part C 57 (2015) 30–41
41
Haque, M.M., Washington, S., 2015. The impact of mobile phone distraction on the braking behavior of young drivers: a hazard-based duration model. Transport. Res. Part C 50, 13–27. He, L., Zong, M., Deng, Y., 2005. Analysis on risk factors of urban subway. J. Saf. Sci. Technol. 1 (3), 33–42. Hou, L., Lao, Y., Wang, Y., Zhang, Z., Zhang, Y., Li, Z., 2013. Modeling freeway incident response time: a mechanism-based approach. Transport. Res. Part C 28, 87–100. Khattak, A.J., Schofer, J.L., Wang, M.H., 1995. A simple time sequential procedure for predicting freeway incident duration. IVHS J. 2 (2), 113–138. Kim, W., Chang, G.L., 2012. Development of a hybrid prediction model for freeway incident duration, a case study in Maryland. Int. J. Intell. Transport. Syst. Res. 10 (1), 22–33. Kim, H., Choi, H., 2001. A comparative analysis of incident service time on urban freeways. J. Int. Assoc. Traffic Saf. Sci. 25 (1), 62–72. Kuhnert, P.M., Do, K.-A., McClure, R., 2000. Combining non-parametric models with logistic regression: an application to motor vehicle injury data. Comput. Stat. Data Anal. 34 (3), 371–386. LegCo, 2013. Legislative Council of Hong Kong. (accessed 05.08.13). Ma, J., Kockelman, K.M., Damien, P., 2008. A multivariate Poisson-lognormal regression model for prediction of crash counts by severity: using Bayesian methods. Accid. Anal. Prev. 40 (3), 964–975. Mishara, B.L., 1999. Suicide in the Montreal subway system: characteristics of the victims, antecedents, and implications for prevention. Can. J. Psychiatry 44, 690–696. Mohamed, M.G., Saunier, N., Miranda-Moreno, L.F., Ukkusuri, S.V., 2013. A clustering regression approach: a comprehensive injury severity analysis of pedestrian–vehicle crashes in New York, US and Montreal, Canada. Saf. Sci. 54, 27–37. Mumpower, J.L., Shi, L., Stoutenborough, J.W., Vedlitz, A., 2013. Psychometric and demographic predictors of the perceived risk of terrorist threats and the willingness to pay for terrorism risk management programs. Risk Anal. 33 (10), 1802–1811. Okumura, T., Hisaoka, T., Yamada, A., Naito, T., Isonuma, H., Okumura, S., Suzuki, K., 2005. The Tokyo subway sarin attack-lessons learned. Toxicol. Appl. Pharmacol. 207 (2), 471–476. Pender, B., Currie, G., Delbosc, A., Shiwakoti, N., 2012. Planning for the unplanned: an international review of current approaches to service disruption management of railways. In: Australasian Transport Research Forum (ATRF), 35th, 2012, Perth, Western Australia, Australia. Pereira, F.C., Rodrigues, F., Ben-Akiva, M., 2013. Text analysis in incident duration prediction. Transport. Res. Part C 37, 177–192. Quinlan, J., 1992. Learning with continuous classes. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Singapore, World Scientific, 1992. Reis, M., 2006. DC traction power negative cable monitoring system. In: ASME/IEEE 2006 Joint Rail Conference, Atlanta, Georgia, USA, April 4–6, 2006. Smith, K., Smith, B., 2001. Forecasting the Clearance Time of Freeway Accidents. Research Report STL-2001-01. Center for Transportation Studies, University of Virginia, Charlottesville, VA. Staten, C.L., 1997. Emergency Response to Chemical/Biological Terrorist Incidents. Emergency Response & Research Institute. Su, X., 2002. Maximum likelihood regression trees. In: ASA Proceedings of the Joint Statistical Meetings. American Statistical Association, pp. 3379–3383. Tavassoli, H.A., Ferreira, L., Washington, S., Charles, P., 2013. Hazard based models for freeway traffic incident duration. Accid. Anal. Prev. 52, 171–181. Torgo, L., 2000. Inductive learning of tree-based regression models. AI Commun. 13 (2), 137–138, . Valenti, G., Lelli, M., Cucina, D., 2010. A comparative study of models for the incident duration prediction. Eur. Transp. Res. Rev. 2 (2), 103–111. Wang, W., Chen, H., Bell, M.C., 2005. Vehicle breakdown duration modelling. J. Transport. Stat. 8 (1), 75–84. Wei, C.H., Lee, Y., 2007. Sequential forecast of incident duration using Artificial Neural Network models. Accid. Anal. Prev. 39 (5), 944–954. Weng, J., Meng, Q., Wang, D., 2013. Tree-based logistic regression approach for work zone casualty risk assessment. Risk Anal. 33 (3), 1539–6924. Wigglesworth, E.C., Uber, C.B., 1991. An evaluation of the railway level crossing boom barrier program in Victoria Australia. J. Saf. Res. 22, 133. Yan, X., Richards, S., Su, X., 2010. Using hierarchical tree-based regression model to predict train–vehicle crashes at passive highway-rail grade crossings. Accid. Anal. Prev. 42 (1), 64–74.