Expert Systems with Applications 38 (2011) 10420–10424
Contents lists available at ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Traffic safety forecasting method by particle swarm optimization and support vector machine Ren Gang ⇑, Zhou Zhuping School of Transportation, Southeast University, Nanjing 210096, China
a r t i c l e
i n f o
Keywords: Support vector machine Evaluation indexes Traffic safety forecasting Influencing factors
a b s t r a c t It is important to establish the decision of traffic safety planning by forecasting the development tendency of traffic accident according to the related data of traffic safety in former years. In order to solve the drawbacks of BP neural network, a novel approach which combines particle swarm optimization and support vector machine (PSO–SVM) is presented to traffic safety forecasting. Firstly, influencing factors of traffic safety and evaluation indexes are analyzed, then traffic safety forecasting model by PSO– SVM is established according to the influencing factors. Finally, the data about traffic safety in China from 1970 to 2006 are applied to research the forecasting ability of the proposed method. The experimental results show that traffic safety forecasting by PSO–SVM is better than that by BP neural network. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction In the past years, with the perfection of road basic installation construction in our country, and the increasing of mobile car retention quantity and roadway track in use, traffic accident rate holds a high level (Bener et al., 2008; Berg, Gregersen, & Laflamme, 2004; Hubacher & Allenbach, 2004). It is important to establish the decision of traffic safety planning by forecasting the development tendency of traffic accident according to the related data of traffic safety in former years (Hayakawa, Fischbeck, & Fischhoff, 2000; Lee, Chung, & Son, 2008). Artificial neural networks, especially BP neural network are common traffic safety forecasting methods. However, the forecasting result by BP neural network will be affected due to its inherent drawbacks, such as local optimization solution (Sadeghi, 2000). Support vector machine (SVM) is the emerging machine learning method, which can be commonly applied to solve the forecasting problem with small samples and non-linear (Alenezi, Moses, & Trafalis, 2008; Rajasekaran et al., 2008). The largest problem encountered in constructing the SVM model is how to select the training parameter values of SVM, and inappropriate parameter settings lead to poor forecasting results. Consider a great influence of the parameters on generalization performance of SVM, particle swarm optimization is applied to search the parameters of SVM in global space. The technique is derived from social behavior such as bird flocking and fish schooling, which can efficiently find optimal or near-optimal solutions in search spaces (Pedrycz, Park, & Pizzi, 2009; Salman, Ahmad, & Al-Madani, 2002; Tasgetiren, Liang, ⇑ Corresponding author. E-mail address:
[email protected] (R. Gang). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.02.066
Sevkli, & Gencyilmaz, 2007). Thus, the novel approach which combines particle swarm optimization and support vector machine (PSO–SVM) is presented to traffic safety forecasting. Firstly, influencing factors of traffic safety and evaluation indexes are analyzed, then traffic safety forecasting model by PSO–SVM is established according to the influencing factors. The data about traffic safety in China from 1970 to 2006 are applied to research the forecasting ability of the proposed method, where the data from 1970 to 1998 are used as the training data and the data from 1999 to 2006 are used as the testing data. 2. Support vector machine Consider a set of data fðxi ; yi Þgni , where xi is the inputs, yi is the corresponding target value (Jain, Rahman, & Kulkarni, 2007). The function f(x) is represented by using a linear function in the feature space:
f ðxÞ ¼ w uðxÞ þ b
ð1Þ
where w is weight vector and b denotes the bias. The optimization problem solved by the SVM can be formulated as:
min s:t:
n X 1 kwk2 þ C ni þ ni ; 2 i¼1
yi w uðxÞ b 6 e þ ni ;
ð2Þ
w uðxÞ þ b yi 6 e þ ni ; ni P 0;
ni P 0;
where C > 0 determines the tradeoff between the empirical risk and regularization term. n and n⁄ are two positive slack variables which
R. Gang, Z. Zhuping / Expert Systems with Applications 38 (2011) 10420–10424
measure the deviation (yi f(xi)) from the boundaries of the einsensitive zone. By constructing the Lagrange function, the dual problem can be given as:
max
þ
n 1X ai ai aj aj K xi ; xj 2 i;j¼1 n n X X yi ai ai e ai þ ai ; i¼1
s:t:
i¼1
n X
ai ai Kðxi ; xÞ þ b:
ð4Þ
i¼1
3. Traffic safety forecasting by PSO–SVM 3.1. Establish the training sets
ð3Þ
n X ai ai ¼ 0; i¼1
0 6 ai ;
f ðxÞ ¼
10421
ai 6 C;
where ai and ai are Lagrange multiplier coefficients obtained by solving the dual optimization problem in support vector machine, and Kðxi ; xj Þ ¼ uðX i ÞuðX j Þ is called the kernel function, the most widely used kernel unction is the radial basis function, where r is the width of radial basis function. Finally, the regression function takes the following function:
Traffic safety forecasting model is a type of non-linear forecasting model according to influencing factor. The evaluation indexes of traffic safety mainly include number of incidents, death toll and number of injury, and influencing factors of traffic safety mainly include roadway track in use, mobile car retention quantity, population size, passenger turnover volume and turnover volume of freight. The five influencing factors of traffic safety form the input vector of the forecasting model, and the three evaluation indexes of traffic safety are used as the output values of the forecasting model. Then, the training sets are established, which are shown in Fig. 1.
Fig. 1. Establishing the training sets of traffic safety forecasting model.
Fig. 2. The process of support vector machine optimized by particle swarm optimization.
10422
R. Gang, Z. Zhuping / Expert Systems with Applications 38 (2011) 10420–10424
Table 1 The data about traffic safety in China from 1970 to 2006. Date
Roadway track in use (km)
Mobile car retention quantity (104)
Population size (104)
Passenger turnover volume (t km)
Turnover volume of freight (billion people km)
Number of incidents
1970 1971 1972 1973 1974 1975 1976 1977 1978 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
636660.4603 675347.7591 699923.3137 715571.0300 737842.6264 783633.7964 823377.0760 855632.4914 890191.8650
424120.3127 494391.0390 577429.5339 672659.8078 785649.7598 917071.3778 1067020.6600 1199306.2620 1358375.3790
82990.4958 85227.2552 87176.0199 89211.1830 90852.7533 92417.5249 93713.5014 94971.0786 96257.4553
240.06 266.94 293.83 320.71 347.60 374.48 423.42 472.36 521.30
138.05 150.97 163.89 176.81 189.73 202.65 226.48 250.41 274.14
55437 69975 77465 71192 81672 91606 101878 112222 107251
9654 11331 11849 13215 15599 16862 19441 20427 19096
37128 52119 58738 53827 66498 71776 81908 84779 77471
1185770.508 1226377.772 1278504.827 1351751.499 1402630.577 1698000 1765200 1809800 1870700 3345200 3457000
11000632.61 12190723.04 13192752.88 14529144.65 16088732.39 18020400 20531700 23829300 26937100 31596600 36973500
122388.1817 123626.5592 124807.3378 125911.3178 126573.7058 127627 128453 129227 129988 130756 131448
4908.8 5541.4 5942.8 6199.2 6657.4 7207.1 7805.8 7695.6 8748.4 9292.1 10135.9
5011.2 5271.5 5483.4 5724.3 6129.4 6330.4 6782.5 7099.5 7840.9 8693.2 8895.2
287685 304217 346129 412860 616971 754919 773137 667507 567753 450254 378781
73655 73681 78067 83529 93853 105930 109381 104372 99217 98738 89455
174447 190128 222721 286080 418721 546485 562074 494174 451810 469911 431139
Actual values
1999 2000 2001 2002 2003 2004 2005 2006
412860 616971 754919 773137 667507 567753 450254 378781
Number of injury
3.2. PSO–SVM traffic safety forecasting model
Table 2 The forecasting values of number of incidents. Date
Death toll
Forecasting values
Error
PSO–SVM
BPNN
PSO–SVM
BPNN
419410 627460 726260 791400 684590 581140 446410 355610
429970 584320 762510 679110 623690 586570 434050 394650
0.0159 0.0170 0.0380 0.0236 0.0256 0.0236 0.0085 0.0612
0.0414 0.0529 0.0101 0.1216 0.0656 0.0331 0.0360 0.0419
3.2.1. Particle swarm optimization The technique is derived from social behavior such as bird flocking and fish schooling, where the individuals are defined as particles. The particles evaluate their positions according to the fitness at every iteration. In the A-dimensional space, the particle i is represented as Xi = (xi1, xi2, . . . , xiA), its velocity is represented as Vi = (vi1, vi2, . . . , viA), and its previous best position is represented as Pi = (pi1, pi2, . . . , piA). The best position among all particles is represented as Pg = (pg1, pg2, . . . , pgA) .The velocity and position of the particle are computed as follows:
v ia ðt þ 1Þ ¼ v ia ðtÞ þ c1 r1 ðpia ðtÞ xia ðtÞÞ þ c2 r2 pga xi ðtÞ ;
Table 3 The forecasting values of death toll. Date
1999 2000 2001 2002 2003 2004 2005 2006
Actual values
83529 93853 105930 109381 104372 99217 98738 89455
xia ðt þ 1Þ ¼ xia ðtÞ þ v ia ðt þ 1Þ;
Forecasting values
Error
PSO–SVM
BPNN
PSO–SVM
BPNN
85525 90294 103120 113310 97919 100450 101570 87371
79827 88834 98473 102280 108320 90662 101180 94961
0.0239 0.0379 0.0265 0.0359 0.0618 0.0124 0.0287 0.0233
0.0443 0.0535 0.0704 0.0649 0.0378 0.0862 0.0247 0.0616
Table 4 The forecasting values of number of injury. Date
1999 2000 2001 2002 2003 2004 2005 2006
Actual values
286080 418721 546485 562074 494174 451810 469911 431139
Forecasting values
Error
PSO–SVM
BPNN
PSO–SVM
BPNN
284330 409400 554460 556480 479160 426800 483850 410920
275400 399790 575780 584910 476960 421350 448110 448760
0.0061 0.0223 0.0146 0.0100 0.0304 0.0554 0.0297 0.0469
0.0373 0.0452 0.0536 0.0406 0.0348 0.0674 0.0464 0.0409
ð5Þ ð6Þ
where pia represents the previous best position of the particle i; pga is the best position among all particles; c1 represents the cognitive parameter, and c2 represents the social parameter; r1 and r2 are the random numbers between 0 and 1; t represents the number of current iteration. 3.2.2. Support vector machine optimized by particle swarm optimization The inappropriate parameters, C, r and e of SVM can lead to over-fitting. Consider a great influence of the parameters on generalization performance of SVM, particle swarm optimization is applied to search the parameters of SVM in global space, which is shown in Fig. 2. The optimal process by particle swarm optimization is described as follows: Step 1: Initialize parameters of particle swarm optimization including the number of evolutionary generation, population size, inertia weight, and randomly generate a population of particles composed of e, C and r. Step 2: Use the selected parameters to train a SVM model. The testing samples are used to measure forecasting ability of the SVM model. Applicability of the model is measured by fitness P ^ as mean absolute percentage error: N1 Ni¼1 j yi yi yi j, where yi is ^i is the forecasting value. the actual value, and y
R. Gang, Z. Zhuping / Expert Systems with Applications 38 (2011) 10420–10424
10423
5
Fig. 3. The comparison of number of incidents forecasting values between PSO–SVM and BP neural network.
4
Fig. 4. The comparison of death toll forecasting values between PSO–SVM and BP neural network.
Step 3: Compute the velocity and position of each particle with Eqs. (5) and (6), respectively. Step 4: Stop the algorithm if termination criterion is satisfied, and the best SVM is gained. Otherwise, produce the new particle according to Eqs. (5) and (6). 4. Case studies As shown in Table 1, the data about traffic safety in China from 1970 to 2006 are applied to research the forecasting ability of the proposed method, where the data from 1970 to 1998 are used as the training data and the data from 1999 to 2006 are used as the testing data (see Table 1). The feature attributes include roadway track in use, mobile car retention quantity, population size, passenger turnover volume and turnover volume of freight, which are
used as the input of PSO–SVM model; and the decision-making attributes include number of incidents, death toll and number of injury, which are used as the output of PSO–SVM model. The forecasting values of number of incidents, death toll and number of injury by PSO–SVM model are given in Tables 2–4. The comparison of forecasting values between PSO–SVM and BP neural network are given in Figs. 3–5, and Table 5 shows the comparison of mean absolute percentage error (MAPE) between PSO–SVM and BP neural network, where MAPE values of number of incidents forecasting by PSO–SVM and BP neural network are 0.0267 and 0.0503, respectively, MAPE values of death toll forecasting by PSO–SVM and BP neural network are 0.0313 and 0.0554, respectively, MAPE values of number of injury forecasting by PSO–SVM and BP neural network are 0.0269 and 0.0458, respectively, which show that traffic safety forecasting by PSO–SVM is better than that by BP neural network.
10424
R. Gang, Z. Zhuping / Expert Systems with Applications 38 (2011) 10420–10424 5
Fig. 5. The comparison of number of injury forecasting values between PSO–SVM and BP neural network.
Table 5 The comparison of mean absolute percentage error between PSO–SVM and BP neural network.
References
MAPE
Number of incidents Death toll Number of injury
2009BAG13A05) and the National Natural Science Foundation of China (No. 51078086).
PSO–SVM
BPNN
0.0267 0.0313 0.0269
0.0503 0.0554 0.0458
5. Conclusion A novel approach which combines particle swarm optimization and support vector machine is presented to traffic safety forecasting. The feature attributes include roadway track in use, mobile car retention quantity, population size, passenger turnover volume and turnover volume of freight, which are used as the input of PSO–SVM model; and the decision-making attributes include number of incidents, death toll and number of injury, which are used as the output of PSO–SVM model. The data about traffic safety in China from 1970 to 2006 are applied to research the forecasting ability of the proposed method. MAPE values of number of incidents forecasting by PSO–SVM and BP neural network are 0.0267 and 0.0503, respectively, MAPE values of death toll forecasting by PSO–SVM and BP neural network are 0.0313 and 0.0554, respectively, MAPE values of number of injury forecasting by PSO–SVM and BP neural network are 0.0269 and 0.0458, respectively. The experimental results show that traffic safety forecasting by PSO–SVM is better than that by BP neural network. Acknowledgement This research is jointly supported by the National Road Traffic Safety Science and Technology Action Program of China (No.
Alenezi, A., Moses, S. A., & Trafalis, T. B. (2008). Real-time prediction of order flowtimes using support vector regression. Computers & Operations Research, 35(11), 3489–3503. Bener, A., Al Maadid, M. G. A., Özkan, T., Al-Bast, D. A. E., Diyab, K. N., & Lajunen, T. (2008). The impact of four-wheel drive on risky driver behaviours and road traffic accidents. Transportation Research Part F: Traffic Psychology and Behaviour, 11(5), 324–333. Berg, H.-Y., Gregersen, N. P., & Laflamme, L. (2004). Typical patterns in road-traffic accidents during driver training: An explorative Swedish national study. Accident Analysis and Prevention, 36(4), 603–608. Hayakawa, H., Fischbeck, P. S., & Fischhoff, B. (2000). Traffic accident statistics and risk perceptions in Japan and the United States. Accident Analysis and Prevention, 32(6), 827–835. Hubacher, M., & Allenbach, R. (2004). Prediction of accidents at full green and green arrow traffic lights in Switzerland with the aid of configuration-specific features. Accident Analysis and Prevention, 36(5), 739–747. Jain, P., Rahman, I., & Kulkarni, B. D. (2007). Development of a soft sensor for a batch distillation column using support vector regression techniques. Chemical Engineering Research and Design, 85(2), 283–287. Lee, J.-Y., Chung, J.-H., & Son, B. (2008). Analysis of traffic accident size for Korean highway using structural equation models. Accident Analysis and Prevention, 40(6), 1955–1963. Pedrycz, W., Park, B. J., & Pizzi, N. J. (2009). Identifying core sets of discriminatory features using particle swarm optimization. Expert Systems with Applications, 36(3), 4610–4616. Rajasekaran, S., Gayathri, S., & Lee, T.-L. (2008). Support vector regression methodology for storm surge predictions. Ocean Engineering, 35(16), 1578–1587. Sadeghi, B. H. M. (2000). A BP-neural network predictor model for plastic injection molding process. Journal of Materials Processing Technology, 103(3), 411–416. Salman, A., Ahmad, I., & Al-Madani, S. (2002). Particle swarm optimization for task assignment problem. Microprocessors and Microsystems, 26(8), 363–371. Tasgetiren, M. F., Liang, Y.-C., Sevkli, M., & Gencyilmaz, G. (2007). A particle swarm optimization algorithm for makespan and total flowtime minimization in the permutation flowshop sequencing problem. European Journal of Operational Research, 177(3), 1930–1947.