Accepted Manuscript Traffic flow prediction based on combination of support vector machine and data denoising schemes Jinjun Tang, Xinqiang Chen, Zheng Hu, Fang Zong, Chunyang Han, Leixiao Li
PII: DOI: Reference:
S0378-4371(19)30226-2 https://doi.org/10.1016/j.physa.2019.03.007 PHYSA 20642
To appear in:
Physica A
Received date : 10 May 2018 Revised date : 26 September 2018 Please cite this article as: J. Tang, X. Chen, Z. Hu et al., Traffic flow prediction based on combination of support vector machine and data denoising schemes, Physica A (2019), https://doi.org/10.1016/j.physa.2019.03.007 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Highlights (for review)
Highlight A hybrid model combing denoising method and support vector machine to predict traffic volume. The prediction performances of five denoising methods including EMD are compared. Traffic flow collected three loop detectors in city of Minneapolis are used in model validation. The results show the effectiveness of denoising algorithms on the improvement of prediction. The improvement of the EEMD on prediction is higher than other algorithms.
*Manuscript Click here to view linked References
Traffic flow prediction based on combination of support vector machine and data denoising schemes Jinjun Tang1, Xinqiang Chen2, Zheng Hu3, Fang Zong4,*, Chunyang Han1, Leixiao Li1 1. School of Traffic and Transportation Engineering, Smart Transport Key Laboratory of Hunan Province, Central South University, Changsha, 410075, China. 2. Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai, 201306, China. 3. School of Intelligent transportation, Hunan Communication Engineering Polytechnic, Changsha, 410132, China. 4. College of Transportation, Jilin University, Jilin, 130012, China. *Corresponding Author E-Mail:
[email protected]
Abstract Traffic flow prediction with high accuracy is definitely considered as one of most important parts in the Intelligent Transportation Systems. As interfering by some external factors, the raw traffic flow data containing noise may cause decline of prediction performance. This study proposes a prediction method by combining denoising schemes and support vector machine model to improve prediction accuracy. This study comprehensively evaluated the multi-step prediction performance of models with different denoising algorithms using the traffic volume data collected from three loop detectors located on highway in city of Minneapolis. In the prediction performance comparison, five denoising methods including EMD (Empirical Mode Decomposition), EEMD (Ensemble Empirical Mode Decomposition), MA (Moving Average), BW filter (Butterworth) and WL (Wavelet) are considered as candidates, specially, four wavelet types, coif (coiflet), db (daubechies), haar and sym (symlet), are further compared based on accuracy evaluation indicators. The prediction results show that the prediction results of the model combined with denoising algorithm are better that of the model without denoising strategy. Furthermore, the improvement of the EEMD on prediction performance is higher than other denoising algorithms, and WL method with db type achieves higher accuracy than other three types. Through comparing prediction accuracy of different denoising models, this study provides valuable suggestions for selecting the appropriate denoising approach for traffic flow prediction.
Key Words: traffic flow prediction; denoising algorithm; support vector machine; ensemble empirical mode decomposition 1 Introduction With sharply increase of traffic demand, traffic problems such as traffic pollution, traffic jams, accidents have seriously affected living quality of people in urban city. Traffic flow prediction in road network is a key step to realize active transportation managing and controlling means. Through extracting traffic variation patterns from large amount of historical data, we can perceive future traffic statues and subsequently design effective traffic organization strategy to relieve traffic congestions. With fast development of Information and electronic technology, the traffic flow data collection change from original single-source to multi-source based ways, e.g. inductor loops, remote microwave, Bluetooth, video, float cars with GPS navigation and so on. However, as the external environment of transportation system is complex, the raw traffic data collected
form detectors may be interfered with some unobservable factors. These interferences, or we can call as noise, will result in the reduction of reliability and accuracy of traffic flow prediction. Presently, a large amount of prediction models are proposed to enhance prediction performance by focusing on different model structure and calculation procedure, e.g. statistical methods [1-5], artificial neural networks [6-10], fuzzy-neural networks [11-14], support vector regression [15-20], Kalman filter theory [21-25] and hybrid approaches by combing several models [26-30]. These algorithms express high performance in some specific applications. Due to the uncertainty and complexity of spatio-temporal variation in traffic system, short-term traffic flow data are strongly interfered by noises during collection, which will significantly affect prediction performance. Accordingly, some works have been finished by researchers focusing on how to design a data denoising process before the application for improving prediction performance [31-37]. As we can see, the Wavelet Decomposition is a widely used method to implement denoising strategy. Xie et al. [31] combined two types of wavelet model with Kalman filter to forecast short-term traffic volumes, and the testing results proved wavelet Kalman filter model expressed higher performance by relieving the influence of noise. Tan et al. [32] firstly applied wavelet transform to reduce the noise in the raw traffic data, and then proposed a hybrid model with ARIMA and the SVM to predict traffic flow based on denoised dataset. Jiang and Adeli [33] introduced an improved discrete wavelet packet transform to deal with the noise implied in original data source, furthermore, the statistical autocorrelation function was used to select decomposition level in wavelet method. Lu and Huang [34] also used wavelet transform to decompose traffic flow data into multi-scale components. Moreover, several modified prediction models on the basis of self-organizing neural networks, neuro-wavelet, and fuzzy-neural network are proposed after applying wavelet denoising algorithm in the works of [35], [36] and [37]. Overall, the denoising process is an effective means to improve the traffic flow prediction accuracy. In this study, a comprehensive comparison is implemented to show the effectiveness and ability to enhance prediction accuracy of different denoising methods. Firstly, in the model comparison, several types of denoising methods are involved, which mainly include EMD (Empirical Mode Decomposition), EEMD (Ensemble Empirical Mode Decomposition), MA (Moving Average), BW filter (Butterworth) and WL (Wavelet), specially, in the WL model, we further compare several commonly used wavelet types, including coif, db, haar and sym. In the next section, we will provide detailed introduction for each model. Secondly, based on the denoised traffic flow dataset, a SVM (Support Vector Machine) model is trained to predict variation patterns in traffic flow. Finally, through an experiment in real traffic system, prediction performance is estimated under different time scales of detecting data and forecasting ahead steps. The remainder of the paper is organized as follows. In the following section, different denoising methods are briefly introduced. Section 3 summarizes data collection and denoising process. Section 4 introduces the prediction results and discussion. The paper ends with brief concluding remarks and future research in section 5. 2 Methodology 2.1 Denoising algorithms In this subsection, several denoising algorithms widely used in data processing and analyzing area are briefly introduced.
2.1.1 EMD and EEMD EMD has been applied in many fields such as image analysis, denoising traffic data, gearbox fault diagnosis etc., and accomplished great success due to its data adaptive feature [38-40]. The EMD model extracts the intrinsic modes based on the data's characteristic frequencies, and then the denoised data is decomposed into intrinsic mode function (IMF) set through the EMD shifting procedure. We can obtain relationships of initial detector data I(d), decomposed elevation IMFs’ set and residual elevation R(d) in Eq. (1): (1) EMD may fail in the denoising procedure when the to-be denoised data cannot meet certain prerequisites [41, 42]. An improved version of EMD model is EEMD which can alleviate the EMD disadvantages. Specially, the EEMD model first introduces white noise to the initial detector data I(d) (see Eq. (2)). Then EEMD decomposes noise-aided detector data to obtain the jth detector-IMF set with shifting process which is similarly as that of EMD. In the jth round shifting process of EEMD model, the noised is decomposed into noise-aided IMFs and residue (see Eq. (3)). When EEMD finishes the round of shifting process, we can obtain final IMFs set through the average value of decomposed IMF collection is shown in Eq. (4).
sets of IMFs, and the final (2) (3) (4)
where parameter j is the round of ensemble times and it meets the condition ( is the maximum ensemble time). EMD has one parameter (i.e. ensemble number ) to be determined while EEMD has two parameters to be determined (i.e. added white noise and ensemble number ). In fact, the EEMD functions same as EMD when the added white noise is set to zero. Thus, we only describe the EEMD parameters setting criterion for simplicity, which also applicable to EMD parameter setting. Relationship of the and is demonstrated in Eq. (5). The larger implicates better denoising result which requires higher computation cost. (5) where
shows EEMD model's denoising level.
2.1.2 Moving Average Method Moving Average (MA) [43] is a statistical method to create new data series by averaging several subsets in the full data. Through this method, a new subset is formed, and the values of which are calculated by the weighting average of corresponding data points. Moving average is a denoising method which is commonly used in smoothing fluctuation of short-term time series and highlighting variation trends of longer-term time series. According to different time scales of data collection, the parameters in MA will be adjusted. A simple moving average method was considered in this study to smooth traffic flow data, in which an unweighted calculation approach is applied, that is, the new average is considered an equal number of data on either side of a central value. This strategy ensures that variations of new dataset are aligned with the trends of raw dataset instead of shifting with time periods. If the traffic volumes detected in different time
periods are defined as
,
,…,
, the simple moving average is calculated as: (6)
where the n is the total number of volume data, and it is also can be considered as the window size in the moving average. Thus, the value of the window size will significantly affect smoothing or denoising performance. If its value is set too small, it is difficult for the MA algorithm to eliminate the small variations in the traffic volume data, especially for the data with short-term collecting scale. While, if the value of window size is set too large, the denoised data will be over-smoothed and some actual patterns implied in raw traffic flow data be removed. The values of window size for traffic volume detected in different time scales will be introduced and discussed in Section 4. 2.1.3 Butterworth Filter For a given sampling frequency of detector data, Butterworth filter (BW) implements noise removal process by constraining data fluctuation margin. Specifically, detector data points, with fluctuation margin between neighboring points exceeding the threshold , will be substituted by new data which results in smaller fluctuations. BF has been applied in many fields include electromyogram denoising, traffic data outlier suppressing [44, 45] etc. The BF transfer function can be depicted in Eq. (7). (7) where the parameter and are coefficients set which determines BF response, and is the filter order. The Eq. (7) shows Z-domain of the transformation function where zeros and poles can be found easily. Specifically, zeros-related data help the construction of numerator in transformation function while poles for constructing the denominator. BF has two papers to be determined which are cutoff frequency and lowpass filter order . The cutoff frequency determines fluctuation margin of denoised detector data. Specifically, larger will lead to flatter fluctuations in the denoised data, But, some true details of detector data may also be discarded by larger value of . However, smaller may fail to remove outliers successfully. Larger results in sharp transferring gap between smoothed data points and their neighbors. Contrarily, smaller results in flatter transferring gap between smoothed data points and their neighbors. 2.1.4 Wavelet To minimize the interference of measured environment, a Wavelet (WL) denoising method [46, 47] is adopted to eliminate the noise resided in the traffic volume data. Through wavelet transformation, the local detailed trends can be revealed. Generally, the denoising procedure in the WL method contains following three steps: Step1. Decompose. In this step, wavelet type is firstly selected, and then the number of level N is determined. Accordingly, the wavelet decomposition of the data at different levels is calculated. Step2. Set threshold for coefficients. For each level from 1 to N, select a threshold and apply soft threshold to the high frequent coefficients. Step3. Reconstruct. Compute wavelet reconstruction using the low frequent coefficients of level N and the modified coefficients of levels from 1 to N.
In this study, we testify and compare the prediction accuracy of traffic volume based on WL denoising method with different wavelet types including coif, db, haar and sym. Furthmore, the optimal parameters selected in the experiment are provide in Section 4. 2.2 Support Vector Machine The main ideal of Support Vector Machine (SVM) is to map data into a high-dimensional feature space, and this method has been widely used in data analysis, pattern recognition, classification and regression. Considering a data series (x1, y1), (x2, y2), …, (xN, yN) for regression, N represents the total number of data samples. The SVM regression function is expressed as follows [48]:
f ( x) wT(x) b
(8)
where w is a vector in a feature space F and Φ(x) represents the features, which maps the input data x into a vector in F. Defining an ε-insensitive loss function:
L ( x, y, f ) y f ( x) max(0, y f ( x) ) The parameters w and b can be calibrated by optimizing the following equation: N 1 2 C L ( xi , yi , f ) w 2 i 1
(9)
(10)
where ε indicate maximum deviation, C means the penalty in the training process, which quantifies the balance between empirical risk and model smoothness. By adding two positive slack variables β and β*, Equation (10) can be rewritten as following constrained formulation: N 1 2 (11) min J ( i , i* ) w C ( i i* ) 2 i 1 subject to:
w T (x) b yi i* yi w T (x) b i i , i* 0
where
w
2
N
represents the regularized term.
( i 1
i
i* ) denotes the empirical error
measured by ε-insensitive loss function. Applying appropriate Karush-Kuhn-Tucker (KKT) conditions to Equation (11), following dual form for the optimization problem can be established: N N 1 N N (12) max Q(i , i* ) yi (i i* ) ( i i* ) ( i i* )( j *j ) K ( xi x j ) 2 i 1 j 1 i 1 i 1 subject to: N
(
i
i* ) 0
i 1
0 i C 0 i* C
Finally, the SVM based traffic flow prediction method is expressed as: N
f ( x) ( i i* )K ( xi , xi ) b
(13)
i 1
where K(x, xi) represents the kernel function. αi and αi* denotes the solution of the dual problem. Generally, there are four traditional kernel functions: linear, radial basis function (RBF), polynomial and sigmoid. In this study, the widely used RBF kernel is adopted in the SVM model. 2
K ( xi , x j ) exp( xi x j )
(14)
where γ is the parameter in kernel function. 2.3 Traffic Flow Prediction A comparison of volume prediction results under different predicting ahead steps is introduced in discussion section. Single-step prediction is to forecast traffic volume at next future time step. Similarly, the aim of multi-step prediction focuses on how to accurately estimate the traffic volume at future multiple steps, which can be referred to Fig.1. In the figure, v indicates the traffic volume, and the input dataset is constructed by volume data according to the sampling time sequence, in which vi represents the traffic volume data collected at the ith time step. It can be seen that the accuracy of multi-step prediction is generally lower than that of single-step prediction. The reason is that the variation patterns or trend of future traffic volume at multiple steps are more difficult to predict or track. The specific experimental analysis and discussions will be introduced in the section 4.
Fig.1 Traffic flow prediction with one and multi ahead steps 3 Data Collection and Denoising process 3.1 Data Source Data source analyzed in the prediction were collected from Minnesota Department of Transportation (Mn/DOT) and Transportation Data Research Laboratory (TDRL) in University of Minnesota Duluth [49]. Traffic flow data including volume, speed and occupancy were detected from loop detectors in freeway network of Minnesota State. The data collection time period starts from January 1st to December 31st in 2015. The detecting time sale for the data samples in this study is set as 30 seconds, that is, 2880 data samples can be accumulated each day. Three detection stations were selected in the city of Minneapolis, see Fig.1. Station A and C are located in the south west corridor of urban city. Station B is designed to collect traffic flow in center area of city. The traffic flow data collected from these three locations can partially reflect spatial distribution patterns. It is should be noted that freeway contains general purpose lane (GPL) and
high-occupancy vehicle lane (HOV) and we mainly focuses on the traffic flow prediction in GPL. In more details, Table 1 shows information of stations including the freeway names, loop detector number, traffic direction, and number of lanes. As there are several lanes on detecting direction for each station, the traffic volume used in this study is the sum of volume in all lanes.
Fig. 2 Three data collection stations in city of Minneapolis Tab.1 Description of four detection locations Locations
Freeway
Direction
Loop detectors No.
Number of main lanes
A B C
I-35W I-94 I-35W
North East West
326 2131 2226
4 2 3
3.2 Traffic Flow Data Denoising Before forecasting the traffic flow, in order to eliminate the error or noise generated by the external interference in the raw data, we adopt a variety of denoising algorithms, including EMD, EEMD, BW, MA, WL with different wavelet types including coif, db, haar and sym, in the pre-process stage to improve the stability and accuracy of the prediction. We express the distribution of denoised traffic flow data under 2min, 10min and 60min detecting intervals in Fig.3, 4 and 5. It can be observed that the variation of data become smoother and the fluctuation also be relieved with the increase of the sampling interval. Furthermore, the denoising effects of five methods are different, while the sharp variations are generally removed, especially for the data under 2min sampling interval. In these three figures, the black thick lines represent the denoised data and the red lines denote the raw data source.
a EEMD
b EMD
c BW
d MA
e coif
f db
g haar h sym Fig.3 Comparison between actual data and denoised data collected at 2min using different algorithms
a EEMD
b EMD
c BW
d MA
e coif
f db
g haar h sym Fig.4 Comparison between actual data and denoised data collected at 10min using different algorithms
a EEMD
b EMD
c BW
d MA
e coif
f db
g haar h sym Fig.5 Comparison between actual data and denoised data collected at 60min using different algorithms 4 Prediction Results Comparison and Discussion In order to evaluate the single-step and multi-step prediction performance for different denoising models, three general accuracy indicators, the mean absolute error (MAE), the mean absolute percentage error (MAPE) and the root mean square error (RMSE) are used in evaluation. The formulas of MAE, MAPE and RMSE are expressed as follows: N
MAE
v v i
i 1
i
(15)
N
1 N vi vi MAPE 100% N i 1 vi
(16)
N
RMSE
(v v ) i 1
i
N
2
i
(17)
where, N is the number of sample used in evaluation, vi is the observed traffic volume in time step i at different stations, and vi represents the predicted volume. Furthermore, it should be noted that the unit of MAE and RMSE for data collected under different time intervals is the number of vehicles per time scales (e.g. 2min, 10min or 60min). In the experiments, aim to further evaluate the long term prediction performance of the all models, both one-step and multi-step ahead prediction (e.g., 3-step (3*time scales), 6-step (6*time scales) and 10-step (10*time scales)) are estimated. Taking the data collected under time scale of 10min as example, the 3-step ahead prediction is to forecast the traffic volume in the next 30min from current time point. In the division of training and validating dataset, in order to guarantee the fairness of the prediction for the models using denoised data and the raw data, we randomly select 70% of total samples as training dataset in the model calibration and remaining 30% of all samples as testing dataset in the model validation and comparison. It should be noted that, for the original SVM model and hybrid models combining different denoising algorithms, the testing datasets are the same. For the training dataset, the hybrid models use the denoised data samples instead of raw data source.
As we mentioned above, two types of prediction model are compared in the case study: SVM model and hybrid model combing SVM and denoising methods. In the hybrid models, we mainly validate and compare several widely used data denoising methods: EMD, EEMD, MA, BW filter and WL with different wavelet types including coif, db, haar and sym. A comprehensive comparison is implemented based on three evaluation index under different prediction ahead steps by using data collected from three stations. Tab.2, Tab.3 and Tab.4 show the prediction performance of different models using data source collected at 2min time scale. Tab.5, Tab.6 and Tab.7 provides the prediction performance using 10min time scale data source, and Tab.8, Tab.9 and Tab.10 show the comparing results of prediction models under 60min time interval. Specifically, Fig.6, Fig.7 and Fig.8 show the RMSE distribution of different prediction models under different time scales for three detectors Tab.2 Prediction accuracy of models with different forecasting steps ahead in detector 326 using data collected in 2min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif)
Number of forecasting steps ahead 1
3
6
10
4.1948 3.3800 3.6840 3.4437 4.1538 3.7353 3.4047 3.7918 3.4628
4.4724 3.3871 4.0702 3.6830 4.4042 4.0974 4.0135 4.2371 4.0799
4.8788 3.6479 4.6661 3.8998 4.7562 4.4459 4.3245 4.5968 4.3850
5.3640 3.6717 5.1362 4.6559 5.1231 4.8222 4.7058 5.1280 4.7416
Number of forecasting steps ahead 1
3
6
10
5.5983 4.4910 4.9338 4.5849 5.5894 5.0675 4.3418 5.1237 4.6468
5.9949 4.5342 5.4606 4.9328 5.9166 5.5026 5.4013 5.6581 5.4800
6.5886 4.8787 6.2460 5.2088 6.4290 6.0388 5.8730 6.1131 5.9440
7.2567 4.9144 6.9956 6.2711 6.9961 6.6120 6.4376 6.9163 6.4818
Number of forecasting steps ahead 1
3
6
10
32.5349 21.7143 23.7124 21.9961 27.0692 26.1321
30.6299 22.0872 27.8817 23.7021 28.6666 28.3637
33.1748 23.6407 32.2672 26.5534 31.7201 28.7690
36.4049 23.8426 33.5789 30.7666 34.2922 31.8792
SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
23.3832 27.0788 25.6056
27.2415 32.1842 28.0578
30.0264 31.0888 28.3753
30.4339 38.0954 30.9080
Tab.3 Prediction accuracy of models with different forecasting steps ahead in detector 2131 using data collected in 2min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
Number of forecasting steps ahead 1
3
6
10
4.9388 3.3429 4.0247 3.9657 4.9363 4.7305 4.4954 4.7879 4.6448
5.1627 4.1658 4.8876 4.8399 5.1727 5.0366 4.9185 5.0809 5.0191
5.4831 4.9797 5.4052 5.2313 5.4812 5.3095 5.2339 5.3981 5.2920
5.9240 5.5643 5.7174 5.6709 5.9226 5.7238 5.6738 5.7897 5.6932
Number of forecasting steps ahead 1
3
6
10
6.5372 4.2626 5.2353 4.9060 6.5326 6.2993 6.1011 6.3640 6.2467
6.8145 5.4280 6.4429 6.3786 6.8527 6.6768 6.5325 6.7306 6.6732
7.2414 6.5982 7.1392 7.0458 7.2329 7.0474 6.9461 7.1366 7.0141
7.8075 7.4308 7.5125 7.5132 7.8126 7.5750 7.5079 7.6622 7.5248
Number of forecasting steps ahead 1
3
6
10
20.5165 13.2832 16.4647 15.9215 20.1528 19.1933 17.8548 19.7106 18.4650
21.1137 16.9061 19.9511 19.7618 20.7521 20.0659 19.4178 20.6853 19.8714
22.3082 19.8853 21.6001 21.5628 22.3398 21.8115 21.4558 22.2143 21.5298
24.9135 23.1071 23.6861 23.6443 24.0627 23.7074 23.3324 23.9624 23.5792
Tab.4 Prediction accuracy of models with different forecasting steps ahead in detector 2226 using data collected in 2min scales MAE
Number of forecasting steps ahead 1
3
6
10
SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
3.7880 2.5906 2.8693 2.7845 3.7996 3.6087 3.4042 3.6814 3.5547
3.9582 3.7137 3.7729 3.6950 3.9122 3.8112 3.7814 3.8415 3.7954
4.2660 3.9362 4.1184 4.0875 4.1218 4.1046 4.0456 4.1713 4.0997
4.4737 4.2083 4.3628 4.2865 4.3940 4.3293 4.2393 4.3382 4.2752
Number of forecasting steps ahead 1
3
6
10
5.0165 3.3451 3.6227 3.5443 5.0728 4.8035 4.6018 4.8737 4.7477
5.2432 4.9071 4.9884 4.9014 5.1654 5.0554 4.9737 5.0925 5.0407
5.5346 5.2632 5.4673 5.4438 5.4827 5.3478 5.3196 5.4185 5.3362
5.9614 5.6414 5.8115 5.6996 5.8711 5.7306 5.6692 5.7791 5.6856
Number of forecasting steps ahead 1
3
6
10
24.1492 15.2824 16.9260 16.6631 23.9832 22.9112 21.6708 23.3805 21.7190
24.4084 22.7334 23.3057 22.9653 24.4685 23.5801 23.3266 23.6040 23.3805
26.5304 23.5716 25.2007 24.9725 25.2513 25.0638 24.4267 25.1685 24.8266
30.9011 25.3652 26.9255 26.3595 28.2187 28.4742 26.7533 29.4742 27.9453
Tab.5 Prediction accuracy of models with different forecasting steps ahead in detector 326 using data collected in 10min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar)
Number of forecasting steps ahead 1
3
6
10
13.8049 7.3140 9.1723 8.5782 14.9569 9.7651 11.9003 12.0373
19.0618 13.8680 16.2461 16.8107 20.9268 17.5981 16.9395 17.6051
24.4825 21.8568 22.6851 22.1748 25.9012 22.4877 22.4052 23.3159
31.7322 28.3185 29.8341 29.4345 31.9110 30.1653 28.7777 30.3061
SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
9.3383 1 19.0754 9.8688 12.0515 10.5862 26.0268 16.6833 12.6008 16.9606 13.5897 1 15.4393 8.2598 10.9372 10.0379 18.1896 13.3566 11.3250 16.4802 11.3977
17.4340 22.4731 Number of forecasting steps ahead 3 6 26.6462 33.4137 19.1051 30.5834 23.5895 31.2964 22.6279 30.8842 30.0585 36.0689 24.6313 31.6229 23.9002 31.3438 24.7332 32.8396 24.3326 31.3569 Number of forecasting steps ahead 3 6 18.8050 26.1560 14.6691 20.9698 16.6407 21.5394 16.2357 20.9256 20.2386 24.6672 18.1973 21.9714 17.3727 21.5360 19.4774 24.0651 18.1455 21.9020
29.2683 10 43.8467 39.4199 41.1328 40.4119 44.0430 41.1202 39.4578 41.6928 40.5222 10 32.1230 27.2771 28.4094 28.0582 30.6644 29.9814 27.3569 30.1323 28.7864
Tab.6 Prediction accuracy of models with different forecasting steps ahead in detector 2131 using data collected in 10min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA
Number of forecasting steps ahead 1
14.667 8.0301 9.8383 9.3094 17.3415 11.7971 10.3276 12.2949 10.8399 1
20.0182 10.6307 12.7362 12.4016 23.2528
3
6
18.5921 23.5039 14.8621 21.3967 16.7417 22.2502 16.9624 21.5152 20.3785 25.0103 17.6691 22.0628 17.4317 21.4455 17.2363 22.2335 17.4601 21.6248 Number of forecasting steps ahead 3 6 25.6457 32.208 20.1839 29.9837 23.3944 30.0881 22.7495 29.3605 27.5552 34.6196
10
31.5573 28.6443 29.6263 29.5234 32.5926 29.6713 29.0194 29.7842 29.3040 10
42.5377 39.1415 40.1533 40.1960 44.8840
SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
16.5823 15.2325 16.7455 15.8924 1
11.7035 6.2677 7.6273 7.5419 14.1539 10.8227 9.9751 11.4255 10.0775
24.5212 30.4244 23.7432 29.7694 25.3329 30.7687 24.1056 29.8211 Number of forecasting steps ahead 3 6 15.1186 20.0246 12.1679 18.0188 13.8038 19.4666 13.5409 18.2579 17.3363 20.7205 13.9192 18.0945 13.2275 16.4317 14.7542 20.9315 13.5175 17.1515
40.5440 39.4520 40.8019 40.3005 10
31.3728 28.1627 29.0246 27.9858 29.4489 27.6268 23.8869 33.3593 27.3245
Tab.7 Prediction accuracy of models with different forecasting steps ahead in detector 2226 using data collected in 10min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD
Number of forecasting steps ahead 1
11.4436 6.4544 7.8211 7.1130 12.8802 9.6550 7.2366 9.8633 7.5132 1
15.4581 8.5914 10.1275 9.8381 17.7083 13.2314 9.8728 13.4148 10.3502 1
11.3011 6.2929
3
6
13.8090 16.8410 11.2399 14.1525 12.6079 15.7807 12.4880 15.5501 14.6947 17.7845 12.8786 15.8662 12.3107 15.1262 12.8857 16.1252 12.5293 15.4599 Number of forecasting steps ahead 3 6 18.5172 22.7006 15.1759 19.5919 16.8942 21.5641 16.8552 21.0507 19.9951 23.9570 17.4396 21.1547 16.8670 20.2398 17.4783 21.9426 17.0497 20.6890 Number of forecasting steps ahead 3 6 13.3548 15.6977 10.9363 13.0618
10
21.4207 18.1762 20.2774 19.9117 21.3940 19.8646 19.2908 20.1218 19.4267 10
29.2811 25.1199 27.6837 27.0458 29.0425 26.7495 26.1553 27.6361 26.6716 10
20.4478 16.8502
SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
7.6256 7.2058 12.4019 9.3481 7.5714 10.8337 7.8030
11.9898 11.7496 13.9584 12.2567 11.7359 13.4584 11.7669
14.7070 14.5267 16.6586 14.1818 13.8240 15.9753 13.8735
18.9567 18.6041 20.0099 17.7480 17.4376 19.3532 17.6561
Tab.8 Prediction accuracy of models with different forecasting steps ahead in detector 326 using data collected in 60min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
Number of forecasting steps ahead 1
78.6184 72.9229 77.1749 76.8419 170.3653 79.0644 77.1356 77.8786 78.0745 1
126.8205 104.7451 114.2827 113.7903 246.0265 117.0353 115.1633 118.4998 116.0642 1
8.2650 7.6743 7.9279 7.8616 24.3482 8.2260 7.9977 8.2525 8.2136
3
6
126.7055 153.6419 108.6226 140.7653 120.4484 143.8010 118.0662 143.2469 200.8525 206.3931 124.1349 146.5063 118.2544 143.8349 125.5884 148.3324 119.9770 146.0151 Number of forecasting steps ahead 3 6 214.0380 251.3380 171.5261 232.2968 192.6858 235.1041 190.1690 233.2064 304.5561 323.1152 211.7351 236.5077 192.9008 233.7141 213.1091 243.9160 197.9115 236.3454 Number of forecasting steps ahead 3 6 12.3762 21.3348 10.9067 17.9592 11.8809 18.9645 11.2284 18.2069 27.9925 29.3265 12.1640 20.2791 11.4742 19.4495 12.3443 20.4150 11.5411 19.7614
10
166.8656 159.7964 163.9694 161.7125 222.9255 163.6195 158.6241 164.1981 161.1752 10
270.1275 264.5973 265.6956 264.9461 360.1739 267.1956 265.0664 268.4374 265.1587 10
28.1639 25.5950 26.9536 25.7128 31.8381 27.4004 25.8558 26.5563 25.9281
Tab.9 Prediction accuracy of models with different forecasting steps ahead in detector 2131 using data collected in 60min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
Number of forecasting steps ahead 1
79.9396 66.6692 75.6785 72.7606 172.7458 77.1804 71.3078 78.4349 76.4314 1
110.7603 90.7533 95.2214 90.2792 240.3183 99.7694 97.0749 106.3307 97.0919 1
8.6670 7.9097 8.3395 8.2705 22.6390 8.5185 8.2046 8.5881 8.3961
3
6
132.6942 157.7462 105.1102 143.4680 123.0495 153.3265 119.9454 150.8764 196.8530 203.8526 130.4270 153.1754 125.1753 150.0255 132.2874 156.8144 126.3863 150.2089 Number of forecasting steps ahead 3 6 190.5324 224.5463 140.9264 206.6524 173.7182 226.4282 172.7329 224.0067 298.8602 319.2523 186.5433 223.0684 176.6157 221.1879 187.5393 224.3421 177.6909 222.6046 Number of forecasting steps ahead 3 6 15.2863 19.0172 12.8051 16.9591 14.3084 17.7458 13.7613 17.4396 26.8180 27.3167 14.6091 18.2727 14.1037 18.2211 15.0304 18.7407 14.4639 18.2361
10
172.0310 162.2888 170.0273 167.9641 227.224 174.8070 170.7763 176.0907 171.7480 10
275.3817 259.6080 271.0878 270.6175 358.9325 274.4408 272.2702 274.2345 273.8866 10
22.5894 20.1515 21.0432 20.9485 25.4170 21.2877 20.8494 21.7117 20.8950
Tab.10 Prediction accuracy of models with different forecasting steps ahead in detector 2226 using data collected in 60min scales MAE SVM SVM+EEMD SVM+EMD SVM+BW
Number of forecasting steps ahead 1
3
6
10
59.5466 55.5644 56.0353 55.9370
93.0577 77.3006 85.1986 82.9727
110.8999 105.5152 107.5163 106.2635
150.1794 133.2997 137.4292 136.8844
SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) RMSE SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym) MAPE (%) SVM SVM+EEMD SVM+EMD SVM+BW SVM+MA SVM+WL (coif) SVM+WL (db) SVM+WL (haar) SVM+WL (sym)
132.7416 57.4060 56.0253 57.9446 56.3740 1
92.8391 82.7762 87.7782 86.4544 188.6888 87.7722 85.6195 90.1714 86.7943 1
11.1851 9.2796 10.8166 10.1936 22.5083 10.3856 9.5477 10.6542 9.9095
153.5822 156.7594 90.8282 108.7263 86.6392 106.6023 91.6266 109.3760 88.5662 108.6930 Number of forecasting steps ahead 3 6 133.7947 164.7112 103.5603 150.7304 126.6633 159.8880 124.0480 158.5663 225.8578 237.1474 126.7732 158.7390 124.5355 157.2264 129.1956 160.4514 125.8182 158.2081 Number of forecasting steps ahead 3 6 16.2808 20.6377 13.3504 17.1922 15.4622 18.7059 15.4335 18.5131 28.6377 27.6355 15.3729 18.4928 14.7880 17.2585 15.8407 19.6127 14.9823 18.0713
a Prediction errors of EEMD, EMD, BW and BW under 2min time scale
185.3192 142.8797 138.7928 144.9353 141.3182 10
240.5580 210.9247 217.7249 217.1945 293.2463 224.1979 217.9930 226.9692 222.4638 10
26.2752 20.8291 23.9024 22.5709 28.8242 24.3782 22.4186 25.2043 22.4315
b Prediction errors of WL with different wavelet types under 2min time scale
c Prediction errors of EEMD, EMD, BW and BW under 10min time scale
d Prediction errors of WL with different wavelet types under 10min time scale
e Prediction errors of EEMD, EMD, BW and f Prediction errors of WL with different wavelet BW under 60min time scale types under 60min time scale Fig.6 RMSE distribution of different prediction models under different time scales in detector 326
a Prediction errors of EEMD, EMD, BW and BW under 2min time scale
b Prediction errors of WL with different wavelet types under 2min time scale
c Prediction errors of EEMD, EMD, BW and BW under 10min time scale
d Prediction errors of WL with different wavelet types under 10min time scale
e Prediction errors of EEMD, EMD, BW and f Prediction errors of WL with different wavelet BW under 60min time scale types under 60min time scale Fig.7 RMSE distribution of different prediction models under different time scales in detector 2131
a Prediction errors of EEMD, EMD, BW and BW under 2min time scale
b Prediction errors of WL with different wavelet types under 2min time scale
c Prediction errors of EEMD, EMD, BW and BW under 10min time scale
d Prediction errors of WL with different wavelet types under 10min time scale
e Prediction errors of EEMD, EMD, BW and f Prediction errors of WL with different wavelet BW under 60min time scale types under 60min time scale Fig.8 RMSE distribution of different prediction models under different time scales in detector 2226 From the observation of prediction results in the tables and figures, several interesting finding can be summarized as follows: (1) The performance of denoising strategy. From the observation of the results in the tables and figures, we can see that the prediction results of the model combined with denoising algorithm are superior to that of the model without denoising strategy. The main reason is that the sharp change part of the raw data is relived after implementing data denoising process, and then the variation trend of the data is expressed more obviously. Accordingly, the model becomes more stable in the training process, and then higher prediction accuracy is obtained. (2) The improving effect of EEMD on prediction results is the highest in all the denoising
algorithms. The prediction accuracies of the EMD and WL are similar, and both slightly slower than that of EEMD. The prediction performance of the BW is lower than that of these three methods, and MA produces lowest prediction accuracy. The reason for the high prediction accuracy of EEMD is that it decomposes the original detector data into multiple IMFs with each IMF shows a characteristic feature of the original data. In fact, the noise-related IMFs show different features as those of noise-free IMFs. For instance, noise-related IMFs have obvious larger data fluctuations, thus the gap between neighboring points are much larger. In addition, the maximum and minimum data in the noise-IMFs are also significantly larger than those of noise-free IMFs. Thus, EEMD can discriminate noises from raw data easily. Compared to EEMD method, EMD smoothing results show that it is inclined to remove some true details of the raw data. This is also consistent with previous research [50]. (3) The performance of WL denoising strategy. In four wavelet types, db produces the highest prediction performance, and the prediction accuracy of coif and sym types are similar and lower than that of db. The prediction performance of haar is inferior to other three wavelet types. The reason may be concluded that db wave shape expresses better fitting performance to the variation patterns of traffic flow data. The coif and sym types of wavelet are good at dealing with data with symmetric features. As the periodicity implied in traffic volume, it expresses approximate symmetric patterns. So, the coif and sym can also produce relative high prediction accuracy. (4) The limitation of MA model. Although this model has simple structure and is easy to calculate, it denoises raw data though calculating the average of surrounding data samples. It can be found that the MA model can achieve similar prediction results to the original SVM model for the data collected at 2min interval. With the increase of the data sampling interval, such as 10min, the prediction accuracy of the MA model become lower. For the data collected at 60min interval, the prediction results become even worse. The reason is that, when the sampling interval increases, the value of traffic volume will also become larger. Using this simple denoising method based on calculation of averages will result in great difference between the de-noised data and the original data. This bias can be observed from the Fig.5d. 5 Conclusions This study evaluated the multi-step prediction performance of the hybrid model based on the combination of SVM and denoising algorithms using the traffic volume data collected from three detectors in Minnesota State. The data are collected from January 1st to December 31st in 2015 with the interval of 2min, 10min and 60min. In the models performance comparison, we firstly choose five denoising models to evaluate the effect of prediction performance improvement: EMD, EEMD, MA, BW and WL with four wavelet types, coif, db, haar and sym. In the applications, we then optimize parameters by using 70% of total samples as training dataset and left 30% of all samples as testing dataset. Finally, we compare the prediction accuracy of different models based on the testing dataset. From the observation of the results from this study, several interesting conclusions can be drawn. First, the prediction results of the model combined with denoising algorithm are superior to that of the model without denoising strategy. Second, the EEMD can produce the highest prediction results in all the denoising algorithms. The prediction accuracies of the EMD and WL are both slower than that of EEMD. The prediction performance of the MA is the lowest. Third, in the WL model, db wavelet type produces higher prediction accuracy than that of other three types. In addition, MA model can achieve similar prediction results with original
SVM model for the data collected at short-term interval, and the prediction accuracy become much lower for the data collected at long-term interval. Through applications of traffic volume prediction and performance comparisons, this study supplies useful suggestions for proper denoising model selection in traffic flow prediction. However, in the current study, we did not consider other factors that may influence prediction accuracy, such as weather, road ranks, spatial correlation between different detectors. Furthermore, fusing multi-source of traffic flow data, such as GPS trajectories, weather information or video data [51-55] will definitely enhance the prediction accuracy. Acknowledgements This research was funded in part by the National Natural Science Foundation of China (No. 71701215), Foundation of Central South University (No. 502045002), Science and Innovation Foundation of the Transportation Department in Hunan Province (No. 201725), Postdoctoral Science Foundation of China (No.140050005), the National key research and development program of China: key projects of international scientific and technological innovation cooperation between governments (No. 2016YFE0108000), and Scientific Research Project of Ministry of Housing and Urban-Rural Development (No. 2017-R2-032) Reference: [1] M. Cetin, G. Comert. Short-term traffic flow prediction with regime switching models. Transportation Research Record: Journal of the Transportation Research Board, 1965(2006) 23-31. [2] Y. Yan, G. Li, J. Tang, Z. Guo. A novel approach for operating speed continuous predication based on alignment space comprehensive index”. Journal of Advanced Transportation. 2017(2017), ID:9862949, 1-14. [3] Y. Zou, X. Zhu, Y. Zhang, X. Zeng. A space-time diurnal method for short-term freeway travel time prediction. Transportation Research Part C, 43(2014) 33-49. [4] S. Chandra, H. Al-Deek. Cross-correlation analysis and multivariate prediction of spatial time series of freeway traffic speeds. Transportation Research Record, 2061(2008) 64-76. [5] J. Tang, J. Liang, S. Zhang, H. Huang, F. Liu. Inferring driving trajectories based on probabilistic model from large scale taxi GPS data. Physica A, 506(2018) 566-577. [6] Q. Ye, W. Szeto, S. Wong. Short-term traffic speed forecasting based on data recorded at irregular intervals. Intelligent Transportation Systems, IEEE Transactions on, 13(2012) 1727-1737. [7] K. Chan, T. Dillon, J. Singh, E. Chang. Neural-network-based models for short-term traffic flow forecasting using a hybrid exponential smoothing and Levenberg-Marquardt algorithm. Intelligent Transportation Systems, IEEE Transactions on, 13(2012) 644-654. [8] J. Tang, F. Liu, W. Zhang, R. Ke, Y. Zou. Lane-changes prediction based on adaptive fuzzy neural network. Expert Systems with Applications. 91(2018) 452-463. [9] C. Ma, W. Hao, F. Pan, X. Wang. Road screening and distribution route multi-objective robust optimization for hazardous materials based on neural network and genetic algorithm, PLOS ONE, 13(2018), e0198931, 1-22. [10] L. Rilett, D. Park. Direct forecasting of freeway corridor travel times using spectral basis neural networks. Transportation Research Record, 1752(2001) 140-147. [11] H. Yin, S.C. Wong, J. Xu, C.K. Wong, Urban traffic flow prediction using a fuzzy-neural approach. Transportation Research Part C, 10(2002) 85-98.
[12] Q. Chai, M. Pasquier, B. Lim, A novel fuzzy neural approach to road traffic analysis and prediction. Intelligent Transportation Systems, IEEE Transactions on, 7(2006) 133-146. [13] L. Dimitriou, T. Tsekeris, A. Stathopoulos. Adaptive hybrid fuzzy rule-Based system approach for modeling and predicting urban traffic flow. Transportation Research Part C, 16(2008) 554-573. [14] J. Tang, F. Liu, Y. Zou, W. Zhang, Y. Wang. An improved fuzzy neural network for traffic speed prediction considering periodic characteristic. IEEE Transaction on Intelligent Transportation Systems, 18(2017) 2340-2350. [15] A. Cheng, X. Jiang, Y. Li, C. Zhang, H. Zhu. Multiple sources and multiple measures based
traffic flow prediction using the chaos theory and support vector regression method. Physica A, 466(2016)422-434. [16] C. Wu, J. Ho, D.T. Lee. Travel-time prediction with support vector machine regression. Intelligent Transportation Systems, IEEE Transactions on, 125(2004) 515-523. [17] Y. Zhang, Y. Liu. Traffic forecasting using least squares support vector machines. Transportmetrica. 5(2009) 193-213. [18] Y. Zhang, Y. Xie. Forecasting of short-term freeway volume with v-support vector machines. Transportation Research Record, 2024(2007) 92-99. [19] M. Asif, J. Dauwels, C. Goh, A. Oran, E. Fathi, M. Xu, M. Dhanya. Spatiotemporal patterns in large-scale traffic speed prediction. Intelligent Transportation Systems, IEEE Transactions on. 15(2014) 794-804. [20] H. Jiang, Y. Zou, S. Zhang, J. Tang, Y. Wang. Short-term speed prediction using remote microwave sensor data: machine learning versus statistical model. Mathematical Problems in Engineering. 2016(2016), ID 9236156. [21] Y. Zhao, Y. Liu, L. Shan, B. Zhou. Dynamic Analysis of Kalman Filter for Traffic Flow Forecasting in Sensor nets. Information Technology Journal, 11(2012) 1508-1512. [22] H. Chen, S. Grant-Muller. Use of sequential learning for short-term traffic flow forecasting. Transportation Research Part C: Emerging Technologies, 9(2001) 319-336. [23] S. Chien, C. Kuchipudi. Dynamic travel time prediction with real-time and historic data. Journal of Transportation Engineering, 129(2003) 608-616. [24] J.W.C. Van Lint. Online learning solutions for freeway travel time prediction. Intelligent Transportation Systems, IEEE Transactions on. 9(2008) 38-47. [25] Y. Wang, M. Papageorgiou, A. Messmer. RENASSANCE: a unified macroscopic model-based approach to real-time freeway network traffic surveillance. Transportation Research Part C: Emerging Technologies. 14(2006) 190-212. [26] L. Dimitriou, T. Tsekeris, A. Stathopoulos. Adaptive hybrid fuzzy rule-based system approach for modeling and predicting urban traffic flow. Transportation Research Part C, 16(2008) 554-573. [27] J. Tang, G. Zhang, Y. Wang, H. Wang, F. Liu. A hybrid approach to integrate fuzzy c-means based imputation method with genetic algorithm for missing traffic volume data estimation. Transportation Research Part C: Emerging Technologies, 51(2015) 29-40. [28] K. Hamad, M.T. Shourijeh, E. Lee, A. Faghri. Near-term travel speed prediction utilizing Hilbert–Huang transform. Computer-Aided Civil and Infrastructure Engineering, 24(2009) 551-576. [29] J. Tang, Y. Yang, Y. Qi. A hybrid algorithm for urban transit schedule optimization. Physica A, 512(2018), 745-755. [30] Y. Yan, S. Zhang, J. Tang, X. Wang. Understanding characteristics in multivariate traffic flow time series from complex network structure. Physica A, 477(2017) 149-160.
[31] Y. Xie, Y. Zhang, Z. Ye. Short-term traffic volume forecasting using kalman filter with discrete wavelet decomposition. Computer‐aided Civil & Infrastructure Engineering, 22(2007) 326–334. [32] M. Tan, Y. Li, J., Xu. A hybrid Arima and SVM model for traffic flow prediction based on wavelet denoising. Journal of Highway & Transportation Research & Development, 26 (2009) 127-131. [33] X. Jiang, H. Adeli. Wavelet packet-autocorrelation function method for traffic flow pattern analysis. Computer-Aided Civil and Infrastructure Engineering, 19 (2010) 324-337. [34] B. Lu, M. Huang. Traffic Flow Prediction Based on Wavelet Analysis, Genetic Algorithm and Artificial Neural Network. International Conference on Information Engineering and Computer Science (2009) 1-4. [35] D. Boto-Giralda, F. Díaz-Pernas, D. González-Ortega, J. Díez-Higuera, M. Antón-Rodríguez. Wavelet-based denoising for traffic volume time series forecasting with self-organizing neural networks. Computer-Aided Civil and Infrastructure Engineering. 25 (2010) 530–545. [36] S. Dunne, B. Ghosh. Weather adaptive traffic prediction using neurowavelet models. IEEE Transactions on Intelligent Transportation Systems. 14(2013) 370-379. [37] H. Xiao, H. Sun, B. Ran. Fuzzy-neural network traffic prediction framework with wavelet decomposition. Transportation Research Record. 1836(2003). [38] J. Nunes, Y. Bouaoune, E. Delechelle, O. Niang. Image analysis by bidimensional empirical mode decomposition. Image & Vision Computing. 21 (2003) 1019-1026. [39] Y. Lin, L. Peng, Combined model based on EMD-SVM for short-term wind power prediction. Proceedings of the CSEE, 31(2011) 102-108. [40] B. Liu, S. Riemenschneider, Y. Xu, Gearbox fault diagnosis using empirical mode decomposition and Hilbert spectrum. Mechanical Systems and Signal Processing. 20(2006) 718-734. [41] W. Tong, M. Zhang, Q. Yu, H. Zhang. Comparing the applications of emd and eemd on time–frequency analysis of seismic signal. Journal of Applied Geophysics, 83(2012) 29-34. [42] Y. Kopsinis, S. McLaughlin. Improved EMD using doubly-iterative sifting and high order spline interpolation. Journal on Advances in Signal processing, 2008(2008) 120. [43] R. Baxley. Exponentially weighted moving average control schemes: properties and enhancements discussion. Technometrics. 32(1990) 13-16. [44] X. Chen, Z. Li, Y. Wang, Highway elevation data smoothing using local enhancement mechanism and butterworth filter. International Journal of Innovative Computing Information and Control. 13(2017) 1887-1901. [45] R.G. Mello, L.F. Oliveira, J. Nadal. Digital Butterworth filter for subtracting noise from low magnitude surface electromyogram. Computer methods and programs in biomedicine. 87(2007) 28-35. [46] G. He, S. Ma, Y. Li. Study on the Short-term forecasting for traffic flow based on wavelet analysis. Systems Engineering Theory and Practice, 12 (2002) 101- 106. [47] D. Huang. Wavelet analysis in a traffic model. Physica A. 329(2003)298-308. [48] C. Cortes and V. Vapnik, Support Vector Networks, Machine Learning, 20(1995) 273-297. [49]Transportation Data Research Laboratory (TDRL) in University of Minnesota Duluth: http://www.d.umn.edu/tdrl/ [50] Z. Wu, N. Huang. Ensemble empirical mode decomposition: a noise-assisted data analysis method. Advances in adaptive data analysis. 1(2009) 1-41.
[51] J. Tang, S. Zhang, X. Chen, F. Liu, Y. Zou. Taxi trips distribution modeling based on entropy-maximizing theory: a case study in Harbin city - China. Physica A, 493(2018) 430-443. [52] C. Ma, R. He. Green wave traffic control system optimization based on adaptive genetic-artificial
fish swarm algorithm, Neural Computing and Applications, 26(2015), 1-11. [53] S. Zhang, X. Liu, J. Tang, S. Cheng, Y. Qi, Y. Wang. Spatio-temporal modeling of destination choice behavior through the Bayesian hierarchical approach. Physica A, 512(2018), 537-551. [54] H. Liu, W. Ma. Virtual vehicle probe model for time-dependent travel time estimation on signalized arterials. Transportation Research Part C Emerging Technologies, 17(2009), 11-26. [55] A. Bhaskar, T. Tsubota, M. Le. Urban traffic state estimation: Fusing point and zone based data. Transportation Research Part C, 48(2014), 120-142.