Adaptive tracking control for nonlinear heterogeneous multi-agent systems with unknown dynamics

Adaptive tracking control for nonlinear heterogeneous multi-agent systems with unknown dynamics

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE ...

4MB Sizes 0 Downloads 88 Views

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

1

Adaptive Multi-Kernel SVM With Spatial–Temporal Correlation for Short-Term Traffic Flow Prediction Xinxin Feng , Member, IEEE, Xianyao Ling , Haifeng Zheng , Member, IEEE, Zhonghui Chen, and Yiwen Xu

Abstract— Accurate estimation of the traffic state can help to address the issue of urban traffic congestion, providing guiding advices for people’s travel and traffic regulation. In this paper, we propose a novel short-term traffic flow prediction algorithm based on an adaptive multi-kernel support vector machine (AMSVM) with spatial–temporal correlation, which is named as AMSVM-STC. First, we explore both the nonlinearity and randomness of the traffic flow, and hybridize Gaussian kernel and polynomial kernel to constitute the AMSVM. Second, we optimize the parameters of AMSVM with the adaptive particle swarm optimization algorithm, and propose a novel method to make the hybrid kernel’s weight adjust adaptively according to the change tendency of real-time traffic flow. Third, we incorporate the spatial–temporal correlation information with AMSVM to predict the short-term traffic flow. We evaluate our algorithm by doing thorough experiment on real data sets. The results demonstrate that our algorithm can do a timely and adaptive prediction even in the rush hour when the traffic conditions change rapidly. At the same time, the proposed AMSVM-STC outperforms the existing methods. Index Terms— Short-term traffic flow prediction, adaptive multi-kernel support vector machine, adaptive particle swarm optimization, spatial-temporal correlation.

I. I NTRODUCTION

T

HE vehicles bring convenience to the citizen. But at the same time, their continual growth inevitably leads to environmental pollution, waste of resources and traffic congestion problems. How to effectively relieve urban traffic congestion bottleneck has become a major issue faced by most of big cities [2]. Accurate estimation of the traffic state can provide guiding advices for citizens’ travel and traffic regulation. For example, the information of short-term traffic flow prediction can be provided to drivers in real time to give them realistic estimation of travel state, expected delays and alternative routes to their destinations [3]. It is believed that providing drivers with this information can help alleviate

Manuscript received May 1, 2017; revised November 1, 2017, January 29, 2018, and April 22, 2018; accepted July 2, 2018. This work was supported in part by NSF China under Grant 61601126, Grant 61571129, and Grant U1405251, and in part by the Foundation of Fujian Province under Grant 2016J01299. The work of X. Feng was supported by the China Scholarship Council. This paper was presented in part at the IEEE CEC 2017, Spain, June 2017 [1]. The Associate Editor for this paper was H. Van Lint. (Corresponding author: Haifeng Zheng.) The authors are with the College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, China (e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TITS.2018.2854913

traffic congestion and enhance the performance of the entire driver-vehicle-road networks. So far, several traffic flow forecasting methods have been proposed, such as time series [4], [5], Kalman filter [6], [7], Neural networks [8], [9], Chaos theory [10], Support Vector Machine (SVM) [11], [12] and so on. Among the methods mentioned above, SVM has the advantages in forecasting short-term traffic flow and traffic state in real time, for it has good self-learning and nonlinear prediction ability. Besides, it can get a better prediction accuracy in the case of limited training samples, which is suitable for the real-time prediction system. In this paper, we concentrate on the improvements to the structure of SVM, and make it more reliable in short-term traffic flow prediction. Aiming at this issue, we propose the AMSVM which has adaptive multi-kernel function. It not only inherits the traditional SVM’s features, but also synthesize different kernel functions’ advantages, which is more suitable for the complex and time-varying traffic flow. In addition, we incorporate the spatial-temporal information with AMSVM to further improve the prediction performance. The main contributions of this paper are as follows. • We explore both the nonlinear and randomness characteristic of traffic flow, and hybridize Gaussian kernel and polynomial kernel with different weights to constitute the AMSVM. Then we propose the APSO algorithm to optimize the parameters of AMSVM, and especially propose a novel method to make the hybrid kernel’s weight adjust adaptively according to the real-time traffic flow. • We incorporate the spatial-temporal correlation information with AMSVM, named as AMSVM-STC, to predict the short-term traffic flow, which can fuse spatial-temporal correlation predicted values with different weights. • We conduct thorough experiment by large real datasets that have different characteristics, including the highway and the urban datasets. The results demonstrate the advantages of our method compared with existing methods. The rest of this paper is organized as follows. In Section II, we review the related work. In Section III, the system model is provided and the fundamental concept is introduced for our work. In Section IV, we study the AMSVM’s formulation and optimization in traffic system, and describe the complete prediction algorithm. In Section V, the temporal correlation and spatial correlation in traffic flow prediction are analyzed, and the fusion mechanism of AMSVM-STC is described.

1524-9050 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

In Section VI, we discuss the experimental results and analyze the prediction performance of our algorithm. In Section VII, the conclusion of our work is drawn. II. R ELATED W ORK There are several existing researches related to traffic flow prediction. Hu et al. [13] took account of historical and realtime data while forecasting the short-term traffic flow under the model of PSO-SVR. Yang and Lu [14] used the waveletSVM combined model to predict the short-term traffic flow of an actual expressway. Li et al. [15] applied SVR model with Gauss loss function (Gauss-SVR) to forecast urban traffic flow and proposed a Chaotic Cloud Particle Swarm Optimization algorithm to optimize the parameters of Gauss-SVR model. Wang and Shi [16] proposed a traffic speed forecasting model using chaos-wavelet analysis and SVM to choose the appropriate kernel function. These methods used one kernel function of SVM while predicting the traffic flow. Some approaches had already studied SVM with multiple kernel functions. Ouyang et al. [17] proposed a traffic state prediction method based on Multi-kernel Support Vector Machine. His main idea was using linear kernel mapping linear portion of historical traffic flow data, and then using non-linear kernel mapping the residual portion, adding two results as the final predicted result. Kong et al. [18] proposed the method that using linear kernel, polynomial kernel and Gaussian kernel to train the same sample data of historical traffic flow respectively, and then predict the traffic state on a future day, selected the best predicted result among the three kinds of kernel function as the final predicted result. These approaches, however, essentially used one kind of kernel function instead of multiple kernel functions to forecast the traffic flow in the same time as well. The above methods mainly considered the temporal variation, which was insufficient for accurate forecasting in the scenario of multiple locations. To mine more information, numerous approaches incorporated the spatial characteristics of traffic flow. Xu et al. [19] presented a spatio-temporal variable selection method based on SVR model to predict traffic flow, in which the spatial and temporal information of all available road segments was utilized. Pan et al. [20] considered the spatial-temporal correlation of traffic flow in stochastic cell transmission model (SCTM) framework, which can support short-term traffic state prediction. Tan et al. [21] proposed a short-term traffic flow prediction approach based on dynamic tensor completion (DTC), which was able to capture more information of traffic flow like temporal variabilities, spatial characteristics and multimode periodicity. Lv et al. [22] proposed a deep-learning-based traffic flow prediction method with big data, in which a stacked autoencoder (SAE) model was used to represent traffic flow spatial-temporal features for prediction. Wu and Tan [23] employed a hybrid deep learning framework combined CNN and LSTM to forecast future traffic flow, in which a CNN was exploited to capture spatial features, and two LSTMs were utilized to mine the short-term variability and periodicities of traffic flow. Zhang et al. [24] proposed deep spatio-temporal residual networks (ST-ResNet) to predict citywide crowd flows. They employed the residual neural network framework to model the temporal closeness, period

Fig. 1.

System model.

and trend properties of crowd traffic, and designed a branch of residual convolutional units for each property to model the spatial properties of crowd traffic, then dynamically aggregated the output of the three residual neural networks with different weights. Compared with prediction methods only incorporated the temporal variation, these spatial-temporal approaches usually achieved better performance. But on the other side, the successes of these remarkable deep learning approaches were at the cost of huge amount of data, tedious training time and large computing resources. Different from the above work, we explore different features of large real datasets, like the change tendency of the realtime traffic flow, as well as the traffic flow periodicity at different time scales, such as day by day and week by week. We also take account of the spatial correlation and select the correlative spatial points by analyzing the relation between the past and current traffic flow of the spatial points and the point of interest (POI). Compared with these related work, the proposed AMSVM-STC can effectively elevate the traffic flow prediction’s accuracy and efficiency, and achieve the purpose of predicting short-term traffic flow and traffic state timely, stably and adaptively. III. S YSTEM M ODEL AND F UNDAMENTAL C ONCEPT In this section, we will firstly provide the system model, and then introduce the fundamental concept of our work. A. System Model Our purpose is to provide the near future traffic states for citizens in time. We will achieve this purpose by proposing an accurate short-term traffic flow prediction algorithm by fully mining the traffic information from time, day, week and location.1 The system model is shown in Fig. 1. Following are the general processes. Suppose that we predict the traffic flow of the POI shown in Fig. 1. We collect the traffic volume of the POI through roadside units (RSU), which will transmit data to the traffic information center. Then we consider both the change 1 Although the traffic volume does not uniquely define a traffic state, it is a main reference index, which provides an intuitive information about future traffic conditions. For a more detailed traffic state evaluation, please check [18], [25].

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FENG et al.: AMSVM WITH SPATIAL–TEMPORAL CORRELATION

TABLE I K EY N OTATIONS

3

problem by introducing Lagrange-coefficients, and then we can obtain the solution of optimal decision-making function as follow. f (x) =

L 

αi − αi∗ K (x, x i ) + b,

(2)

i=1

where αi and αi∗ are Lagrange-coefficients, which help solving the extremum of objective function by integrating the constraints in (1). In addition, b ∈ R means the bias, and K (x, x i ) is the kernel function which satisfies K (x, x i ) =  (x) ,  (x i ),

(3)

in which  (·) is the mapping function in the feature space. Kernel function essentially used to replace the inner product operation between two sample points in the high-dimensional space. tendency of the real-time traffic flow (defined as the real-time data) and the periodicity at day by day of the traffic flow in a week (defined as the near historical data) to obtain a part of the predicted result. In addition, we explore the traffic flow at the same location of the same day of several previous weeks (defined as the distant historical data) and at the correlative spatial points of the same day (defined as the spatial data) to get other parts of the predicted result. Finally, We incorporate all the parts to output the final predicted result of short-term traffic flow. B. Fundamental Concept The key module of our algorithm is SVM. SVM is a statistical learning theory for classification and regression proposed by Vapnik in 1995 [26]. When used for regression forecasting, SVM has the advantages of avoiding falling into local optimum compared to other nonlinear prediction models. TABLE I shows the key notations mentioned in this paper. We suppose L , that there are L sample sets {(x i , yi ) |x i ∈ R v , yi ∈ R}i=1 where x i is a v-dimensional real input vector, yi is the output real vector corresponding to x i . The basic idea of SVM is mapping the sample vector to N-dimensional feature space by kernel function, and then construct the optimal decision-making function in the feature space, which can be transformed into function programming problem as follow. 

⎧ L  1 ⎪ 2 ∗ ⎪mi n ω ξ + C + ξ ⎪ i=1 i i ⎪ ⎪ ⎨ ⎧2 ⎪ (1) ⎨ yi − f (x i ) ≤ ε + ξi ⎪ ⎪ ⎪s.t. f (x i ) − yi ≤ ε + ξi∗ ⎪ ⎪ ⎩ ⎪ ⎩ ξi , ξi∗ ≥ 0, i = 1, 2, · · · , L, where ω = (ω1 , ω2 , ω3 , · · · , ωN )T is a linear weight vector, C is penalty coefficient which determines the robustness of regression model, ε means insensitive loss coefficient which determines the numbers of support vectors, ξi and ξi∗ are two non-negative slack variables, which allow the sample vectors to offset slightly from the hyperplane for solving the programming problem in a larger feasible margin. The solving process of (1) can be transformed into solving the dual

IV. AMSVM IN T RAFFIC F LOW P REDICTION In this section, we will study how to formulate an AMSVM to predict the traffic flow in a POI based on the near historical traffic data and real-time data. A. AMSVM Formulation How to select kernel function depends on the distribution of sample data and the relationship between sample data and predicted variables [27]. Since different feature space has different data distribution, the performance of SVM depends largely on the choice of kernel function. Kernel functions can be divided into local and global types, the local kernel function has good learning ability but weak generalization ability, while the global kernel function has good generalization ability but weak learning ability [28]. Gaussian radial basis kernel function (RBF) and polynomial kernel function (Poly) are typical local and global kernel function, respectively. Thus they are often used for traffic flow prediction [17], [18], following are their formulas.

q (4) K (x, x i ) = γ (x ∗ x i ) + 1 ,   2 K (x, x i ) = ex p −γ x − x i  , (5) in which (4) is Poly and (5) means RBF, γ is subjected to kernel function, which implicitly determines the distribution of the data after being mapped to the new feature space, q means the power parameter. The traffic flow varies in complex modes. For example, the traffic volume is usually a slowly varied nonlinear process during a whole day when referring to its mean value. Meanwhile, randomness oscillations of it exist due to unexpected factors, such as traffic jams or traffic accidents. In order to adapt to the nonlinearity and randomness of traffic flow, as well as improving the accuracy of our algorithm, we set a hybrid kernel function which is formulated by combining both (4) and (5) with adaptive weights, the formula is as follow.  

q K (x, x i ) = β · exp −γ x − x i 2 +(1−β) · γ (x ∗ x i )+1 , (6)

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

The updating rule of particle is as follow.  vl (t + 1) = θ vl (t) + c1r1 ( pl − χl (t)) + c2r2 pg − χl (t) , (8) χl (t + 1) = χl (t) + vl (t + 1), (9)  c ε γ where pl = pl , pl , pl means the local optimal value of the γ

Fig. 2.

Location update of particle.

where β ∈ [0, 1] means the hybrid kernel function’s weight coefficient. To make full use of both advantages when predicting the traffic flow, we propose the method to make the weight adjust adaptively according to the change tendency or slope of realtime traffic data, following is the formula. β =1−

1 , e|k|

(7)

−yi−2 means the slope of the previous two where k = xyi−1 i−1 −x i−2 values of traffic volume. When the value of |k| is decreasing, the curve of traffic flow tends to smooth, and we should increase the kernel function’s global generalization ability, that is increasing the weight of polynomial kernel function, or decreasing the value of β. When the value of |k| is increasing, the curve tends to sharp, and we should increase the kernel function’s local learning ability, that is increasing the weight of Gaussian kernel function, or increasing the value of β.

B. AMSVM Optimization The parameters of AMSVM have an significant influence on traffic flow predicted results, so we propose the APSO algorithm to optimize them. The undetermined parameters include: penalty coefficient C, insensitive loss coefficient ε and parameters γ . The parameter q is generally set as 2, which means combining the quadratic polynomial kernel and Gaussian kernel function. PSO is a swarm intelligence optimization algorithm which simulates birds’ predatory behavior proposed by Eberhart and Kennedy [29]. The algorithm regards the potential solution to each problem as a particle in the searching space. When training the AMSVM, we set particle’s current position vector as the current value of undetermined parameters: χl = (C, ε, γ ), then the global optimal position searched through the optimization process is equivalent to the optimal solution of the elements in the vector, which is AMSVM’s optimal parameters. The current speed vector  vl = v c , v ε , v γ determine their direction and distance when motion. PSO algorithm searches the global optimal solution through collaboration and competition among each particle [30]. Its location updating process is shown in Fig. 2.

current particle, pg = pgc , pgε , pg means the global optimal value of the entire particle swarm, θ is inertia weight, c1 and c2 are two positive constants that represent local learning factor and global learning factor, r1 and r2 are two random numbers which obey uniform distribution in the interval of [0,1]. As shown in Fig. 2, the particle’s actual flight paths are 1, 2 and 3, which represent the influence of current speed, self-learning and social-learning to the particle, respectively. It is obvious that the combination effect of all these flight paths is path 4. To meet the changes of traffic flow, ensure the PSO algorithm converge rapidly and accurately, we propose a novel APSO algorithm in this paper. The main idea of APSO is to adopt a adaptive inertia weight, and let the learning factor and flight time factor adjust with inertia weight dynamically. We use the Mean Square Error as fitness function to evaluate the evolution degree of particles, which is E=

L 2 1  Yi − Yi∗ , L

(10)

i=1

where L means the sample numbers, Yi is actual value of the sample data, and Yi∗ is predicted value for the sample. We have known that inertia weight θ in (8) determines the performance of PSO algorithm. When the value of θ is large, the algorithm will converge rapidly by improving the global search ability like the red path shows, also it can help avoid prematurity. When the value is small, it is benefit for local search ability like the blue path shows, and it can improve convergence accuracy. Wang and Li [31] used a linear decreasing inertia weight, which balance global search ability and local search ability to some extent. However, the value of inertia weight will decrease continually as the evolution going, and the particle will even lose the effect of inertia, resulting in local search ability degradation as well. To avoid the unexpected degradation, we formulate the adaptive inertia weight as follow. ⎧ (θmax − θmin ) ∗ (El − E min ) ⎪ ⎪ , El ≤ E avg ⎨θmin + E avg − Emin (11) θl = (θmax − θmin ) ∗ El − E avg ⎪ ⎪ ⎩θmax − , El > E avg , El − E min where inertia factor θl is within [θmax ,θmin ], El means the fitness value of the current particle, E avg and E min represent average and minimum fitness value of all current particles, respectively. For each particle, the inertia weight θl will change adaptively according to fitness value in the evolutionary process. In the early stages of evolution when El > E avg , we decrease the value of θl in a small extent so as to prevent the particle from separating from the swarm, but it still has a strong global search ability. In the later stages of evolution when El ≤ E avg , we increase the value of θl in a small extent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FENG et al.: AMSVM WITH SPATIAL–TEMPORAL CORRELATION

5

so as to keep the effect of inertia and ensure the particle’s final local search ability. To make the particle swarm find the global optimal solution in a higher probability, and improve its efficiency further, we propose the strategy that learning factor will be adjusted dynamically according to inertia weight. In the updating process of particles in traditional PSO algorithm, the value of learning factor c1 and c2 are usually set within the interval (0,4), for example, are set as 2 in [32]. Nevertheless, in our work, we allow the learning factor to adjust to different evolutionary stages of particles, so that it can strengthen the effect caused by adaptive inertia weight, the formula is as follow.   c1 = c1max − (c1max − c1min ) 1 − θl2  (12) c2 = c2min + (c2max − c2min ) 1 − θl2 , where c1 is a decreasing function of θl and the value is within [c1max ,c1min ], c2 is a increasing function of θl and the value is within [c2min ,c2max ]. In the early stages of evolution, the value of c1 is relatively large so that the particles have strong selflearning ability (local search ability). We ensure their global search ability and speed up the whole search process by decreasing the value of c1 . In the later stages of evolution, the value of c2 is relatively large so that the particles have strong social-learning ability (global search ability), which can assist the θl to strengthen the particle’s final local search ability through information sharing between groups. So they can converge to the global optimal solution with higher accuracy and efficiency. In addition, to further improve the evolutionary efficiency of the particle swarm, we introduce the concept of flight time factor [33], which can improve position updating process of particles. The position updating (9) turns into χl (t + 1) = χl (t) + T vl (t + 1),

(13)

where T means the flight time factor, which is a linear decreasing function of θl and the objective value is in the range of [1, 0.5], so that it can change with the inertia weight in real time. Consistent with inertia weight and learning factor, the value of T is large in the early stages of evolution, which can accelerate the global search process. The value of T will decrease as the evolution going, which is benefit for local search and can ensure the particles to converge to the global optimal solution as accurate as possible. C. Traffic Flow Prediction Algorithm of AMSVM We assume that a whole day can be uniformly divided into several periods. Traffic flow of the POI in a certain period is mainly related to the near historical average value of the corresponding period, and real-time value of previous periods. Now we illustrate the constitution of training module and prediction module in Fig. 3 in detail. In order to accurately reflect the impact of traffic flow variation to predicted value, when building the model, we regard near historical average value and real-time value of certain road’s traffic volume as the input variables of AMSVM. Moreover, we define the backtracking factor N which represents using the data of the

Fig. 3.

Prediction algorithm of AMSVM.

N periods previous to predict traffic flow of the next period. We input the average traffic flow data to training module, and utilize the APSO algorithm to optimize the parameters of AMSVM, then input the real-time traffic flow data of the N periods previous to prediction module, and predict the next period’s traffic flow under the conditions of optimal parameters. We call the result as AMSVM’s autoregressive predicted value, and regard the POI as detection point in this paper. Fig. 3 shows the traffic flow prediction algorithm of AMSVM. Specific steps are as follows. step1 Determine the detection point to be forecasted and mark it as I . step2 Initialize the environmental parameters of APSO. step3 Regard the undetermined parameters as the position vector of particle, and initialize the particle’s speed and position. step4 Input sample data of near historical average traffic flow of point I , complete mapping and regression by AMSVM. step5 Calculate fitness value of the particle according to (10). step6 Update speed and position of the particle according to (8), (11), (12) and (13). step7 For each particle, repeat step2 to step5. step8 Iterate according to step3 to step6 until the error or fitness value of training sample is within the accuracy limitation, then output the optimal parameters C, ε and γ to prediction module. step9 Regard the traffic flow in current moment and previous N periods as the input variables: X I = {PI (t) , PI (t − 1) , · · · , PI (t − N)}, calculate the value of β according to (7), then train the AMSVM under the conditions of optimal parameters, and then predict the traffic flow in the next period. step10 Output the predicted value: PI A (t + 1). V. S PATIAL -T EMPORAL C ORRELATION IN T RAFFIC F LOW P REDICTION In fact, the traffic flow of the POI in the next period is not only related to near historical data and real-time data, but also to distant historical data and spatial data. By exploring them, the information from time, day, week and location can be fully used. In this section, we will analyze the temporal correlation and spatial correlation in traffic flow prediction, then describe

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

how to fuse the different predicted values, and then provide the complete prediction algorithm of AMSVM-STC. A. Temporal Correlation in Traffic Flow Prediction We firstly obtain the traffic data of current point I , and then calculate the correlation of traffic flow between current point and its distant history in the same day of previous h weeks according to Pearson correlation coefficient, which is  Cov X I , X m T , m T = 1, 2, · · · , h, (14) Rm T = σ I σm T where σ I and σm T are standard deviation of X I and X m T separately, and they can be calculated by the corresponding sample data. Then we get the historical traffic flow in the next period of previous m T -th week, and define it as Pm T (t + 1). Finally, the predicted value of current point in the next period is calculated by statistically averaging these distant historical data of the same period, which is PI T (t + 1) =

h

1 Rm T · Pm T (t + 1) . h T

(15)

m =1

B. Spatial Correlation in Traffic Flow Prediction We obtain the spatial correlation by analyzing the traffic flow at the current point and its adjacent areas in the same day of the previous week. Without loss of generality, we consider the current point and its upstream points [34]. The Hierarchical clustering algorithm [35] is referred to on the basis of ensuring that there is a overpass or intersection between every two clusters, and select one correlative spatial point from one cluster. The effects of the selected correlative point to the current point are evaluated by calculating the Pearson correlation at certain times, which is  Cov X I , X n S , n S = 1, 2, · · · , r, (16) Rn S = σ I σn S where σn S means the standard deviation of X n S , r is the number of these correlative points. In addition, to further reduce the uncertainty caused by random factors, we adopt the “time-interval” correlation, which is detailed described at Section VI, Subsection C. Then we obtain the traffic flow in the time t + τn S of  point n S : Pn S t + τn S , where τn S is the delay between point n S and point I , which means using the traffic flow of point n S in τn S minutes ago to estimate the effect to point I in the next period. We should notice that each correlative point has a certain delay which depends on the distance toward current point, and we can determine the τn S by considering this issue. Finally the predicted value of current point in the next period is obtained by statistically averaging the spatial correlation traffic data, which is r  1 Rn S · Pn S t + τn S . PI S (t + 1) = r S n =1

(17)

C. Weighted Fusion of Different Predicted Values We combine these predicted values linearly,2 and then output the final predicted short-term traffic flow of current point in the next period according to the following equation. PI (t + 1) = λ1 PI A (t + 1) + λ2 PI T (t + 1) + λ3 PI S (t + 1), (18) in which the fusion weights λ1 , λ2 , λ3 satisfy the relation: λ1 + λ2 + λ3 = 1, and can be obtained by “Entropy method” [39], [40]. The PI A (t), PI T (t) and PI S (t) are closely related to PI (t), and we define their own prediction errors as e I A (t), e I T (t) and e I S (t), respectively. We calculate the fusion weights of different predicted values according to the following steps [40]. 1) We define the relative error ratio of a certain single prediction method as follow, which reflects the impact of the error of the j th prediction method at the tth period. e jt p j t = N , t = 1, 2, . . . , N, j = 1, 2, 3, (19) t =1 e j t  where tN=1 p j t = 1. 2) The entropy value can be calculated according to the relative error ratio, which is h j = −k

N

 p j t ln p j t ,

j = 1, 2, 3,

(20)

t =1

where k = ln1N and h j ∈ [0, 1]. 3) Based on the principal that the magnitude of entropy is opposite to the variation, we then define the following variation degree of a certain prediction method as dj = 1 − h j,

j = 1, 2, 3.

(21)

4) Finally the fusion weights of different predicted values in the next period are obtained as follows.   dj 1 λj = 1 − μ μ−1 j =1 d j ⎡ ⎤   1 + ln1N tN=1 p j t ln p j t 1 ⎣1− =   ⎦, μ  1 N μ−1 1+ p ln p jt t =1 j t j =1 ln N j = 1, 2, 3, μ = 3,

(22)

where μ is the total of types of prediction values. The fusion weight can ensure that the larger the variation degree of a certain prediction method is, the smaller the weight it will have. We integrate the above subsections and provide the complete prediction algorithm of AMSVM-STC, which is shown in Algorithm 1. 2 The ways of combination are diverse, such as by neural network in [36], by bayesian in [37], and by dynamic tensor completion in [21]. Compared with the existing methods, linear combination has advantages of not requiring huge data, and easy implementation. In addition, it is demonstrated to be useful for performance improvement [38].

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FENG et al.: AMSVM WITH SPATIAL–TEMPORAL CORRELATION

Algorithm 1 AMSVM-STC Input: D: dataset, x I , X I , X n S , X m T . Output: PI (t + 1): traffic flow in the next period. 1 for x I , X I ∈ D do 2 AMSVM← x I ; 3 APSO← χ = (C, ε, γ )T ; 4 if optimization finished then 5 obtain optimal parameters: C, ε and γ ; 6 end 7 AMSVM← X I , C, ε, γ ; 8 Calculate β according to (7); 9 if X I is trained by AMSVM then 10 output autoregressive predicted value PI A (t + 1); 11 end 12 end 13 for X m T ∈ D do 14 calculate temporal correlation according to (14); 15 if traffic flow of the corresponding period in m T weeks ago is Pm T (t + 1) then 16 output temporal correlation predicted value PI T (t + 1) according to (15); 17 end 18 end 19 for X n S ∈ D do 20 calculate spatial correlation according to (16); 21 if traffic flow  in the previous period of upstream point n S is Pn S t + τn S then 22 output spatial correlation predicted value PI S (t + 1) according to (17); 23 end 24 end 25 for different predicted value PI A , PI T and PI S do 26 weighted fusion and obtain the final predicted result PI (t + 1) according to (18) and (19); 27 end 28 return PI (t + 1).

Fig. 4.

Spatial correlative points.

VI. E XPERIMENT AND A NALYSIS In order to verify the feasibility of our prediction algorithm, we analyze the traffic flow data of PeMS system [41]. We firstly evaluate our algorithm in freeway scenario by setting a POI at (37.97425N,121.247481W ), SR99, District 10, San Joaquin County (where many intersections and overpasses exist leading to complex traffic condition), on Apr. 11, 2011. We name this POI as “Freeway-1”. We show the experimental

7

Fig. 5.

The experimental area of Freeway-1.

area in Fig. 5, in which Location 1 is the POI, and other red markers represent the selected spatial correlative points. Then we will consider traffic flow prediction under different scenarios. A. The Performance of APSO We will predict the traffic flow of Location 1 in Fig. 5 on Monday of the following week. Firstly, the performance of APSO is analyzed. We select the near historical traffic data of Location 1 from Monday to Friday, and calculate their average value as the sample data. Before training the model, we normalize the sample data into [0,1]. Before the APSO optimization, the initial parameters are set as follows: 200 iterations, 30 population, the inertia weight θ is within [0.4,0.9], learning factors c1 and c2 are within [0.8,2], the parameters C, γ and ε are within [0.001,0.1], the hybrid kernel’s weight β is set as 0.5,3 and the limited accuracy is 10−4 . We input the normalized sample data to training module and then train it. At the same time, our APSO are compared with some existing PSO algorithms with Fixed weight [17], Linear decreasing weight [31] and Adaptive weight [42], and obtain the fitness graph of different PSO shown in Fig. 6. In the iterative process of these algorithms, we make some of the particles in the swarm mutate and escape from the initial range, in order to avoid falling into the local optimum. The performance are shown in TABLE II. The results indicate that compared with other PSO algorithms, the APSO algorithm proposed in this paper almost keeps the same best fitness (minimum fitness value of all particles at the last iteration), and reduces the training time (T r ) more over 9.677% with much more iteration numbers. The training time can be generally regarded as the convergence time of particles, the APSO algorithm owns less time in each iteration. Thus it can effectively improve the rate of particles’ convergence. At the same time, we can find in Fig. 6 that the fitness curves of the compared ones tend to converge with fewer iterations (less than 128 iterations), while of APSO still has significant fluctuations even in the later iterations 3 Actually, the initial setting of β in training module will barely affect the performance of both APSO and AMSVM. We will prove it by simulation at the next subsection.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

Fig. 6.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

TABLE II

TABLE III

O PTIMIZATION C OMPARISON A MONG D IFFERENT PSO

P ERFORMANCE I NDEXES FOR THE P REDICTION A LGORITHM

Fitness graph of different PSO.

(around 185 iterations), thus proving its superiority in making particle swarm break away from the local optimal position, and avoiding the prematurity of the algorithm. Therefore, the APSO algorithm in this paper is more suitable for the nonlinear and stochastic mutative process of traffic flow. B. Analysis of AMSVM Prediction Results Now we analyze the AMSVM’s autoregressive predicted value of short-term traffic flow of Location 1. We count the traffic flow volume in a period of 5 minutes, and divide 24 hours into 288 periods. While predicting in real time, we utilize the traffic data of 15 minutes ago to predict the traffic flow of the next 5 minutes. The hybrid kernel’s weight β will be adjusted adaptively according to real-time variation tendency or slope of traffic flow, and its value will be determined by (7). For analyzed convenience, we first study the traffic flow in rush hour (15:00-18:00) and usual hour (18:00-21:00) on Monday.4 After training the model, we can get the optimal parameters. In rush hour, the value of C, γ and ε are 5, 0.01 and 0.002, respectively. The predicted result is shown in Fig. 7 (a), in which the green line means the slope variation of real traffic flow, and its variation tendency can reflect the varying process of hybrid kernel function indirectly. At the same time, we fix the value of C, γ and ε, while setting different values of β in the interval of [0.0,1.0], and get the performance comparison of predicted results, just as Fig. 7 (b) shows. The definitions of these accuracy indexes are shown in TABLE III. Similarly, in usual hour, the value of C, γ and ε are 3, 0.01 and 0.001, respectively. The predicted result and performance comparison are shown in Fig. 8. 4 The choices of rush and usual hours depend on the observation of traffic volume in PeMS system, as shown in Fig. 13 and Fig. 15.

Fig. 7. Traffic flow prediction results and performance comparison during 15:00-18:00. (a) Predicted result. (b) Performance comparison with different β.

Fig. 7 (a) and Fig. 8 (a) show that the slope of traffic volume varies with time, which represents the different and adaptive functions of the two kind of kernels (Polynomial and RBF ones). That is to say, the AMSVM, which has been optimized by APSO and has adaptive hybrid kernel function, can flexibly adapt to the nonlinear and random characteristic of traffic flow. Besides, Fig. 7 (b) and Fig. 8 (b) show that compared with the fixed hybrid kernel function’s weight, the AMSVM with adaptive weight has lower MAPE and RMSE, as well as higher R value. Thus it validates the feasibility and superiority of AMSVM in autoregressive prediction. In addition, the performance improves significantly especially in rush hour, this is because the slope of the traffic volume curve varies greater in the rush hour (between 0 to 70) than that in the usual hour (between 0 to 45). That is to say, the AMSVM are more adaptive to complicated traffic condition. To show the effects of different training β on AMSVM performance, we provide TABLE IV, where MAPE, RMSE

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FENG et al.: AMSVM WITH SPATIAL–TEMPORAL CORRELATION

9

Fig. 9. Traffic flow prediction results during the rush hour by different methods. TABLE V T RAFFIC F LOW P REDICTION P ERFORMANCE OF D IFFERENT M ETHODS

Fig. 8. Traffic flow prediction results and performance comparison during 18:00-21:00. (a) Predicted result. (b) Performance comparison with different β.

TABLE VI O PTIMIZED PARAMETERS AND P REDICTION E FFICIENCY OF APSO-AMSVM A LGORITHM

TABLE IV T HE E FFECTS OF D IFFERENT T RAINING β ON P ERFORMANCE

and R of AMSVM under adaptive β in prediction module and different fixed β in training module are compared. From this table, it is found that the value of β in training module barely affects the performance of AMSVM. For example, in rush hour, the variation of MAPE introduced by different training β is on the order of 1%, and of RMSE and R are on the order of 0.1% and 0.01. In addition, there is no obvious regularity between the training β and the performance. Therefore, the training β is fixed as 0.5 hereinafter unless otherwise specified.5 To illustrate the superiority of APSO-AMSVM further, we select the methods of PSO-SVM, APSO-SVM-R and 5 Note that the value of β is only fixed in the Training module, but it should be still adaptive in the Prediction module.

APSO-SVM-P to compare with our method, where the kernel function of the first two methods are selected as RBF and of last one is polynomial. In addition, PSO-SVM does not employ any adaptive PSO method. TABLE V shows the general accuracy indexes of our algorithm and the compared ones, which reflect the complete prediction performance in 24 hours on Monday. Without loss of generality, we present the predicted results during the rush hour, which are shown in Fig. 9. We can find in TABLE V that the MAPE and RMSE of APSO-AMSVM are 10.2608% and 12.3632%, both of which are lower than the other three methods. Besides, its R is 0.9654, which is highest among these methods. Accordingly, our algorithm has less error and higher accuracy in short-term traffic flow prediction. The optimized value of parameters and prediction efficiency of APSO-AMSVM algorithm are shown in TABLE VI. Its average training time is 112 seconds, while the average prediction time (T p) is 0.036 seconds, showing the high efficiency compared to the existing method [17]. In other words, our algorithm doesn’t elevate the complexity of the traditional algorithms essentially. Thus it can meet the requirement of real time in short-term traffic flow forecasting.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

Fig. 10. Time sequence and correlation of traffic flow of Location 1 in previous five Mondays.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Fig. 11.

Hierarchical clustering result of different detection points.

C. Analysis of Spatial-Temporal Correlation 1) Temporal Correlation Analysis: In order to consider the influence of distant historical temporal correlation to the traffic flow of Location 1, we analyze the correlation between the time series data in previous 5 Mondays. The time sequence and the correlation of these distant historical data are shown in Fig. 10. To adopt the existing experience, we regard the correlation shown in Fig. 10 as the correlation between this Monday and the previous 4 Mondays. After statistically averaging these data, we can obtain the historical temporal correlation predicted traffic flow of Location 1 in the next moment on this Monday. 2) Spatial Correlation Analysis: In order to consider the influence of spatial correlation to the traffic flow of Location 1, we utilize the Hierarchical clustering algorithm to analyze the correlation from Location 1 to Location 14. We explore last Monday’s traffic data of these detection points, and regard the average clustering distance of traffic data as the assessment of spatial correlation. The dendrogram of clustering result is shown in Fig. 11, where the clustering distance means the difference between datasets of different points. We cut the dendrogram along the red dotted line, and divide the traffic data in different detection points into 6 clusters: Location 1 Location 3, Location 4 - Location 6, Location 7 - Location 9, Location 10, Location 11 - Location 13, Location 14. According to the hypothesis of selecting spatial correlative points shown in Fig. 4, we select Location 4, Location 7, Location 10 and Location 11 as the upstream correlative detection points of Location 1. The time sequence of these correlative detection points are shown in Fig. 12. Actually, we can find in Fig. 11 and Fig. 12 that the correlation of Location 14 is significantly beyond other detection points, for it has much larger clustering distances from other clusters. To reduce the effect of uncertainty caused by random factors, such as environment, weather and traffic accident, to the predicted result, we adopt the idea of “time-interval” to analyze the spatial correlation of these correlative detection points further, just as TABLE VII shows. Now that the spatial correlation is obtained based on the previous experience, we can statistically average the traffic

Fig. 12. Monday.

Time sequence of spatial correlative points’ traffic flow on last

TABLE VII T IME -I NTERVAL C ORRELATION B ETWEEN L OCATION 1 AND S PATIAL C ORRELATIVE P OINTS

flow of Location 4, Location 7, Location 10 and Location 11 according to the time-interval correlation, which are shown in TABLE VII, and gain the spatial correlation predicted traffic flow of Location 1 in the next moment. D. Traffic Flow Prediction Results Based on AMSVM-STC We combine the AMSVM’s autoregressive predicted value, historical temporal correlation predicted value and spatial correlation predicted value with different weights determined by “Entropy Method” described in Section V, Subsection C, and obtain the final predicted short-term traffic flow of Freeway-1 on this Monday. To illustrate the superiority of AMSVM-STC,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FENG et al.: AMSVM WITH SPATIAL–TEMPORAL CORRELATION

Fig. 13.

11

Traffic flow prediction results of different algorithms. (a) Freeway-1. (b) Freeway-2.

TABLE VIII

TABLE IX

T RAFFIC F LOW P REDICTION P ERFORMANCE ON F REEWAY-1

T RAFFIC F LOW P REDICTION P ERFORMANCE ON F REEWAY-2

we introduce the usual time series prediction algorithm like ARIMA, BPNN to compare with our algorithm. We select the ARIMA(3,1,1) model to autoregressively predict the traffic flow of Freeway-1 in the same day. As for the BPNN, we set 288 input nodes, 5 hidden nodes and 1 output node, then train the model by traffic data from Monday to Friday last week, and then predict this Monday’s traffic flow. Fig. 13 (a) shows the predicted results of different algorithms, the general accuracy indexes are shown in TABLE VIII. The experimental results demonstrate that, as for the autoregressive prediction, the AMSVM has higher accuracy than the ARIMA for its MAPE and RMSE are 8.3206% and 4.373% less than ARIMA, respectively. Considering that the BPNN needs a large number of historical data as the training sample, the AMSVM can realize the prediction only by small real-time data, which make it more adapt to the change characteristic of traffic flow. Besides, the AMSVM-STC has fused the spatial-temporal correlation predicted values, which can be regarded as improvement of the predicted value of AMSVM, thus further elevating the prediction reliability and accuracy. Specifically, STC reduces the effect of uncertainty caused by random factors on the training data of AMSVM. In addition, even an unexpected event (such as a rainy day) happens at the certain time on the POI, its effect can be reflected to the prediction result especially by introducing the spatial correlation. Therefore, compared to the AMSVM, the AMSVM-STC has reduced the MAPE and RMSE by 3.8074% and 2.6268%, respectively, and has increased the R by 2.21%. In addition, to further evaluate the AMSVM-STC’s performance, we study another POI in Freeway, which is at (36.630772N, 119.68918W ), SR99, District 6, Fresno County,

on Apr. 14, 2017. We name it as “Freeway-2”. The corresponding experimental area is shown in Fig. 14 (a), where Location 1 is the POI. We do similar experiments on it as that on Freeway-1. The results are shown in Fig. 13 (b) and TABLE IX. Similar superiority of our AMSVM-STC can be observed in this scenario. In short, our algorithm can achieve better prediction performance compared to the existing methods. E. Traffic Flow Prediction Under Different Scenarios To illustrate the generalization of our algorithm, the traffic flow prediction in different scenarios are considered. We pick up two POIs in Urban road, which are at (37.33460N, 121.8598W ), Sinclair, District 4, Santa Clara County, on Apr. 11, 2011, and at (37.323066N, 121.896538W ), Sinclair, District 6, Santa Clara County, on Nov. 7, 2016. We name them as “Urban road-1” and “Urban road-2” respectively. The experimental areas are shown in Fig. 14 (b) and (c). The predicted results are shown in Fig. 15. TABLE X shows the comparison of the prediction performance between the freeway and urban road. Comparing Fig. 14 (b) (c) with Fig. 14 (a) and Fig. 5, we can find that there are more intersections and overpasses in urban road scenario, which will lead to more complex and dynamic traffic conditions. When observing traffic flow in Fig. 13 and Fig. 15, we can find that there are several sharp fluctuations of traffic volume, which means a big change of vehicle number (more than 100) during a relatively short time (a couple of periods), in urban road scenarios. For example, these fluctuations can be found at around 6:30, 11:40 and 17:30 in Fig. 15 (a), but only around 15:45 in Fig. 13 (a).

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 12

Fig. 14.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

The experimental areas in different scenarios. (a) Freeway-2. (b) Urban road-1. (c) Urban road-2.

Fig. 15. Traffic flow prediction results of urban road by AMSVM-STC. (a) Urban road-1. (b) Urban road-2.

prediction algorithm, namely Adaptive Multi-kernel Support Vector Machine with Spatial-Temporal Correlation, to predict the short-term traffic flow. Besides, we presented the Adaptive Particle Swarm Optimization to optimize the parameters of AMSVM, and introduced how to adaptively adjust the hybrid kernel function’s weight according to the change tendency of real-time traffic flow. Especially, we incorporated spatialtemporal information of correlative locations, and combined AMSVM’s autoregressive predicted value and spatial-temporal predicted values as the final predicted result of short-term traffic flow. The experimental results validated its superiority compared with existing methods. In conclusion, our method can better adapt to the dynamic characteristic of traffic flow on urban road, thus providing a more accurate predicted result. We can find from the simulation results that the predicted traffic flow have a certain delay compared to the real values. Therefore, we plan to improve the prediction instantaneity in our future work. Besides, we will extend our method by considering more traffic information like speed and lane occupancy, so that we can estimate the traffic state of the next moment exactly.

TABLE X

R EFERENCES

T RAFFIC F LOW P REDICTION P ERFORMANCE U NDER D IFFERENT S CENARIOS BY AMSVM-STC

TABLE X indicates that the MAPE and RMSE of traffic flow prediction in urban road are lower than that in freeway, while R is higher. Therefore, our AMSVM-STC has good universality, which can adapt to the extremely complicated urban traffic conditions and achieve a considerable predicted result. VII. C ONCLUSION Accurate estimation of traffic state is an effective measure for relieving traffic congestion in cities. Short-term traffic flow forecasting of the road is crucial to estimate traffic state on future moment. In this paper, we proposed a novel

[1] X. Ling, X. Feng, Z. Chen, Y. Xu, and H. Zheng, “Short-term traffic flow prediction with optimized multi-kernel support vector machine,” in Proc. IEEE Congr. Evol. Comput. (CEC), Jun. 2017, pp. 294–300. [2] M. Zhu et al., “Public vehicles for future urban transportation,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 12, pp. 3344–3353, Dec. 2016. [3] C. Guo, D. Li, G. Zhang, and M. Zhai, “Real-time path planning in urban area via vanet-assisted traffic information sharing,” IEEE Trans. Veh. Technol., vol. 67, no. 7, pp. 5635–5649, Jul. 2018. [4] N. L. Nihan and K. O. Holmesland, “Use of the box and jenkins time series technique in traffic forecasting,” Transportation, vol. 9, no. 2, pp. 125–143, 1980. [5] B. Ghosh, B. Basu, and M. O’Mahony, “Bayesian time-series model for short-term traffic flow forecasting,” J. Transp. Eng., vol. 133, no. 3, pp. 180–189, 2007. [6] I. Okutani and Y. J. Stephanedes, “Dynamic prediction of traffic volume through Kalman filtering theory,” Transp. Res. B, Methodol., vol. 18, no. 1, pp. 1–11, 1984. [7] L. L. Ojeda, A. Y. Kibangou, and C. C. De Wit, “Adaptive Kalman filtering for multi-step ahead traffic flow prediction,” in Proc. IEEE Amer. Control Conf., Jun. 2013, pp. 4724–4729. [8] W. Huang, G. Song, H. Hong, and K. Xie, “Deep architecture for traffic flow prediction: Deep belief networks with multitask learning,” IEEE Trans. Intell. Transp. Syst., vol. 15, no. 5, pp. 2191–2201, Oct. 2014. [9] J. Tang, F. Liu, Y. Zou, W. Zhang, and Y. Wang, “An improved fuzzy neural network for traffic speed prediction considering periodic characteristic,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 9, pp. 2340–2350, Sep. 2017.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FENG et al.: AMSVM WITH SPATIAL–TEMPORAL CORRELATION

[10] C. J. Dong, Z. Y. Liu, and Z. L. Qiu, “Prediction of traffic flow in realtime based on chaos theory,” Inf. Control, vol. 33, no. 5, pp. 518–522, 2004. [11] C.-H. Wu, J.-M. Ho, and D. T. Lee, “Travel-time prediction with support vector regression,” IEEE Trans. Intell. Transp. Syst., vol. 5, no. 4, pp. 276–281, Dec. 2004. [12] W.-C. Hong, Y. Dong, F. Zheng, and S. Y. Wei, “Hybrid evolutionary algorithms in a SVR traffic flow forecasting model,” Appl. Math. Comput., vol. 217, no. 15, pp. 6733–6747, 2011. [13] W. Hu, L. Yan, K. Liu, and H. Wang, “PSO-SVR: A hybrid shortterm traffic flow forecasting method,” in Proc. IEEE Int. Conf. Parallel Distrib. Syst., Dec. 2015, pp. 553–561. [14] Y. Yang and H. Lu, “Short-term traffic flow combined forecasting model based on SVM,” in Proc. Int. Conf. Comput. Inf. Sci., 2010, pp. 262–265. [15] M. Li, W.-C. Hong, and H.-G. Kang, “Urban traffic flow forecasting using Gauss–SVR with cat mapping, cloud model and PSO hybrid algorithm,” Neurocomputing, vol. 99, pp. 230–240, Jan. 2013. [16] J. Wang and Q. Shi, “Short-term traffic speed forecasting hybrid model based on Chaos–Wavelet analysis-support vector machine theory,” Transp. Res. C, Emerg. Technol., vol. 27, no. 2, pp. 219–232, Feb. 2013. [17] J. Ouyang, F. Lu, and X. Liu, “Short-term urban traffic forecasting based on multi-kernel SVM model,” J. Image Graph., vol. 15, no. 11, pp. 1688–1695, 2010. [18] X. Kong, Z. Xu, G. Shen, J. Wang, Q. Yang, and B. Zhang, “Urban traffic congestion estimation and prediction based on floating car trajectory data,” Future Generat. Comput. Syst., vol. 61, pp. 97–107, Aug. 2016. [19] Y. Xu, B. Wang, Q. Kong, Y. Liu, and F. Y. Wang, “Spatio-temporal variable selection based support vector regression for urban traffic flow prediction,” in Proc. Transp. Res. Board 93rd Annu. Meeting, 2014, p. 15. [20] T. L. Pan, A. Sumalee, R. X. Zhong, and N. Indra-Payoong, “Shortterm traffic state prediction based on temporal–spatial correlation,” IEEE Trans. Intell. Transp. Syst., vol. 14, no. 3, pp. 1242–1254, Sep. 2013. [21] H. Tan, Y. Wu, B. Shen, and P. J. Jin, “Short-term traffic prediction based on dynamic tensor completion,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 8, pp. 2123–2133, Aug. 2016. [22] Y. Lv, Y. Duan, W. Kang, Z. Li, and F.-Y. Wang, “Traffic flow prediction with big data: A deep learning approach,” IEEE Trans. Intell. Transp. Syst., vol. 16, no. 2, pp. 865–873, Apr. 2015. [23] Y. Wu and H. Tan. (2016). “Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework.” [Online]. Available: https://arxiv.org/abs/1612.01022 [24] J. Zhang, Y. Zheng, and D. Qi, “Deep spatio-temporal residual networks for citywide crowd flows prediction,” in Proc. AAAI, 2017, pp. 1655–1661. [25] Y. Wang, M. Papageorgiou, A. Messmer, P. Coppola, A. Tzimitsi, and A. Nuzzolo, “An adaptive freeway traffic state estimator,” Automatica, vol. 45, no. 1, pp. 10–24, 2009. [26] V. N. Vapnik, “An overview of statistical learning theory,” IEEE Trans. Neural Netw., vol. 10, no. 5, pp. 988–999, Sep. 1999. [27] K. R. Muller, A. J. Smola, G. Ratsch, B. Scholkopf, J. Kohlmorgen, and V. Vapnik, “Predicting time series with support vector machines,” in Proc. Int. Conf. Artif. Neural Netw., 1997, pp. 999–1004. [28] G. F. Smits and E. M. Jordaan, “Improved svm regression using mixtures of kernels,” in Proc. Int. Joint Conf. Neural Netw. (IJCNN), May 2002, pp. 2785–2790. [29] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE ICNN, vol. 4. Nov./Dec. 1995, pp. 1942–1948. [30] M. R. Bonyadi, X. Li, and Z. Michalewicz, “A hybrid particle swarm with a time-adaptive topology for constrained optimization,” Swarm Evol. Comput., vol. 18, no. 1, pp. 22–37, 2014. [31] W. Wang and Z. Li, “Traffic prediction method based on particle swarm optimized support vector machine,” J. Shanxi Datong University: Natural Sci., no. 2, pp. 25–28, 2015. [32] W. Ren and X. Wu, “A modified simple particle swarm optimization using dynamically changing learning factor,” Techn. Automat. Appl., vol. 31, no. 10, pp. 9–11, 2012. [33] G. Ma, R. Li, and L. Liu, “Particle swarm optimization algorithm of learning factor and time factor adjusting to weights,” Appl. Res. Comput., vol. 31, no. 11, pp. 3291–3294, 2014. [34] B. Williams, “Multivariate vehicular traffic flow prediction: Evaluation of ARIMAX modeling,” Transp. Res. Rec., J. Transp. Res. Board 1776, pp. 194–200, 2001. [35] I. Davidson and S. S. Ravi, “Agglomerative hierarchical clustering with constraints: Theoretical and empirical results,” in Proc. Eur. Conf. Princ. Data Mining Knowl. Discovery, 2005, pp. 59–70.

13

[36] M.-C. Tan, S. C. Wong, J.-M. Xu, Z.-R. Guan, and P. Zhang, “An aggregation approach to short-term traffic flow prediction,” IEEE Trans. Intell. Transp. Syst., vol. 10, no. 1, pp. 60–69, Mar. 2009. [37] W. Zhang, Y. Qi, K. Henrickson, J. Tang, and Y. Wang, “Vehicle traffic delay prediction in ferry terminal based on Bayesian multiple models combination method,” Transportmetrica A, Transport Sci., vol. 13, no. 5, pp. 467–490, 2017. [38] C. W. J. Granger, “Invited review combining forecasts—Twenty years later,” J. Forecasting, vol. 8, no. 3, pp. 167–173, 1989. [39] B. Fassinut-Mombot and J. B. Choquel, “An entropy method for multisource data fusion,” in Proc. Int. Conf. Inf. Fusion, vol. 2, Jul. 2000, pp. THC5/17–THC5/23. [40] Z. Huang, H. Ouyang, and Y. Tian, “Short-term traffic flow combined forecasting based on nonparametric regression,” in Proc. Int. Conf. Inf. Technol., Comput. Eng. Manage. Sci., Sep. 2011, pp. 316–319. [41] Caltrans PeMS. Accessed: Oct. 15, 2016. [Online]. Available: http://pems.dot.ca.gov/ [42] F. Liu, Y. Han, and Z. Wang, “Adaptive weight particle swarm optimize particle filter algorithm,” Comput. Simul., vol. 30, no. 11, pp. 330–333, 2013. Xinxin Feng (M’15) received the Ph.D. degree in information and communication engineering from Shanghai Jiao Tong University, Shanghai, China, in 2015. She is currently an Associate Professor with the College of Physics and Information Engineering, Fuzhou University, Fuzhou, China. Her research interests include data analysis and incentive mechanism design in crowd-sensing networks, game theory, and market theory. Xianyao Ling received the B.S. degree in electronic information science and technology from Nanjing Agricultural University, Nanjing, China, in 2015. He is currently pursuing the M.S. degree in information and communication engineering with Fuzhou University, Fuzhou, China. His research interests include big data analysis for Internet of Vehicles, machine learning, and deep learning.

Haifeng Zheng (M’14) received the Ph.D. degree in communication and information system from Shanghai Jiao Tong University, Shanghai, China, in 2014. He was a Visiting Scholar with State University of New York at Buffalo from 2015 to 2016. He is currently an Associate Professor with the College of Physics and Information Engineering, Fuzhou University, China. His research interests include wireless sensor networks, crowd-sensing networks, compressive sensing, and machine learning.

Zhonghui Chen received the M.S. degree in communication engineering from Tsinghua University, Beijing, China, in 1987. He is currently a Professor with the College of Physics and Information Engineering, Fuzhou University, Fuzhou, China. His research interests include crowd-sensing networks, Internet of Vehicles, and wireless communication networks.

Yiwen Xu received the Ph.D. degree from Xiamen University, Xiamen, China, in 2012. He is currently an Associate Professor with the College of Physics and Information Engineering, Fuzhou University, Fuzhou, China. His research interests include data mining technology for crowdsensing-based Internet of Vehicles and coding technology in H.265/HEVC.