Physics Letters A 357 (2006) 314–318 www.elsevier.com/locate/pla
Nonlinear analysis of traffic time series at different temporal scales Pengjian Shang a,∗ , Xuewei Li b , Santi Kamae c a Department of Mathematics, School of Science, Beijing Jiaotong University, Beijing 100044, PR China b School of Economics and Management, Beijing Jiaotong University, Beijing 100044, PR China c Department of Physics, Osaka University, Machikaneyama-chou 1-1, Toyonaka, Osaka 560-0043, Japan
Received 27 January 2006; received in revised form 13 April 2006; accepted 18 April 2006 Available online 2 May 2006 Communicated by C.R. Doering
Abstract This Letter of the behavior of traffic flow at different temporal scales identifies the type of approach most suitable for transformation of traffic data from one scale to another. The finite correlation dimensions obtained for the four traffic volume series indicate the possible existence of chaotic behavior in the traffic data observed at the four scales. A possible implication of this might be that the traffic processes at these scales are related through a chaotic (scale-invariant) behavior. A comparison of the correlation dimension and coefficient of variation of each of the time series reveals a direct relationship between the two: higher dimension for higher coefficient of variation and vice versa. © 2006 Elsevier B.V. All rights reserved. PACS: 45.70.Vn; 05.45.-a; 05.45.Tp
1. Introduction In the last decade there have been successful attempts by many researchers to apply nonlinear analysis to model and study highly irregular signals arising in various fields of natural sciences and engineering [1–8]. Applications of the concept of deterministic chaos to understand traffic flow have received considerable attention [9–17]. Such applications have resulted in noticeable progress in the areas of identification and prediction of traffic flow is encouraging news for transportation professionals, and it is believed that the concept of chaos can be employed to solve other traffic problems, such as data transformation as well. An attempt is made in the present Letter to employ the concept of chaos theory to understand the dynamics of transformation of traffic data from one scale (or resolution) to another, a problem of significant importance in traffic engineering domain, as the lack of high-resolution traffic data is one of the most prominent limiting factors in traffic calculations.
* Corresponding author.
E-mail addresses:
[email protected] (P. Shang),
[email protected] (X. Li),
[email protected] (S. Kamae). 0375-9601/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.physleta.2006.04.063
Almost all of the past studies of the existence of chaos in traffic flow analyzed time series of only one particular scale measured in the same location [6,14,16,17]. Hence, none could provide information about whether or not the transformation of traffic process between different scales is chaotic. As a first step, the present study investigates traffic series of different temporal scales observed in the same location. Traffic data of four different temporal scales, i.e., 2-minute, 4-minute, 8-minute, and 16-minute, over a period of about 40 months, from 16 January 2001 to 17 June 2004, observed on the Yuquanying highway, in Beijing, China, are analyzed to investigate the existence of chaos. The underlying assumption is that the individual behavior of the dynamics of traffic flow at these scales provides important information about the dynamics of the overall traffic transformation between these scales. More specifically, if the traffic flow at different scales exhibit chaotic behavior, then the dynamics of the transformation between them may also be chaotic. The presence of chaos in the traffic time series is investigated by employing the correlation dimension method [18]. The correlation dimension is a representation of the variability or irregularity of a process and furnishes information on the number of dominant variables present in the evolution of the corresponding dynamical system; it can indicate not only
P. Shang et al. / Physics Letters A 357 (2006) 314–318
the existence of chaos in the traffic flow, if any, but also reveal whether the process is deterministic or stochastic, if not chaotic. The organization of this Letter is as follows. First, a brief review of nonlinear time series modeling techniques, including phase-space reconstruction and correlation dimension method, is given in Section 2. Details of the data used, analyses carried out and results obtained are presented in Section 3. Finally, important conclusions drawn from this Letter are provided in Section 4. 2. Reconstruction of the phase space and correlation dimension method For a scalar time series xt , where t = 1, 2, . . . , N , the phase space can be reconstructed using the method of delays [19]. The basic idea in the method of delays is that the evolution of any single variable of a system is determined by the other variables with which it interacts. Information about the relevant variables is thus implicitly contained in the history of any single variable. On the basis of this an “equivalent” phase space can be reconstructed by assigning an element of the time series xt and its successive delays as coordinates of a new vector time series Yt = {xt , xt+τ , xt+2τ , . . . , xt+(m−1)τ },
(2.1)
where τ is referred to as the delay time and for a digitized time series is a multiple of the sampling interval used, while m is termed the embedding dimension. The dimension m of the reconstructed phase space is considered as the sufficient dimension for recovering the object without distorting any of its topological properties, thus it may be different from the true dimension of the space where this object lies. Both the τ and m reconstruction parameters must be determined from the data. Correlation dimension is a measure of the extent to which the presence of a data point affects the position of the other point lying on the attractor. Among a large number of methods available for distinguishing between chaotic and stochastic systems, the correlation dimension method is probably the most fundamental one. The method uses the correlation function (or integral) for distinguishing between chaotic and stochastic behaviors. The concept of the correlation function is that a seemingly irregular phenomenon arising from deterministic dynamics will have a limited number of degrees of freedom equal to the smallest number of first order differential equations that capture the most important features of the dynamics. Thus, when one constructs phase spaces of increasing dimension for an infinite data set, a point will be reached where the dimension equals the number of degrees of freedom and beyond which increasing the dimension of the representation will not have any significant effect on the correlation dimension. Phase space is a powerful concept because, with a model and a set of appropriate variables, dynamics can represent a real-world system as the geometry of a single moving point. Therefore, the reconstruction of the phase-space of a time series and, hence, its attractor (a geometric object which characterizes the long-term behavior of a system in the phase-space) is an important first step in
315
the correlation dimension method (or any chaos identification technique). Such a reconstruction approach uses the concept of embedding a single-variable series in a multi-dimensional phase space to represent the underlying dynamics. For an m-dimensional phase space the correlation function C(r) is given by 2 C(r) = lim N→∞ N (N − 1)
N
H r − |Yi − Yj | ,
(2.2)
1i
where H is the Heaviside step function, with H (u) = 1 for u > 0, and H (u) = 0 for u 0, where u = r − |Yi − Yj |, N is the number of data points, r is the radius of the sphere centered on Yi or Yj . If the time series is characterized by an attractor, then for positive values of r the correlation function C(r) is related to the radius r by the following relation: C(r) ∝ αr D2 ,
(2.3)
where α is a constant; and D2 is the correlation exponent or the slope of the log C(r) versus log r plot given by D2 = lim
r→0
log C(r) . log r
(2.4)
The slope is generally estimated by a least squares fit of a straight line over a certain range of r, called the scaling region. For a finite data set, such as the traffic time series, it is clear that there is a separation r below which there are no pairs of point; that is, it is “depopulated”. At the other extreme, when the value of r exceeds the set diameter, the correlation function increases no further; that is “saturated”. Therefore, for a finite data set, the region sandwiched between the depopulation region and the saturation region is considered as the scaling region. A somewhat better way to identify the scaling region is to estimate the local slope given by d[log C(r)] . d[log r]
(2.5)
To observe whether chaos exists, the correlation exponent (or local slope) values are plotted against the corresponding embedding dimension values. If the value of the correlation exponent is finite and noninteger, the system is considered to exhibit chaos. The saturation value of the correlation exponent is defined as the correlation dimension of the attractor. The nearest integer above the saturation value provides the minimum number of phase spaces or variables necessary to model the dynamics of the attractor. On the contrary, if the correlation exponent increases without bound with increase in the embedding dimension, the system under investigation is considered as stochastic [20]. 3. Analysis, results and discussion 3.1. The Beijing Yuquanying time series data We use the data observed on the Beijing Yuquanying highway over a period of about 40 months, from 16 January 2001
316
P. Shang et al. / Physics Letters A 357 (2006) 314–318
to 17 June 2004. During this period the traffic data set was chosen not to be biased due to road constructions or bad weather conditions. The data were downloaded from the Highway Performance Measurement Project (FPMP) run by Beijing STONG Intelligent Transportation System CO. LTD, Beijing. The raw data for speed, volume and occupancy are collected every 20 seconds for each lane of the instrumented freeway locations. The focus of our research is on the analysis of traffic volume data of the different temporal scales. Therefore, the traffic volume data are aggregated into four different time series at four temporal scales, i.e., 2-minute, 4-minute, 8-minute, and 16minute, respectively. Table 1 presents some of the important statistics of the above four series of the data observed at the Beijing Yuquanying over a period of about 2 months. 3.2. Correlation dimension calculation In the present study, traffic volume data observed on the Yuquanying highway, in Beijing, China, are analyzed. The correlation functions and the exponents are now computed for the four series. The delay time, τ , for the phase-space reconstruction is computed using the autocorrelation function method and is taken as the lag time at which the autocorrelation function first crosses the zero line [21]. For the four series, the first zero value of the autocorrelation function is attained at lag times 2, 3, Table 1 Statistics of Beijing Yuquanying traffic flow volume data (the number of cars passed per time interval) Parameter
2-minute
4-minute
8-minute
16-minute
Number of data Mean Standard deviation Coefficient of variation Maximum value Minimum value
648000 41 25.42 0.62 97 0
324000 82 37.72 0.46 168 0
162000 164 62.32 0.38 315 0
81000 328 88.56 0.27 596 6
4 and 3 respectively (Table 2); therefore, these values are used as the delay times in the phase-space reconstruction. For the 2-minute traffic volume series, Fig. 1 shows the relationship between the correlation integral, C(r), and the radius, r, for embedding dimensions, m, from 2 to 17. The log C(r) versus log r plots indicate clear scaling regions that allow fairly accurate estimates of the correlation exponents. Fig. 2 presents the relationship between the correlation exponent values and the embedding dimension values for the four time series. For all the series, the correlation exponent value increases with the embedding dimension up to a certain dimension, beyond which it is saturated; this is an indication of the existence of deterministic dynamics. The saturation values of the correlation exponent (or correlation dimension) for the four time series are respectively, 7.86, 5.56, 4.69, and 3.43 (Table 2). The finite correlation dimensions obtained for the four series indicate that they all exhibit chaotic behavior. The presence of chaos at each of these four scales suggests that the dynamics of transformation of traffic volume between these scales may also exhibit chaotic behavior. This, in turn, may imply the applicability of a chaotic approach for transformation of traffic data from one scale to another. The correlation dimension of a time series represents the variability or irregularity of the values in the series. A series with a high variability in values provides a higher dimension, which, in turn, indicates higher complexity in the dynamics of the process. A low dimension would be the result of low variability, indicating that the dynamics of the process are less comTable 2 Results of correlation dimension analysis of traffic data Parameter
2-minute
4-minute
8-minute
16-minute
Delay time Correlation dimension Number of variables Coefficient of variation
2 7.86 8 0.62
3 5.56 6 0.46
4 4.69 5 0.38
3 3.43 4 0.27
Fig. 1. log C(r) versus log r plot for 2-minute traffic volume series.
P. Shang et al. / Physics Letters A 357 (2006) 314–318
Fig. 2. Relationship between embedding dimension and correlation exponent for 2-minute, 4-minute, 8-minute, and 16-minute.
plex. The dimension results obtained above indicate that the traffic series at the 2-minute scale exhibits the highest variability. The correlation dimension value of 7.86 obtained indicates that the number of dominant variables involved in the dynamics of the 2-minute series is 8. On the other hand, the 16-minute series, yielding a correlation dimension of 3.43, exhibits the lowest variability, indicating that the number of dominant variables involved in the 16-minute traffic dynamics is 4. For traffic series at 4-minute and 8-minute scales, the correlation dimensions obtained are respectively 5.56 and 4.69 so that the number of variables dominant in the dynamics is 6 and 5 respectively (Table 2). The correlation dimensions and, thus, the number of variables obtained for the four traffic series indicate that aggregating the volume from higher resolutions, such as 2-minute and 4-minute, to lower resolutions, such as 8-minute and 16-minute, decreases the variability of the traffic dynamics. Such an observation appears to be direct to what is normally observed in nature; in general, the temporal aggregation of traffic volume often decreases the variability. The values of the coefficient of variation, defined as the ratio of the standard deviation to the mean, presented in Table 1, support this latter point. The highest coefficient of variation of 0.62 is observed for the (highest resolution) 2-minute traffic series, indicating that the 2-minute series exhibits the highest variability. The coefficient of variation values for the 4-minute, 8-minute, and 16-minute series are in decreasing order with values of 0.46, 0.38, and 0.27 respectively, indicating a decreasing trend in the variability, with the (lowest resolution) 16-minute traffic volume exhibiting the lowest variability. The outcomes of such studies could be very useful in verifying the present results regarding the correlation dimensions of the traffic series and the existence of chaos in the dynamics of traffic data transformation between the different scales studied. 4. Conclusions Understanding the dynamical behavior of transformation between the traffic processes observed at different scales is important to identify the suitable type of approach and the possibility
317
of transformation of data from one scale to another. Using the correlation dimension method, the present study investigated traffic time series of four different temporal scales, 2-minute, 4-minute, 8-minute, and 16-minute, observed on the Yuquanying highway, in Beijing, China. The underlying assumption was that the dynamical behavior of the traffic processes at the different scales could provide important information regarding the behavior of the overall transformation process between these scales. The fact that the correlation dimension method yielded finite dimension values of 7.86, 5.56, 4.69, and 3.43 for the 2-minute, 4-minute, 8-minute, and 16-minute traffic volume series respectively suggests that all of the above traffic series exhibited chaotic behavior. Hence, the transformation process between these scales might also be chaotic; however, such an interpretation must be substantiated with further evidence. Investigation of any parameter that connects the above traffic series could be useful. The distribution of traffic volume from one scale to another is one such important parameter, studies of which might lead to additional information regarding the behavior of transformation processes of the traffic data. The correlation dimension results showed that the higher resolution traffic series yielded higher dimensions than did the lower resolution series, suggesting that the variability of the higher resolution series was more than that of the lower resolution series. Therefore, aggregations of traffic volume from higher to lower resolutions normally decrease the variability. It is also necessary for us to note that traffic systems are rather considered to possess high degrees of freedom corresponding to the large number of vehicles. Such complex systems may exhibit complicated behavior due to the several different collective motions of vehicles. For example, stable and unstable traveling wave solutions may appear with different wave numbers [9]. The “mixture” of such motions could result in very complicated behavior. Furthermore, the time-scale of velocity oscillations decreases with the wave number, which could yield that the complexity increases when the traffic is observed on shorter time scales. Although the present Letter did not answer the existence of chaotic behavior in the dynamics of traffic transformation process between different scales, it has provided some clues so as not to exclude such a possibility and justifies continuation of the investigation to confirm the existence of a chaotic component in the transformation process. Acknowledgements The referees’ comments and suggestions were greatly useful for improving the Letter. The authors thank Beijing STONG Intelligent Transportation System CO. LTD, Beijing (SITS) for providing the traffic data of the Beijing Yuquanying highway. This research was supported by the funds of the Chinese MED (20040004006) and Beijing Jiaotong University (2005SM065). References [1] K. Falconer, Fractal Geometry, Wiley, New York, 1990.
318
P. Shang et al. / Physics Letters A 357 (2006) 314–318
[2] K. Fraedrich, C. Larnder, Tellus A 45 (1993) 289. [3] H. Kantz, T. Schreiber, Nonlinear Time Series Analysis, Cambridge Univ. Press, Cambridge, 1996. [4] W.E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, IEEE/ACM Trans. Networking 2 (1994) 1. [5] P. Maragos, Modulation and fractal models for speech analysis and recognition, in: Proceedings of COST-249 Meeting, February 1998. [6] A.S. Nair, J.-C. Liu, L. Rilett, S. Gupta, Non-linear analysis of traffic flow, 2001, Available from http://translink.tamu.edu/docs/Research/ LinearAnalysisTrafficFlow/chaos1.PDF, accessed 27 May 2003. [7] S. Pengjian, K. Santi, Chaos Solitons Fractals 26 (2005) 997. [8] R.C. Hilborn, Chaos and Nonlinear Dynamics: An Introduction for Scientists and Engineers, second ed., Oxford Univ. Press, Oxford, 2001. [9] G. Orosz, B. Krauskopf, R.E. Wilson, Physica D 211 (2005) 277. [10] G. Orosz, R.E. Wilson, B. Krauskopf, Phys. Rev. E 70 (2004) 026207. [11] I. Gasser, G. Sirito, B. Werner, Physica D 197 (2004) 222. [12] K. Engelborghs, T. Luzyanina, D. Roose, ACM Trans. Math. Software 28 (2002) 1.
[13] K. Nagel, M. Paczuski, Phys. Rev. E 51 (1995) 2909. [14] J.E. Disbro, M. Frame, Transportation Res. Rec. TRB 1225 (1989) 109. [15] L.A. Safanov, E. Tomer, V.V. Strygin, Y. Ashkenazy, S. Havlin, Europhys. Lett. 57 (2002) 151. [16] P. Shang, X. Li, S. Kamae, Chaos Solitons Fractals 25 (2005) 121. [17] H.J. van Zuylen, M.S. van Geenhuizen, P. Nijkamp, Transportation Res. Rec. TRB 1685 (1999) 21. [18] P. Grassberger, I. Procaccia, Physica D 9 (1983) 189. [19] F. Takens, Detecting strange attractors in turbulence, in: D.A. Rand, L.S. Young (Eds.), Dynamic Systems and Turbulence, Warwick, 1980, in: Lecture Notes in Mathematics, Springer-Verlag, Berlin, 1981, p. 366. [20] K. Fraedrich, J. Atmos. Sci. 3 (1986) 419. [21] J. Holzfuss, G. Mayer-Kress, An approach to error estimation in the application of dimension algorithms, in: G. Mayer-Kress (Ed.), Dimensions and Entropies in Chaotic Systems, Springer, New York, 1986.