Expert Systems with Applications 39 (2012) 8474–8478
Contents lists available at SciVerse ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Chaotic time series prediction with employment of ant colony optimization Vasilii A. Gromov ⇑, Artem N. Shulga Oles Honchar Dnepropetrovsk National University, 13, Naukova str., Dnepropetrovsk 49050, Ukraine
a r t i c l e
i n f o
Keywords: Chaotic time series prediction Ant colony optimization
a b s t r a c t In this study, the novel method to predict chaotic time series is proposed. The method employs the ant colony optimization paradigm to analyze topological structure of the attractor behind the given time series and to single out the typical sequences corresponding to the different part of the attractor. The typical sequences are used to predict the time series values. The method was applied to time series generated by the Lorenz system, the Mackey–Glass equation, and weather time series as well. The method is able to provide robust prognosis to the periods comparable with the horizon of prediction. Ó 2012 Elsevier Ltd. All rights reserved.
1. Introduction Broad range of chaotic processes that one can meet in the real life and technical systems inspires a great interest to chaotic models and particularly, to chaotic series prediction. Meanwhile, predictive models and routines used to be applied to forecast stationary time series (such as ARIMA-models and others) are failed to cope with chaotic series prediction problem. It could be explained by essential non-linearity of systems that generates chaotic series and complex geometrical structure of the systems’ attractors. The complex dynamics manifests itself in various ways like rapid changes of the time series, positive Lyapunov exponents, fractal dimensions of attractors and others (Strogatz, 2001). Recently, a plenty of biologically inspired models have been proposed to derive the structure of the strange attractor out of a series and predict the series. One could divide them into three main groups in accordance with the artificial intelligence theories they used to reconstruct the state space (Wang, Chi, Wu, & Lu, 2011) and to analyze time series. The first group is the neural networks that able to employ their approximative capability in order to single out different local trends occurred in a series and predict them with simple linear models installed in them (Gan, Peng, Peng, Chen, & Inoussa, 2010; Wong, Xia, & Chu, 2010). One should mention here singular spectrum analysis method used to analyze singular values of covariance matrix of the series and extract information about the local trends (Elsner & Tsonis, 1996). The second group is fuzzy and neuro-fuzzy models used to create robust and transparent predictive systems (Fu, Wub, Jeng, & Ko, 2010; Gu & Wang, 2007). The third one is the systems based on different distributed artificial
intelligence approaches such as genetic algorithms (Mirzaee, 2009), particle swarm optimization (Zhao & Yang, 2009), ant colony optimization (Niu, Wang, & Wu, 2010; Toskari, 2009), and others. They could be used to adjust parameters of other models (Pan, Jiang, Wang, & Jiang, 2011), but also they possess their own predictive capabilities. One could mention a number of works, where ant colony optimization applied to forecast different chaotic series (power supply demand (Ünler, 2008), electric load (Hong, 2010), traffic flow (Hong, Dong, Zheng, & Lai, 2011), financial series (Weng & Liu, 2006). This paper employs ACO in order to extract information about strange attractor topological structure out of a given time series and to predict it using information being extracted. We assume that all transient processes in the system behind the chaotic time series at hand have been completed and the time series reflects the movement in the neighborhood of strange attractor however complex it is. The second assumption is that the series meets Takens theorem conditions and respectively one could analyze the attractor structure using its elements. Consequently, trajectory of the system should move along the same part of the attractor frequently and one could meet analogous (similar) sequences in the time series. The rest of the paper is organized as follows. Section 2 outlines general notions of ant colony optimization and describes state space used to represent inner structure of the attractor underling the time series. Section 3 presents a complete description of prediction routine. Section 4 provides the results of prediction for the series generated by the Lorenz system, the Mackey–Glass equations, and for weather time series for Kiev city. Section 5 provides the conclusions. 2. Ant colony optimization
⇑ Corresponding author. Address: Dnepropetrovsk 49044, Post Office 44, Post Box 2718, Ukraine (Lyegpogenpodcr 49044, g/o 44, a/z 2718, Ukraine). Tel.: +380 (067) 3897526, +380 (056) 7440598. E-mail addresses:
[email protected] (V.A. Gromov),
[email protected] (A.N. Shulga). 0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2012.01.171
2.1. General description of ant colony optimization Ant colony optimization (ACO) is metaheuristics used to find paths in a graph best with respect to the predefined functional
V.A. Gromov, A.N. Shulga / Expert Systems with Applications 39 (2012) 8474–8478
(Dorigo, 1992; Dorigo & Gambardella, 1997) ACO has been inspired by real ants search for food and retains some biological terms. For instance, the term ant stands for the simplest search agent moving along the graph, a pheromone stands for the weight of a graph edge; pheromone evaporation means that pheromone for each edge is diminished for a small predefined quantity every iteration. The basic idea of ACO-type algorithms consists of two interweaved rules. Ants constantly move along the graph and an outer observer appreciates the path of each ant according to a certain quality functional. After that pheromone of each edge belonged to the path is updated with respect to the calculated quality functional value. On the other hand, each ant chooses the edge of the graph taking into account the amount of pheromone corresponding to this edge. It allows ACO to combine the ability of DAI methods to find the globally best solution with a high speed of convergence pertain to gradient algorithms. Pheromone is evaporated after each iteration. It leads to that after some iterations all ants is inclined to move in the vicinity of the best solutions. Mathematically the rules are expressed by the following formulas:
Pij ðtÞ ¼ P
sij ðtÞa
b 1 dij
s
sij ðt þ 1Þ ¼ ð1 pÞsij ðtÞ þ
ð2:1Þ
a 1 b; ð Þ
k2all allowed vertices ik ðtÞ
dij
X k2ant;that chose edge ði;jÞ
Q ; Lk
ð2:2Þ
where Pij ðtÞ – the probability of the ant transition along the edge (i, j), sij ðtÞ – the amount of pheromone, corresponding to the edge (i, j), dij – the distance between node i and node j, a; b – parameters to control the influence of sij ðtÞ and dij , p – the pheromone evaporation coefficient, Lk – the cost of the kth ant’s path, Q is a constant. 2.2. Graph creation The necessary precondition to apply ACO to any problem is to reformulate the problem in the terms of nodes and edges and to provide oneself with reasonable quality functional (Niu et al., 2010). One could find examples of the graphs and functionals for data mining problems in Abraham, Grosan, and Ramos (2006). For the chaotic series prediction problem the graph should represent inner structure of the attractor behind the series and be able to collect information about typical sequences pertaining to different parts of the attractor. Firstly, the analyzed series is normalized for all its elements to be in the range [0, 1]. Secondly, one should quantize the interval [0, 1] into subintervals ½yi ; yiþ1 ; i ¼ 1; M, where M ¼ 1=e, e describes the necessary exactitude of prognosis. Another parameter to be introduced is a maximal possible distance between two adjacent numbers of series’ elements belonged to the same sequence. Thirdly, one could formulate the search graph of the series as complete multigraph ðG ¼< V; E >; jVj ¼ M; jEj ¼ DÞ with the number of vertices equal to the necessary number of the subintervals (M), and with each pair of vertices to be linked by D edges. A movement from ith vertex to jth one through kth edge corresponds to the fact that two elements of the time series belong to ith and jth subintervals respectively and the difference between their positions in the series is equal to k. A set of the typical sequences is an auxiliary structure need to be introduced in order to launch the algorithm. The typical sequence Lðd1 ðd1 Þ; d2 ðd2 Þ; . . . ; dn1 ðdn1 Þ; dn ) is characterized by subintervals’ numbers d1 ; d2 ; . . . ; dn and differences d1 ; d2 ; . . . ; dn1 between positions of the sequence elements in the series. The graph and the set of typical sequences allow one to represent correctly typical sequences frequented in the series, e.g. belonged to different parts of a strange attractor. We assume that the sequence Lðd1 ðd1 Þ; d2 ðd2 Þ; . . . ; dn1 ðdn1 Þ; dn ) describes the part
8475
of the series starting from observation y½n if it meets the following inequality:
" # i X diþ1 dj < e; y n þ N j¼1
8i ¼ 0 . . . k;
ð2:3Þ
3. Algorithms for series analysis and prediction The above mentioned considerations allow ones to formulate following algorithms to encode the information about attractor behind the time series structure and its typical sequences and to predict the time series. The algorithm for series analysis. Step 0: Construct the search multigraph for given values of algorithm parameters. Initialize all multigraph edges by initial equal quantity of pheromone. Create the empty set of typical sequences. Step 1: Place ants by random to elements of given time series and start new iteration. Step 2: Each ant moves to the new element of the time series and, in doing so, adds new part to its current sequence. The probability of its transition to the new time series element is equal to pheromone quantity for the respective multigraph edge. Step 3: Calculate the number of entries of the sequence in the time series – C kL . Step 4: Add new sequence to the set of typical sequences, if C kL k1 P C min , where C min is the algorithm parameter. Delete CL the respective ant. Add pheromone to the edges of the multigraph corresponding to the newly added sequence. Step 5: Delete the respective ant, if C kL < C max , where C max is the algorithm parameter. Step 6: Go to the step 2, if there are any ants in the population. Go to the step 8, otherwise. Step 7: Evaporate pheromone for all edges of the multigraph. Go to the step 1. The algorithm terminates, if the quantity of pheromone is not changed more than pe during last k iterations. The algorithm for series prediction. For each time moment to be prognosed xt ; t 2 ðto ; t o þ NÞ, where to is the number of last known observation, N is the maximum possible length of the time series part occupied by the sequence. Step 1. Find all sequences from the set of typical sequences that correspond to known either already prognosed part of the time series and include the xt element. Step 2. Single out the sequences from the set of sequences created new at the step 1, if: 8i : 0 < i < L : jxold tþi xtþi j < Dmax , Where L is the length of time series part, occupied by the sequence, old xnew tþi , xtþi – the starting number of current and previous sequence entrance in the time series. Step 3. Clusterize the sequences from the set created at the step 2 with respect to their prognosis for xt. Step 4. Single out the cluster for which standard deviation of the cluster sequences’ values and known either already prognosed part of the time series is minimal. Step 5. Calculate prognosed value as average value for the xt for the sequences of the cluster. The sequences collect information about the attractor and average, in a sense, sequences of the time series. It makes the prediction routine robust and allows one to use the predicted values as initial ones for further predictions.
8476
V.A. Gromov, A.N. Shulga / Expert Systems with Applications 39 (2012) 8474–8478
4. Numerical simulation The proposed algorithms were applied to predict time series generated by the Lorenz system, the Mackey–Glass equation, and temperature time series for Kiev-city. 4.1. Time series generated by the Lorenz system The Lorenz system
X_ ¼ rðX YÞ Y_ ¼ rX XZ ; Z_ ¼ ZY bZ
ð4:1Þ
(Lorenz, 1963), a typical benchmark for all chaotic series prediction models, was considered with standard ‘‘chaotical’’ values r ¼ 10; b ¼ 83 ; r ¼ 28. The series was generated from the Eq. (4.1) by the Runge–Kutta method and has an exactitude of its data d ¼ 104 . If one imposes a desirable exactitude of prognosis
e ¼ 0:1, one could calculate the horizon of prognosis for the series equal to 115 observations. The nearest neighbor algorithm (Sprott, 2003) was used to calculate it. The respective value of Lyapunov exponent is 0.06.The search multigraph was constructed on basis of 5000 observations; other 2000 observations were used to test predictive abilities of the algorithm. The parameters of the algorithms were M = 20, dmin ¼ 2; dmax ¼ 1:5 The Fig. 1 plots the real (red (solid) line) and the predicted (blue (dotted) line) values. So, for this case the prediction routine makes a reasonable prognosis for 114 observations ahead (it is comparable with horizon of prognosis value). The Fig. 2 presents the standard deviation of prognosis versus difference between predicted and last available numbers of time series elements. The red (farthest) vertical lines represents the horizon of prediction (115), the green (nearest) line represents a border of robust prognosis (73). The Figs. 1 and 2 two also reflects another peculiarity of the prediction algorithm. It is possible to run into the part of the series for which prediction is completely incorrect (outliers in Fig. 2 near 90), but it doesn’t mean that it is unable to predict further correctly. It
Fig. 1. Prediction of the chaotic time series generated by the Lorenz system.
Fig. 2. The absolute error of prediction of the chaotic time series generated by the Lorenz system.
Fig. 3. Prediction of the chaotic time series generated by the Mackey Glass differential equation.
V.A. Gromov, A.N. Shulga / Expert Systems with Applications 39 (2012) 8474–8478
8477
Fig. 4. The absolute error of prediction of the chaotic time series generated by the Mackey Glass differential equation.
Fig. 5. The length of reasonable prognosis versus number of iterations.
bases its prediction on predicted values before the ‘‘bad’’ part of the series and completely ignores values within this part (as the set of typical sequences doesn’t contain information about such behavior). 4.2. Time series generated by the Mackey–Glass equation Another typical benchmark for chaotic series is provided by the Mackey–Glass equation, given by
dx xtau ¼b cx; c; b; n > o dt 1 þ xns
ð4:2Þ
with values of parameters c ¼ 1; b ¼ 2; s ¼ 17; n ¼ 10. The series was generated from Eq. (4.2) by Runge–Kutta method and has an exactitude of its data = 104. If one imposes a desirable exactitude
of prognosis e ¼ 0:1, one could calculate the horizon of prognosis for the series equal to 287 observations. The nearest neighbor algorithm was used to calculate it. The respective value of Lyapunov exponent is 0.024. The search multigraph was constructed on basis of 5000 observations; other 2000 observations were used to test predictive abilities of the algorithm. The parameter of the algorithms were M = 20, dmin ¼ 3; dmax ¼ 21. The Fig. 3 represents the prediction of the time series of Mackey Glass equation. The Fig. 4 represents the absolute error of prognosis versus difference between predicted and last available numbers of time series elements. Despite the non-smooth function appearance it never outliers the threshold 0.001 and respectively is able to predict robustly after the horizon of prediction. Figs. 5 and 6 present dependences the quality of prognosis and length of reasonable prognosis on number of iterations. The
Fig. 6. The standard deviation versus number of iterations.
Fig. 7. One-day-ahead prediction of the temperature in Kiev city time series.
8478
V.A. Gromov, A.N. Shulga / Expert Systems with Applications 39 (2012) 8474–8478
Fig. 8. Many-day-ahead prediction of the temperature in Kiev city time series.
non-monotony of the plots reflects the essential non-linearity of the time series. 4.3. Weather time series prediction The above mentioned method was applied to predict temperature changes for Kiev-city. The horizon of prognosis for the series is 19 (we consider that the temperature was measured up to 0.1 °C exactitude and desirable exactitude is 3 °C). The respective value of Lyapunov exponent is 0.18. The Fig. 7 plots the result of one-day-ahead prediction for the time series, Fig. 8 presents many-day-ahead prediction for the series. Green vertical line represents the calculated horizon of prediction. 5. Conclusions The proposed in the paper method of chaotic series prediction based on ant colony optimization is able to provide robust prognosis to the periods comparable with the horizon of prediction (it even breaks this barrier for the simplest cases). The method was tested with employment of series generated by the Lorenz system, the Mackey–Glass equation, and weather time series as well. The optimal parameters of the algorithm were found for the abovementioned time series. The algorithm demonstrates a computational convergence for the cases considered. References Abraham, A., Grosan, C., & Ramos, V. (Eds.). (2006). Swarm intelligence in data mining //Studies in computational intelligence (Vol. 34). Berlin: Springer-Verlag. Dorigo, M. (1992). Optimization, learning, and natural algorithms. Doctoral Dissertation, Dip. Elettronica e Informazione, Politecnico di Milano, Italy. Dorigo, M., & Gambardella, L. M. (1997). Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolu-tionary Computation, 1, 53–66.
Elsner, J. B., & Tsonis, A. A. (1996). Singular spectrum analysis: A new tool in time series analysis (1st ed.). Springer. Fu, Y. Y., Wub, C. Y., Jeng, J. T., & Ko, C. N. (2010). ARFNNs with SVR for prediction of chaotic time series with outliers. Expert Systems with Applications, 37, 4441–4451. Gan, M., Peng, H., Peng, X., Chen, X., & Inoussa, G. (2010). A locally linear RBF network-based state-dependent AR model for nonlinear time series modeling. Information Sciences, 180, 4370–4383. Gu, H., & Wang, H. (2007). Fuzzy prediction of chaotic time series based on singular value decomposition. Applied Mathematics and Computation, 185, 1171–1185. Hong, W. C. (2010). Application of chaotic ant swarm optimization in electric load forecasting. Energy Policy, 38, 5830–5839. Hong, W. C., Dong, Y., Zheng, F., & Lai, C. Y. (2011). Forecasting urban traffic flow by SVR with continuous ACO. Applied Mathematical Modeling, 35, 1282–1291. Lorenz, E. N. (1963). Deterministic non-periodic flows. Journal of the Atmospheric Science, 20, 130–141. Mirzaee, H. (2009). Linear combination rule in genetic algorithm for optimization of finite impulse response neural network to predict natural chaotic time series. Chaos, Solitons and Fractals, 41, 2681–2689. Niu, D., Wang, Y., & Wu, D. D. (2010). Power load forecasting using support vector machine and ant colony optimization. Expert Systems with Applications, 37, 2531–2539. Pan, Y., Jiang, J. C., Wang, R., & Jiang, J. J. (2011). Predicting the net heat of combustion of organic compounds from molecular structures based on ant colony optimization. Journal of Loss Prevention in the Process Industries, 24, 85–89. Sprott, J. C. (2003). Chaos and time-series analysis (1st ed.). Oxford University Press. Strogatz, S. H. (2001). Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering (1st ed.). Westview Press. Toskari, M. D. (2009). Estimating the net electricity energy generation and demand using the ant colony optimization approach. Energy Policy, 37, 1181–1187. Ünler, A. (2008). Improvement of energy demand forecasts using swarm intelligence: The case of Turkey with projections to 2025. Energy Policy, 36, 1937–1944. Wang, J., Chi, D., Wu, J., & Lu, H. (2011). Chaotic time series method combined with particle swarm optimization and trend adjustment for electricity demand forecasting. Expert Systems with Applications, 38, 8419–8429. Weng, S. S., & Liu, Y. H. (2006). Mining time series data for segmentation by using ant colony optimization. European Journal of Operational Research, 173, 921–937. Wong, W. K., Xia, M., & Chu, W. C. (2010). Adaptive neural network model for timeseries forecasting. European Journal of Operational Research, 207, 807–816. Zhao, L., & Yang, Y. (2009). PSO-based single multiplicative neuron model for time series prediction. Expert Systems with Applications, 36, 2805–2812.