A parametric, information-theory model for predictions in time series

A parametric, information-theory model for predictions in time series

Physica A 405 (2014) 63–69 Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa A parametric, inform...

1MB Sizes 1 Downloads 82 Views

Physica A 405 (2014) 63–69

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

A parametric, information-theory model for predictions in time series M.T. Martín a , A. Plastino b,∗ , V. Vampa c , G. Judge d a

Facultad de Ciencias Exactas and Conicet, Universidad Nacional de La Plata, Argentina

b

IFLP-CCT-CONICET, Universidad Nacional de La Plata, C. C. 727, 1900 La Plata, Argentina

c

Facultad de Ingeniería, Universidad Nacional de La Plata, Argentina

d

Giannini Hall, University of California at Berkeley, CA, United States

highlights • Economic, chaotic time-series are analyzed with information theory methods. • The maximum entropy principle is appealed to. • Predictions concerning the Dow-Jones series are made.

article

info

Article history: Received 29 October 2013 Received in revised form 20 February 2014 Available online 12 March 2014

abstract In this work, a method based on information theory is developed to make predictions from a sample of nonlinear time series data. Numerical examples are given to illustrate the effectiveness of the proposed method. © 2014 Elsevier B.V. All rights reserved.

Keywords: Time series Information theory Parametric Inference

1. Introduction One of the most important aspects of nonlinear dynamics is concerned with time series analysis and how to predict the future behavior of a system. Economic-behavior systems, like physical and biological systems, are stochastic in nature. Thus the predictability of a dynamic system’s behavior may best be considered in a random outcome context and probability distribution functions may be used to measure the statistical nature of the system. System predictability may then be studied quite naturally by information theoretic methods-functionals, since the focus is random in nature, and the functionals may be interpreted in terms of uncertainty and by measures of the difference between the statistical distributions. Thus, an information theoretic basis is provided for unlocking the dynamic content of nonlinear time series data and using this information to predict the future behavior of the system. 2. Method Given a signal x from a dynamical system D : RS → RS , the corresponding time series consists of a sequence of measurements {v(tn ), n = 1, . . . , N } on a system considered to be in a state described by x(tn ) ∈ RS at discrete times



Corresponding author. Tel.: +54 22145239995; fax: +54 2214523995. E-mail addresses: [email protected], [email protected] (A. Plastino).

http://dx.doi.org/10.1016/j.physa.2014.02.055 0378-4371/© 2014 Elsevier B.V. All rights reserved.

64

M.T. Martín et al. / Physica A 405 (2014) 63–69

tn , where N is the length of the time series. It is known (see Refs. [1,2]) that for T ∈ R, T > 0, there exists a functional form of the type,

v(t + T ) = F (v(t )),

(1)

where, v(t ) = [v1 (t ), v2 (t ), . . . , vd (t )],

(2)

and vi (t ) = v(t −(i − 1)∆), for i = 1, . . . , d, where ∆ is the time lag and d is the embedding dimension of the reconstruction. In this paper d is determined from the data itself, by the method of false nearest neighbors. We consider (as in Ref. [2]) a particular representation for the mapping function of Eq. (1), expressing it, using Einstein’s summation notation, as an expansion of the form F ∗ (v(t )) = a0 + ai1 vi1 + ai1 i2 vi1 vi2 + ai1 i2 i3 vi1 vi2 vi3 + · · · + ai1 i2 ...inp vi1 vi2 . . . vinp ,

(3)

where 1 ≤ ik ≤ d and np is the polynomial degree chosen to expand the mapping F ∗ . The number of parameters in Eq. (3) corresponding to k terms (the degree), is the combination with repetitions,

 ∗ d k

=

(d + k − 1)! . k!(d − 1)!

(4)

The length of the vector of parameters, a is

Nc =

np  ∗  d k=1

k

.

(5)

As an information recovery criterion we determine the vector a by using the maximum entropy principle (MEP) [3]. Our objective is a model that attains high predictive ability. Computations are made on the basis of a specific information supply, given by M points of the series

{v(tn ), v(tn + T )},

n = 1, . . . , M .

(6)

Given the data set in Eq. (6), the parametric mapping in Eq. (3) will be determined by satisfying following condition:

v(tn + T ) = F ∗ (v(tn )) n = 1, . . . , M .

(7)

In this way, a rectangular system of equations is obtained, W · a = (v(tn + T ))n=1,...,M ,

(8)

where W is a matrix of size M × Nc , M is the length of the information-points in Eq. (6), and Nc is the number of parameters of the model (cf. Eq. (3)). Using the Moore–Penrose pseudo-inverse of the matrix W [4], the solution is a = PMP (W ) · (v(tn + T ))n=1,...,M ,

(9)

where PMP (W ) = (W T ∗ W )−1 ∗ W T . Thus, the most probable configuration is the one that is linked to the mean value of the probability distribution, associated to the pseudo-inverse matrix of W . Once the pertinent parameters are determined, they are used to predict MP new series’ values,

 · a, ( v(tn + T ))n=1,...,MP = W

(10)

 is a matrix of size MP × Nc. MP is such that MP − M new series’ values are predicted. where W 3. Applications In order to illustrate the performance of the method we now discuss two specific time series predictions, with reference to possible chaotic systems. In particular, we deal with (i) the well-known Logistic system and (ii) an actual, empirical time series.

M.T. Martín et al. / Physica A 405 (2014) 63–69

65

Fig. 1. Orbit diagram for the logistic map. (a) Original series values. (b) Predicted series values.

3.1. The logistic map The logistic map yields what is perhaps the simplest example of chaos, and thus becomes a useful tool to illustrate new concepts in the treatment of dynamical systems. We focus on the quadratic map F : xn → xn+1 [5–7], described by the ecologically motivated, dissipative system given by the first-order difference equation xn+1 = r · xn · (1 − xn ),

(11)

where 0 ≤ xn ≤ 1 and r (0 ≤ r ≤ 4) represents the growth rate. In Fig. 1(a) the well-known bifurcation diagram is displayed as a function of the parameter r for the range 3.5 ≤ r ≤ 4.0, with 1r = 0.005. We evaluated numerically 1000 values using the logistic map (Eq. (11)) starting from a random initial condition for each r-value. The first 104 iterations are disregarded because they correspond to transitory states. Applying the model described in the previous section the data generated are subdivided into two parts. One of them is employed in order to adjust the parameters of the model, while the other checks upon its subsequent predictive power. Considering 1r = 0.005 and 1000 values for each r ∈ [3.5, 3.55], 10 000 series’ values are used to obtain the parameter a in Eq. (9). It suffices to include into the model polynomials of the second order (np = 2) (see Eq. (3)). Using a d = 3 embedding dimension, the number of estimated parameters is Nc = 34 (Eq. (5)). Considering T = 25 (Eq. (1)), predictions obtained with Eq. (10) for the range 3.5 ≤ r ≤ 4 are shown in Fig. 1(b). It can be observed the good performance of the proposed method. A remarkable fact to be emphasized is the rather small quantity of samples needed for the modeling process.

66

M.T. Martín et al. / Physica A 405 (2014) 63–69

Fig. 2. Daily time series with 30 000 (≈100 years) values of the DJIA index. Table 1 Relative mean square errors of our predictions. Model time series interval

Predict time series interval

Relative mean square error

7500–9300 (7 years) 16 000–17 500 (6 years) 25 500–26 500 (4 years)

9300–14 400 (20 years) 17 500–18 400 (4 years) 26 500–27 500 (4 years)

0.036 0.018 0.033

Making a comparison with the original series (Fig. 1(a)) the method predicts adequately both the periodic windows and the period-doubling bifurcations, as can be seen in Fig. 1(b). Note that, in the chaotic region, our vertically spanned ranges for 3.7 ≤ r ≤ 3.95 are smaller than the original ones, while the ranges are the same. 3.2. Dow Jones industrial average This second application involves stochastic system’s time series predictions for the Dow Jones Industrial Average. It is also called the Industrial Average, the Dow Jones, or simply the Dow, and is a stock market index. It is one of several indices created by Wall Street Journal editor and Dow Jones—Company co-founder Charles Dow, and based on a self-organized system of behavioral data. The industrial average was first calculated on May 26, 1896. Currently owned by S–P Dow Jones Indices, which is majority owned by McGraw-Hill Financial, it is the most notable of the Dow Averages, of which the first (non-industrial) was first published on February 16, 1885. It is an index that shows how 30 large publicly owned companies based in the United States have traded during a standard trading session in the stock market. In Fig. 2 we display the complete time series to be used. This index was measured daily between 1896/05/26 and 2013/03/04 and we have the corresponding 29 272 values (117 years). We present three experiments, from the time series values in the interval [7500, 27 500], corresponding to years from 1926 to 2006. In each case we used an interval to find the model parameters and then, another larger interval to predict, as we did with the logistic map. The results are shown in Figs. 3(a), 4(a), 5(a) and resumed in Table 1. The original time series values and the predicted ones are overlapped (blue and red refer to original and predicted values, respectively). The time-interval used for modeling purposes, in each case, is enclosed between vertical lines. Also, in each interval, we did a zoom for a subinterval containing 75 values so as to illustrate on the predictive accuracy of our model. Accordingly, in Figs. 3(b), 4(b), and 5(b) we plotted our predicted values-curve together with the associated error-curve. As is evident, just a few years are used to obtain very good predictions for long time-intervals ahead. In the first case, only 7 years (just 1800 values) were needed to model and predict the next 20 years (5100 values). Our results indicate that the proposed approach looks very efficient: it has high predictive accuracy and does not require large amounts of computing time. As in the logistic map, the embedding dimension used is d = 3, and the number of parameters that were estimated was Nc = 34. In these instances, the relative mean square errors are, respectively, 0.036 (Fig. 3), 0.018 (Fig. 4), and 0.033 (Fig. 5). 4. 2013 example In the previous section, daily DJ predictions were made based on known data outcomes. We finish this paper by making predictions outside of the known data for the period 2013/01/02–2013/10/04 (see Fig. 6). At the date of writing the paper,

M.T. Martín et al. / Physica A 405 (2014) 63–69

67

Fig. 3. (a) The original time series’ values (1926/08/13–1953/06/10) together with our predictions are overlapped (blue and red refer to original and predicted values, respectively). (b) Zoom choosing a subinterval containing 75 series’ values. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 4. (a) The original time series’ values (1960/08/02–1970/03/26) and our predictions are overlapped (blue and red refer to original and predicted values, respectively). (b) Zoom choosing a subinterval containing 75 series’ values. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

68

M.T. Martín et al. / Physica A 405 (2014) 63–69

Fig. 5. (a) The original time series values (1998/04/06–2006/02/15) and our predictions are overlapped (blue and red refer to original and predicted values, respectively). (b) Zoom choosing a subinterval containing 75 series values. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 6. 2013 original time series values and our predictions are overlapped (blue and red refer to original and predicted values, respectively). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

we could only compare the actual and predicted data until 2013/11/04. The actual daily values were not available for the period 2013/10/04–2013/11/04, and only the predicted daily values are given. In the period where predicted and actual daily values were available, the accuracy of the predictions is approximately the same as for the previously reported sample prediction errors.

M.T. Martín et al. / Physica A 405 (2014) 63–69

69

5. Conclusions We have studied, in a time-series context, one of the most important aspects of nonlinear dynamics, that of time series analysis. Our main interest focused on how to predict the future behavior of a system from an associated stochastic time series. We have advanced an approach that permits successful predictions both for the logistic map and for an important Dow Jones financial time-series. References [1] F. Takens, Dynamical Systems and Turbulence, in: Lecture Notes is Mathematics, vol. 898, Springer, Berlin, 1981. [2] L. Diambra, A. Plastino, Phys. Lett. A 216 (1996) 278. [3] E.T. Jaynes, Phys. Rev. 106 (1957) 620. 108 (1957) 171; R.D. Rosenkrantz (Ed.), Papers on Probability, Statistics and Statistical Physics, Reidel, Dordrecht, Boston, 1987; A. Katz, Principles of Statistical Mechanics, The Information Theory Approach, Freeman and Co., San Francisco, 1967. [4] J. Baker-Jarvis, J. Math. Phys. 30 (1989) 302. [5] J.C. Sprott, Chaos and Time Series Analysis, Oxford University Press, Oxford, 2004. [6] H.O. Peitgen, H. Jürgens, D. Saupe, Chaos and Fractals, New Frontiers of Science, Springer-Verlag, New York, 1992. [7] J.P. Crutchfield, J.D. Farmer, B.A. Huberman, Phys. Rep. 92 (1982) 45.