Physica A 387 (2008) 5511–5517
Contents lists available at ScienceDirect
Physica A journal homepage: www.elsevier.com/locate/physa
Relationship between efficiency and predictability in stock price change Cheoljun Eom a,∗ , Gabjin Oh b , Woo-Sung Jung c a
Division of Business Administration, Pusan National University, Busan 609-735, Republic of Korea
b
Pohang Mathematics Institute, Pohang University of Science and Technology, Pohang 790-784, Republic of Korea
c
Center for Polymer Studies and Department of Physics, Boston University, Boston, MA 02215, USA
article
info
Article history: Received 14 February 2008 Received in revised form 10 May 2008 Available online 14 June 2008 PACS: 89.65.Gh 05.45.Tp 89.65.-s Keywords: Hurst exponent Approximate entropy Nearest neighbor prediction Efficient market hypothesis
a b s t r a c t In this study, we evaluate the relationship between efficiency and predictability in the stock market. The efficiency, which is the issue addressed by the weak-form efficient market hypothesis, is calculated using the Hurst exponent and the approximate entropy (ApEn). The predictability corresponds to the hit-rate; this is the rate of consistency between the direction of the actual price change and that of the predicted price change, as calculated via the nearest neighbor prediction method. We determine that the Hurst exponent and the ApEn value are negatively correlated. However, predictability is positively correlated with the Hurst exponent. © 2008 Elsevier B.V. All rights reserved.
1. Introduction In the financial market, efficiency refers to the weak-form efficient market hypothesis (EMH) with regard to the information regarding past price changes [1]; lower efficiency corresponds to higher predictability. In this study, we assessed financial data empirically, in order to confirm the relationship between the efficiency of the stock market and the predictability in future price changes. We applied the Hurst exponent [2] and the approximate entropy (ApEn) [3] in order to observe the long-term memory in a financial time series and estimate the efficiency of the market. Additionally, we employed the hit-rate calculated by the nearest neighbor prediction method (NN method) in order to estimate the predictability of the market [4,5]. The financial time series contains a variety of stylized facts, including power-law distribution and long-term memory [6–15]. In particular, the volatility of returns evidences a long-term memory property whereas the temporal correlation of return follows a random walk [13–15]. Previous studies have demonstrated that the Hurst exponent, which reveals the longterm memory property, can be utilized to quantify the efficiency of the stock market [14,15]. The ApEn, which quantifies the complexity and randomness of the time series, can also be used for estimating the efficiency [16]. Meanwhile, previous studies have shown that the correlation between the Hurst exponent and the ApEn is negative [17–19]. We applied both the Hurst exponent and the ApEn to assess the efficiency. We employed the NN method in order to estimate the predictability. The NN method analyzes similar price change patterns of the past, such that one can predict future price changes. Previous studies have demonstrated that the NN method
∗
Corresponding author. E-mail address:
[email protected] (C. Eom).
0378-4371/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.physa.2008.05.059
5512
C. Eom et al. / Physica A 387 (2008) 5511–5517
is useful for prediction within a relatively short time frame [20–22]. The predictability in this study corresponds to the hitrate, which quantifies the consistency between the actual price change and the one predicted via the NN method. In this work, we investigate 27 stock market index data. Using the Hurst exponent, the ApEn, and the NN method, we estimate the efficiency and predictability of the stock markets. We verify the negative relationship between the Hurst exponent and the ApEn. We also demonstrate that the Hurst exponent provides more significant information than the ApEn, but can be used as a complementary measurement of the efficiency. Additionally, we conduct an empirical measurement of the relationship between efficiency and predictability. We confirm that mature markets evidence a high level of efficiency with regard to a low Hurst exponent and high ApEn values, whereas the majority of Asian and South American markets evidence a strong long-term memory property and a low level of efficiency. In the following section, we describe the data and methods utilized in this paper. In Section 3, we present the results, and summarize the findings and conclusions of the study in Section 4. 2. Data and methods 2.1. Data We analyzed the daily indices of 27 stock markets, which were obtained from DataStream (http://www.datastream.net). The market indices are comprised of 13 markets in the Asia–Pacific region, 7 in North and South America, and 7 in Europe: China (Shenzhen Composite), Hong Kong (Hangseng), India (Bombay SE200), Indonesia (Jakarta SE Composite), Japan (Nikkei225), Korea (KOSPI200), Malaysia (Kuala Lumpur Composite), Philippines (SE Composite), Singapore (Straits Times), Taiwan (SE Weighted), Thailand (Bangkok S.E.T.), Australia (ASX), New Zealand (NZX), Argentina (Base General), Brazil (Bovespa), Chile (General), Mexico (IPC), Peru (Lima SE General), Canada (TSX Composite Index), USA (S&P 500), Austria (ATX), Denmark (Copenhagen KBX), France (CAC 40), Germany (DAX 30), Italy (Milan Comit General), Netherlands (AEX Index), and the UK (FTSE100). The data period was 15 years from January 1992 to December 2006, with the exception of Australia (from June 1992), Argentina (from June 2000), and Denmark (from January 1996). The return time series of the market index used was calculated by the logarithmic change of the index, R(t ) = ln P (t ) − ln P (t − 1), in which P (t ) represents the index on day t. 2.2. Hurst exponent In this study, we employ the Hurst exponent to quantify the long-term memory property. Previous studies utilized a variety of methods, including using the Hurst exponent [14,15,23–25], ARFIMA (autoregressive fractional integration moving average) [26] and FIGARCH (fractionally integrated generalized autoregressive conditional heteroscedasticity) [27] in order to observe quantitatively the long-term memory property. However, the long-term memory property does not depend on the method utilized [28]. Additionally, many methods can be employed to calculate the Hurst exponent; the classical re-scaled range analysis [2], generalized Hurst exponent method [15], modified R/S analysis [29], GPH method [30], and the detrended fluctuation analysis (DFA) [31]. Weron determined that the DFA method is the most efficient [32], and thus we utilize the DFA method in this study. The Hurst exponent calculated by the DFA method can be explained as follows. First, after the subtraction of the mean x from the original time series x(i), one accumulates the series y defined by y=
N X [x(i) − x],
(1)
i=1
in which N is the number of the time series. Next, the accumulated time series is divided into a window of the same length n. We estimate the trend lines yn by using the ordinary least square in each window. We eliminate trends existing within the window by subtracting from the accumulated time series in each window. This process is then applied to every window and the fluctuation magnitude F (n) is defined as F (n) =
N 1 X
N i=1
(y − yn )2 .
(2)
The above process is repeated for every scale (n, 2n, 3n, . . . , dn). We subsequently attempted to determine whether the scaling relationship exists for every scale. The scaling relationship is defined by F (n) ≈ c · nH ,
(3)
in which c is the constant and H is the Hurst exponent. H = 0.5 corresponds to no memory in the time series. If 0 ≤ H < 0.5, the time series has a short-term memory. When H > 0.5, it has a long-term memory. As H approaches 1, the long-term memory property becomes stronger, because the persistence of similarity patterns in past price changes is high.
C. Eom et al. / Physica A 387 (2008) 5511–5517
PT − 1
5513
j
1 Meanwhile, we employ the average Hurst exponent (Hj = T − t =1 Ht , in which j represents the market index) 1 estimated repeatedly until December 2005 (t = T − 1) by estimating the time window with a width of five years and shifting one year for a whole period. The time window of the data covers the period from 1992 (t = 1) to 2006 (t = T ). However, the NN method requires an out-of-sample data for the one-year prediction period, and one must use the same sample size with regard to all measurements, the Hurst exponent, and the ApEn and NN methods. We excluded the 2006 data from the average value. Additionally, the average Hurst exponent was able to maintain the robustness of the exponent, regardless of the time variation.
2.3. Approximate entropy The ApEn, which measures the randomness in the time series, was also introduced as the measurement of the efficiency [16,17,19]. We also employ the ApEn in order to measure the efficiency. The ApEn is defined as ApEn(m, r ) = Φ m (r ) − Φ m+1 (r ),
(4)
in which m is the embedding dimension and r is the tolerance determining the similarity between price change patterns. The Φ m (r ) is expressed by N −m+1
Φ m (r ) = (N − m + 1)−1
X
ln[Cim (r )]
(5)
i=1
Cim
(r ) =
Bi (r )
(N − m + 1)
in which Bi (r ) is the number of data pairs within the tolerance of similarity r. Also, we calculate the similarity in the time series of each price change pattern u(k) (k = 1, 2, . . . , m) by the distance d[x(i), x(j)] between two vectors x(i), x(j) defined as Bi ≡ d[x(i), x(j)] ≤ r , d[x(i), x(j)] =
(6)
max (|u(i + k − 1) − u(j + k − 1)|).
k=1,2,...,m
Accordingly, one can observe the ApEn by comparing the relative magnitude between repeated pattern occurrences for the embedding dimensions m and m + 1. The ApEn becomes smaller if similar price change patterns for embedding dimensions m and m + 1 appear more frequently. When the patterns for m and m + 1 occur with the same frequency, the ApEn is zero. As the ApEn is small, the frequency at which similar price change patterns appears is large. Then, the time series has a lower level of randomness and the efficiency is low. In this study, we apply the embedding dimension m = 2 and the tolerance of similarity r = 20% of standard deviation of the time series, similar to what has been demonstrated in PT − 1 j 1 previous works [16,17,19]. We employ the average ApEn (Aj = T − t =1 At ) as well as the average Hurst exponent. The 1 data for year 2006 (t = T ) were also excluded from the analysis. 2.4. Nearest neighbor prediction method In this section, we introduce the NN method as follows. First, we reconstruct the pattern series Vnm,τ defined as Vnm,τ = [xn , xn−τ , . . . , xn−(m−1)τ ]
(7)
(n = (m − 1)τ + 1, . . . , T ) where the embedding dimension is m and the time delay corresponds to τ from the financial time series (x1 , x2 , . . . , xT ). The length of Vnm,τ is m when that of the time series is N (= T ), so Vnm,τ has N − m + 1 data points. We find price change m,τ patterns similar to the target price change pattern Vtarget at time t so we can predict the price change at time t + 1. The m,τ m,τ m,τ 2 distance D = (Vtarget − Vn ) between two patterns, Vtarget and Vnm,τ , is used to select similar patterns. When two patterns m,τ are wholly identical, D is zero. We subsequently select the K patterns (Vn,k , k = 1, 2, . . . , K ) which exhibit the smallest m+1,τ
distance. k corresponds to the number of similar patterns. We then select Vn,k(∗) (k(∗) = 1, 2, . . . , K ∗ ) for the embedding m+1,τ
dimension m + 1. Finally, the next price change xn+1 can be predicted by Vn,k(∗) . The hit-rate utilized in this study corresponds to the rate of consistency between the direction of the actual price change and that of the predicted one. We test a given year using the time window with one-year width shifting at 1 day, and take the j
1 hit-rate NNt as the rate of the consistency. We note that the average hit-rate NNj = T − 1 to the average Hurst exponent and the average ApEn.
PT − 1 t =1
j
NNt is employed, in addition
5514
C. Eom et al. / Physica A 387 (2008) 5511–5517
Fig. 1. (Color online). The relationship between the Hurst exponent and the ApEn for (a) the actual market data and (b) the random time series. The circles (red), triangles (magenta), squares (blue), and diamonds (green) indicate the market indices for the Asia–Pacific region (13 markets), South America (5 markets), North America (2 markets), and Europe (7 markets).
Fig. 2. (a) The ApEn and (b) the Hurst exponent for the past (square) and the present (circle). The x-axis corresponds to the market classified by region (Asia–Pacific region, South and North America, and Europe).
3. Results Fig. 1 represents the relationship between the average Hurst exponent Hj and the average ApEn Aj . In Fig. 1(a), we find that the relationship between the Hurst exponent and the ApEn is negative (ρ(Hj , Aj ) = −53%), and thus we are able to confirm the result of the previous studies [17–19]. Meanwhile, the random time series shows no correlation between the Hurst exponent and the ApEn (Fig. 1(b)). In Fig. 1(a), the majority of Asian (Indonesia, Philippines, Malaysia, and India) and South American stock markets evidence lower ApEn and higher Hurst exponent values. In other words, those markets evidence a higher long-term memory property and a lower level of randomness. On the other hand, the majority of mature markets (France, the UK, Germany, Canada, the USA, Australia, Japan, New Zealand, and Korea) have higher ApEn and lower Hurst exponent values, and thus their efficiency is higher. In addition, we assess the efficiency as a function of time (Fig. 2). In Fig. 2, the past represents the period from January 1992 to December 1996. Australia, Argentina, and Denmark are excluded from this analysis because the data periods are different. The present corresponds to the period beginning in January 2002 and ending in December 2006. Fig. 2(a) shows the ApEn, whereas Fig. 2(b) shows the Hurst exponent. We find that the efficiency of the market increases as time progresses.
C. Eom et al. / Physica A 387 (2008) 5511–5517
5515
Fig. 3. (Color online.) The relationship between efficiency and predictability. (a) and (b) show the relationship between the predictability and the ApEn, and (c) and (d) display the relationship between the predictability and the Hurst exponent. (a) and (c) correspond to the actual market data, whereas (b) and (d) represent the random time series in (b) and (d). The circles (red), triangles (magenta), squares (blue), and diamonds (green) indicate the market indices of Asia–Pacific (13 countries), South America (5 countries), North America (2 countries), and Europe (7 countries).
In particular, the efficiency of emerging markets, including the Asian and South American ones, definitely increases more than that of the mature markets. In Fig. 3, we display the relationship between efficiency and predictability using the average Hurst exponent Hj , the average ApEn Aj , and the average hit-rate NNj . The hit-rate and the ApEn have a negative correlation (Fig. 3(a)), whereas the hit-rate and Hurst exponent evidence a positive correlation (Fig. 3(c)). However, we were unable to find any correlation between the hit-rate and either the ApEn or the Hurst exponent using the random time series (Fig. 3(b) and d). Accordingly, we empirically observe the relationship between efficiency and predictability; ρ(NNj , Aj ) = −42% and ρ(NNj , Hj ) = 86%. Moreover, |ρ(NNj , Hj )| is larger than |ρ(NNj , Aj )|. Also, we find that most Asian and South American markets evidence lower efficiency and higher predictability, whereas mature markets exhibit higher efficiency and lower predictability. We next assess the artificial time-correlated noise and the white noise using the same methods, namely, the Hurst exponent, the ApEn, and the hit-rate. The Hurst exponent of the time-correlated noise ranges from 0.3 to 1.0. The length of the noise refers to that shown in the actual market data. Fig. 4 represents the relationships between the hit-rate and the Hurst exponent (Fig. 4(a) and (c)), and the ApEn (Fig. 4(b) and (d)). Fig. 4(a) and (b) correspond to the time-correlated noise, whereas (c) and (d) correspond to the white noise. The hit-rate of the time-correlated noise shows a significant tendency against the Hurst exponent (Fig. 4(a)). Interestingly, when the Hurst exponent exceeds 0.5, especially when it is near 1.0, the predictability (hit-rate) approaches 1.0. A higher Hurst exponent refers to a high long-term memory in the time series, such that the predictability builds up as the Hurst exponent increases. The ApEn of the noise is negatively related to the Hurst
5516
C. Eom et al. / Physica A 387 (2008) 5511–5517
Fig. 4. The relationship between the efficiency and the predictability obtained using the noise data. (a) and (b) correspond to the time-correlated noise, whereas (c) and (d) correspond to the white noise. (a) and (c) show the relationship between the predictability and the Hurst exponent, and (b) and (d) display the relationship between predictability and the ApEn.
exponent, as well as the actual market data, and the hit-rate of the noise also tends against the ApEn as well as the Hurst exponent. However, the white noise, which does not exhibit a time-correlated property, evidences no relationship between the hit-rate and both the Hurst exponent and the ApEn (Fig. 4(c) and (d)). 4. Conclusions We empirically assessed the relationship between efficiency and predictability using the market index for various countries. In the weak-form EMH, a lower efficiency value refers to the information of the past price changes, and is useful for predictions of future price changes. We employed the Hurst exponent and the ApEn to estimate market efficiency, and the NN method was used to assess market predictability. We verified that the relationship between the Hurst exponent and the ApEn is negative. Accordingly, the Hurst exponent and the ApEn can be utilized to measure the efficiency in a complementary fashion. We determined that the Hurst exponent has a stronger correlation with predictability than does the ApEn. In the financial field, our study can provide useful information when the international investment funds determine their strategy, active or passive, against various financial markets, such that they can establish an efficient portfolio. References [1] [2] [3] [4] [5] [6]
E.F. Fama, J. Finance 25 (1970) 383. H.E. Hurst, Trans. Am. Soc. Civil Eng. 116 (1951) 770. S.M. Pincus, Proc. Natl. Acad. Sci. USA 88 (1991) 2297. D. Farmer, J. Sidorowich, Phys. Rev. Lett. 39 (1987) 226. T. Sauer, J.A. Yorke, M. Casdagli, J. Stat. Phys. 65 (1991) 579. R.N. Mantegna, H.E. Stanley, Nature 376 (1995) 46; R.N. Mantegna, H.E. Stanley, Nature 383 (1996) 587. [7] B.B. Mandelbrot, Quant. Finance 1 (2001) 560. [8] X. Gavaix, P. Gopikrishnan, V. Plerou, H.E. Stanley, Nature 423 (2003) 267. [9] B. Jacobsen, J. Empirical Finance 3 (1996) 393.
C. Eom et al. / Physica A 387 (2008) 5511–5517 [10] [11] [12] [13]
[14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32]
C. Hiemstra, J.D. Jones, J. Empirical Finance 4 (1997) 373. W. Willinger, M.S. Taqqu, V. Teverovsky, Finance Stoch. 3 (1999) 1. P. Grau-Carles, Physica A 287 (2000) 396. G. Oh, S. Kim, C. Eom, J. Korean Phys. Soc. 48 (2006) S197; S. Chae, W.-S. Jung, J.-S. Yang, H.-T. Moon, J. Korean Phys. Soc. 48 (2006) 313; W.-S. Jung, F.Z. Wang, S. Havlin, T. Kaizoji, H.-T. Moon, H.E. Stanley, Eur. Phys. J. B 62 (2008) 113. D.O. Cajueiro, B.M. Tabak, Chaos Solitons Fractals 22 (2004) 349; D.O. Cajueiro, B.M. Tabak, Chaos Solitons Fractals 23 (2004) 671. T.D. Matteo, T. Aste, M.M. Dacorogna, J. Bank. Finance 29 (2005) 827; T.D. Matteo, Quant. Finance 7 (2007) 21. S.M. Pincus, R.E. Kalman, Proc. Natl. Acad. Sci. U.S.A. 101 (2004) 13709. T. Kim, C. Eom, G. Oh, Korean J. Finance 18 (2005) 239. C. Eom, G. Oh, Rev. Bus. Econ. 18 (2005) 2859. G. Oh, S. Kim, C. Eom, Physica A 382 (2007) 209. O. Bajo-Rubio, F. Fernandez-Rodrguez, S. Sosvilla-Rivero, Econ. Lett. 39 (1992) 207. F. Fernandez-Rodrguez, S. Sosvilla-Rivero, M.D. Caeca-Artiles, Jpn. World Econ. 11 (1999) 395. A.S. Soofi, L. Cao, Econ. Lett. 62 (1999) 175. C.W.J. Granger, Z. Ding, J. Econometrics 73 (1996) 61. J.T. Barkoulas, C.R. Baum, N. Travlos, Appl. Financ. Econ. 10 (2000) 177. R. Kilic, Appl. Financ. Econ. 14 (2004) 915. C.W.J. Granger, R. Joyeux, J. Time Ser. Anal. 11 (1980) 15. R.T.T. Baillie, T. Bollerslev, H.O. Mikkelsen, J. Econometrics 74 (1996) 3. C. Eom, S. Choi, G. Oh, W.-S. Jung, Physica A 387 (2008) 4630. A.W. Lo, Econometrica 59 (1991) 1279. J. Geweke, S. Porter-Hudak, J. Time Ser. Anal. 4 (1983) 221. C.K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger, Phys. Rev. E 49 (1994) 1685. R. Weron, Physica A 332 (2002) 285.
5517