A study of the interplay between the structure variation and fluctuations of the Shanghai stock market

A study of the interplay between the structure variation and fluctuations of the Shanghai stock market

Physica A 391 (2012) 3198–3205 Contents lists available at SciVerse ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa A study...

650KB Sizes 3 Downloads 30 Views

Physica A 391 (2012) 3198–3205

Contents lists available at SciVerse ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

A study of the interplay between the structure variation and fluctuations of the Shanghai stock market Yang Chunxia ∗ , Xia Bingying, Hu Sen, Wang Rui Nanjing University of Information Science and Technology, Nanjing 210044, China

article

info

Article history: Received 1 September 2011 Received in revised form 5 January 2012 Available online 21 January 2012 Keywords: Mutual information Stock network Scalefree degree distribution Volatility clustering

abstract The intricate interplay between the variation of the stock network structure and fluctuations of that stock market is increasingly becoming a hot topic. In this work, employing a moving window to scan through every stock price time series over a period from 2 January 2001 to 7 December 2010, we use mutual information to measure the statistical interdependence between stock prices, and we construct a corresponding network for 501 Shanghai stocks in every given window. Then we address the timevarying relationships between the structure variation and fluctuations for the Shanghai stock market. All the results obtained here indicate that at turning points the growing independence of stocks causes the scalefreeness of the degree distribution to be disrupted, and that the Shanghai stock index has little volatility clustering. In contrast, under normality of the market, the stock networks have characteristics of scalefree degree distribution. Furthermore, the degree of volatility clustering is a little higher. © 2012 Elsevier B.V. All rights reserved.

1. Introduction As a complex system, the interdependence between stock prices existing on a financial market always attracts considerable attention. Up to now, many complex network models have been constructed for studying it [1–9]. For example, Tse et al.’s network of US stock prices gives full information about their interdependence and indicates that one stock price is strongly influenced by a relatively small number of other stocks [8]; in a comparative study with static and dynamic thresholds, Qiu Tian et al. investigate network topology dynamics of American and Chinese stock markets [9]. Note that in the above works the Pearson correlation function of a pair of anomaly time series is used to construct the stock networks [6–9]. As is known, the Pearson correlation function usually reveals a kind of linear physical relationship. The highly nonlinear processes of a real financial market call for the application of nonlinear methods to obtain more reliable results. Because mutual information allows capturing nonlinear relationships between time series and can disclose its nonlinear characteristics [10], we introduce it to measure the statistical interdependence between stock prices. Besides, in order to uncover the way in which the stock market varies, the network dynamics is exploited, and an interesting result that the scalefreeness of the degree distribution is disrupted when the market experiences fluctuation has been found by Liu et al. [11]. That is to say, there exist relationships between the network variation and market fluctuation. However, we need more detailed information. One can wonder what the market volatility is when the disruption of the scalefreeness of the degree distribution takes place. Moreover, the present studies could not give us a satisfying answer. Hence, we will first construct stock networks by mutual information methods, then address this question by investigating a time-varying relationship between the network structure variation and the degree of volatility clustering. In particular, we base our study



Corresponding author. Tel.: +86 18761806626. E-mail addresses: [email protected], [email protected] (Y. Chunxia).

0378-4371/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.physa.2012.01.015

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

3199

on Shanghai stocks, so that the networks constructed from these stocks can be consistently compared with the volatility of the Shanghai market index. After detailed analyses, the results obtained differ from previous ones and will help disclose the way in which the stock market varies. The paper is organized as follows. Section 2 briefly introduces the measurement of volatility clustering and stock network construction by the method of mutual information. Section 3 presents detailed analyses of the relationship between structural change and fluctuations of the Shanghai stock market. Finally, Section 4 provides our conclusion. 2. Network construction and volatility clustering 2.1. Quantitative measurement of volatility clustering For a volatility series {R(t )}, Tseng and Li introduce a moving window with a certain window size n to scan through it and use 1 to represent the largest p% (p is a constant) fluctuations and 0 for all the other smaller fluctuations [12]. In this way, they have a sequence which only contains values 0 and 1. R(t ) is a big fluctuation otherwise.



1 0

A(t ) =

(1)

Counting the total number of trading days among the largest p% of fluctuations in volatility within this window, they get another series. For example, they put the first window on the first 10 days of the whole series and count the number of days among the largest 20% of fluctuations within this 10-day window; then, the window slides 1 day along the time series and they again count the number of days among the largest 20% of fluctuations within this next 10-day window. The same procedure is repeated until they finish scanning through the whole time series. So they get a large fluctuation frequency series H (t ) =

n 

A(t + i).

(2)

i=1

The ratio Qn ≡ σe /σG is defined to measure the degree of volatility clustering, where σe and σG are the standard deviations of the number of days of the largest p% of fluctuations for a given volatility series and simulated Gaussian data sets, respectively. The larger the ratio is, the √ larger the degree of clustering will be. When the given time series is completely random, the ratio is equal to 1. Incidentally, np(1 − p) is the theoretical value of the standard deviation of large fluctuation frequency series for a Gaussian random noise, so

σe

. Qn ≡ √ np(1 − p)

(3)

2.2. Network construction 2.2.1. Mutual information The mutual information Mij from information theory can be interpreted as the excess amount of information generated by falsely assuming two time series ai and aj to be independent. It offers new promising perspectives for uncovering strongly nonlinear relationships between time series [10,13,14]. By definition, Mij is large if the two time series are highly linearly (anti)correlated. Besides, a highly nonlinear relationship between ai and aj also yields large Mij . The mutual information Mij between ai and aj can be calculated using Mij =



pij (µ, v) log2

µv

pij (µ, v) pi (µ)pj (v)

,

(4)

where pi (µ) is the probability density function (PDF) of the time series ai , and pij (µ, v) is the joint PDF of a pair (ai , aj ). By definition, Mij is symmetric, so Mij = Mji [15]. If logarithms to base 2 are used, the standard unit of measurement of mutual information is the bit. The principal difficulty in calculating mutual information from experimental data is how to estimate pij (µ, v). We use a simple histogram approach with equally sized boxes for all pairs {i, j} to estimate the probability densities. Because the estimator (Eq. (4)) is known to depend on the box size and partitioning [16–18], we use an identical partitioning for all pairs {i, j} to guarantee optimal comparability of the Mij . The number m of boxes is determined using a formula from [19]: 2

m = 1.87 × (l − 1) 5 ,

(5)

where l is the length of time series. If a box with size ∆i∆j in the {i, j} plane has Nij points in it, we estimate pij to be Nij /Ntotal , where Ntotal is the total number of points in the plane. In this way, pij (µ, v) and Mij can be calculated. The algorithm is feasible, since the application to the network construction requires only the correct estimation of relative differences of Mij between all pairs of time series.

3200

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

Fig. 1. Nonlinear behavior of the closing price for a stock labeled SH600007.

Fig. 2. Nonlinear behavior of the closing price for two stocks labeled SH600007 and SH600008.

2.2.2. Nonlinearity of stock price Palus introduced a method to detect nonlinearity for a given time series x(t ), which is usually considered as a realization of a stochastic process X (t ) [20,21]. First, two stochastic variables X (t ) and X (t + τ ) are constructed. Then mutual information M (τ ) and another value L(τ ) are computed, L(τ ) =

1 2

1

(log C11 + log C22 ) − (log σ1 + log σ2 ), 2

(6)

where σ1 and σ2 are eigenvalues of the covariance matrix of the above two-dimensional random variables X (t ) and X (t +τ ), and C11 and C22 are the diagonal elements of the covariance matrix. Finally the variable patterns of M (τ ) and L(τ ) over delay time τ are compared to deduce the nonlinearity for a given time series. When M (τ ) and L(τ ) change similarly, this can reflect a linear correlation between X (t ) and X (t + τ ). In contrast, when the variable patterns of M (τ ) and L(τ ) are significantly different, X (t ) and X (t +τ ) are nonlinearly related. By this method, we detect the nonlinear behavior of the closing price for a stock labeled SH600007 over the period from 2 January 2001 to 7 December 2010. Fig. 1 shows the corresponding result. It shows that the variable pattern of M (τ ) and L(τ ) is obviously different when the delay time τ is small (τ ≈ 400). So, X (t ) and X (t + τ ) are nonlinearly related and the stock price has nonlinear characteristics. Fig. 2 shows the corresponding result for two stocks labeled SH600007 and SH600008 over the period from 2 January 2001 to 7 December 2010. Similarly, one can find that the variable tendency of M (τ ) and L(τ ) is obviously different when the delay time τ is small (τ ≈ 400). So, X (t ) and Y (t + τ ) are nonlinearly related. Thus these two experiments indicate that there surely exist nonlinear processes in a real stock market. So we introduce mutual information to investigate the interdependence between stock price time series. Note that the length of the nonlinear correlation is approximately 400 days, so the size of our moving window is 400. 2.2.3. Network construction Let the graph G = (V ; E ) represent a stock interdependence network, where V and E are the sets of vertices and edges, respectively. Vertices denote stocks. In order to define our criterion for connecting a pair of stocks, we need a threshold value τ for the mutual information. Obviously, different values of τ define networks with the same set of vertices but

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

3201

Fig. 3. Part of the stock network of the Shanghai market over the period from 4 August 2004 to 30 March 2006. Node labels are the Shanghai stock abbreviations.

different sets of edges. If the threshold is too small, the number of edges between vertices will be too many to analyze well. Similarly, too large a threshold will give rise to too little information and fewer edges to analyze. Based on the analysis of the number of edges between vertices in the stock network, we find that most networks have a clear structure when τ is located in a range from 1.2 to 1.9. Suppose that the threshold is τ = 1.55; the connection criterion for stock i and stock j is Mij ≥ τ , and the adjacency matrix is as follows:

 E=

eij = 1, eij = 0,

i ̸= j and Mij ≥ τ i = j or i ̸= j and Mij < τ .

(7)

By connecting 501 stocks over the period from 4 August 2004 to 30 March 2006, we construct a stock network of the Shanghai market. Part of it is shown in Fig. 3; here, the mutual information threshold τ = 1.55. 2.2.4. Topology structure of the network



2.2.4.1. Degree distribution. The degree of vertex i is ki = j̸=i eij , which denotes the number of vertices connecting with i. The vertex degree distribution function Pk tells us the probability that a randomly selected vertex is connected with k edges [22]. Many empirical studies on actual networks indicate that the vertex degree distribution obeys a power law [22] Pk ∝ k−γ ,

(8)

where γ is a scaling parameter. Formula (8) can be equivalently expressed as ln Pk ∝ −γ ln k.

(9)

2.2.4.2. Cluster coefficient. If ki , the nearest neighbors of vertex i, have mi edges among them, the ratio of mi to ki (ki − 1)/2 is the cluster coefficient of vertex i. The network cluster coefficient is calculated by averaging through the clustering coefficient of all vertices. Many empirical studies on actual networks indicate that they have larger network cluster coefficients than stochastic networks with the same size [22]. 2.2.5. Structure variation of stock network In order to exploit the structure variation of a stock network, the moving window method is employed. A window with width 400 is tested, and the window slides 1 day each step. For example, we put the first window on the first 400 days

3202

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

Fig. 4. Degree distribution of a stock network of the Shanghai market over the period from 4 August 2004 to 30 March 2006 for τ = 1.55; the power-law exponent is 2.54.

of the whole series, compute the mutual information Mij for each pair of stocks, and construct a stock network within this 400-day window; then, the window slides 1 day along the daily stock price time series, new mutual information for each pair of stocks is computed, and another stock network is constructed within this next 400-day window. The same procedure is repeated until we finish scanning through the whole time series. Then we have a set of networks, in which each network corresponds to a certain value of τ (τ = 1.55). Its statistical properties of degree distribution will be discussed by the maximum likelihood method and the Kolmogorov–Smirnov (KS) statistic proposed by Clauset et al. [23]. To quantify how closely a hypothesized power-law distribution resembles the actual distribution of an observed set of samples, we calculate the KS goodness-of-fit statistic with their methods [23]. Fig. 4 shows the degree distribution of a stock network of the Shanghai market over the period from 4 August 2004 to 30 March 2006. By minimizing the standard KS statistic, D = max |S (x) − P (x)| . x≥xmin

(10)

Reliable estimates of xmin and the power-law exponent are 4 and 2.54, respectively, the former xmin minimizing D (D = 0.0516). Here, S (x) is the cumulative distribution function (CDF) of the data for observations with value at least xmin , P (x) is the CDF for the power-law model that best fits the data in the region x ≥ xmin , and D is the maximum distance between the CDF of data and the fitted model. That is to say, D is a fitting error which can indicate the goodness of fit of given data to a theoretical power-law distribution. If the degree distribution deviates significantly from the power-law distribution, the fitting error will become large. In contrast, a small value of the fitting error reflects the scalefreeness of a network. Therefore, the above small value of D (D = 0.0516) indicates that the degree distribution shown in Fig. 4 can be well fitted by a power-law decaying function spanning in a range from several stocks to more stocks. In order to rule out the non-power-law distribution, we compute the corresponding p-value for the KS test, shown in Fig. 5. The p-value quantifies the probability that our data were drawn from the hypothesized power-law distribution. If the pvalue is much less than 1, then it is unlikely that the data are drawn from a power-law distribution. If it is close to 1, then the data may be drawn from a power-law distribution [23]. As shown in Fig. 5, whether the sizes of the samples are large or small, the p-values for our distribution are still high, and it becomes possible to say that the power-law model is a good fit for our distribution. Over many experiments, we find that most of networks have scalefreeness of degree distribution when τ is in a range from 1.2 to 1.9 and the power-law exponent varies from 1 to 3. Note that the estimate of the power-law exponent requires determination of the value of xmin first. Then we uniformly take the average degree as the value of xmin to analyze the statistical properties of each stock network and to compare their changes over time. 3. Analysis of the structural variation and fluctuations for the Shanghai stock market In order to uncover the way in which the stock market varies, we will investigate the intricate interplay between the structure variation and fluctuations of the Shanghai market. First, we focus on the volatility clustering of the Shanghai market index. Then, we will investigate the relationship between the structure variation of stock networks and volatility clustering. For a consistent comparison we will construct a network and calculate the degree of volatility clustering of the Shanghai market index in the same observation window. Here, the size of the window is 400 days. For the largest 17% of fluctuations, in order to compute the degree of volatility clustering in this window, another window with size of 10 days is used to count the total number of trading days among the largest 17% of fluctuations in the volatility. When the 10-day window slides 1 day each step along the daily volatility series, the large fluctuation frequency series in the 400 days being

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

3203

Fig. 5. The p-value relative to the maximum likelihood power-law model for the distribution shown in Fig. 4, as a function of n.

considered can be obtained and the corresponding degree of volatility clustering can also be calculated. When the 400-day window slides 1 day along the daily volatility series, the degree of volatility clustering in the next window can be similarly computed. Repeating the same procedure until we finish scanning through the whole time series, the variation of the degree of volatility clustering over time obtained is shown in Fig. 6(a). Here, the time marked is the intermediate time of the window being considered. For example, a 400-day window is from 2 January 2001 to 5 September 2002, the time marked on X axis is 5 November 2001 which is the intermediate time of the window. From Fig. 6(a), one can find that the degree of volatility clustering is close to 1 around 1 July 2005, 16 October 2007, and 1 December 2008, which correspond to the cases where the market index fluctuates completely randomly. At the same time, stock networks of the corresponding windows are constructed, and the average degree (K ), average cluster coefficient (C ), fitting error, and p-value of every network are also calculated. Here, the mutual information threshold τ = 1.55. Fig. 6(b)–(e) show the variation of average degree, average cluster coefficient, fitting error, and p-value over time. The time marked is the same as that in Fig. 6(a). Basically, the network constructed from stock price time series in a particular 400-day window reflects the market internal structure for the 400day period concerned. As time goes by, the series of networks will offer us some information about its structural changes. From Fig. 6(b)–(e), one indeed captures how the network structure and parameters vary when time elapses. Fig. 6(d) shows that over periods around 1 July 2005, 16 October 2007, and 1 December 2008 the fitting error is a little higher, but it is lower over other periods, which indicates that the networks over these other periods may have characteristics of scalefree degree, while the degree distribution of networks over periods around 1 July 2005, 16 October 2007, and 1 December 2008 deviates significantly from the power-law distribution. Fig. 6(e) shows that the p-values for our distributions are above a threshold 0.05 over those other periods, while they fall below the threshold over periods around 1 July 2005, 16 October 2007, and 1 December 2008, which indicates that the power-law model is a good fit for the distributions over periods except for those around 1 July 2005, 16 October 2007, and 1 December 2008. Fig. 6(b)–(c) indicate that both the average degree and average cluster coefficient have a similar wave behavior. Generally most networks have larger average degree and average cluster coefficient. However, over periods around 1 July 2005, 16 October 2007, and 1 December 2008, both their average degree and average cluster coefficient are a little lower in comparison with their neighboring values. For a stock network, although values of the average degree and average cluster coefficient do not have specific economic significance, they actually reflect the difference of network topology. In particular, the maximum values of average degree and average cluster coefficient represent that most stocks depend on each other at that moment, while minimum values correspond to the case when most stocks are mutually independent. From Fig. 6, other clear aspects also can be seen: the periods around 1 July 2005, 16 October 2007, and 1 December 2008 are turning points of the Shanghai market; at turning points, the independence of stocks is increasing and the scalefreeness of the degree distribution is disrupted. That is to say, every stock moves randomly and independently, and there is no consistent trend among stocks. Then the Shanghai stock index is irregular and has little volatility clustering. Therefore, we can conclude that at turning points every stock moves randomly and independently, so the network’s scalefree structure is disrupted and the degree of volatility clustering is lower. In contrast, under normality of the market, stock networks have characteristics of scalefree degree distribution: scalefree-like structure is an indicator of normality. 4. Conclusion Currently, the intricate interplay between structure variation and fluctuations of a stock market is a hot topic. In this work, employing a moving window to scan through every stock price time series over a period from 2 January 2001 to 7 December 2010, we use mutual information to measure the statistical interdependence between stock prices, and we construct a corresponding network for 501 Shanghai stocks in every given window. Then we investigate the relationship between the structure variation of stock networks and the volatility clustering of the index for the Shanghai market. As can be seen in this work, at turning points, the increasing independence of stocks causes the scalefreeness of the degree distribution to be disrupted, and the Shanghai stock index has little volatility clustering. In contrast, under normality of the market, the stock networks have characteristics of scalefree degree distribution. Furthermore, the degree of volatility clustering is a little higher. Therefore, the scalefree-like structure of a stock network is an indicator of market normality. All these findings will help reveal the way in which a stock market varies and guide people to invest in the market.

3204

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

Fig. 6. The degree of volatility clustering, the parameters of the stock networks, the fitting error, and the p-value. K denotes the average degree and C is the average cluster coefficient.

Acknowledgments This work is supported by the National Natural Science Foundation of China (Grant No. 60874111), Qing Lan Project of Jiangsu Province, College Science Foundation of Jiangsu Province (07KJD120128) and Social Science Institute Program of Jiangsu Province (B0808).

Y. Chunxia et al. / Physica A 391 (2012) 3198–3205

3205

References [1] G. Bonanno, F. Lillo, R.N. Mantegna, High-frequency crosscorrelation in a set of stocks, Quantitative Finance 1 (2001) 96–104. [2] G. Bonnanno, G. Caldarelli, F. Lillo, R.N. Mantegna, Topology of correlation-based minimal spanning trees in real and model markets, Physical Review E 68 (2003) 046103. [3] D.-M. Song, M. Tumminello, W.-X. Zhou, R.N. Mantegna, Evolution of worldwide stock markets, correlation structure, and correlation based graphs, Physical Review E 84 (2) (2011) 026108. [4] Yiting Zhang, Gladys Hui Ting Lee, Jian Cheng Wong, Jun Liang Kok, Manamohan Prusty, Siew Ann Cheong, Will the US economy recover in 2010? A minimal spanning tree study, Physica A 390 (2011) 2020–2050. [5] J.-P. Onnela, A. Chakraborti, K. Kaski, Dynamics of market correlations–Taxonomy and portfolio analysis, PRE 68 (2003) 056110. [6] C.K. Tse, J. Liu, F.C.M. Lau, Winner-take-all correlation-based complex networks for modeling stock market, in: Proc. Int. Symp. Nonl. Theory and Its Appl., Budapest, Hungary, September 2008. [7] C.K. Tse, J. Liu, F.C.M. Lau, K. He, Observing stock market fluctuation in networks of stocks, Proceedings of Complex (2) (2009) 2099–2108. [8] Chi K. Tse, Jing Liu, Francis C.M. Lau, A network perspective of the stock market, Journal of Empirical Finance 17 (2010) 659–667. [9] T. Qiu, B Zheng, C Guang, Financial networks with static and dynamic thresholds, New Journal of Physics 12 (2010) 043057. [10] H. Kantz, T. Schreiber, Nonlinear Time Series Analysis, second ed., Cambridge University Press, 2004. [11] Jing Liu, Chi K. Tse, Keqing He, Fierce stock market fluctuation disrupts scalefree distribution, Quantitative Finance (2009) 1–7. [12] Jie Jun Tseng, Sai Ping Li, Asset returns and volatility clustering in financial time series, Physica A 390 (2011) 1300–1314. [13] K.W. Church, P. Hanks, Word association norms, mutual information, and lexicography, Computational Linguistics 16 (1) (1990) 22–29. [14] Andrew M. Fraser, Harry L. Swinney, Independent coordinates for strange attractors from mutual information, Physical Review A 33 (1986) 1134–1140. [15] J.F. Donges, Y. Zou, N. Marwan, J. Kurths, Complex networks in climate dynamics, European Physical Journal Special Topics 174 (2009) 157–179. [16] U. Schwarz, A.O. Benz, J. Kurths, A. Witt, Astronomy and Astrophysics 277 (1993) 215–224. [17] R. Hegger, H. Kantz, T. Schreiber, Practical implementation of nonlinear time series methods: the TISEAN package, CHAOS 9 (1999) 413. [18] A. Papana, D. Kugiumtzis, Evaluation of mutual information estimators on nonlinear dynamic systems, Complex Phenomena in Nonlinear Systems 11 (2) (2008) 225–232. [19] J.B. Bendat, A.G. Piersol, Measurement and Analysis of Random Data, John Wiley, New York, 1966. [20] M. Palus, Information theoretic test for nonlinearity in time series[J], Physics Letters A 175 (1993) 203–209. [21] M. Palus, Testing for nonlinearity using redundancies: quantitative and qualitative aspects[J], Physica D 80 (1995) 186–205. [22] M.E.J. Newman, The structure and Function of Complex Networks, SIAM Review 45 (2003) 167. [23] A. Clauset, C.R. Shalizi, M.E.J. Newman, Power-law distributions in empirical data, SIAM Review 51 (4) (2009) 661–703.