Chaos, Solitons and Fractals 94 (2017) 44–53
Contents lists available at ScienceDirect
Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena journal homepage: www.elsevier.com/locate/chaos
Dynamic asset trees in the US stock market: Structure variation and market phenomena Wei-Qiang Huang a,∗, Shuang Yao b, Xin-Tian Zhuang a, Ying Yuan a a b
School of Business Administration, Northeastern University, Shenyang, Liaoning 110167, P. R. China School of Economics and Management, Shenyang University of Chemical Technology, Shenyang, Liaoning 110142, P. R. China
a r t i c l e
i n f o
Article history: Received 29 March 2016 Revised 4 October 2016 Accepted 19 November 2016 Available online 28 November 2016 Keywords: Stock market Minimal spanning tree Structure variation Market phenomena
a b s t r a c t In this work, employing a moving window to scan through every stock price time series over a period from 2 January 1986 to 20 October 2015, we use cross-correlations to measure the interdependence between stock prices, and we construct a corresponding minimal spanning tree for 170 U.S. stocks in every given window. We show how the asset tree evolves over time and describe the dynamics of its normalized length, centrality measures, vertex degree and vertex strength distributions, and single- and multiple-step edge survival ratios. We find that the normalized tree length shows a tendency to decrease over the 30 years. The power-law of vertex degree or vertex strength distribution does not hold for all trees. The survival ratio analysis reveals an increased stability of the dependence structure of the stock market as time elapses. We then examine the relationship between tree structure variation and market phenomena, such as average, volatility and tail risk of stock (market) return. Our main observation is that the normalized tree length has a positive relationship with the level of stock market average return, and it responds negatively to the market return volatility and tail risk. Furthermore, the majority of stocks have their vertex degrees significantly positively correlated to their average return, and significantly negatively correlated to their return volatility and tail risk. © 2016 Elsevier Ltd. All rights reserved.
1. Introduction A quantitative description of the hierarchical structure is crucial for understanding the dynamics of complex systems [1]. In essence, the stock market is an example of a complex system consisting of many interacting components [2,3]. The correlation matrix of stock return time series, which plays a central role in investment theory and risk management, can be used to extract information about hierarchical organization of stock market. By using the correlation between pairs of elements as a similarity measure, some hierarchical clustering procedures have been proposed to select the statistically reliable information of the correlation matrix [1]. The hierarchical tree obtained by applying single linkage cluster analysis (SLCA) and average linkage cluster analysis (ALCA) to the correlation matrix, can well identify groups of stocks belonging to the same economic sector [4,5,6]. In addition to the hierarchical trees, one can also associate correlation based networks with the correlation matrix using clustering algorithm. In the correlation based networks, a subset of links which are highly in-
∗
Corresponding author. E-mail address:
[email protected] (W.-Q. Huang).
http://dx.doi.org/10.1016/j.chaos.2016.11.007 0960-0779/© 2016 Elsevier Ltd. All rights reserved.
formative about the hierarchical structure of the system are selected. For example, the minimum spanning tree (MST), which was firstly introduced in Mantegna (1999), is a correlation based tree associated with the SLCA. A lot of subsequent studies constructed MSTs to investigate the economic properties of stock returns [7– 15]. Other examples of correlation based networks are the planar maximally filtered graph (PMFG) [16], and the average linkage minimum spanning tree (ALMST) [17]. The PMFG presents a graph structure which is richer than the one of the MST, and has been used to investigate stock return time series in Refs. [6,9,16,18,19]. To evaluate the statistical reliability of nodes in a hierarchical tree and links in a correlation based network, a bootstrap procedure of the time series has been devised in Refs. [17,20]. In order to quantify and compare the performance of different filtering procedures, a useful measure using the Kullback–Leibler distance has been proposed [21]. It was shown that the Kullback–Leibler distance is very good for comparing correlation matrices [21]. In addition to the modeling of correlation matrix for stock return time series, a lot of researches have constructed similar correlation based networks for industry indices [22], stock market indices [23,24,25], world currencies [26], and government bond market indices [27]. The empirical analysis of correlation based networks would be static or dynamic. Refs. [4,7,8,18] focused on the static network, i.e.
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53
investigating the properties of the network constructed for a long time period, such as network topology, hierarchical structure, or taxonomic studies in terms of economic sector, region, or other characteristics. However, because each stock responds differently to external information like the same economic announcements or market news, the correlation among them will vary. More and more studies constructed dynamic stock correlation networks, and investigated structural changes [9,11,12,15,28], topological stability [9,10,14,30], structural differences between crisis and non-crisis periods [11,12,14,29], and relationship between volatility and network properties [13]. For example, Micciche et al. [10] investigated the time series of the degree of minimum spanning trees obtained by using a correlation based clustering procedure which starts from asset return. The analysis showed that the degree of stocks has a very slow dynamics with a time scale of several years. Sienkiewicz et al. [12] provided empirical evidence that there is a dynamic structural and topological first order phase transition in the time range dominated by a crash. By investigating the ability of the network to resist structural or topological change, Yan et al. [19] found that the PMFG before the US sub-prime crisis has a stronger robustness against the intentional topological damage than the other two non-crisis periods. Kocheturov et al. [29] studied cluster structures of market networks constructed from correlation matrix of returns of the stocks traded in the USA and Sweden. Their main observation was that in non-crisis periods of time cluster structures change more chaotically, while during crises they show more stable behavior and fewer changes. These existing studies mainly relate stock correlation network variations to the extreme events such as global financial crises. However, the general relationships between network structure variation and market phenomena can help us understand the interaction between network and stock market dynamics, thus it can be a good guide for risk management of stock investment. In this paper, we investigate the dynamics of correlations present between pairs of U.S. stocks traded in U.S. market by studying correlation based networks. We also investigate the general relationship between network structure variation and market dynamics. The study is performed by using stock time series during the time period from January 1986 to October 2015, which spans near 30 years. We begin with the construction of network based on raw data of stock in Section 2. In Section 3 we describe the network topology structures and market phenomena. Section 4 is empirical study and results. In the last section we present a few conclusions. 2. Network construction A network is usually defined as a collection of vertices connected by edges. If we consider a stock correlation network, each stock will be a network vertex. Each pair of stocks is connected with an edge, with its edge weight equal to the Pearson’s correlation of their corresponding stock returns in a certain time period. Furthermore, we can characterize the dynamics of stock correlation network by calculating the cross-correlation between two stocks for the moving time periods. The sample data we collected are the daily returns of N stocks traded in the U.S. stock market. The sample time period length is T days. The network evolution is analyzed by setting a time window of length w days and moving this window along time. One network is obtained by considering the time series inside each window. This window is displaced by an amount of τ days and a new network is obtained after each displacement. This process is repeated until the end of the original time series is reached. For example, the first network will be constructed from the time series starting at day t11 = 1 and ending at day t21 = w, the second network will be constructed from the time series starting at day t12 = 1 + τ and ending at day t22 = τ + w, the third network will be constructed from the time series starting at
45
day t13 = 1 + 2τ and ending at day t23 = 2τ + w, and so on. Hence, we achieved a total of M(M = 1 + [(T − w)/τ ], [ • ] denotes the ceiling function) networks. Let Rm (t ) be the log return of stock i at i day t in the mth window, where m = 1, 2, , M. m m Rm i (t ) = ln Pi (t ) − ln Pi (t − 1 )
(1)
where Pim (t ) is the closing price of stock i at day t. The correlation between stock i and j(i = 1, 2, , N, j = 1, 2, , N) in the mth window can be measured by Pearson’s correlation of series Rm and i Rm . j
ρi,mj =
Rm Rmj − Rm Rmj i i ( ( Rm )2 − Rm 2 )((Rmj )2 − Rmj 2 ) i i
(2)
where denotes the expected value, and ρi,mj ∈ [−1, 1]. Specifically, if i = j then ρi,mj = 1. Thus we construct the mth stock correlation network Gm (V, Em ), where V = {1, 2, , N} denotes the node set, and the network edge set Em can be denoted by {em = ρi,mj |i = ij 1, 2, · · · , N, j = 1, 2, · · · , N}. So em reflects the edge weight between ij node i and j in the network, and Gm (V, Em ) is an undirected and weighted network. The correlation coefficient of a pair of stocks cannot be used as a distance between the two stocks because it does not fulfill the three axioms that define a metric. However a metric can be defined using as distance a function of the correlation coefficient. The correlation coefficient ρi,mj is transformed to a distance metric di,mj [4].
di,mj =
2(1 − ρi,mj )
(3)
The di,mj fulfills the three axioms of a metric distance: 1) di,mj = 0 m + d m [4]. Now if and only if i = j; 2) di,mj = dm and 3) di,mj ≤ di,k j,i k, j
the edge weight em can be measured by di,mj , and the correspondij m ing edge set E can be denoted by {em = di,mj |i = 1, 2, · · · , N, j = ij m ∈ [0, 2]. The full connected network 1, 2, · · · , N}. We have em = d ij i, j Gm (V, Em ) is then used to determine the minimal spanning tree MSTm , which is a simply connected graph that links the N vertices with the N − 1 edges such that the sum of all edge weights is minimum. The minimal spanning tree can provide an easy way to extract the most important correlations and information in the stock market while retaining the simplest structure and enabling the ability to visualize the relationships across stocks. A general approach to the construction of the MSTm is as follows [9,31].
Step 1: Start with an empty graph. Make an ordered list of edges in Gm (V, Em ), ranking them by increasing edge weight di,mj . Step 2: Take the first element in the list and add the edge to the graph. Step 3: Take the next element and add the edge if the resulting graph is still a tree; otherwise discard it. Step 4: Iterate the process from Step 3 until all pairs have been exhausted. During the whole T days period, the network construction procedure is repeated M times, and hence we have M consecutive networks.
46
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53
3. Network topology structure and market phenomena 3.1. Network topology structure 3.1.1. Normalized tree length We calculate the normalized tree length for each minimal spanning tree [11].
Lm =
1 dm N − 1 m m i, j
(4)
di, j ∈E
where Lm is the normalized tree length for MSTm ; N − 1 is the number of edges in the MST; It measures the closeness among the components of network. The variance of the normalized tree length is
vm 2 =
1 (dm − Lm )2 N − 1 m m ij
(5)
di j ∈E
v =
1
( N − 1 ) ( vm )3/2 2
dimj ∈E m
(
dimj
m 3
−L )
(6)
and the kurtosis is
vm 4 =
1
(N − 1 ) (v )
m 2 dimj ∈E m 2
3.1.4. Survival ratio The survival ratio is a measure of the robustness of asset tree topology. The single-step survival ratio is defined as [11]
ϕ (m ) =
The skewness is m 3
where β is a constant parameter of the distribution known as the exponent or scaling parameter. In cases where the lower bound kmin is known, the scaling parameter β can be estimated by the method of maximum likelihood in the limit of large sample size. Given the observed vertex degree data set and the estimated power law distribution, we can use a Kolmogorov–Smirnov (KS) goodness-of-fit test which generates a p-value to quantify the plausibility of the estimated distribution. The power law hypothesis is accepted if the p-value is larger than 0.1 and rejected otherwise. The parameter estimation and distribution test can be implemented by using the toolbox proposed by Ref. [33]. Similarly, we can also investigate the vertex strength distribution.
(dimj − Lm )4
(7)
3.1.2. Centrality measures In network theory, the centrality of a node determines the relative importance of that node within a network. There are different quantitative definitions of centrality in the MSTm . The definitions are given below. Vertex degree km is the number of vertices that is adjacent to i vertex i. Vertex strength sm is the correlation coefficient weighted vertex i degree of vertex i. Betweenness centrality bm measures the importance of vertex i i as an intermediate part between other vertices. It is defined as
n jk (i ) 1 bm i = (N − 1 )(N − 2 ) n jk
(8)
cim =
1 N li, j
(9)
j=1
where li, j is the shortest distance from i to j. 3.1.3. Vertex degree and vertex strength distribution The results of many empirical studies on actual networks demonstrate the vertices degree distribution obeys a power law [32]. More often the power law applies only for values greater than some minimum kmin . The cumulative distribution function of vertex degree which is power law distributed is given by [33]
P (k ) =
k kmin
−β +1
(10)
(11)
where Em refers to the set of edges of MSTm . ∩ is the intersection operator and || gives the number of elements in the set. The single-step survival ratio is the fraction of edges found common in the two consecutive trees. To test the relative long time stability of the linkages, we calculate the multistep survival ratio.
1 m E ∩ E m−1 · · · ∩ E m−k+1 ∩ E m−k N−1
ϕ (m, k ) =
(12)
where ϕ (m, k) is the k-step survival ratio of edges in the MST. Larger values for k and ϕ (m, k) mean higher robustness of asset tree topology. 3.2. Market phenomena We consider the market phenomena as the time-varying values of average, volatility and tail risk of individual stock’s return in each w-day window. The average return rim of stock i in the mth w-day window is given by
rim =
1 m R (t ) t i w
(13)
The return volatility σim is
j,k
where njk (i) is the number of the shortest paths from j to k passing through i,while njk is the number of the shortest paths from j to k. There is a unique shortest path for any pair of vertices in the MST. Therefore, the value of njk (i)/njk zero (if the path from j to k does not pass through vertex i) or one (if the path passes through vertex i). Closeness centrality cim is a measure of the average geodesic distance from vertex i to all others. The strongly connected central vertices have a high closeness centrality measure.
1 m E ∩ E m−1 N−1
σim =
t
( Rm (t ) − rim )2 i w
(14)
We apply conditional value-at-risk (CVaR) introduced by Ref. [34] to measure the tail risk of stock return. The CVaR of stock i’s return in the mth window with confidence level α is m m m CV aRα (Rm i ) = E[Ri |Ri ≤ V aRα (Ri )]
(15)
where E[•] denotes the conditional expectation, and VaR(value-atRisk) is a measure of risk commonly used in the finance industry. VaRα (Rm ) is implicitly defined as the α quantile, i.e., i m Pr(Rm i ≤ V aRα (Ri )) = α
(16)
To reflect the phenomena of the whole stock market, we also calculate the time-varying values of average, volatility and tail risk of market index’s return in each window. They are denoted as rAm , σAm and CVaRα (Rm ) respectively. A 4. Empirical study and results 4.1. Data The S&P500 is an American stock market index based on the market capitalizations of 500 large companies having common stock listed on the NYSE or NASDAQ. It is considered as one of
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53
47
4.2. Dynamic evolution of network topology structures
Fig. 1. Mean, variance, skewness and kurtosis of distances as function of time. Window length w = 200 days and window step length τ = 60 days. Results are plotted according to the start date of window.
the best representations of the US stock market and a bellwether for the US economy. In our paper, we consider companies which were in the S&P500 index on October 20, 2015. The data we have used in this study consists of daily closing prices for 170 stocks of the S&P 500 index, obtained from Yahoo. The time period of this data extends from 2 January 1986 to 20 October 2015 (approximately 30 years), including a total of 7514 price quotes per stock. We have taken into consideration 170 stocks of the S&P500 index because only these stocks’ data are complete during the analyzed period. To avoid a large overlapping in the MSTs and guarantee the statistical reliability, a window length of w = 200 days is rolled forward at τ = 60 days intervals. As a result, we get 122 windows or dynamic MSTs.
4.2.1. Normalized tree length analysis Rolling-window graphs of the first four moments of the distances di,mj (Eq. (3)) are presented in Fig. 1, where the window length is w = 200 days and the window step length is τ = 60 days. The initial period is followed by sharply decreasing normalized tree length as the U.S stock market move past the 1987 crash. After that the normalized tree length quickly increases and has kept at a relatively stable high-level, approximately 1. Meanwhile, the variance of distances begins to drop and has kept at a relatively low-level. However, normalized tree length decreases again, possibly reflecting the early 20 0 0s recession. Recovery is accompanied by increasing normalized tree length. The normalized tree length decreases once again during the U.S. subprime mortgage crisis and increases after the recovery. According to Eq. (3), the mean correlation is negatively correlated with the normalized tree length, tending to rise in times of market crisis and decline after recovery. Overall, the normalized tree length shows a tendency to decrease over the 30 years, indicating a “tighter” composition of the MST and rising correlations among stock returns. We also note that the skewness is positive in times of market crisis, indicating the distance distribution is spread out more to the right. The kurtosis are larger than 3 for the majority of the time. After 20 0 0s, the skewness and kurtosis fluctuate around 0 and 3 respectively, implying the distribution of the distances becomes more Gaussian. 4.2.2. Centrality analysis In general the larger the centrality measures, the more important the vertex is. Time-varying highest centrality measures are presented in Fig. 2. The highest vertex degree, highest vertex degree strength, highest betweenness centrality and highest closeness centrality vary from 6 to 32, from 2.7416 to 24.7172, from 0.5764 to 0.9139 and from 7.9745e−004 to 0.0022 respectively.
Fig. 2. Time-varying highest centrality measures in the MST.
48
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53 Table 1 Top ten stocks according to the times they appear with the highest centrality measures in the MSTs. Highest vertex degree
Highest vertex strength
Highest betweenness centrality
Highest closeness centrality
Stock
Times
Stock
Times
Stock
Times
Stock
Times
GE PPG HON PEG EMR DD BEN SNA LNC PH
22 12 8 6 6 6 6 5 4 3
GE PPG HON PEG BEN SNA JPM DD EMR SO
21 11 8 7 6 5 5 5 4 3
GE PPG HON JPM EMR BK UTX PH MMM C
21 10 7 7 6 5 4 4 4 4
GE PPG HON JPM BK EMR UTX DIS C PH
20 9 8 7 7 6 4 4 4 3
Fig. 4. Survival ratio as a function of time under different steps (w = 200 days and τ = 60 days).
Fig. 3. Time-varying p-value of (a) vertex degree distribution and (b) vertex strength distribution.
As time varying, the corresponding stocks having these highest centrality measures also changed. In total, there are 40, 40, 41 and 44 different stocks having been the vertices with the highest vertex degree, highest vertex strength, highest betweenness centrality and highest closeness centrality respectively throughout the period. Table 1 shows the top ten stocks according to the times they appear with the highest centrality measures in the MSTs. For different highest centrality measures, the top ten stock lists have minor differences. We can find that the top three stocks are GE, PPG and HON for all centrality measures. 4.2.3. Vertex degree and vertex strength distribution analysis We analyze the vertex degree distribution and vertex strength distribution of the MSTs by using the method proposed by Ref. [33]. We present the p-value of KS goodness-of-fit test over time in Fig. 3. As time goes by, the p-value fluctuates dramatically. The results indicate that not all of the vertex degree distribution and vertex strength distribution of MSTs obey a power-law. There are 98 (80.33% of 122) MSTs that obey a power-law vertex degree distribution (p-value > 0.1). Meanwhile, there are 81 (66.39% of 122) MSTs that obey a power-law vertex strength distribution. The scaling parameters β of the power-law vertex degree distribution vary from 2.2 to 3.5. However, the scaling parameters β of the powerlaw vertex strength distribution have a wider range which is from 2.3623 to 4.9119. 4.2.4. Survival ratio analysis The survival ratio was calculated in order to examine the consecutive stability of the tree topology. Fig. 4 presents the singlestep and multi-step survival ratio for the MST. In all cases, w = 200 days and τ = 60 days. We computed multiple steps of 1, 2, 3, 4, 5, 6, which represent the intervals of 60 days, 120 days, 180 days, 240
days, 300 days and 360 days, respectively. When the step interval is set as 60 days, the average of the survival ratio reaches 0.4341. It indicates that about 43 percent edges between stocks survive from one window to the next. As might be expected, the survival ratio decreases with the increased steps. For example, the average survival ratio drops to 0.0545 when the step interval is 360 days. Though the connections disappear quite rapidly, a small proportion of edges remain intact, implying importance for the construction of portfolios. We also find that during the sample period, for each 1, 2, 3, 4, 5, 6 steps, there is a significant increase in the survival ratios revealing an increased stability of the dependence structure of the U.S. stock market. 4.3. Connecting market phenomena and structure variation 4.3.1. Relationships between market index dynamics and structure variation The dynamics of the window average, volatility and tail risk of stock market index’s return are indicative of how the whole stock market behaves. Here we take the S&P500 index as the stock market index. To facilitate the analysis, all the 122 MSTs are categorized into 10 groups in accordance with the average return of market index rAm during the corresponding time window. Group 1 denotes the lowest average return group and group 10 denotes the highest average return group. Fig. 5 presents the number of MST that belongs to each group. There are 44 MSTs belonging to group 8 (the average return range is from 0.0 0 03 to 0.0 0 07). The number of MSTs belonging to high or low average return level is smaller. Fig. 6 displays the characteristics of these ten groups. The normalized tree length (L), in Fig. 6, shows a clear positive relationship with the level of average return. We find an increasing pattern of L from group1 (the average L is 0.6733) through group 10 (the average L is 0.9622). This shows that, the structure of the MST becomes looser as the market average return rises. This also indicated that stock prices move in the same direction when the market return is lower. The market return is closely related to the widespread one-
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53
49
Fig. 5. The number of MST belonging to each group. The groups are categorized based on the average S&P500 index return in each window.
Fig. 6. Normalized tree length of the ten groups (categorized based on the average S&P500 index return in each window).
factor model. Bonanno et al. [35] compared the topology of the MST obtained from real markets with the one obtained from surrogated data simulated by using one-factor model. They found that the empirical tree has features of a complex network, such as hierarchical distribution of importance of the nodes, which can not be reproduced by the one-factor model. Our result shows an approximately linear relationship between the normalized tree length and market return. Therefore, a single index model can well describe the time evolution of the entire system. To analyze the relationship between structure variation and level of market volatility, all the MSTs are categorized into ten groups in accordance with the return volatility of market index σAm during the corresponding time window. Group 1 denotes the smallest volatility group and group 10 denotes the largest volatility group. Fig. 7 presents the number of MST that belongs to each group. The MSTs belonging to low volatility level groups outnumber those belonging to high volatility level groups. The number of MSTs belonging to group 2 (with volatility ranges from 0.0071 to
Fig. 8. Normalized tree length of the ten groups (categorized based on the return volatility of S&P500 index in each window).
0.0096) is 34, which is the largest among all the groups. Groups 8 and 10 (with volatility ranges from 0.0246 to 0.0271, and from 0.0296 to 0.0321, respectively) contain the smallest number of MST which is only one. Fig. 8 displays the characteristics of these ten groups. The normalized tree length, in Fig. 8, shows a clear negative relationship with the level of volatility. We find a decreasing pattern of L from group 1, the smallest volatility group, (the average L is 1.0394) through group 10, the largest volatility group, (the average L is 0.6679). This shows that the structure of the MST becomes denser as the market volatility increases. Also, this indicates that more stocks move in the same direction when the market is more volatile. To analyze the relationship between structure variation and level of tail risk, all the MSTs are categorized into ten groups in accordance with the tail risk of market index return CVaRα (Rm ) durA ing the corresponding time window. The smaller CVaRα (Rm ) is, the A higher tail risk will be. Group 1 denotes the highest tail risk group and group 10 denotes the lowest tail risk group. Fig. 9 presents the
Fig. 7. The number of MST belonging to each group. The groups are categorized based on the return volatility of S&P500 index in each window.
50
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53
Fig. 9. The number of MST belonging to each group. The groups are categorized based on the tail risk of S&P500 index in each window.
Fig. 10. Normalized tree length of the ten groups (categorized based on the tail risk of S&P500 index in each window).
number of MST that belongs to each group. The MSTs belonging to low tail risk level groups outnumber those belonging to high tail risk level groups. The number of MSTs belonging to group 9 (with tail risk ranges from −0.0144 to −0.0106) is 43, which is the largest among all the groups. There is only one MST belonging to group 2 (with tail risk ranges from −0.0486 to −0.0448), which is the smallest. Fig. 10 displays the characteristics of these ten groups. The normalized tree length, in Fig. 10, shows a clear negative relationship with the level of tail risk. We find an increasing patter of L from group 1, the highest tail risk group, (the average L is 0.6671) through group 10, the lowest tail risk group, (the average L is 1.1104). This shows that the structure of the MST becomes denser as the tail risk increases. This also indicated that stocks move in the same direction when the market experiences more severe extreme risk. Overall, the normalized tree length best responds to the average, volatility and tail risk of market return. It needs to be pointed out that the results are similar when the group category number in the above analyses is changed. 4.3.2. Relationships between individual stock dynamics and centrality measures It is of interest to know the relationships between individual stock dynamics such as average, volatility and tail risk of stock returns, and their centrality measures in the MST. First of all, we investigate the correlations among different vertex centrality measures. We calculate the Pearson’s correlation coefficients between each pair of centrality measures for each vertex. It is worth noting that we are performing a multiple hypothesis testing in the statistical inference of coefficient significance. A multiple hypothesis testing correction is needed. Bonferroni correction and False Discovery Rate (FDR) are common correction procedures, which for example has been used to statistically validate links of a projected network in bipartite complex systems [36]. The classical Bonferroni
correction is conservative, and has been used mainly in situations where no other multiple test procedure is available [37]. We will use the FDR correction procedure [38], which provides less stringent control of Type I errors compared to family-wise error rate (FWER) controlling procedures (such as the Bonferroni correction). For example, in the analysis of correlation between vertex degree k and vertex strength s, the calculation of the sample correlation coefficient ρˆ (k, s ) and its corresponding p-value in the statistical test will be repeated N times. The FDR correction is defined as follows [38]. Consider testing H1 , H2 , , HN based on the corresponding p-values p1 , p2 , , pN , where H(i) is the null hypothesis that the true correlation coefficient ρ (k, s) is equal to 0. Let p(1) ≤ p(2) ≤ ≤ p(N) be the ordered p-values, and denote by H(i) the null hypothesis corresponding to p(i) . Let K be the largest i for which p(i) ≤ i × 0.05/N, then reject all H(i) (i = 1, 2, , K), where 0.05 is the multiple level of significance. Table 2 summarizes the Pearson’s correlation coefficients among different centrality measures. The value of correlation coefficient and strength of correlation are categorized according to Ref. [39]. It is found that 168 vertices (98.82% of 170) have their degrees significantly (strong or very strong) positively correlated to their strength, 144 vertices (84.71% of 170) have their degrees significantly (strong or very strong) positively correlated to their betweenness centrality, and 120 vertices (70.59% of 170) have their strength significantly (strong or very strong) positively correlated to their betweenness centrality. We can conclude that three centrality measures, vertex degree, vertex strength and betweenness centrality are highly correlated among them. The results also show that only 7 vertices (4.12% of 170) have their closeness centrality significantly (strong) positively correlated to their degrees, only 8 vertices (4.71% of 170) have their closeness centrality significantly (strong or very strong) positively correlated to their strength, and only 4 vertices (2.35% of 170) have their closeness centrality significantly (strong) positively correlated to their betweenness centrality. The closeness centrality measure is not highly correlated to other centrality measures. To avoid redundance, we only investigate the relationships between individual stock dynamics and two centrality measures (vertex degree and closeness centrality). It is worth noting that, in the following analyses, a multiple hypothesis test correction procedure (FDR) is also performed to properly evaluate the statistical significance of the tests. We calculate the Pearson’s correlation coefficient of vertex degree to each of the individual stock’s average, volatility and tail risk of return. Fig. 11 shows the Pearson’s correlation coefficients between each pair of vertex degree km , averi age return rim , volatility σim and tail risk CVaRα (Rm ) for stock i. i Table 3 summarizes the results. It is found that 119 stocks (70% of 170) have their vertex degrees significantly positively correlated to their average return, 126 stocks (74.12% of 170) have their vertex
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53
51
Table 2 Summarization of Pearson’s correlation coefficients among different centrality measures. Correlation coefficient (Strength of correlation)
ρ (k, s)
ρ (k, b)
ρ (k, c)
ρ (s, b)
ρ (s, c)
number(ratio)
number(ratio)
number(ratio)
number(ratio)
number(ratio)
ρ (b, c) number(ratio)
0.80 ∼ 1.00 (Very strong positive) 0.60 ∼ 0.79 (Strong positive) 0.40 ∼ 0.59 (Moderate positive) 0.20 ∼ 0.39 (Weak positive) 0.00 ∼ 0.19 (Very weak positive) Significant positive
157(92.35%) 11(6.47%) 2(1.18%) 0 0 170(100%)
56(32.94%) 88(51.76%) 23(13.53%) 3(1.76%) 0 170(100%)
0 7(4.12%) 25(14.71%) 46(27.06%) 0 78(45.88%)
36(21.18%) 84(49.41%) 39(22.94%) 11(6.47%) 0 170(100%)
1(0.59%) 7(4.12%) 35(20.59%) 64(37.65%) 4(2.35%) 111(65.29%)
0 4(2.35%) 24(14.12%) 51(30%) 0 79(46.47%)
Fig. 11. Pearson’s correlation coefficients between vertex degree k and, (a) average return r, (b)volatility σ , (c) tail risk CVaRα (R) for each stock. Table 3 Summarization of Pearson’s correlation coefficients between vertex degree and individual stock dynamics. Value of the correlation coefficient 0.40 ∼ 0.59 0.20 ∼ 0.39 0.00 ∼ 0.19 −0.19 ∼ 0.00 −0.39 ∼ −0.20 −0.59 ∼ −0.40
Strength of correlation
Moderate positive Weak positive Very weak positive Very weak negative Weak negative Moderate negative Significant positive Significant negative
ρ (k, r) number(ratio)
ρ (k, σ ) number(ratio)
ρ (k, CVaR) number(ratio)
1 (0.59%) 39 (22.94%) 79 (46.47%) 10 (5.88%) 0 0 119 (70%) 10 (5.88%)
1 (0.59%) 3 (1.76%) 7 (4.12%) 83 (48.82%) 43 (25.29%) 0 11 (6.47%) 126 (74.12%)
0 38 (22.35%) 68 (40%) 11 (6.47%) 5 (2.94%) 2 (1.18%) 106 (62.35%) 18 (10.59%)
Fig. 12. Pearson’s correlation coefficients between vertex closeness centrality c and, (a) average return r, (b)volatility σ , (c) tail risk CVaRα (R) for each stock.
degrees significantly negatively correlated to their return volatility, and 106 stocks (62.35% of 170) have their vertex degrees significantly negatively to their return tail risk. It is worth noting that the value of CVaR is negatively correlated with tail risk. So the positive Pearson correlation coefficient ρ (k, CVaR) means negative correlations between vertex degree and tail risk. Furthermore, the strength of most significant correlations is weak or very weak, and few of them are moderate.
Fig. 12 shows the Pearson’s correlation coefficients between each pair of vertex closeness centrality cim , average return rim , volatility σim and tail risk CVaRα (Rm ) for stock i. Table 4 summai rizes the results. It is found that 83 stocks (48.82% of 170) have their vertex closeness centrality significantly positively correlated to their average return, 101 stocks (59.41% of 170) have their vertex closeness centrality significantly negatively correlated to their return volatility, and 84 stocks (49.41% of 170) have their vertex closeness centrality significantly negatively to their return tail risk.
52
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53 Table 4 Summarization of Pearson’s correlation coefficients between vertex closeness centrality and individual stock dynamics. Value of the correlation coefficient 0.40 ∼ 0.59 0.20 ∼ 0.39 0.00 ∼ 0.19 −0.19 ∼ 0.00 −0.39 ∼ −0.20 −0.59 ∼ −0.40
Strength of correlation
Moderate positive Weak positive Very weak positive Very weak negative Weak negative Moderate negative Significant positive Significant negative
ρ (c, r) number(ratio)
ρ (c, σ ) number(ratio)
ρ (c, CVaR) number(ratio)
0 24(14.12%) 59(34.71%) 14(15.29%) 2(1.18%) 0 83(48.82%) 16(9.41%)
0 1(0.59%) 21(12.35%) 56(32.94%) 44(25.88%) 1(0.59%) 22(12.94%) 101(59.41%)
1(0.59%) 34(20%) 49(28.82%) 25(14.71%) 3(1.76%) 0 84(49.41%) 28(16.47%)
In contrast to the results shown in Table 3, we can not find a majority of stocks having their closeness centrality significantly positively or negatively correlated to their individual dynamics. From the above results, we can conclude that the relationships between stock dynamics and their centrality measures vary across different stocks. No definite conclusion can be drawn. However, due to the differences, we can actually use the correlation coefficients to classify stocks. From the perspective of vertex centrality measures in asset trees, such a classification may provide potential applications on the portfolio investment or risk management activities in which return and (extreme) risk are the main concerns. 5. Conclusion In this work, employing a moving window to scan through every stock price time series over a period from 2 January 1986 to 20 October 2015, we construct a corresponding minimal spanning tree for 170 U.S. stocks in every given window. We have studied the dynamic evolution of network topology structures, such as normalized tree length, centrality measures, vertex degree and vertex strength distribution, survival ratio. We have shown that the normalized tree length shows a tendency to decrease over the 30 years, indicating rising correlations among stock returns. In the recent 15 years, the distribution of the distances among stocks becomes more Gaussian. As time varying, the corresponding stocks having the highest centrality measures also change, but the top ten stocks sorted by the times they appear with the highest centrality measures are relatively stable. We also find that not all of the vertex degree distribution and vertex strength distribution of MSTs obey a power-law, and the scaling parameters of power-law vertex strength distribution have a wider range than those of the power-law vertex degree distribution. The survival ratio decreases with the increased steps, and it increases over time for each step showing an increased stability of the dependence structure of the stock market. We have then examined the relationship between network dynamics and stock (market) dynamics. The results suggest that the normalized tree length has a positive relationship with the level of stock market average return. The normalized tree length responds negatively to the market return volatility and tail risk. So the trend of the normalized tree length is a reverse image of that of the market return volatility and tail risk series. With a multiple hypothesis testing correction procedure, the correlation analyses among different centrality measures show that the centrality measures, vertex degree, vertex strength and betweenness centrality are highly correlated among them. The closeness centrality measure is not highly correlated to other measures. By calculating the Pearson correlation coefficients between individual stock dynamics and centrality measures (vertex degree and closeness centrality), we find that the relationships vary across different stocks, and the majority of stocks have their vertex degrees significantly positively correlated to their average return, and significantly negatively correlated to their return volatility and tail risk. However, the strength of most significant correlations is weak or very weak. Such relationship dif-
ferences can help to classify stocks, and its further application on the investment needs to be investigated in future. Acknowledgments This research was supported by the National Natural Science Foundation of China, project No. 71371044. References [1] Tumminello M, Lillo F, Mantegna RN. Correlation, hierarchies, and networks in financial markets. J Econ Behav Organ 2010;75:40–58. [2] Namaki A, Jafari GR, Raei R. Comparing the structure of an emerging market with a mature one under global perturbation. Physica A 2011;390(17):3020–5. [3] Mantegna RN, Stanley HE. An introduction to econophysics. Cambridge: Cambridge University Press; 20 0 0. [4] Mantegna RN. Hierarchical structure in financial markets. Eur Phys J B 1999;11(1):193–7. [5] Bonanno G, Lillo F, Mantegna RN. High-frequency cross-correlation in a set of stocks. Quant Financ 2001;1:96–104. [6] Coronnello C, Tumminello M, Lillo F, et al. Sector identification in a set of stock return time series traded at the London stock exchange acta. Phys Pol B 2005;36:2653–79. [7] Brida JG, Risso WA. Multidimensional minimal spanning tree: the Dow Jones case. Physica A 2008;387(21):5205–10. [8] Garas A, Argyrakis P. Correlation study of the Athens stock exchange. Physica A 2007;380(7):399–410. [9] Aste T, Shaw W, Matteo TD. Correlation structure and dynamics in volatile market. New J Phys 2010;12(8):085009. [10] Micciche S, Bonanno G, Lillo F, et al. Degree stability of a minimum spanning tree of price return and volatility. Physica A 2003;324(1-2):66–73. [11] Onnela JP, Chakraborti A, Kaski K, et al. Dynamic asset trees and black monday. Physica A 2003;324(1-2):247–52. [12] Sienkiewicz A, Gubiec T, Kutner R, et al. Dynamic structural and topological phase transitions on the Warsaw stock exchange: a phenomenological approach. Acta Phys Pol A 2013;123(3):615–20. [13] Lee J, Youn J, Chang W. Intraday volatility and network topological properties in the Korean stock market. Physica A 2012;391(4):1354–60. [14] Wang GJ, Xie C, Han F, et al. Similarity measure and topology evolution of foreign exchange markets using dynamic time warping method: Evidence from minimal spanning tree. Physica A 2012;319(16):4136–46. [15] Teh BK, Goo YW, Lian TW, et al. The Chinese correction of February 2007: how financial hierarchies change in a market crash. Physica A 2015;424(4):225–41. [16] Tumminello M, Aste T, Di Matteo T, et al. A tool for filtering information in complex systems. In: Proceedings of the national academy of sciences of the United States of America, 102; 2005. p. 10421–6. [17] Tumminello M, Coronnello C, Lillo F, et al. Spanning trees and bootstrap reliability estimation in correlation-based networks. Int J Bifurcat Chaos 2007;17:2319–29. [18] Tumminello M, Matteo TD, Aste T, et al. Correlation based networks of equity returns sampled at different time horizons. Eur Phys J B 2007;55:209–17. [19] Yan XG, Xie C, Wang GJ. Stock market network’s topological stability: evidence from planar maximally filtered graph and minimal spanning tree. Int J Mod Phys B 2015;29(22):1550160. [20] Tumminello M, Lillo F, Mantegna RN. Hierarchically nested factor model from multivariate data. Europhys Lett 20 07;78:30 0 06. [21] Tumminello M, Lillo F, Mantegna RN. Kullback–Leibler distance as a measure of the information filtered from multivariate data. Phys Rev E 2007;76:031123. [22] Buccheri G, Marmi S, Mantegna RN. Evolution of correlation structure of industrial indices of U.S. equity market. Phys Rev E 2013;88:012806. [23] Song DM, Tumminello M, Zhou WX, et al. Evolution of worldwide stock markets, correlation structure, and correlation-based graphs. Phys Rev E 2011;84:026108. [24] Coelho R, Gilmore CG, Lucey B, et al. The evolution of interdependence in world equity markets-evidence from minimum spanning trees. Physica A 2007;376(3):455–66. [25] Liu XF, Tse CK. A complex network perspective of world stock markets: synchronization and volatility. Int J Bifurcat Chaos 2012;22:1250142.
W.-Q. Huang et al. / Chaos, Solitons and Fractals 94 (2017) 44–53 [26] Fiedor P, Holda A. Time evolution of non-linear currency networks. Int J Mod Phys C 2014;10:1–16. [27] Gilmore CG, Lucey BM, Boscia MW. Comovements in government bond markets: a minimum spanning tree analysis. Physica A 2010;389(21):4875–86. [28] Onnela JP, Kaski K, Kertesz J. Clustering and information in correlation based financial networks. Eur Phys J B 2004;38(2):353–62. [29] Kocheturov A, Batsyn M, Pardalos PM. Dynamics of cluster structures in a financial market network. Physica A 2014;413(11):523–33. [30] Thomas KDP, Luciano FC, Francisco AR. The structure and resilience of financial market networks. Chaos 2012;22:013117. [31] Prim RC. Shortest connection networks and some generalizations. Bell Syst Tech J 1957;36(16):1389–401. [32] Newman MEJ. The structure and function of complex networks. SIAM Rev 2003;45(2):167–256.
53
[33] Clauset A, Shalizi CR, Newman MEJ. Power-law distributions in empirical data. SIAM Rev 2009;51(4):661–703. [34] Rockafellar RT, Uryasev S. Optimization of conditional value-at-risk. J Risk 20 0 0;2:21–42. [35] Bonanno G, Caldarelli G, Lillo F, et al. Topology of correlation-based minimal spanning trees in real and model markets. Phys Rev E 2003;68:046130. [36] Tumminello M, Micciche S, Lillo F, et al. Statistically validated networks in bipartite complex systems. Plos One 2011;6(3):1–11. [37] Holm S. A simple sequentially rejective multiple test procedure. Scand J Statist 1979;6:65–70. [38] Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B 1995;57(1):289–300. [39] Evans JD. Straightforward statistics for the behavioral sciences. Brooks/Cole Publishing Company; 1996.