Phase synchronization based minimum spanning trees for analysis of financial time series with nonlinear correlations

Phase synchronization based minimum spanning trees for analysis of financial time series with nonlinear correlations

Physica A xx (xxxx) xxx–xxx Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa Phase synchronizati...

2MB Sizes 1 Downloads 21 Views

Physica A xx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

Phase synchronization based minimum spanning trees for analysis of financial time series with nonlinear correlations Q1

Srinivasan Radhakrishnan c , Arjun Duvvuru b , Sivarit Sultornsanee a , Sagar Kamarthi c,∗ a

School of Business University of the Thai Chamber of Commerce, Bangkok, Thailand

b

Quality Engineer at JDA Software Pvt Ltd, India

c

Department of Mechanical and Industrial Engineering, Northeastern University, Boston, USA

highlights • • • • •

We analysed financial time series using a phase synchronization (PS) measure. We built a network where times series are nodes and PS measure is a link strength. We analysed characteristics of the minimum spanning tree of the time series network. We compared the performance of PS measure with that of cross correlation measure. We found PS measure is more suitable for time series with time-lagged correlations.

article

info

Article history: Received 7 March 2015 Received in revised form 31 August 2015 Available online xxxx Keywords: Phase synchronization Time series Financial systems Stock correlation network Exchange rate networks Complex networks



abstract The cross correlation coefficient has been widely applied in financial time series analysis, in specific, for understanding chaotic behaviour in terms of stock price and index movements during crisis periods. To better understand time series correlation dynamics, the cross correlation matrices are represented as networks, in which a node stands for an individual time series and a link indicates cross correlation between a pair of nodes. These networks are converted into simpler trees using different schemes. In this context, Minimum Spanning Trees (MST) are the most favoured tree structures because of their ability to preserve all the nodes and thereby retain essential information imbued in the network. Although cross correlations underlying MSTs capture essential information, they do not faithfully capture dynamic behaviour embedded in the time series data of financial systems because cross correlation is a reliable measure only if the relationship between the time series is linear. To address the issue, this work investigates a new measure called phase synchronization (PS) for establishing correlations among different time series which relate to one another, linearly or nonlinearly. In this approach the strength of a link between a pair of time series (nodes) is determined by the level of phase synchronization between them. We compare the performance of phase synchronization based MST with cross correlation based MST along selected network measures across temporal frame that includes economically good and crisis periods. We observe agreement in the directionality of the results across these two methods. They show similar trends, upward or downward, when comparing selected network measures. Though both the methods give similar trends, the phase synchronization based MST is a more reliable representation of the dynamic behaviour of financial systems than the cross correlation based MST because of the former’s

Corresponding author. Tel.: +1 617 373 3070. E-mail address: [email protected] (S. Kamarthi).

http://dx.doi.org/10.1016/j.physa.2015.09.070 0378-4371/© 2015 Elsevier B.V. All rights reserved.

2

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

ability to quantify nonlinear relationships among time series or relations among phase shifted time series. © 2015 Elsevier B.V. All rights reserved.

1

2 3 4 5

1. Cross correlation based networks Segmentation and segment clustering techniques give a macro view of a dynamical process embedded in time series pertaining to economics and finance [1]. However the application of these techniques to high frequency time series creates a significant exclusion of important information. In order to retain richer information a cross-correlation technique is preferred in which the cross correlation value is computed between two time series i and j (without lag) using Eq. (1). T 

6

Ci,j = 

(xit − xi )(xjt − xj )

t =1 T 

t =1

(xit − xi )

2

T 

.

(1)

(xjt − xj )

2

t =1

17

The cross correlation coefficient has been widely applied to analysing stock price and index movement time series, in specific, for understanding chaotic behaviour during crisis periods [2–8]. Several studies have focused on understanding non-trivial cross correlation using random matrix theory in which the cross correlation is computed over either entire data or certain pre-set time windows [1,9–16]. The information dispersed in independent cross correlation matrices is difficult for human understanding and interpretation. In an attempt to better understand correlation dynamics, the cross correlation matrices are given graphical representations. Individual time series is denoted as a node and individual nodes are connected to each other in accordance with pre-formed rules. These rules determine the outcome of the representation, which could be either a tree or a graph. In case of a tree, all the nodes are retained with acyclic connections (cyclic connections are inhibited by connection criteria). In contrast, graphs by virtue of its connection criteria allow formation of isolated nodes and loops. The resultant graphical representation in case of graphs may vary significantly compared to that of a tree. In this work we limit our focus to trees only.

18

1.1. Graphical representation: tree approach

7 8 9 10 11 12 13 14 15 16

22

The minimum spanning trees (MST) are weighted graphical representations, the construction of which can involve one of the two widely used algorithms, namely, Kruskal’s algorithm [17] and Prim’s algorithm [18]. MSTs found few applications in the field of economics until Mantegna [19] indicated robust patterns in the underlying correlations [20,21]. Since then MSTs have been widely used for statistical analysis of financial market data [22–28].

23

1.2. Cross correlation based MST

19 20 21

24 25 26 27 28 29 30 31 32 33 34 35 36 37

The following steps are executed in order to create MSTs from computed cross correlation between two time series. 1. Individual time series are considered as nodes in the network. 2. The cross correlation denoted by Cij is computed for each pair of time series, i and j, using Eq. (1). 3. The correlation coefficients forming an N × N correlation matrix with −1 ≤ Ci,j ≤ 1 are transformed into a N × N  distance matrix with elements dij = 2(1 − Ci,j ), such that 0 ≤ dij ≤ 2. The symmetric property of the distance formula ensures that dij = dji . The triangular property reveals the relationship between the distance value and the correlation coefficient (smaller distance values indicate higher correlation values). 4. The distance matrix is essentially an adjacency matrix representing the correlation network. The distance matrix is then used to determine the minimum spanning tree (MST), which is simply a connected graph that connects all the N nodes of the network with (N − 1) edges such that the sum of all distances is minimum. 5. Kruskal algorithm1 is applied to the distance matrix to form the desired MST [17]. In this work we investigate a new approach that has not been explored earlier for establishing reliable correlations among financial time series (e.g. stocks). The advantage of this new method is that it can characterize the synchronized variations among different time series better than the measures, such as cross correlation, that capture only the linear relationships.

1 Kruskal’s algorithm and Prims algorithm are two widely adopted methods to from MST. From space aspect, Kruskal’s algorithm is relatively superior to Prims algorithm when the number of nodes are less than 100, however when the number of nodes are greater than 100, from the time complexity’s aspect, Prim algorithm is superior [29]. In this work we adopt Kruskal’s algorithm for converting distance matrix into MST [17].

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

3

2. Phase synchronization

1

Phase synchronization is a state in which the dynamics or oscillations of two or more components of a system are locked to some degree. When two components are phase synchronized, the phases and frequencies of their signals are locked together. The choreographed behaviour of Southeast Asian Fireflies is a good example of phase synchronization (PS). Strogatz [30] was among the first ones to discover the elegant mathematical structure in this synchronicity. Beginning at dusk the fireflies begin to flash periodically with random phases and a Gaussian distribution of native frequencies. As night falls, the flies, sensitive to one another’s behaviour, begin to synchronize their flashing. After some time, all the fireflies within a given tree begin to flash simultaneously in a burst. Thinking of the fireflies as biological oscillators, we can define the phase to be 0° during the flash and ±180° exactly halfway until the next flash. Thus, when they begin to flash in unison, they synchronize in phase. One can determine if two systems are phase synchronized by analysing recurrences of their signals. Recurrence is a key property of dynamic systems and it happens when a system returns to its previous states over a certain time length [31]. A state recurs at time tj , if the trajectory of a complex system returns to the neighbourhood of a previous state xi = x(ti ) for ti < tj (see Fig. 1(a)). Here xi ∈ ℜm and i = 1, 2, . . . , n, where n is the number of measured points and m is the dimension of the phase. The phase space trajectory can be reconstructed from a time series by the method of embedding time delay [32]. Recurrence plots (Fig. 1(b)) are used to visualize recurrence in phase space. A recurrence plot represents all recurrences in the form of a binary matrix R, where Ri,j = 1 (black marks in Fig. 1(b)) if the state xj is in the neighbourhood of xi in phase space, and Ri,j = 0 otherwise (white spaces in Fig. 1(b)). The centre and size of the neighbourhood is determined by the selected norm. Every Ri,j in a n × n recurrence matrix is computed using Eq. (2), where xi and xj are points in phase space; ε is the threshold distance; Θ is the Heaviside step function; and ∥ · ∥ denotes a suitable norm in the considered phase space. A typical norm would be centred at the current state xi in the trajectory with a size or radius covering an area containing the k nearest neighbours. Ri,j (ε) = Θ (ε − ∥xi − xj ∥) xi ∈ ℜm , i, j = 1, . . . , n.

(2)

If P (τ ) denotes the probability that a system returns to the ε -neighbourhood of a former point xi on the trajectory after a time interval τ , the recurrence probability with time delay τ is given by the following:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

23 24 25

n−τ

 P (τ ) =

Ri,i+τ

i=1

n−τ

,

where τ = 1, 2, . . . , n − 1.

(3)

Here the numerator on the right hand side of the equation is the sum recurrences calculated for n − τ data points on the trajectory. The phase synchronization between a pair of time series can be detected and quantified by comparing the P (τ ) values for the two systems under consideration. The function P (τ ) formally defined by Eq. (3) is also an autocorrelation function which can define higher-order correlations between points on the trajectory within the time interval τ . To determine if two systems, i and j, are in phase synchronization, the cross correlation coefficient between Pi (τ ) for system i and Pj (τ ) for system j is computed; the cross correlation coefficient is also measured as the correlation probability of recurrence (CPR) [33]: CPRi,j =

⟨P¯i (τ )P¯j (τ )⟩ where τ = 1, 2, . . . , n − 1. σi σj

(4)

26

27 28 29 30 31 32 33

34 35

Refer Appendix A for example of PS method applied to Rossler signal. In Eq. (4), ⟨·⟩ is average operator, P¯ i (τ ) and P¯ j (τ ) are mean centred (mean value is subtracted), i.e., P¯ i (τ ) = Pi (τ ) − P¯ i and P¯ j (τ ) = Pj (τ ) − P¯ j , and σi and σj are the standard deviations of Pi (τ ) and Pj (τ ) respectively. If the two systems are synchronized, the CPRi,j approaches close to one. If they are not synchronized, the CPRi,j will take a value close to zero. One can expect a drift in the probability of recurrences to result in low values of CPRi,j .2 The following example demonstrates that phase synchronization measure (i.e., CPRi,j ) characterizes the synchronized variations among two different time series better than cross correlation measure. We consider two time series (see Fig. 2) with inherent noise and time lag. The correlation between time series 1 and times 2 is not linear. Cross correlation coefficient value for the above two signals is −0.1415 and phase synchronization (CPRi,j ) value is 0.715. This example illustrates that phase synchronization is a more reliable measure than cross correlation to characterize matching trends in different time series. In addition the suitability of CPRi,j over correlation based measures have been validated by other studies. Goswami et al. [34] in their work conclude that CPRi,j is a robust measure, one having the power to reveal patterns from relatively poor data sets in terms of noise, low frequency of sampling, and short time series length, thereby extending its capacity as a measure that can estimate connections between time series data effectively. To

2 For example: Applying Eqs. (3) and (4), for a financial market, subscripts 1 and 2 represent time-series data or dynamic signals corresponding to a pair of stocks. Pi (τ ) and Pj (τ ) represent the probability of recurrence for time series data of stocks i and j respectively. If the CPRi,j value for the stocks computed using Eq. (4) is greater than a predetermined cut-off or threshold, the two stocks are correlated or in phase synchronization.

36 37 38 39 40 41 42 43 44 45 46 47 48

4

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

Fig. 1. (a) Segment of the phase space trajectory of a Rossler System. Signals representing a Rossler System are generated using the equations: x˙ = −y − z; y˙ = x + ay; z˙ = b + z (x − c ) with a = 0.15, b = 0.2, and c = 10. (b) The corresponding recurrence plot. A phase space vector j which falls into the neighbourhood (grey circle in Fig. 1(a)) of a given vector i is considered as a recurrence point (black point on the trajectory in Fig. 1(a)). This is marked with a black point in the recurrence plot at the position (i, j). A phase space vector outside the neighbourhood (empty circle in Fig. 1(a)) leads to a white point in the recurrence plot.

Fig. 2. Demonstration of efficacy of phase synchronization measure for non-linearly correlated time series.

5

summarize, phase synchronization detects linear or non-linear relationships between a pair of time series (time series with or without time lag) while the cross correlation detects only the linear relationships between time series (without time lag). Appendix A illustrates how CPR and cross correlation values vary as the two signals differ in their phase. We can see that CPR stays high and stable regardless of the phase shift between the time series. In contrast the cross correlation values take maximum values only when the time series have shifted by a phase of 2kπ where k = 0, 1, 2.

6

3. Phase synchronization based MST

1 2 3 4

7 8

Phase Synchronization (PS) between a pair of entities is defined as the degree of phase locking observed between trajectories of the time series corresponding to these entities in phase space. In the context of a financial system, ‘‘the time

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

5

Fig. 3. SET index-moving average of monthly closing prices of stocks 1995–2012.

series of an entity’’ is essentially the fluctuation in value of a stock or a currency over time. Determining the PS between a pair of entities in a system involves the following steps that map the time series of entities to trajectories in phase space [31,35]. These trajectories are the plots of the variable x in the time series data at time t versus the same variable at time t − τ , where τ is the embedded time delay. 1. Compute recurrences for each trajectory using Eq. (2). 2. Calculate the recurrence probability at time delay τ for each trajectory using Eq. (3). 3. Compute the PS between each trajectory pair as a cross correlation coefficient CPRi,j using Eq. (4). If two trajectories, i and j, (corresponding to the time series of the entities) are phase synchronized, then the probability of recurrence is maximum at the same time resulting in a high CPRi,j . For N entities under consideration, a N × N matrix of  CPRi,j values is computed and then transformed into a N × N distance matrix with elements dij = 2(1 − CPRi,j ). The MST is then constructed from the distance matrix using Kruskal’s algorithm. 4. Study of financial networks A phase synchronization based method is relevant and applicable to analysis and study of financial markets. A financial market is a dynamic system in which the components (stocks, currency exchange, etc.) are subjected to oscillations due to various external forces. Investors and analysts are interested in not only studying and predicting these oscillations but also determining correlations (interactions) between relevant components. The main objective of the study is to compare topological characteristics of MSTs constructed using cross correlation (CC) and phase synchronization (PS). For the purpose of investigation, we mainly consider two types of time series: (1) currency exchange rates, and (2) stock prices. The two different components will help us understand how the proposed PS based method works compared to the existing CC based method in case of time series with low fluctuations and negligible or no time lag (currency exchange rates) and time series with high fluctuations and often with time lag (stock prices). The following sections present the detail of currency exchange rate data and stock prices data used for studying the performance of the phase synchronization based method in comparison to the cross correlation based method. 4.1. Currency exchange rates data Thai currency exchange data was obtained from http://oanda.com/ for the year 1997 and for nine years from 2004 to 2012. The year 1997 is when the Thai currency crisis and the Asian financial crisis happened [36]; the year 1997 is included to compare the results between periods of crisis and period of stability. The above time series data was divided into ten time windows, one window for the year 1997, one window for each of the nine years during the 2004–2012 period. Minimum spanning trees were constructed for each of the ten time windows using PS-MST and CC-MST methods. 4.2. Stock prices data Daily closing prices of all stocks in the SET index, between 1995 and 2012 (including 1995 and 2012) is used in this analysis. However, the results for only 9 years are presented here; these nine years are 1996, 1997, 2001, 2003, 2005, 2006, 2008, 2011, and 2012. These specific years are selected for the analysis because they were the times during which the SET index experienced crisis, buoyancy or relative calmness. Fig. 3 show a plot of 4-period moving average of the monthly closing prices of stocks in the SET index between 1995 and 2012. We observe that the SET index has two periods with dominant bearish behaviour—the period between 1995 and 1999, and during 2008. The index also shows two periods of bullish behaviour—the period between 2010 and 2012 (both years including), and during 2003. There are also short periods where the index was almost horizontal—2001, and 2005 to 2006.

1 2 3 4

5 6 7 8 9 10 11

12

13 14 15 16 17 18 19 20 21 22 23

24

25 26 27 28 29

30

31 32 33 34 35 36 37 38 39

6

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

Table 1 Network measures considered for analysis and their purpose. Network measures analysed

Purpose of analysis

Average distance as a function of endpoint degree

It determines how edge and node properties influence the overall topology of the network It determines how node properties and position in the network influence topology of the network It examine if edge properties change with respect to external events

Average betweenness centrality as a function of degree Network diameter and average shortest path versus time periods

1

4.3. Network performance measures

17

The MSTs generated by PS and CC methods are analysed with the objective of gaining insights into the variations in network topology and measures, over time. The network measures considered for comparison purposes are listed in Table 1. We selected the measures which we expect to be sensitive to the crisis. There could be other measures that are sensitive to crisis but only these three measures are selected for the illustration purpose. Two nodes are considered and their degrees are multiplied. The distance values are averaged out for same values of end point degree. The end point degree brings out the assortative nature of the nodes, i.e., tendency of high degree node pairs having high correlation (highlighted by low distance values) and low degree node pairs having low correlation (highlighted by high distance values). During a bad financial time period the system tends to have small distances irrespective of degree of nodes. This implies that highly connected nodes as well as less connected nodes move in same direction hence they are highly correlated. In case of good economic periods, nodes irrespective of their degree move in different direction which will yield varying values of distances irrespective of the degree of nodes. The network diameter represents the largest distance between nodes and average shortest path is the mean of all shortest paths between nodes in the system. In the event of a crisis the network diameter and the average shortest path are expected to shrink due to the formation of additional links (more correlations). Since the PS based method is able to capture additional correlations more comprehensively than CC based method, the PS based approach would produce more links than the CC based method. This would cause the PS based approach to yield marginally lower values of average shortest path and low values of network diameter than CC based method.

18

5. Results and discussions

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

20

Here we discuss the results pertaining to currency exchange rate and stock index applications. These results were obtained by analysing properties of MST networks constructed using PS and CC based methods.

21

5.1. Results for currency exchange rates

19

22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

First, we compare endpoint degree ki kj as a function of average distance ⟨dij ⟩. The endpoint degree of an edge is the product of the degrees of nodes the edge is incident on. Fig. 4(a) and (b) show plots of average distance versus endpoint degree for networks constructed from PS-MST and CC-MST respectively. A first look at the plots suggests that the relationship between the endpoint degree and average distance is random. A closer observation of the data corresponding to 1997 reveals that the average distance for almost every value of ki kj is below 0.2, and the average distances for other time windows predominantly occupy levels between 0.2 and 0.45. This observation is an indication of change in the structure of the currency exchange network during the time of turbulence. In the case of CC-MST, the average distances for the 1997 time window also predominantly occupy levels below 0.2, but there is also a considerable occupation in this level by data points from other time windows. Second, we compare the variation of betweenness centrality of nodes with respect to their degree. Betweenness centrality of a node is the fraction of shortest paths between node pairs that pass through that node. In simple terms it is the measure of the influence a node has on controlling flow through the network. In the case of both PS-MST (Fig. 5(a)) and CS-MST (Fig. 5(b)), the betweenness centrality is observed to increase with node degree; however the upward trend appears to hold stable up to node degree 3. There is a considerable variation in centrality observed between time windows for nodes above degree 4. For instance, nodes of degree 5 in Fig. 5(b) have an average centrality between 0.25 and 1.0 whereas nodes of degree 3 have an average centrality between 0.22 and 0.4. These observations indicate the presence of two types of high degree nodes or hubs— prominent players in the network and inactive hubs lying outside the main spine of the minimum spanning tree [37]. In the case of PS-MST, 1997 and 2008 time windows have an average centrality lower than the other time windows indicating an influence of external events—the Thai currency devaluation and the subprime mortgage crisis, on the network’s structure. A comparison of high degree nodes or hubs across all time windows showed that the position of hubs in the minimum spanning tree changes from time to time. Fig. 6(a) and Fig. 7(a) show how the positions of the high degree nodes changed their positions during the time windows considered. For both PS-MST and CC-MST there are very few nodes that retain the top three degrees over a long period of time. In the case of PS-MST, North Korean Won, UAE Dirham, Chinese Yuan, Singapore Dollar, and Taiwan Dollar are the major hubs in the minimum spanning tree as they fall within the top three degrees in at

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

7

Fig. 4. Average distance versus endpoint degree; (a) for PS-MST; (b) for CC-MST.

Fig. 5. Average betweenness centrality; (a) for PS-MST; (b) for CC-MST.

least three of the ten time windows (see Fig. 6(b)). In the case of CC-MST, North Korean Won, Chinese Yuan, Qatari Riyal, Danish Krone, Singapore Dollar, and Macau Pataca are the major hubs (see Fig. 7(b)). A comparison of diameters of currency networks and average shortest paths for the ten time windows including 1997 is shown in Fig. 8. Network diameter is the minimum of all the shortest paths computed in the MST and the average shortest path is the mean of shortest paths between every pair of nodes in a network. Both PS-MST and CC-MST show a low point during periods of market turbulence, i.e. in years 1997 and 2008. For the time windows after 2008, the networks responded with a general increase in diameter and average shortest path with the exception of the 2011 time window corresponding to CC-MST. This observation shows that the exchange networks tend to reduce distances between nodes (or shrink as a whole) during crisis events; and increase distances (or expand) during periods of stability. We observe no significant difference in the characteristics of PS based MST and CC based MST. This is because the time series pertaining to currency exchange rates do not experience considerable time lag. Changes in one currency affect other currency rates instantaneously or with negligible time lag. 5.2. Results for stock prices Fig. 9(a) and (b) shows the variation between average distance and endpoint degree for PS base MST and CC based MST respectively. Years 2011, 2012 and 2003 indicate bull markets and years 2008, 1997, and 1996 indicate bear markets. Both PS based MST and CC based MST show a rapid drop in values of average distance till endpoint degree value of 10. For endpoint degree values greater than 10 the average distance tend to plateau for both PS based MST and CC based MST. In addition both PS based MST and CC based MST yield lower values of average distance for bear markets. The characteristically low average distance could serve as a potential indicator of market downturns. However it is important to note two interesting points: (1) PS based MST yields nodes with higher endpoint degree than those of the CC based MST. This indicates nodes in

1 2 3 4 5 6 7 8 9 10 11 12 13

14

15 16 17 18 19 20 21

8

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

Fig. 6. (a) Variation of top three degree nodes by year for PS-MST; (b) number of times a node was within the top three degree nodes in all time windows.

Fig. 7. (a) Variation of top three degree nodes by year for CC-MST; (b) number of times a node was within the top three degree nodes in all time windows.

Fig. 8. (a) Network diameter and average shortest path; (a) for PS-MST; (b) for CC-MST.

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

9

Fig. 9. Average distance versus endpoint degree; (a) PS based MST; (b) CC based MST; (2011, 2012, and 2003 are bull market years; 1996, 1997, and 2008 are bear market years).

Fig. 10. Average distance versus endpoint degree; (a) PS based MST, (b) CC based MST; (2001, 2005, and 2006 are flat market years).

PS based MST have more links than those in the CC based MST. PS based method tends to generate additional links (nonlinear relations among time series) that have been missed by CC based method. (2) The average distance values generated by PS based method tend to be shorter than those generated by the CC based method. This in turn indicates that PS based method generates stronger correlations between nodes  This is because higher the correlation the shorter than the CC based methods. the distance according to the equation dij = 2(1 − CPRi,j ) or dij = 2(1 − Ci,j ). These two points support the assertion that PS based MST captures the dynamics highly fluctuating time series with a time lag more reliably than CC based MST. Fig. 10(a) and (b) shows the variation between average distance and endpoint degree for PS base MST and CC based MST respectively. Years 2001, 2005, and 2006 indicate flat markets. The observations for flat markets are consistent with observations for bull and bear markets. In the analysis of average betweenness centrality, we observe a general linear increase in average betweenness centrality with degree (see Fig. 11). With respect to the structure of an MST, this observation indicates that high degree nodes or stocks play a central role in the network. In Fig. 11(a) and (b) we see that both PS based MST and CC-MST have nodes with relatively high average centrality for nodes whose degrees are greater than five when the stock index tanked during 1996–1997 time period. But this behaviour is not observed for 2008. We note that the magnitude of the downturn during 1996 and 1997 was greater than that during 2008. Though there is no particular difference observed in the plots for bull and bear markets, bear markets appear to produce nodes of relatively large centrality. For example, in 1996, PS based MST produced nodes of degree 6 with an average centrality of 0.42 compared to 0.13 in 2008. In 1996, CC based MST produced nodes of degree 8 with an average centrality of 0.38 compared to 0.18 in 2008. During flat markets both PS based MST and CC based MST show large variation in centrality for degrees greater than 4 (see Fig. 12(a) and (b)). Analysis of network diameter and average shortest path shows that the metrics tend to follow the trends of the SET index value in case of PS-MST (see Fig. 13(a)), i.e., the average distance and diameter of the MST tend to shrink when the market is bearish (years 1996, 1997, and 2008) and, in contrast, they tend to increase when the market is bullish (years 2003, 2011 and 2012). This behaviour is not observed to the same extent in the case of CC-MST (see Fig. 13(b)). For example, we see

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

10

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

Fig. 11. Average centrality versus degree; (a) PS based MST; (b) CC based MST; (2011, 2012, and 2003 bull market years; 1996, 1997, 2008 bear market years).

Fig. 12. Average centrality versus degree; (a) PS based MST; (b) CC based MST; (2001, 2005, and 2006 are flat market years).

Fig. 13. Network diameter and average shortest path; (a) for PS-MST; (b) for CC-MST.

2

decrease in diameter and average shortest path in 1996, 1997 and 2008, but we see a sudden rise in the metrics in 1998 when the market was still bearish and in 2006 when the market was flat.

3

6. Conclusions

1

4 5 6 7 8 9

This study yields three important observations. (1) Both PS based MST and CC based MST are suitable in situation where the time series are linearly correlated with negligible time lag. (2) PS based MST turns out to be better measure than CC based MST when the time series are non-linearly correlated i.e. when they have time lag between them. (3) PS method is suitable for predictive analytics due to its ability to detect time-lagged correlations among time series. On the other hand CC based method detects changes between time series only when the changes occur simultaneously in time. We conclude that the PS based MST performs better than CC based MST for financial time series that have non-linear and/or time-lagged

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

11

Fig. A.1. Rossler signals and their respective phase trajectories. Signal 2 has embedded noise.

relationships. This observation is clearly supported by presence of more links between nodes in case of PS based MST than those in CC based MST. These comparisons are valid only to the network measures selected for analysis in this work. We cannot generalize the same observations to other network measures. In future we will undertake further comparisons between PS and CC based MSTs for different network measures and compile a comprehensive comparison result set between these two methods.

1 2 3 4 5

Uncited references

6

Q2

[39].

7

Acknowledgements

8

This work is part of a project titled Observing and Predicting Thai Stock Market Dynamics Using Phase synchronization Based Networks, and is partially supported by the University of the Thai Chamber of Commerce.

10

Appendix A

11

Example of Phase Synchronization concept—Rossler signals 1. Equations for generating Rossler signals

9

12

13

x˙ = −y − z

14

y˙ = x + ay

15

z˙ = b + z (x − c ).

16

2. Construct phase trajectories from the time series using Takens algorithm [32] Fig. A.1 shows two Rossler signals (signal 2 is with noise added) and their respective phase trajectories. 3. Generate the recurrence plots (see Fig. A.2) for each phase trajectory using the equation given below Ri ,j (ε) = Θ (ε − ∥xi − xj ∥) xi ∈ ℜ , i, j = 1, . . . , n. m

4. Compute the recurrence probability with time delay τ using the following equation

17 18 19 20

21

n−τ

 P (τ ) =

Ri,i+τ

i=1

n−τ

,

where τ = 1, 2, . . . , n − 1.

5. Compute the CPR value using the following equation CPRi,j =

⟨P¯i (τ )P¯j (τ )⟩ where τ = 1, 2, . . . , n − 1. σi σj

For the above mentioned signal 1 and signal 2 the CPR value is 0.9557

22

23

24

25

12

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx

Fig. A.2. Rossler signals and their respective phase trajectories. Signal 2 has embedded noise.

Fig. B.3. Lorentz signals and their respective phase trajectories. Signal 2 has embedded noise.

1

Appendix B

7

Comparison of PS based and Correlation based approach Fig. B.3 shows two signals (Signal 1 and Signal 2). Signal 2 is shifted every delta t steps. After each shift the correlation value and the PS value between two signals are computed. The correlation and PS values are shown in Fig. B.3. We can see that the correlation values approach PS values for every 2π shift. However the values of PS are close to 1 indicating the non-linear correlation between signal 1 and signal 2. For detailed study on effect of noise and lag on PS and correlation values please refer Blasco et al. [38].

8

References

2 3 4 5 6

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

[1] Y. Zhang, G.H.T. Lee, J.C. Wong, J.L. Kok, M. Prusty, S.A. Cheong, Will the US economy recover in 2010? A minimal spanning tree study, Physica A 390 (11) (2011) 2020–2050. [2] N. Frank, B. González-Hermosillo, H. Hesse, Transmission of liquidity shocks: Evidence from the 2007 Subprime Crisis, IMF Working Paper WP/08/200, August 2008. Available from http://www.banquefrance.eu/fr/publications/telechar/seminaires/2008/Hermosillo-. [3] A.W. Lo, Hedge funds, systemic risk, and the financial crisis of 2007–2008, written testimony prepared for the US House of Representatives Committee on Oversight and Government Reform, 13 November 2008. Revised version published as A.W. Lo, Regulatory reform in the wake of the financial crisis of 2007–2008, J. Financ. Econ. Policy 1(1) (2009) 4–43. [4] C. Tudor, Understanding the roots of the US Subprime Crisis and its subsequent effects, Rom. Econ. J. 31 (2009) 115–148. [5] W. Cheung, S. Fung, S.-C. Tsai, Global capital market interdependence and spillover effect of credit risk: evidence from the 2007–2009 global financial crisis, Appl. Financ. Econ. 20 (1) (2010) 85–103. [6] M.C. Münnix, R. Schäfer, O. Grothe, Estimating correlation and covariance matrices by weighting of market similarity, 30 Jun 2010. arXiv:1006.5847. [7] D.K.T. Wong, K.-W. Li, Comparing the performance of relative stock return differential and real exchange rate in two financial crises, Appl. Financ. Econ. 20 (1) (2010) 137–150. [8] V. Boginski, S. Butenko, P.M. Pardalos, Statistical analysis of financial networks, Comput. Statist. Data Anal. 48 (2) (2005) 431–443. [9] L. Laloux, P. Cizeau, J.-P. Bouchaud, M. Potters, Noise dressing of financial correlation matrices, Phys. Rev. Lett. 83 (7) (1999) 1467–1470. [10] V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, T. Guhr, H.E. Stanley, Random matrix approach to cross correlations in financial data, Phys. Rev. E 65 (6) (2002) 066126. [11] A. Utsugi, K. Ino, M. Oshikawa, Random matrix theory analysis of cross correlations in financial markets, Phys. Rev. E 70 (2) (2004) 026110. [12] D. Wilcox, T. Gebbie, On the analysis of cross-correlations in South African market data, Physica A 344 (2004) 294–298. [13] D. Wilcox, T. Gebbie, An analysis of cross-correlations in an emerging market, Physica A 375 (2007) 584–598. [14] S. Cu¸kur, M. Eryiğit, R. Eryiğit, Cross correlations in an emerging market financial data, Physica A 376 (2007) 555–564. [15] V. Kulkarni, N. Deo, Correlation and volatility in an Indian stock market: A random matrix approach, Eur. Phys. J. B 60 (2007) 101–109.

S. Radhakrishnan et al. / Physica A xx (xxxx) xxx–xxx [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39]

13

J. Shen, B. Zheng, Cross-correlation in financial dynamics, Europhys. Lett. 86 (2009) 48005. J.B. Kruskal Jr., On the shortest spanning subtree of a graph and the traveling salesman problem, Proc. Amer. Math. Soc. 7 (1956) 48–50. R.C. Prim, Shortest connection networks and some generalizations, Bell Syst. Tech. J. 36 (1957) 1389–1401. R.N. Mantegna, Hierarchical structure in financial markets, Eur. Phys. J. B 11 (1999) 193–197. G. Bonanno, N. Vandewalle, R.N. Mantegna, Taxonomy of stock market indices, Phys. Rev. E 62 (6) (2000) R7615–R7618. S. Miccich‘e, G. Bonanno, F. Lillo, R.N. Mantegna, Degree stability ofa minimum spanning tree of price return and volatility, Physica A 324 (2003) 66–73. G. Bonanno, G. Caldarelli, F. Lillo, R.N. Mantegna, Topology of correlation-based minimal spanning trees in real and model markets, Phys. Rev. E 68 (4) (2003) 046130. W.-S. Jung, S. Chae, J.-S. Yang, H.-T. Moon, Characteristics of the Korean stock market correlations, Physica A 361 (2006) 263–271. J.G. Brida, W.A. Risso, Dynamics and structure of the main Italian companies, Internat. J. Modern Phys. C 18 (11) (2007) 1783–1793. C. Borghesi, M. Marsili, S. Miccichè, Emergence of time-horizon invariant correlation structure in financial returns by subtraction of the market mode, Phys. Rev. E 76 (2) (2007) 026104. J.G. Brida, W.A. Risso, Multidimensional minimal spanning tree: The Dow Jones case, Physica A 387 (2008) 5205–5210. C. Eom, G. Oh, W.-S. Jung, H. Jeong, S. Kim, Topological properties of stock networks based on minimal spanning tree and random matrix theory in financial time series, Physica A 388 (2009) 900–906. J.G. Brida, W.A. Risso, Dynamics and structure of the 30 largest North American companies, Comput. Econ. 35 (1) (2010) 85–99. F. Huang, P. Gao, Y. Wang, Comparison of Prim and Kruskal on Shanghai and Shenzhen 300 Index hierarchical structure tree, in: Web Information Systems and Mining, 2009. WISM 2009. International Conference on, IEEE, 2009, pp. 237–241. S. Strogatz, SYNC, Hyperion Press, 2002. R.V. Donner, Y. Zou, J.F. Donges, N. Marwan, J. Kurths, Recurrence networks—A novel paradigm for nonlinear time series analysis, New J. Phys. 12 (3) (2010). F. Takens, Detecting Strange Attractors in Turbulence, Springer, Berlin, Heidelberg, 1981, pp. 366–381. M.C. Romano, M. Thiel, J. Kurths, I.Z. Kiss, J.L. Hudson, Detection of synchronization for non-phase-coherent and non-stationary data, Europhys. Lett. EPL 71 (3) (2005) 466–472. B. Goswami, G. Ambika, N. Marwan, J. Kurths, On interrelations of recurrences and connectivity trends between stock indices, Physica A 391 (18) (2012) 4364–4376. N. Marwan, M.C. Romano, M. Thiel, J. Kurths, Recurrence plots for the analysis of complex systems, Phys. Rep. 438 (2007) 237–329. J. Nagayasu, Currency crisis and contagion, J. Asian Econ. 12 (4) (2001) 29–546. J.-P. Onnela1, A. Chakraborti1, K. Kaski1, J. Kertesz, Dynamic asset trees and portfolio analysis, Eur. Phys. J. B (30) (2003) 285–288. R. Blasco, M. Carmen, Synchronization analysis by means of recurrences in phase space, (Doctoral dissertation), Universitätsbibliothek, 2004. R. Hogg, A. Craig, J. McKean, Introduction to Mathematical Statistics, Macmillan, New York, 1959.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24