Physical approach to complex systems

Physical approach to complex systems

Physics Reports 515 (2012) 115–226 Contents lists available at SciVerse ScienceDirect Physics Reports journal homepage: www.elsevier.com/locate/phys...

8MB Sizes 2 Downloads 676 Views

Physics Reports 515 (2012) 115–226

Contents lists available at SciVerse ScienceDirect

Physics Reports journal homepage: www.elsevier.com/locate/physrep

Physical approach to complex systems Jarosław Kwapień a,∗ , Stanisław Drożdż a,b a

Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, PL–31-342 Kraków, Poland

b

Institute of Computer Science, Faculty of Physics, Mathematics and Computer Science, Cracow University of Technology, PL–31-155 Kraków, Poland

article

info

Article history: Accepted 28 December 2011 Available online 18 January 2012 editor: I. Procaccia Keywords: Complex systems Complexity measures Correlations Asymmetric correlations Coexistence of collectivity and noise Random matrix theory Time series analysis Fractals Multifractals Critical phenomena Complex networks Financial markets Human brain Quantitative linguistics

abstract Typically, complex systems are natural or social systems which consist of a large number of nonlinearly interacting elements. These systems are open, they interchange information or mass with environment and constantly modify their internal structure and patterns of activity in the process of self-organization. As a result, they are flexible and easily adapt to variable external conditions. However, the most striking property of such systems is the existence of emergent phenomena which cannot be simply derived or predicted solely from the knowledge of the systems’ structure and the interactions among their individual elements. This property points to the holistic approaches which require giving parallel descriptions of the same system on different levels of its organization. There is strong evidence – consolidated also in the present review – that different, even apparently disparate complex systems can have astonishingly similar characteristics both in their structure and in their behaviour. One can thus expect the existence of some common, universal laws that govern their properties. Physics methodology proves helpful in addressing many of the related issues. In this review, we advocate some of the computational methods which in our opinion are especially fruitful in extracting information on selected – but at the same time most representative – complex systems like human brain, financial markets and natural language, from the time series representing the observables associated with these systems. The properties we focus on comprise the collective effects and their coexistence with noise, long-range interactions, the interplay between determinism and flexibility in evolution, scale invariance, criticality, multifractality and hierarchical structure. The methods described either originate from ‘‘hard’’ physics – like the random matrix theory – and then were transmitted to other fields of science via the field of complex systems research, or they originated elsewhere but turned out to be very useful also in physics — like, for example, fractal geometry. Further methods discussed borrow from the formalism of complex networks, from the theory of critical phenomena and from nonextensive statistical mechanics. Each of these methods is helpful in analyses of specific aspects of complexity and all of them are mutually complementary. © 2012 Elsevier B.V. All rights reserved.

Contents 1.



Complex systems .................................................................................................................................................................................... 117 1.1. Physics and complexity .............................................................................................................................................................. 117 1.2. Notion of complexity .................................................................................................................................................................. 118 1.3. Properties of complex systems .................................................................................................................................................. 120 1.3.1. Self-organization ......................................................................................................................................................... 121

Corresponding author. E-mail address: [email protected] (J. Kwapień).

0370-1573/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.physrep.2012.01.007

116

2.

3.

4.

5.

6.

7.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

1.3.2. Coexistence of collective effects and noise ................................................................................................................ 121 1.3.3. Variability and adaptability ........................................................................................................................................ 122 1.3.4. Hierarchical structure.................................................................................................................................................. 123 1.3.5. Scale invariance ........................................................................................................................................................... 124 1.3.6. Self-organized criticality ............................................................................................................................................. 125 1.3.7. Highly optimized tolerance......................................................................................................................................... 126 Description of selected complex systems.............................................................................................................................................. 127 2.1. The human brain......................................................................................................................................................................... 127 2.1.1. Functional organization of the brain .......................................................................................................................... 127 2.1.2. Metastability of the brain state................................................................................................................................... 128 2.1.3. Functional imaging of the brain.................................................................................................................................. 129 2.2. Natural language......................................................................................................................................................................... 129 2.3. Financial markets........................................................................................................................................................................ 131 2.3.1. Financial markets as complex systems ...................................................................................................................... 131 2.3.2. Structure of the financial markets .............................................................................................................................. 132 Coexistence of collectivity and noise ..................................................................................................................................................... 133 3.1. Methods of identification of collective effects in empirical data............................................................................................. 133 3.1.1. Two degrees of freedom .............................................................................................................................................. 133 3.1.2. Multivariate data ......................................................................................................................................................... 133 3.1.3. Ensemble of Wishart random matrices...................................................................................................................... 134 3.1.4. Ensemble of correlated Wishart matrices.................................................................................................................. 135 3.1.5. Mutual information in many dimensions .................................................................................................................. 136 3.2. Collective effects in stock markets ............................................................................................................................................ 136 3.2.1. Data............................................................................................................................................................................... 136 3.2.2. Structure of empirical correlation matrix .................................................................................................................. 136 3.2.3. Information masked by noise ..................................................................................................................................... 140 3.2.4. Temporal scales of coupling formation ...................................................................................................................... 141 3.3. Collective effects in the currency market.................................................................................................................................. 145 3.3.1. Complete multi-dimensional market structure ........................................................................................................ 146 3.3.2. Market structure for a given base currency ............................................................................................................... 146 3.3.3. Stability of market structure ....................................................................................................................................... 150 Repeatable and variable patterns of activity ......................................................................................................................................... 153 4.1. Financial markets........................................................................................................................................................................ 153 4.2. Cerebral cortex............................................................................................................................................................................ 156 4.2.1. Magnetoencephalography .......................................................................................................................................... 157 4.2.2. The experiment and preprocessing of data................................................................................................................ 157 4.2.3. Correlation matrix structure ....................................................................................................................................... 158 4.2.4. Identification of the activity patterns......................................................................................................................... 160 Long-range interactions.......................................................................................................................................................................... 163 5.1. Asymmetric correlation matrix ................................................................................................................................................. 163 5.2. Long-range interactions in the human brain ............................................................................................................................ 165 5.2.1. Visual cortex ................................................................................................................................................................ 165 5.2.2. Auditory cortex ............................................................................................................................................................ 169 5.3. Long-distance couplings in financial markets .......................................................................................................................... 172 5.4. ..................................................................................................................................................................................................... 174 Lack of characteristic scale ..................................................................................................................................................................... 174 6.1. Power laws in financial data ...................................................................................................................................................... 175 6.1.1. Central Limit Theorem and distribution stability ...................................................................................................... 175 6.1.2. Stylized facts ................................................................................................................................................................ 175 6.1.3. Impact of collective couplings on return distributions ............................................................................................. 179 6.1.4. Acceleration of market evolution ............................................................................................................................... 181 6.2. Stock market as a nonextensive system.................................................................................................................................... 183 6.3. The Zipf law and the natural language ...................................................................................................................................... 184 6.3.1. The linguistic Zipf law ................................................................................................................................................. 184 6.3.2. Deviations from the uniform Zipf law ........................................................................................................................ 186 6.3.3. Parts of speech and lemmatization ............................................................................................................................ 187 Fractality and multifractality.................................................................................................................................................................. 190 7.1. Fractals, multifractals and criticality ......................................................................................................................................... 191 7.2. Multifractal character of financial data ..................................................................................................................................... 193 7.2.1. Multiscaling in foreign currency and stock markets ................................................................................................. 194 7.2.2. Microscopic price increments and transaction waiting times.................................................................................. 197 7.2.3. Sources of multifractality ............................................................................................................................................ 198 7.3. Speculative bubbles and critical phenomena ........................................................................................................................... 201 7.3.1. Discrete scale invariance ............................................................................................................................................. 201 7.3.2. Log-periodic oscillations as a part of financial market dynamics ............................................................................ 202 7.3.3. Prediction potential of the log-periodic model ......................................................................................................... 203

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

8.

9.

117

Network representation of complex systems ....................................................................................................................................... 205 8.1. Network formalism and basic properties of networks............................................................................................................. 206 8.1.1. Measures of network properties................................................................................................................................. 206 8.1.2. Scale-free networks ..................................................................................................................................................... 206 8.1.3. Minimal spanning trees............................................................................................................................................... 207 8.2. Financial markets as complex networks ................................................................................................................................... 208 8.2.1. Network representation of the stock market ............................................................................................................ 208 8.2.2. Network representation of the currency market ...................................................................................................... 211 Summary and outlook ............................................................................................................................................................................ 220 Acknowledgements................................................................................................................................................................................. 220 References................................................................................................................................................................................................ 221

1. Complex systems 1.1. Physics and complexity Since its early times, science, and especially physics, due to limitations in mathematical apparatus concentrated themselves on considering the simplest possible models of natural phenomena, models which were able to grasp the most important features (usually of certain practical importance) of those phenomena and, on this ground, to formulate predictions. For instance, the success of statics in designing machines and in construction allowed us to describe sometimes complicated, multicomponent systems as a simple result of interactions among those elements. On the other hand, subsequent success of the Newtonian theory of gravity and Maxwell theory of electromagnetism allow to explain distinct, apparently distant and unrelated phenomena by means of a few simple equations describing elementary interactions. As a direct consequence of these successful achievements the so-understood reductionism became a fundamental scientific imperative in physics and in science in general [1,2]. Going this way step by step it was possible not only to formulate the experimentally verified theories unifying some, and in future perhaps all, the fundamental interactions, but also to show that some systems and phenomena which were not considered before as being in the realm of interest of physics, like, e.g., the chemical compounds and their reactions, the structure and behaviour of living organisms, or the earth crust phenomena, are not more than a strict mathematical consequence of the elementary physical equations. From this point there is a straightforward way to a view that having got a hypothetical system of equations of the Theory of Everything [3], one would be able to understand every system existing in the Universe, provided one had an adequate computing capacity and complete knowledge of the initial conditions [4]. Although, in the past, many scientists shared this view, now it seems to be rather controversial. First, a computing power available in practice is too insufficient to be able to fully describe systems of rather simple structure, if they comprise a direct and simultaneous nonlinear interactions of many elements. This is evident even in a system as simple as that in the Newton’s three body problem [5]. This problem cannot be solved exactly and one has to apply an approximate method (the perturbation approach) instead. Somewhat more sophisticated though still a relatively simple system is the atomic nucleus where iterating involved approximations (mean field, Hartree–Fock–Bogoliubov, higher order configuration mixing effects, coupling to continuum) cannot be avoided [6]. In this context, solving equations for systems that are really much more complex than these, e.g., a living cell or the earth crust, based on a knowledge of their constituents (atoms) remains a task that is completely beyond our reach with the present or even imaginable-future computing capabilities. Second, the point that is somehow related to the first one, the knowledge of a system’s present state does not imply that we are able to know its initial conditions. On a quantum mechanical level, this stems from the probabilistic nature of the quantum theory and the irreversibility of the wave function collapse, while on a classical level, this originates from chaotic solutions of the relevant physical equations governing the dynamics of the system and the resulting sensitivity of these equations to initial conditions. One may argue that for highly complicated systems with hierarchical organization even the very notion of physical causality has to be redefined [7]. In this way one approaches the severe limitations of the reductionistic approach. On the one hand, it leads to a better understanding of fundamental aspects of the structure of the Universe and to confine them in a simple form of a few equations (a triumph of the so-called grand reductionism [8]). On the other hand, however, in a broad class of systems consisting of many interacting elements there emerge problems with a shift from details to a global picture — the problems that are impossible to overcome (a fail of the ‘‘petty’’ reductionism [8]). A possible way to work around this difficulty may be a complementary application of a holistic approach, i.e., such in which a description of each structural level is made independently by formulating specific laws whose application domain can (but not necessarily has to) be restricted to this level only [9,10]. Let us consider an example of a financial market which consists of the individual and institutional investors whose actions are driven by a desire to profit and by an aversion to risk. Although it is roughly known which ‘‘forces’’ govern the investors (in the simplest view, these are: greed and fear) and what are the interactions among them (information exchange and transactions), in practice this knowledge is far insufficient to be able to construct a realistic model of the market and, based on

118

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

it, to precisely predict the future movements of stock prices. Instead, those movements can be described with more success using some specific price-centred rules provided, e.g., by the technical analysis. Diving deeper into the market structure, the investor emotions are doubtlessly a product of the neuronal interactions in their brains. These emotions, however, can neither be explained nor predicted in such a ‘‘microscopic’’ way. The activity of neurons is in turn a consequence of certain chemical reactions which take place in the neurons themselves and in their surroundings, the chemical reactions in a living cell are a higher-level effect of the quantum properties of atoms, which on their own consist of the electrons and nucleons, and the latter are perhaps no more than the triples of strongly interacting quarks. By reasoning along this path, it is straightforward to guess that, theoretically, all the laws of financial economics must be a strict mathematical consequence of the four fundamental interactions among the elementary particles. Obviously, from a practical point of view, this way of thinking is strongly misleading, since it is impossible to derive those laws using such an upward approach. Thus, in order to be able to fully explain the financial market’s behaviour, one has to neglect the deeper levels of organization without any meaningful loss of information. As it is evident from the above example, the formulation of independent but complementary descriptions on different levels of matter organization is indispensable. It stems from the fact that ‘‘the whole is something beside the parts’’ [11] or ‘‘more is different’’ [12]: the phenomena occurring at higher levels may not be a straightforward product of the lowerlevel structure and dynamics of the system’s constituents. Illustrations of such phenomena are abundant: convection [13], turbulence [14], phase transitions [15], friction [16,17], fractal coastline [18], landforms and sand dunes [19,20], selfreplication of DNA [21], metabolic cycles in living cells [22–24], multicellular organisms [25,26], population dynamics in ecosystems [27,28], brain potentials and cognition [29], natural language syntax [30,31], money [32–34], business cycles in economy [35,36], social structure [37,38], etc. Their common property is emergence, that is, a spontaneous occurrence of macroscopic order from a sea of randomly interacting elements on a microscopic level. This is possible if the elements interact in a strongly nonlinear fashion and the interaction can be propagated over long distances. Local fluctuations can in this way transform themselves into a collective behaviour, depending on the environment state (context). Now based on the phenomenon of emergence we may formulate a working definition of a complex system; this definition will be used throughout the present work. According to it, a complex system is a system built from a large number of nonlinearly interacting constituents, which exhibits collective behaviour and, due to an exchange of energy or information with the environment, can easily modify its internal structure and patterns of activity. This definition is by no means mathematically rigorous and precise enough to serve as a practical criterion for any given system to be considered complex or simple. However, based on it, it is possible to indicate which systems are doubtlessly complex in the above sense and which, for sure, are not complex at all (even if they consist of a large number of interacting elements, like, for example, a gas in thermodynamical equilibrium). This definition has an advantage such that it takes into account the most characteristic physical properties of the structure and the dynamics of complex systems, which leads to its broadest use in physics. Systems which fulfil this definition are commonly seen in nature; they can also be products of human technology, like the Internet, communication networks, financial markets, and so on. Their ubiquity causes that different complex systems are traditionally the subjects of interest of many different fields of science with their diverse analytical tools and different languages: physics, chemistry, physiology, genetics, linguistics, economics, sociology, information theory and many other. Within each of these fields there are well-established laws regarding the macro- and microscopic properties of the related systems, but, from the classical perspective, their domain does not exceed a horizon of the interests of the corresponding scientific field. This is a rather unwanted heritage, however, since by adopting such a perspective one unavoidably misses the possibility of a parallel study of distinct complex systems and a derivation of (or at least guessing) more general, i.e., more fundamental laws. Only recently this paradigm has changed, mostly owing to a growing interest in interdisciplinary research [39]. Some even started to consider complex systems a topic of studies on its own and a new field of science: the complex systems research. It has been found that many complex systems, sometimes of very distinct structure have amazingly common properties (Section 1.3). This may suggest that those properties are universal for a broad class of systems and may be considered a starting point for studies aiming at formulation of specific laws for such systems. Importantly, among many scientific disciplines which contribute to study of complex systems, physics seems to be best equipped. There are two reasons for this. First, physics discovers fundamental laws of Nature; the omnipresence of complexity in the Universe indicates that anything which is universal for complexity is fundamental for Nature. Second, physics developed tools which can be useful in this research field: nonlinear dynamics [40], theory of critical phenomena [15], renormalization theory [41], theory of self-organized criticality [42], synergetics [43], and so on. For this reason physics, which until quite recently was considered a deep-level (reductionist) science by many other disciplines, e.g., by economics and neuroscience, and therefore completely inadequate for describing higher organization levels of matter, like, e.g., financial markets or human brain, nowadays becomes to be appreciated as a peer discipline. 1.2. Notion of complexity In order to approach a definition of complexity, one may preliminarily recall intuition and assume that complexity is a non-trivial regularity having its origin deep inside the system’s structure which, at first sight, should not reveal any regularity. However, the so-understood complexity is, to a large degree, a subjective notion (‘‘complexity is in the eye of

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

119

the beholder’’ [44]). Unfortunately, introducing an objective, mathematically rigorous, intuition-compliant, and in addition commonly accepted definition, and providing a way of quantitatively describing this phenomenon is enormously difficult. This largely comes from a rich diversity of systems which can be called complex and their placement at an interface of different scientific disciplines. In a consequence, despite several tens of proposed measures [45], none is versatile enough to be applicable in all cases. In general, a few main groups of measures, referring to different aspects of the structure or evolution of the complex systems, can be distinguished. A short review of the used measures can worthy be started with the deterministic measures introduced in information theory. It is this discipline where the notion of complexity occurred for the first time and where the first attempts of formulating its formal definition were carried out. The reasoning direction is as follows: if one deals with a measure describing complexity, its natural expression in respect to a given problem is how complicated is the problem’s description. From a deterministic perspective, by the complexity of a problem one may understand the amount of resources that have to be used in order to exactly solve the problem. Here a classic quantity is the algorithmic complexity of a symbolic sequence, introduced by Kolmogorov [46], which describes the binary length of the shortest algorithm that can reproduce the given sequence. This quantity, even though in practice it cannot exactly be calculated, plays an important role in information theory, allowing one to estimate what is the asymptotic behaviour of the demand for computing resources with a growing problem’s size [47]. It is worthy of remark that the algorithmic complexity makes any sense as a complexity measure only in respect to data with some regular sequences; in the case of random data it loses, however, functionality due to the fact that its indications then contradict an intuitive understanding of randomness as a trivial phenomenon. A possible course can be introduction of a measure sensitive only to regularities and neglecting random sequences by design (the effective complexity [47,48]) or a measure describing the time needed to reproduce a given sequence of symbols from its compressed representation (the logical depth [49,50]). In the latter one the requirement for the algorithm to be of minimum length (minimum algorithmic complexity) is shifted, and the longer yet temporally more effective algorithms are admissible. In this view, the complex (‘‘deep’’) problems require large computational effort. It can be shown that, in accordance with this definition, completely random sequences are logically ‘‘shallow’’ [49]. Logical depth may also be considered a measure of the structural complexity of physical systems. This is because the processes leading to formation of systems with high degree of inner organization need long time of action — it is rather improbable that a simple system become complex in a fast process. This physically reflects the long computation time essential to generate complex information [51]. Staying at the interface between information theory and physics, one may introduce a notion of complexity as computing capacity. It states that a system may be called complex if it is equivalent to a universal Turing machine [52]. As it was shown based on some cellular automata models and quasi-physical systems (e.g., hard sphere gas [53]), with carefully selected initial and boundary conditions those models can be computationally universal and can simulate any physical process. This implies that from an initially trivial structure such a model system can evolve to an arbitrarily complex structure. An advantage of using this definition of complexity is that for one to be allowed to assign a given system to a particular category it suffices to show that this system is capable (or incapable) of doing universal calculations. From the pure physics’ perspective, however, such an approach has a serious drawback: it does not distinguish between the systems which have truly evolved to a complex state (their actual state is an effect of long-acting processes) and the systems which are potentially capable of such evolution but nevertheless remaining in a state of simple structure [51]. Statistical measures of complexity express it as an amount of excess information on the macroscopic state of the system that can be obtained by analysing its microscopic state. An example of measures from this group is information entropy [54], defining an amount of information transferred in a message or obtained in a measurement. If pi is probability that a system can be found in a state i selected out of N allowed states, then the information entropy HI is given by: HI = −

N 

pi ln pi .

(1)

i=1

This formula describes a number of information bits about the system’s state which can be obtained by receiving a message or taking a measurement. Although the information entropy is a useful measure, it assigns the highest values to the messages or states that result from a series of random events. Because of this property, the information entropy cannot be considered an appropriate tool for quantifying complexity. Other statistical measures available that can be considered in the context of complexity are founded on the observation that systems being in thermodynamical equilibrium with the environment exhibit rather a simple structure, in contrast to the systems commonly referred to as complex – e.g., a living cell, a human brain, an ecosystem – which remain in a state of strong non-equilibrium with the environment. What comes to mind in this context is the notion of free energy stored in a physical system. However, an amount of free energy in a given system is not proportional to its intuitive level of complexity. As an example one may consider a plant which, as being a dissipative system, has lower free energy than the mineral and organic compounds from which it takes energy, but it also has an incomparably more complex structure. This issue can be workaround by introducing the thermodynamical depth [55], being a physical counterpart of the logical depth and quantifying the minimum entropy of the physical processes and events which were potentially able to create the system. (This quantity can be viewed a formal extension of an earlier concept of negentropy [56].) Thermodynamical depth

120

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

of a system in a state X is defined by DT (X ) = min{−k ln pi (X )}, i

(2)

where pi (X ) is the probability that the system arrived at the state X on route i, and k is the Boltzmann constant. However, taking the minimum over all feasible trajectories i is not possible in practice. One therefore has to choose the most probable trajectory instead. This measure can be understood as expressing the entropy transferred to other degrees of freedom than those which characterize the current state of the system. It is a measure of thermodynamical effort which has to be made in order to create the given system. Although the thermodynamical depth works well as a qualitative tool of characterizing physical systems and is consistent with intuition, it largely fails as a method of quantitative description of such systems, because it is virtually impossible to provide a complete series of thermodynamical events forming history of a system. Moreover, there is another problem difficult to decide, namely, whether one has to consider only those events which are related to the system on the considered level of hierarchy, or one has to take into consideration also the deeper levels of structure. Viewing the system from a more global perspective and taking into consideration the fact that the system’s structure can be described by different degrees of complexity on different levels of organization (different scales), is a characteristic property of multiscale complexity [57,58]. This measure is based on the information entropy and quantitatively defines the system’s complexity as a sum of the entropies of subsystems on its different levels of organization. Thanks to its properties, especially the functional dependence on scale, the multiscale complexity permits one to distinguish between the systems which exhibit the intuitive complexity and the systems which do not. Applicability of the multiscale complexity is, however, restricted to the systems whose components’ state is described by a known probability distribution. This, however, makes an application of this measure to natural systems rather difficult. Another view on complexity refers to statistical dependences among a system’s constituents: the more elements show some functional connections, the higher is the complexity of the analysed system (the elements have to be distant enough to assure that their dependences are not a trivial effect of the short-range interactions between close neighbours). This can be expressed by the information entropy which for such correlated elements/subsystems ceases to be additive. The sointerpreted complexity can be quantified, e.g., in terms of mutual information which describes the amount of information about a subsystem that can be obtained by observation of the other subsystems [59]. For two subsystems X1 , X2 , the mutual information is defined by: I (X1 , X2 ) = HI (X1 ) + HI (X2 ) − HI (X1 , X2 ),

(3)

where HI (X1 , X2 ) is the joint entropy of X1 and X2 . According to it a system can be considered complex if 0 < I (X1 , X2 ) < HI (X1 ) and, in addition, it fulfils the condition that the observed statistical dependences are a consequence of a slow process rather than sudden global events [60]. Finally, complexity can be viewed as irregularity of a system’s structure. An intuitive example of a related quantitative measure is fractal dimension. It can be applied to describe spatial structure and temporal evolution of both the mathematical objects and the physical systems. On the other hand, it cannot be used to describe the systems which do not possess any obvious fractal structure even if they do reveal complex organization and complex activity patterns, like, e.g., living cell or financial market. The above review does not exhaust all the ways in which complexity can be defined or expressed. It, however, allows one to note that any attempts at quantitatively describing so diverse class of complex systems and complex signals by means of a single measure can be deemed hopeless. Even if some quantity proves useful in some aspect of the phenomenon of complexity and it gives results that agree with intuition, one can face profound difficulties with either applying it in empirical studies (as it is with the thermodynamical and the logical depths) or it is not universal enough to be applicable as a decisive measure in each case (as the fractal dimension or the mutual information). For that reason, instead of such an approach, which may be called holistic, a more local one is necessary. This means that one has to concentrate on looking for the properties that are characteristic for the complex systems and describe these properties quantitatively with help of the tools (from mathematics, statistical physics, or information theory) adequate to a particular situation. In this sense, identification of such quantitative characteristics in a given system may be considered a manifestation of its complex structure. In physics, this approach to research on complexity of natural systems seems to be nowadays the most common. 1.3. Properties of complex systems In this section we shall review a few of the most characteristic properties of the structure and behaviour of the systems which are commonly referred to as complex. Presenting such a review is possible because of the fact that complex systems differing in morphology, being built from distinct elements on a microscopic level, and even occupying different levels of the hierarchical organization of matter (some systems are built from simple elements, i.e., not exhibiting complexity, while other systems are built from the elements that are complex systems on their own) can exhibit amazingly similar macroscopic structure and also similar behaviour in specific situations. This may constitute – although this is still essentially an open problem – a manifestation of universality of the physical phenomena underlying existence of such systems.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

121

Fig. 1. Bifurcation of a stable solution of the Eq. (4) observed while the parameter γ crosses 0 in a negative direction. Stable solutions are denoted by solid lines and an unstable solution by a dashed line.

1.3.1. Self-organization A property which distinguishes complex systems from other natural systems is self-organization. This is a process of continuous modification of a system’s internal structure owing to which an order spontaneously emerges and complexity increases. This process is facilitated by interaction of the system with its environment. The phenomenon of self-organization is particularly striking in biological and social systems, where the cooperation and specialization of the elements occurs at different levels of organization, but this phenomenon can also be observed in less spectacular way in other systems and processes as well, like, e.g., in chemical reactions, in crystal growth, or in formation of dunes. Due to the self-organization, from an initial microscopic disorder a strong order emerges in a form of macroscopic structure and global activity patterns. The mechanism of self-organization has its origin in non-equilibrium processes in which – by changing some external parameter – an initially stable system passes through an unstable fixed point [61]. The systems in which such a phenomenon occurs have to be situated far from a thermodynamical equilibrium point. In an equilibrium state which is a stable solution of equations describing the system’s dynamics, any fluctuations destabilizing the system are damped and do not cause any persistent macroscopic effects. Even if the boundary conditions imposed on the system do not allow it to return to equilibrium state exactly, it inclines to reside as close to this state as possible, what is expressed by the principle of minimum entropy production [62]. A different picture can be seen far from equilibrium where – due to changing value of the control parameter as a consequence of permanent interaction with the environment – other stable solutions can occur in bifurcation (Fig. 1). Then, by passing through a bifurcation point, the system becomes unstable and even tiny microscopic fluctuations might now be nonlinearly amplified causing the system to move to a new stable state. A so-called dissipative structure is formed (the system produces entropy). This may be described as an emergence of order due to a spontaneous symmetry breaking. Mathematically, it is expressed by a non-zero value of the system-specific order parameter. The bifurcation appearance can be discussed in a simplified example of an anharmonic oscillator expressed by the equation: q˙ (t ) = −γ q(t ) − α q3 (t ),

α > 0.

(4)

In an equilibrium state q˙ (t ) = 0 and then the solutions of Eq. (4) read:

γ >0: γ <0:

q=0 q0 = 0

(5) and q1,2

 = ± |γ |/α.

(6)

In the first case the unique solution is stable, while in the second case only the solutions q1 and q2 are stable. The potential V (q) associated with these two cases is shown in Fig. 2. One of the simplest systems in which very occurrence of sufficiently large deviation from equilibrium causes spontaneous self-organization is liquid layered between two horizontal plates in gravitation field. The lower and upper plates have temperatures T1 and T2 , respectively, satisfying the condition T1 > T2 . With sufficiently large temperature difference, convection occurs and one observes the occurrence of the Bénard cells, which reflect an increase of a degree of inner organization. Other examples of the spontaneous symmetry breaking are the occurrence of non-zero magnetization of a ferromagnetic sample below the Curie point under external variable magnetic field, and start of laser action following a sufficiently strong population inversion in atoms has been achieved. Similar principle underlies the process of cellular differentiation which is fundamental for living organisms [63]. 1.3.2. Coexistence of collective effects and noise The phenomenon of self-organization can be viewed as an emergence of order due to significant reduction of the effective number of degrees of freedom in a system when – after change of values of external parameters – it gets close to an instability

122

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

b

Fig. 2. Anharmonic oscillator potential (4) depending on the order parameter q. (a) For γ > 0 the unique solution q = 0 is stable. (b) For γ < 0 the solution q = 0 becomes unstable, fluctuations lead to a spontaneous symmetry breaking and the system falls into any of the two equivalent potential minima.

point. Let a system with n degrees of freedom i represented by variables qi (t ) be described by equations resembling Eq. (4): q˙i (t ) = −γi qi (t ) + fi (qi (t )),

i = 1, . . . , n,

(7)

where γi ≥ 0, and the functions fi are nonlinear, and let the following two groups of degrees of freedom can be distinguished (s) in this system: the stable (or fast) degrees qi in which perturbations are quickly damped, and the unstable (or slow) degrees (s)

in which damping is weak (γi (s)

≫ γj(u) ). If a stable degree is perturbed, it can subordinate the other degrees:

(u)

qi (t ) ≃ g (qj (t )),

(8) (u)

where g is in general a nonlinear function. Such an unstable degree qj can be identified with an order parameter. Eq. (8) expresses the so-called slaving principle [43]. The phenomenon of decreasing the effective number of freedom degrees in a system manifests itself in collective behaviours which characterize the system on a macroscopic level. In natural complex systems it occurs rarely that all degrees of freedom undergo coherent evolution. Prevalently, the coupled degrees coexist with the degrees displaying individual behaviour. The proportions between the number of collective and the number of individual degrees vary in different systems and it also depends on a particular state in which the system is found. There exist the states of profound coherence (alpha waves in the brain, panic in financial markets, bird migrations, to list a few) as well as the states in which collective behaviour is much weaker (vigilance of the brain, stagnation periods in financial markets, bird’s flights on the breeding ground). However, even in an ordered state, a marginal presence of noise is favourable since it facilitates flexibility of the system and its ability to penetrate the phase space in search of new stable states. 1.3.3. Variability and adaptability Emergence of order does not have to be permanent. Sooner or later external parameters change their values once again and previous state of the system become unstable. The systems passes through a next bifurcation point and reaches, as previously, a new stable state (Fig. 2). Another possible situation happens if in a close vicinity of the potential minimum in which the system resides there exists or there occurs a new, favourable minimum which is separated from the former one by a potential barrier. Fluctuations then allow the system to overcome the barrier and jump to a more stable state. This situation is seen in Fig. 3. In this new state the system can have different structure and reveal different behaviour than in the preceding state. This way the evolution of the system runs through a series of consecutive metastable states and transitions between them. Reaching a state that is more stable than the system’s preceding state can be viewed as increasing of the system’s adaptation to external conditions. Macroscopically, the more flexible is the system’s internal structure (the better is its reorganization capability), the faster it can adapt as a whole. A different picture can be seen at the microscale level, where some degrees of freedom can undergo destruction while new ones can be created at the same time. This does not have any impact on the very existence of the system at the macroscale, however. This is well visible in biological systems in which individual cells (in multicellular organisms) or species (in ecosystems) can be recognized as degrees of freedom. Which particular degrees survive a transition through unstable states depends on their adaptation ability to new conditions. Let us consider an exemplary physical system — a laser. A parameter that can determine the fitness of a given degree of freedom ωi (i.e., an electromagnetic field mode) is the lifetime τi of the corresponding photon inside the laser cavity. It can be shown that if different modes are characterized by different values of τi , the laser action occurs for a mode with the longest lifetime τmax which is also closest to the resonance frequency. Vanishing of the other modes while the laser passes through the action threshold (which is a kind of phase transition) may be described as a competition between the modes ended with the win of the fittest one [43]. The above example connects the physical perspective expressed in terms of processes near a critical point with the evolutionary perspective based on the Darwinian theory of natural selection. According to the latter, a key characteristics of

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

123

Fig. 3. An example of a potential with three stable points. Initially, the system resides in a metastable state at q2 , but due to fluctuations it overcomes a barrier and passes to one of the two favourable states at q1 or q3 .

a given degree of freedom (e.g., a species) is its ability to reproduce under a pressure exerted by the competition for limited resources with other degrees of freedom. This competition critically increases when the system is destabilized by either a change of external parameters (climate, catastrophes etc.) or a sudden occurrence of a new degree of freedom due to a random mutation. Which degree then wins is determined by its ability to increase its strength (e.g., the species’ population) at the expense of other degrees. Once new relations among the degrees of freedom are established, the system arrives at a new equilibrium state. Adaptability requires an existence of both the fluctuations in a system and the relatively low potential barriers between stable states. These two conditions allow the system to quickly transit from one minimum of the potential to another minimum. On the other hand, the fluctuations must not be too strong and the barriers must not be too low in order for the system to have any stable states. A competition between these two contradictory expectations can be controlled by the system through manipulation of its size: the larger the system is, the relatively weaker are the fluctuations (they grow as N 1/2 , where N is the system’s size). A similar issue is an influence of the environment via the surface exchange of energy and matter: it is the strongest in the case of small systems. Since such exchange tries to restore the thermodynamical equilibrium between the system and the environment, only large systems, in which surface interactions can be neglected, can show permanent self-organization. This is one of the causes due to which the stable complex systems have to consist of a large number of degrees of freedom. 1.3.4. Hierarchical structure The majority of complex systems display multilevel structure organization, in which individual elements from higher structural levels are on their own complex systems at lower structural levels. Such a multilevel organization has an important advantage: the formation of higher structural levels leads to increase of the number of structural configurations that are available for the system and, in consequence, it amplifies the system’s optimization potential. A principal mechanism which makes the emergence of the higher-order structural forms possible is the above-described slaving principle: a coupling of a number of (situation-specific) degrees of freedom which lead to effective decrease of the total number of such degrees in a system, even to only one in the uttermost case. From an external observer’s point of view the whole system begins to function as if it had only a single degree of freedom (for example, a single laser mode or a nonzero magnetic moment of a ferromagnetic domain). If there is more than one such a system with strongly coupled degrees of freedom and if these systems interact, they can form a higher-level complex structure. In more complicated systems, especially in the chemical, biological and social ones, a coupling of the degrees of freedom and the resulting hierarchical structure can be formed by way of transferring the Darwinian evolution from a lower level of organization to a higher one [26]. This happens when, at some moment, an amount of available resources is sufficient so that not all the degrees of freedom have to compete for these resources. It may happen then that an appearance of interdependences among some of the degrees provides them with more benefits than does the competition against each other. While cooperating, the degrees from such a group become, however, vulnerable to destruction of any of them. To prevent such a scenario from happening, the group develops a barrier which separates it from the external world. From this moment the group becomes a single entity at a higher level of hierarchy and it can start competing against similar entities in the Darwinian way. A system with a multilevel hierarchical structure that can possibly evolve in this manner is a living multicellular organism. Constructive cooperation between degrees of freedom is evident here on three levels: the molecular level, the cellular constituents level, and the cellular level itself. Another type of hierarchical structure is the one in which a hierarchy emerges among the elements forming a system. This can be seen, for example, in complex networks where nodes with many connections coexist with nodes with few connections or even a single one, in a settlement structure on a given territory where more-populated settlements coexist with less-populated ones, or in a structure of seismic faults with different lengths.

124

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

1.3.5. Scale invariance There is another common property of complex systems which is related to the hierarchical structure discussed above: a lack of characteristic scale. This means that both the structure of, and the processes taking place inside such systems are the same over a broad range of spatial and temporal scales. The examples are ubiquitous; apart from the ones mentioned in the preceding section, these are: structure of the circulatory and the nervous system, the coastline [18], crystal growth processes [64], the earthquake frequency dependence on the magnitude [65], the functional dependence of the vortex velocity difference on distance between the vortices in turbulence [46], the dependence of the animal metabolic rate on the animal mass [66,67], the distribution of wealth in a society [68], of land-slide volumes [69], of a number of links pointing to web pages [70], and so on [71]. Various mechanisms were proposed in attempt to account for this universality of scale invariance in nature: the diffusion-limited aggregation [72], the Yule processes [73] and the preferential attachment [70], the random walks [74], a superposition of the exponentially distributed processes and others [71]. Only some of them may be associated with the phenomenon of complexity, while the other ones are rather trivial. Self-organization is closely related to critical phenomena, therefore the scale invariance, which is universal for such phenomena, may in a natural way be among the most important mechanisms leading to the scale invariance in complex systems [75]. A quantity that describes the scale characteristic for a given phenomenon is the correlation length ξ . Let the equal-time spatial correlation function be defined by the following equation: C (r ) = ⟨ρ(x)ρ(r − x)⟩x ,

(9)

where ρ(x) is a physical quantity specific for a given problem (spin of a particle, probability of finding a particle in a point x, etc.). In typical conditions above a critical point, the correlation function vanishes exponentially: C (r ) ∼ e−r /ξ .

(10)

The correlation length defines two symmetry regimes in a system. For r < ξ , the degrees of freedom are mutually dependent (the symmetry is shifted) so that any fluctuations are amplified to such a degree that calculating average values of the physical quantities makes no sense. In the opposite case of r ≫ ξ , on the other hand, the regions of size r are independent, the symmetry is thus preserved, and the average values can be determined. In systems in which phase transitions occur, the correlation length depends on an external control parameter Θ (e.g., temperature), and especially on the difference between actual value of the parameter and its critical value Θc under which the transition takes place. Because of the interactions, by approaching the critical point ((Θ − Θc ) → 0+ ), among the initially weakly correlated degrees of freedom (small ξ ), there occur correlations whose range is the larger, the closer is the critical point. In a vicinity of the point, the length ξ is power-law divergent:

ξ ∼ |Θ − Θc |−ν ,

ν > 0.

(11)

Exponential dependence of the correlation function (10) also transforms into power-law dependence: C (r ) ∼ r −α ,

α > 0.

(12)

In practice, this means that a maximal range of the correlations is determined only by the system’s size. Under such conditions, the collective behaviour of the degrees of freedom can be observed on all available scales and the fluctuations can propagate to arbitrarily long distances. The system thus becomes extremely sensitive to even small perturbations. A lack of the distinguished correlation scale transposes itself directly to a scale-free structure. Scale invariance of the correlations can be observed in the temporal domain, either. Here the correlation function: C (τ ) = ⟨ρ(t )ρ(t − τ )⟩t

(13)

also behaves as a power-law near the critical point; the power spectrum S (f ) =



+∞

C (τ )e−2π if τ dτ

(14)

−∞

reveals then a 1/f β -type behaviour with β > 0 (‘‘the 1/f noise’’). An especially interesting example of scale invariance is the situation in which power-law relations manifest themselves differently in different parts of the system, in different parts of a signal, or at different scales. This may happen if the system or the signal is multifractal (this issue will be discussed extensively in Part 7). Multifractal character of an observable is a result of nonlinear processes governing evolution of this observable; nonlinearity, in turn, is among the fundamentals of the complex systems’ structure. Obviously, in real systems the scale invariance ceases to be valid in the case of very small and large scales, due to the system’s finite size and other imposed boundary conditions. Power laws can then be altered by, for example, more strongly convergent exponentials on large scales.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

125

1.3.6. Self-organized criticality In a classic formulation of the phase transitions theory, passing of a system through a critical point results from a change of external parameter controlled by the environment. The scale invariance observed in many complex systems may suggest that such systems are situated in or near a critical state. This would have been less probable, however, if there had not been an inner mechanism steering the system towards the critical state, which occupy the place on a border between order and disorder. A role of this mechanism may be played by the self-organized criticality (SOC) [76], introduced in attempt to explain the omnipresence of the 1/f noise (seen, e.g., in electrical conductance [77–79], river discharges [80], ocean currents [81], quasar emissions [82], human cognition [83], coordination [84], vision [85], and heart rate [86–88], music [89], and so on [90,91]). An archetypal model of the SOC phenomenon is evolution of a sand pile formed by adding with constant frequency sand grains on a flat open surface from which they can freely and irreversibly fall to the environment. The more sand is being added, the steeper becomes the pile and also its angle increases. If the angle is small (the pile is relatively flat), the successive falling grains either do not generate any avalanches or the avalanches are of local character only. While the pile angle increases, the avalanches become larger and more frequent. After crossing some threshold angle value αc , the pile is so steep that adding a single grain is sufficient to evoke a global catastrophic avalanche, which flattens the pile such that the angle drops below the threshold αc . If the process of adding the grains continues, the cycle repeats. Both theoretical considerations and computer simulations have shown [76,92] that such a system evolves towards a stationary state, in which the angle oscillates in a narrow range around αc . In a close neighbourhood of αc a hierarchy spontaneously emerges: both the size of the avalanches and their duration are described by power laws. A necessary condition for the system to reach the stationary state is its openness; boundary dissipation of the sand must compensate the inflow of new grains. Another important condition refers to the system’s relaxation time, which has to be considerably shorter than the average interval between the successive grains. This ensures that the avalanches themselves are not perturbed. A simple numerical realization of the SOC model is a cellular automaton (called the Abelian sandpile) with the following rules [76]: hi = hi + 1, hi = hi − 2D,

hi ≤ htr

(15)

hi > htr

(16)

hj = hj + 1,

(17)

where hi is the ith cell value (‘‘height’’), htr is the critical threshold above which the ‘‘sand pile’’ undergoes a catastrophic slide, and j’s are the neighbours of the cell i (there is 2D such neighbours, where D is dimensionality of the space). In each step of the model’s evolution a cell i is randomly selected and the above rules are applied to it. If the drop of a single ‘‘grain’’ is assumed as a unit of the dissipated energy φ(x, t ) at a point x, then the total energy dissipated by the system in unit time and its power spectrum are equal, respectively, to: F (t ) = S (f ) =

 

φ(x, t )dx, ⟨F (t0 + t )F (t0 )⟩e−2π ift dt ≃ f −β ,

(18)

β > 0,

(19)

where the average ⟨·⟩ is taken over all the moments t0 . The resulting spectrum S (f ) is just the 1/f noise. However, in spite of the convincing results of computer simulations regarding the scale invariance of the model, its real-world realizations offer rather confusing indications, not always supporting the scale-free hypothesis [93,94]. This disagreement between the experiments and the numerical simulations can be explained by different geometries of the used materials. On the other hand, an advantage of the model is universality of the scenario: the system self-organizes in a critical state for a broad spectrum of the model’s particular realizations and a broad range of boundary conditions. For that reason, SOC is among the most fundamental models used for describing the evolution of complex systems. Its characteristic property is that it treats sudden changes in the system’s structure, both the local and the global catastrophic ones, as an inherent part of the inner dynamics of the system. This has a profound practical consequence: as the nature and development of each catastrophic event, like, e.g., a large earthquake, does not differ completely from the nature and development of any of the small events, our capability of prediction of such catastrophic events seems to be seriously questioned [71]. Owing to its catastrophic character, the SOC model has found many applications in description of natural and social crises like forest fires, earthquakes, landslides, mass extinctions of species, epidemics, military conflicts, and financial market panics [42]. Within its realms, the observation that many real-world complex systems exhibit discontinuous evolution (i.e., long periods of relative stability and equilibrium are interrupted by short periods of ‘‘turbulence’’ strongly affecting the system’s state) can naturally be accounted for. A classic example of such a system is the biosphere. Throughout its history the essential changes occurred mainly in effect of natural catastrophes leading to the mass extinctions of some groups of species and to accelerated development of other groups. The SOC-based model of the biosphere evolution [95] may be regarded as a physics-born counterpart of the earlier-formulated phenomenological theory of punctuated equilibria [96]. It is worth noting, however, that statistics of the most catastrophic events which can happen in various complex systems shows that such really extreme events occur even more frequently than one might conclude from simple approximations of

126

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

the experimental data by power laws. This over-representation of such events (called thus the outliers) cannot be explained on a ground of the SOC model. One therefore needs to consider some modifications of this model, which – contrary to the generic one – open some space for successful predictions [97,98]. 1.3.7. Highly optimized tolerance Complex systems can also be viewed from the perspective of engineering. They appear here as systems whose structure is designed to survive potential threats. This perspective is drastically opposite to the one represented by the SOC model, where changes due to catastrophes are something natural. According to the theory of highly optimized tolerance (HOT), catastrophic events of any size are undesired by the system which is protected from such risks by its internal organization [99,100]. A complex system, as understood by HOT, is very flexible to both the internal perturbations and errors and the unfavourable external factors (i.e., it has high tolerance). The tolerance range is broadened by precise ‘‘tuning’’ of the system’s structure in order to successfully cope with the working conditions (high optimization), which can typically be achieved by increasing its complexity. Such understanding of complex systems can be valid in the case of natural systems, especially the biological end ecological ones, as well as in the case of artificial systems like technical devices and social structures. As it is seen from the latter two, HOT allows one to include among complex systems the systems whose complex structure is produced by design rather than by spontaneous self-organization. HOT seems to be a more realistic model than SOC in such a sense that instead of the too simplified assumption about homogeneity of the described systems, which is crucial in SOC, HOT puts a stress on diversity (heterogeneity) of the structural elements and on multiparametric description. Following Ref. [100], the idea behind HOT can be illustrated by the example of a modern passenger jet which is equipped with a number of complicated electronic devices and circuits. Such a jet – despite the fact that it does roughly the same function as older and much simpler aeroplane models which achieve comparable flight parameters – is far more robust to changes of atmospheric conditions, minor faults, or pilots’ indisposition. Mathematically, the HOT model is based on optimization of the system’s parameters in order to obtain maximum effectiveness [99]. In a simple case, let x be an event whose cost equals K (x), let Z (x) be the resources available at x, and let p(x) be the event’s probability. Now, if the resources are globally limited:



Z (x)dx = const,

(20)

X

and the cost is power-law dependent on local resources: K (x) = Z −α (x),

α > 0,

(21)

then by minimizing the expectation value of the cost δ(E (K (x))) = 0 under the condition (20), we obtain a function of power-law type: p(x) ∼ Z α+1 (x) = K −(1+1/α) (x).

(22)

So, minimization of the cost provided it is power-law dependent on the locally available resources leads to a rule stating that the most optimal way of resource allocation is to locate the most of them in regions with high p(x), which allows the system to withstand such typical events. The other side of the coin is, however, that such highly optimized, tolerant systems are significantly vulnerable to very rare, unusual events (e.g., cascading failures), whose occurrence, according to Eq. (22), generate high costs. Similar problem can happen when the system’s structure itself is even marginally altered. For example, positions of two amino acids in a DNA chain are exchanged or two code lines are randomly swapped in a computer program. The system loses then a part of its functionality or, in an extreme case, it stops functioning at all. This property distinguishes HOT from SOC, in which sensitivity to changes of boundary conditions is weak. Foundations of the HOT model touch the problem of complexity limits. The problem states that the more complex a system is, the higher is the probability that there occur fluctuations (or errors) destabilizing it. However, the point of view of HOT is, in a sense, the opposite: growing complexity of the system leads to improvement of its function and stability. It is noteworthy that although the models of HOT and SOC were proposed in order to explain the 1/f dependences, an observer who does a series of measurements of some observable and discovers its power-law scaling is unable to give an answer, if he relies only on the signal, to the question whether the properties of the studied signal stem from the HOT or the SOC scenario, or the scaling has its origin merely in the statistical properties of the underlying processes. In such situation one needs additional knowledge of the structure of the system and the processes taking place there. It stems from the arguments presented in Section 1.2 that having collected data from measurements of observables associated with natural systems, it is difficult, or sometimes even impossible, to express by means of a single quantity the diversity of properties that make such systems complex. It seems to us that at present a more appropriate approach is to analyse the data so as to identify there the particular manifestations of complexity. This approach will therefore be applied throughout this article. In the subsequent chapters starting from Part 3, we will present and discuss results of analyses of the data collected from several systems which a priori are known to be complex: human brain, natural language, and financial markets. Our studies will be carried out in order to observe in these data the properties that are typical for complex systems: the coexistence of noise and collective phenomena, the hierarchical structure, the power-law dependences, and

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

127

so on. It has to be noted, however, that due to still unsatisfactorily understood nature of the phenomena taking place in complex systems, the problem of whether these properties are indeed a manifestation of complexity or rather they have some simpler interpretation (like in the case of the 1/f noise), must at this stage remain open. 2. Description of selected complex systems In this section we present brief description of the complex systems whose data will be analysed in the further parts of the paper: human brain, natural language, and financial markets. The order in which they are considered here does not correspond to the amount of space devoted to each of them in the paper, but it rather reflects a series which they form, and which itself can illustrate one of the properties of complexity: a hierarchical organization. At the base there is the human brain, among whose emergent creations there is language, without which, in turn, it would be impossible to develop human civilization and financial markets being one of this civilization’s products. 2.1. The human brain The human brain can be viewed as almost an archetypal example of a complex system. An enormous number of neuronal cells (1010 ) and their interconnections (1014 ), amazing functional and adaptive abilities, and the phenomenon of consciousness make formulation of any comprehensive or at least a partial model of the human brain still beyond reach for contemporary science. Complexity of the brain is observed at different levels of its organization, from the level of molecular interactions, through the structure and action of individual neurons, to the structure of the neuronal network and the network of specialized regions. Its complex structure is, on the one hand, a product of long evolution of genes and, on the other hand, of the process of self-organization of the brain structure throughout its lifespan. 2.1.1. Functional organization of the brain The most important degrees of freedom in the brain are neurons. At the level of individual cells, complexity manifests itself both in morphology (strongly branched structure of the cells with fractal properties) and in physiology (strong nonlinearity of processes taking part inside and outside of a cell). However, from the macroscopic point of view, if only one observable corresponding to electric activity of neurons is considered, the neurons can be treated as single degrees of freedom. Electric activity of a neuron can be described as ion currents flowing in dendrites towards the cell’s body and more or less irregular discharges in an axon (action potentials). Paradoxically, even if only the fields produced by ion currents are, after integrating them over millions of neurons, measurable outside the skull, these are the oscillations and bursts of axonal activity that are considered the basic information carriers in neurons [101]. An individual neuron can be connected with a large number of other cells by synapses whose number can reach 104 [102]. Depending on connection type between two cells, the same information can trigger activity of a neighbouring cell or suppress its activity. A fundamental property of groups of connected neurons is synchronization of their activity due to which the neurons can switch from states with individual patterns of activity to collective states involving synchronous activity of many cells. Such transitions are possible because of complicated structure of excitatory and inhibitory loops which allow even distant neurons to be synchronized [103]. This happens, for instance, in the case of low-frequency brain waves (α , β , and θ rhythms) and in some pathological states (like epileptic seizures), when the oscillations involve practically the whole cortex. A source of such global oscillations are generators located in hippocampus. Although the brain waves are associated with the states of relax and sleep, and, as such, they do not play any important role in the normal human activity, similar generators of synchronous neuronal activity are also responsible for various other rhythmic actions like breathing, gait, and heartbeat [104]. Synchronization of neuronal activity plays also an important role in information processing and perception of external stimuli. Cortex is a hierarchical system, in parallel with the brain as a whole. There can be identified the regions that are connected with sensory organs or with the motor system, as well as the regions without such connections. The former are responsible for perceiving and processing of external stimuli, while the latter are responsible for making associations between different pieces of information, decoding the meaning of the stimuli, and making decisions. Fig. 4 shows the approximate locations of a few selected cortical areas in a human brain. These are the two sensory-related areas: the auditory cortex and the visual cortex, and the two association ones: Broca’s and Wernicke’s areas, responsible for the ability to speak and to understand the spoken and written language. Each of these four areas is somehow related to the present work, what will become clear in the following sections. Each sensory-related cortical region is rich in intrinsic structure and has specific functional organization. The incoming (afferent) nervous connections from the sensory organs terminate in the primary areas of the relevant cortical region which becomes increasingly active. In the primary areas certain elementary aspects of the stimuli are analysed and recognized. Information about these aspects travels then to higher-order sensory areas (the secondary, the tertiary, and so on) in which more and more complex properties of the stimuli are identified. Finally, information leaves the sensory cortical region and is transferred to the association areas. This simplified scheme does not reflect the complete organization of the cortex, since

128

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 4. Lateral view of the left hemisphere with colour-distinguished areas that due to various reasons are related to this work. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

it comprises a number of additional feedback connections imposed on the basic hierarchically ascending ones, effectively modifying information paths and activation patterns of the associated regions. The sensory areas are organized according to topographic principles. In accordance with it, each neuron and a group of its neighbouring neurons receive a connection related to a specific receptive field of an associated sensory organ (retina, Corti organ, skin, etc.). Owing to the connections with specific structures of the Corti organ, receiving acoustic waves of precisely defined frequencies and performing a kind of the Fourier transform, the receptive fields of neurons located in the primary areas of the auditory cortex are also sensitive to a single frequency. In the visual cortex situation looks similar. Each group of neurons in the primary area (V1) is connected with retinal cells and activates when a stimulus falls within this group’s receptive field. Some groups of neurons share the same receptive field but they are sensitive to different aspects of the stimulus. In this way, for each stimulus with arbitrary characteristics only those neurons whose receptive fields overlap show amplified activity. Information dispersed among a number of neuronal groups in the primary areas is then functionally integrated by other groups of neurons in the secondary and higher-order areas. A mechanism for this integration consists in synchronization. The groups of neurons sharing the same receptive field display evoked collective oscillations if the stimulus occurs in this receptive field and some aspect of the stimulus precisely matches the aspect to which this group is sensitive. By this, in a given cortical area, a number of separate groups of synchronously oscillating cells are formed. Synchronization of the neurons belonging to groups with different receptive fields is also possible provided the neurons in both groups are sensitive to the same stimulus aspect. This mechanism allows to form a functional structure of oscillating neuronal ensembles interacting with each other via long-range connections. Functional segregation and integration are together considered a fundamental paradigm of the brain’s work. The segregation of complex information received from the environment through its decomposition into elementary components enhances the speed and effectiveness of its decoding, while the later integration has its goal in constructing the inner representation of the complete information, attributing meaning to it, and preparing an optimal reaction [105]. 2.1.2. Metastability of the brain state Under normal conditions the cortical areas are neither fully synchronized nor fully desynchronized. At a given instant, some neurons are engaged in collective oscillations, while the other ones reveal their own idiosyncratic activities. In a consequence of the mechanisms leading to the occurrence and disappearing of synchronization, the brain’s state continuously changes. This means that the brain permanently resides in a metastable state [106]. Such a view is close to the well-known concept that biological systems reside at the edge of chaos [107,108]. A related but less vaguely defined concept is chaotic itinerancy. According to it, the brain activity can be expressed by a trajectory on an attractor which comprises unstable periodic orbits. The process of perception corresponds in this context to a temporary stabilization of dynamics on such an orbit only to leave it after a moment and again penetrating the attractor space by freely itinerating over its various parts. Then the dynamics is again stabilized on another unstable orbit and the whole process repeats [109]. The concept of the brain working at the interface between order and disorder can also be formulated in the language of the SOC model. By constant interaction with the environment, the brain cannot be trapped in a stable state because this would invalidate its flexibility, i.e., the ability to modify its own structure. Moreover, the state in which the brain resides should to some degree be sensitive to small perturbations in order for the brain to be able to rapidly alter its state if required by the circumstances. For example, this may be necessary in the case of an external stimulus of high importance. On the other hand, the brain’s state cannot be completely unstable, because in this case each arbitrarily small perturbation of even marginal significance, would lead to unpredicted behaviour and non-optimal reactions. This gives justification for viewing the brain as a system residing close to the critical state (in the sense of SOC) [110]. The main assumptions related with the

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

129

Fig. 5. Statistically typical neuronal response of the auditory cortex to delivery of a sound stimulus. A series of evoked potentials are seen with their characteristic signs (N /P) and delays (expressed in milliseconds). Onset of a stimulus is denoted by vertical line.

SOC model have been tested on artificial neural networks [111] and confronted against the real macroscopic brain structure represented by networks [112,113]. 2.1.3. Functional imaging of the brain Basic techniques of functional brain imaging can be classified into two main categories: those which are related to metabolism (positron emission tomography — PET [114], functional magnetic resonance imaging — fMRI [115,116]) and those related to electric activity of neurons (electroencephalography — EEG, magnetoencephalography — MEG [117–119]). The metabolic techniques are based on detecting the amount of blood flowing through a given region of the brain or measuring the consumption of nutrients by this region. These techniques are characterized by good spatial resolution (below 1 mm) at the expense of temporal resolution which is rather poor (of order of 1 s). The latter property implies that unfortunately they are not suitable for studying the paths and the stages of information processing in the brain, because the events that are involved in this process happen with frequency up to several tens Hz. This drawback is unknown in EEG and MEG which are designed for imaging the brain’s activity with millisecond temporal resolution. However, both techniques have considerably worse spatial resolution (0.5 cm) than the metabolic ones. An additional advantage of EEG and MEG is their noninvasiveness, in sharp contrast to fMRI and, especially, to PET. EEG and MEG are unable to detect the action potentials in neurons which are too short and weak. In fact, what they do measure are the relatively slow oscillations of postsynaptic potentials in dendrites, integrated over millions of neurons involved in the macroscopic activity. Owing to their characteristics, EEG and MEG are broadly applied to record the cortex activity evoked by external stimuli. A typical cortex response to an auditory stimulus obtained by either of these two techniques is schematically drawn in Fig. 5. It consists of a few oscillations (the evoked potentials) which occur with characteristic delay after a stimulus delivery, which is encoded in their labels. 2.2. Natural language Natural language is a system functioning at the interface between biology and social interactions. On the one hand, language is inseparably tied to the principles of human brain function of which it is a product. On the other hand, however, language emerged in a process of socialization, when a need of information exchange between cooperating members of a primitive human group turned out to be beneficial. Language is an emergent phenomenon born out of complex nonlinear interactions among neurons and groups of neurons in many different brain areas with the distinguished Broca’s area. As an adaptive system, it is subject to permanent modifications catalyzed by cognitive and social mechanisms [120]. Under the influence of interactions between members of a community sharing the language, the existing grammar rules are modified, some words are pushed out by other words, there appear completely new ones whose task is to name new objects and concepts, etc. Origin of the innate language abilities of humans is still a rather controversial issue. The best-known theory trying to account for this issue, for the ubiquity of using language as a communication means by all the known peoples, as well as for the huge linguistic diversity of the world, is the theory of universal grammar (UG) [30]. It claims that each newly born child possesses an already completed built-in mechanism coding a general grammatical structure of language. Within this structure there is a set of adjustable parameters whose particular values are associated with specific rules of grammar. As the child grows up, its brain is exposed to a language spoken by the family and other members of a community. This results in irreversible fixing of the hitherto free parameters of UG and acquisition of the grammar specific for this language. (In the language of synergetics, such amplification of certain parameters and suppression of others can serve as an example of the slaving principle in action.) At this moment the child becomes a native speaker of the given language. The theory requires the existence of an organ in the brain which can be a carrier of universal grammar. The key point is that this organ had to be formed before the language itself, since the language is formed by the properties of UG, not vice versa. And this is also exactly the point which makes the theory rather weak: origin of the UG organ is extremely difficult to explain on grounds of the known principles of evolution theory.

130

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Other competitive theories on the genesis of language state that in the course of evolution and natural selection the brain adapted itself to acquire this new ability [121]. Instead of formation of a special organ at some moment of history, the genetic structure of the brain was modified gradually in parallel with the language, in such a way that humans can learn it faster and faster and can use it more and more effectively. However, a weak point of this approach is that its supporters have not yet proposed any convincing mechanisms explicating the children’s capability of learning any language no matter whether it is spoken by members of the community which is genetically closest to the child or a community genetically distant from it. Furthermore, this theory does not provide any arguments explaining how the brain could genetically adapt to language which evolves incomparably faster than does the genetic code. The above-mentioned problems are not present in another theory, according to which language adjusted all its properties to the very way the brain works and learns [31]. This theory, which may well be called ‘‘ecological’’, states that language has to be viewed as an organism that evolved so as to be best fitted to function within an ‘‘ecosystem’’ of the human brain. Other organisms – the protolanguages based on different principles than the modern languages – were not so well adapted and they simply extinguished. This ecological view has such a fundamental advantage that it does not require to provide any additional assumptions regarding the way the brain acquired the ability of linguistic communication. The rather vague idea of a hypothetical brain which had to adapt its structure to language is here replaced by the idea that language adapted its properties to the existing competences of the brain. In this context, the first, native language of each person may be considered a single individual of its own species, while different existing or historical languages are just different species. As it was shown by computer simulations on neural networks, the languages with the most typical grammar structures (as regards the word order in sentences, e.g., subject–object–predicate, subject–predicate–object) demand significantly shorter network training than do the uncommon languages with some exotic grammar structures (e.g., object–subject–predicate) [122]. It is thus probable that the most common grammar structures are related to specific learning abilities (and limitations) of the brain. Children learn the appropriate language rules by receiving very short and noisy samples and succeed within relatively short time. In order to achieve this, it must exist an innate capacity for learning language in precisely this form, not another one. This capacity does not stem from the existence of a hypothetical organ carrying universal grammar, but from the adjusting of language to the brain’s capacity for sequential learning. Language has a hierarchical structure. At the most basic level, it consists of phonemes (spoken language) and characters or ideograms (written language). Typically, number of either of these elements in a given language is small and usually reaches several tens (for example, in British English there are exactly 26 characters and about 45–50 phonemes in active use; the latter number varies depending on a source). The phonemes and characters group themselves in morphemes, which play an important role of fundamental carriers of meaning. The morphemes are not self-reliant, however. The function of the smallest self-reliant components of language is played by the words consisting of one or more connected morphemes. A higher level of language organization is formed by clauses and sentences which are the most important units of information transfer. In the case of written language, there can be distinguished also other levels of the organization hierarchy (paragraphs, chapters, texts, and so on). All modern languages have a grammatical structure based on the existence of word groups with precisely defined role in the structure of information transfer, i.e., parts of speech (classes), and the existence of precisely defined dependences among the words from different groups. Word classes can be divided into two types: the open classes in which new words may be freely created (nouns, verbs, adjectives, and adverbs) and the closed classes which have limited number of elements (articles, prepositions, pronouns, conjunctions). Generally, words have a twofold function in the utterances: a larger, basic group of words forms symbolic references to objects (understood in a broad sense as things, notions, actions, or attributes), while the other, relatively small group of words does not carry alone any information playing grammatical roles only (e.g., prepositions and conjunctions). Moreover, it is observed that some words may gradually lose their original meaning and become solely the carriers of grammatical forms (for example, words being originally separate entities are reduced to the verb suffixes indicating grammatical tense or person). It is believed that primitive protolanguages not possessing any grammatical structure used to map only a small number of the most important objects to the corresponding words. As the social ties in human groups were increasing and the interactions among members of these groups were becoming complex, the number of objects which had to be named also increased. This implied that more and more words were needed. Computer simulations of a protolanguage model show that if size of a vocabulary that is required to master is sufficiently large, the number of communication errors grows so much that the language loses its ability to follow the more and more complex reality [123,124]. In this situation, the occurrence of grammar and higher levels of language organization helps with maintaining the number of necessary words at a reasonable level and, despite this limitation, it allows a human to express practically unlimited number of ideas. The emergent phenomenon of grammar, formed as a mechanism allowing people to counteract the growing number of communication errors in the protolanguage, as well as to express the incomparably larger number of ideas, is the most striking manifestation of the complexity of natural language. Transforming sequences of words received auditorily or visually into meaningful messages requires ability of the brain to decode complex signal, to identify there the possible regular structures associated with grammar, and to read the senses attributed to the particular words and phrases. Therefore, it is supposed that regardless of the actual mechanism of its acquisition, the appearance of language was a profound factor accelerating evolution of human brain and human social structures.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

131

2.3. Financial markets 2.3.1. Financial markets as complex systems In a view of many, the financial markets are among the most complex systems existing in the known Universe. This opinion stems from two reasons. First, the financial markets dynamics is incident to a hard to estimate number of factors of both the internal and the external origin, whose impact strength is also hard to assess. In addition, these factors are strongly interrelated by a largely unknown network of connections and feedbacks (positive and negative). Without much exaggeration, the financial markets may be considered the embodiment of the philosophical concept that ‘‘everything depends on everything’’. As a matter of fact, the markets inherit this property after the whole economical sphere of human activity, being this sphere’s the most easily quantifiable part. Second, the financial markets are not built up of simple physical objects like atoms or molecules, or even of actually complex objects like neurons while effectively not less vacuous than the atoms, but from the investors endowed with intelligence. This fundamental, qualitative difference allows the financial markets to self-organize unequally fast, increasing their already high complexity [125–127]. This is why it is so a difficult task to construct realistic market models (whose construction by itself will contribute to the further self-organization of the real markets). Financial markets are open systems. They derive energy from the environment and bring it back to the environment, too. In this point they resemble natural dissipative systems. Openness of the market is expressed by the inflow and outflow of the invested capital (or issued shares) which plays the role of energy. However, as the capital and asset flow through the market can be considered a rather slow process if compared to the transaction activity, it can be assumed that at the time scales characteristic for transactions both quantities are approximately conserved. This is one of the prerequisites supporting a belief that is widespread, especially in classical economics, in accordance with which the market is an equilibrium system. The assumption about the market equilibrium implies further assumptions about full rationality of market participants’ actions based on complete knowledge on the present and future situation, about perfect asset liquidity (transactions do not influence price), about a balance between supply and demand, etc. [128]. Equilibrium state is closely related to the efficient market hypothesis (EMH) [129] stating that price of an asset at a given moment reflects complete information that can have any meaningful relation with the asset and which is available to investors. In this light, the given asset price P (t ) is a stochastic process with the property of a martingale: E [P (t > t0 )] = P (t0 ),

(23)

where E [.] is expectancy value estimator. According to EMH, price movements are void of memory effects and any effects that are contradictory to this statement have to be restricted to short time scales corresponding to the characteristic time with which the investors react to arriving information (which nowadays is a fraction of a minute). From the empirical perspective, the assumption about equilibrium and efficiency is useful in the first approximation, allowing to simplify otherwise more sophisticated models. There exist many phenomena, however, which are in a striking conflict with this assumption and which suggest that some corrections have to be taken into account as regards the equilibrium picture. For example, the investors are not always perfectly rational and often there is a strong irrational component in their decisions. In extreme cases, this can lead to herding behaviour which typically results in speculative bubbles and their subsequent breakdowns (although it should be noted that herding may also be a consequence of rational decisions [130]). The same can be caused by a lack of complete knowledge about the market when the investors are unable to determine whether the prices indeed reflect the embedded information as demanded by EMH. If they overestimate the information content of the price, they make decisions which can contribute to a positive feedback which further loosens the relation between the price and the information content [131]. Bubbles and crashes are the examples of collective effects that drive the market far out of an equilibrium point. At the price level, such or smaller events are associated with the phenomenon of volatility clustering, i.e., the existence of successive periods of small and large amplitude of price fluctuations, which resembles intermittence. Volatility clustering implies longterm memory of the fluctuation amplitude, expressed by the power-law decreasing autocorrelation function (Section 6.1). This is reminiscent of the power-law dependence of correlations near the phase transition point in critical phenomena (Eq. (12)). Despite this resemblance, the criticality of financial markets is not so obvious as it may appear in this context. Some empirical indications (the power-law distribution of events with over-the-threshold volatility), however, may suggest a possibility that the markets reside near the SOC state, among the possible explanations [132]. This may look differently in a final stage of the growth phase, when the market becomes more and more sensitive to perturbations, as does a system in which an external parameter is approaching the critical state. In this phase, an arbitrarily small fluctuation can trigger a cascading response of the market similar to a phase transition, when thus far increasing prices under the condition of supply and demand symmetry at some moment collectively start to fall under the condition of broken symmetry in favour of dominating supply [133]. Apart from volatility, the long memory can also be seen in some other types of financial data (trading volume [134], number of transactions per unit time [135], size and type of orders [136,137], spread [138]) and macroeconomical data (inflation [139], GDP [140]). Power-law relations can additionally be found in income distributions (in the limit of high incomes, the so-called Pareto law) [68], in distribution of annual growth rates of companies [141] and volume traded in a stock market [142], as well as in logarithmic return distributions of commodities and stocks [143,144], market indices [145], and currency exchange rates [146] (Section 6.1). All these phenomena are among the so-called stylized facts of financial data.

132

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

A physical justification of this ubiquitous power-law behaviour comes from the fact that the financial markets consist of a large number of nonlinearly interacting elements. Such interactions, if lacking a characteristic scale, cause that microscopic details which can distinguish individual markets, cease to be statistically significant at the macroscopic level. 2.3.2. Structure of the financial markets The most important financial markets are commodity markets, stock markets, bond markets, derivative markets and the foreign exchange market. The largest one is the foreign exchange market (forex) which – in a contrast to other markets – is entirely globalized. In Forex, the average daily turnover reached 4.0 trillion US dollars in April 2010 [147] which is roughly equivalent to be 25 times larger than daily Gross Domestic Product (GDP) of the world, estimated to be ca. 160 billion US dollars in 2010 [148]. Moreover, this market has a tendency to grow by 5%–20% annually. The forex operates without the daily cycle typical for other markets: 6–8 trading hours followed by 16–18 h of pause. Instead, it operates according to the weekly cycle, being open 24 h a day from Sunday evening (22:00 GMT) till Friday evening (22:00 GMT). The currency market is also much more liquid than other markets and has a much larger number of transactions. It consists mainly of large financial institutions and central banks which makes it less sensitive to collective effects like panic. Stock and other security markets have local character. Typically these are the exchange floors situated in financial centres of particular countries, although in recent years there is a strong tendency to merge national floors in order to form large international, decentralized markets like NYSE Euronext and NASDAQ OMX. Total capitalization of all the stock exchanges exceeded 57 trillion USD in March 2011 [149], while daily turnover oscillates around about 1% of this value. According to market capitalization, the world’s largest stock exchanges are the American branch of NYSE Euronext with the domestic capitalization of 13.4 trillion USD, the American branch of NASDAQ OMX (3.9 trillion USD), Tokyo Stock Exchange (3.8 trillion USD), London SE (3.6 trillion USD), the European branch of NYSE Euronext (2.9 trillion USD), Shanghai SE (2.7 trillion USD), and Hong Kong Exchanges (2.7 trillion USD). Against these exchanges the German stock exchange in Frankfurt does not seem to be an equally large market (1.4 trillion USD) [149]. On a free market, value of an asset can be quantified by means of a price only at the moment of transaction. The price is determined by buy and sell orders with or without price limits. These orders are matched electronically by automatic systems operating on the majority of contemporary stock exchanges. Between the consecutive transactions the price, strictly speaking, remains undefined. A problem of price continuity arises here, solving of which is especially crucial in the case of numerical analysis of time series sampled with constant frequency. In practice, this problem can be worked around by the standard assumption that between the transactions the price remains constant and equals the price at which the last transaction was made. From a theoretical point of view, however, this important problem is still open and it forms a starting point for the construction of quantum models of financial markets, in which the problem of price definition is a financial counterpart of the quantum-mechanical problem of measurement [150]. Another difficulty associated with numerical analysis of financial data is discontinuity of trading hours in the stock markets, which makes unclear whether the market time flows during trading hours only or perhaps also during the overnight hours? This problem’s core lies in the fact that when one group of markets (e.g., the American markets) is closed, the markets located in other time zones may be active. Therefore, by taking into consideration the global character of both the information flow and the investor activity, it is justified to assume that the market time, understood as the events and price movements, flows by also when the stock exchange is closed. Despite this, it is a common practice that only the trading hours are considered in data analysis. In the case of a financial market, degrees of freedom may be viewed twofold: in the basic picture, they can be both the individual investors and the particular assets: securities, commodities, or currencies. In the former case the interactions between the degrees are direct and comprise the exchange and absorption of information as well as doing transactions. In the latter case the idea of interactions is more abstract and refers to the statistical dependences among the asset prices. Such dependences are a product of the investors’ coherent decisions related with these assets. Since it is impossible to follow the transactions done by each particular investor, in empirical analyses of market data the degrees of freedom are usually identified with the assets. Behaviour of the investors can be observed only through the collective effects which they produce — the trading volume and the price movements. Similarity of company profiles and direct trade links between the companies are transferred in a natural way to the market floor fostering the formation of dependences between their stock prices (the same refers to currency rates and commodities). Different strengths of those trade links outside the market floor is also reflected in different strengths of interstock couplings. This leads to emergence of a hierarchical organization of the market consisting of individual companies at the lowest level, small groups of closely cooperating or otherwise interrelated companies at higher level, the industrial sectors even higher, and finally the whole market at topmost level of the structure. Even more, the existence of many national or regional stock markets in the world and the economical globalization render the appearance of couplings even between the stocks traded on different floors [151,152]. Owing to this effect, the following two additional levels of hierarchy can be added: the level of geographical regions, on which the markets are more strongly coupled inside each region than outside of it, and the global level, on which all or almost all stock markets display collective behaviour as if the global market had only a single degree of freedom [151]. The price of each stock, commodity, or some other asset is expressed in a local or a customarily defined currency, which plays a role of an independent reference frame for the price evaluation. The situation looks completely different in the forex

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

133

market, where there is no a stable reference frame. Here, value of a currency X may be expressed exclusively in some other currency Y, which makes such evaluation not to be absolute, and implying that the proper assets in the forex market are the crossrates X/Y, not the individual currencies. X and Y represent here the three-letter ISO codes uniquely defining each currency. The relative character of currency values together with the no-arbitrage condition impose two bounds on the currency rates. These bounds significantly reduce the effective number of degrees of freedom in the market: X/Y =

1 Y/X

,

(24)

X/Y = X/Z · Z/Y.

(25)

The latter equation is known as the triangle rule. 3. Coexistence of collectivity and noise 3.1. Methods of identification of collective effects in empirical data Collective behaviour emerging spontaneously from a sea of noise and decaying in the same manner, which manifest itself in a non-zero value of order parameter, is one of characteristic properties of complex systems (Section 1.3.2). If an observer carries out a series of measurements of an observable associated with the order parameter in a state of thermodynamic equilibrium, the measured signal has a form of random fluctuations around zero with the r.m.s. amplitude decreasing as N −1/2 in the limit of N → ∞. This situation can change completely, if the system has been perturbed: within the time scale of nonequilibrium duration, the measurement outcomes may significantly deviate from zero. Let Ω be a system built of N elements α being its degrees of freedom and let Xα (i), i = 1, . . . , T be a time series of measurement outcomes of an observable Xα associated with the degree α . The measurements are typically done with a constant frequency, i.e., with a constant sampling interval 1t = ti+1 − ti , but this is not a necessary condition. In addition, we assume that the measurements of all N observables Xα are done simultaneously. 3.1.1. Two degrees of freedom A coupling of two degrees of freedom α, β can be identified by means of one of the measures describing statistical dependences between time series. The most popular measure is the normalized correlation coefficient (the Pearson coefficient) defined by Cαβ (τ ) =

1

T  (xα (i) − x¯ α )(xβ (i + τ ) − x¯ β ),

T σα2 σβ2 i=1

(26)

where x¯ γ denotes the mean and σγ2 denotes the variance of time series γ , while τ is a temporal shift of β with respect to α . The correlation coefficient (26) is an estimator of the real correlation between the processes underlying analysed signals in the case of a finite sample. Values of Cαβ fulfil the inequality: −1 ≤ Cαβ ≤ 1. In the limit of N → ∞, independent time series give Cαβ = 0, while the identical signals give Cαβ = 1. Negative values of the coefficient are related to anticorrelations. By its definition to which both the signals xα and xβ enter in the first power, the correlation coefficient (26) is sensitive to linear dependences only. This limitation can be overcome if one introduces higher-order correlation coefficients, to which the signals enter as xsα , xtβ (with the exponents s, t ≥ 1). Usually, however, the linear correlations are the most significant ones in empirical data which justifies using the classic definition (26). A measure which is free of the limitations characteristic for the correlation coefficients of any order, constraining them to be sensitive to only one specific type of correlations, and therefore being sensitive to any statistical dependences is mutual information (Eq. (3)). In its two-dimensional form, it can be calculated according to the following equation: I (Xα , Xβ ) =



pαβ (k, l) log

k∈Xα l∈Xβ

pαβ (k, l) pα (k)pβ (l)

,

(27)

where pγ (m) ≡ P (Xγ = m) are marginal probability distributions and pαβ (k, l) ≡ P (Xα = k, Xβ = l) is a joint probability distribution. 3.1.2. Multivariate data In a multivariate case (N ≫ 2), the analysis can be simplified by a matrix approach [153]. Temporal evolution of a whole system may be expressed in a form of an N × T data matrix X whose elements are: Xα i =

1

σα

(xα (i) − x¯ α ).

(28)

134

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Based on this matrix, a correlation matrix of size N × N can be constructed: 1 XXT , (29) T whose elements are the Pearson coefficients Cαβ . Any statistical dependences among the signals show themselves in non-zero elements of C and, what directly follows, in its spectral properties. The eigenvalues and eigenvectors of C can be obtained in the ordinary way by solving the characteristic equation: C=

Cv(k) = λk v(k) ,

k = 1, . . . , N .

(30)

Typically, a convention of ordering the eigenvalues according to their magnitude is applied: λ1 ≥ λ2 ≥ · · · ≥ λN . By construction, the correlation matrix is symmetric with real elements, which implies that its eigenvalues λk and eigenvectors v(k) are also real. Since Cαα = 1, we have: N 

λk = Tr C =

N 

Cαα = N .

(31)

α=1

k=1

3.1.3. Ensemble of Wishart random matrices When looking at the eigenspectrum of a correlation matrix C, an essential problem to overcome is troublesome distinction between the correlations which originate from actual dependences among the degrees of freedom and the random correlations which are a consequence of finite lengths of signals. This can be done with help of the random matrix theory (RMT) [154,155] which offers universal predictions for matrix ensembles with specific properties. In the case of the correlation matrix, the relevant ensemble is the Wishart ensemble of matrices [156] defined by: W=

1 T

MMT ,

(32)

where an N × T matrix M is a counterpart of the data matrix X (Eq. (28)) with its elements drawn from a Gaussian distribution N (0, σ ). In the limit N , T → ∞, the eigenvalue density for the Wishart matrices depends on Q = T /N. In the non-degenerate case (Q ≥ 1), it is given by the Mar˘cenko–Pastur distribution [157,158]:

ρW (λ) =

N 1 

N k=1

δ(λ − λk ) =

√ (λmax − λ)(λ − λmin ) , 2 2π σ λ Q

 2 λmax min = σ (1 + 1/Q ± 2 1/Q ),

(33) (34)

where λmin ≤ λ ≤ λmax . In the limit Q → ∞ all the off-diagonal elements Wij = 0 and then Eq. (33) is reduced to:

ρW (λ) =

N 1 

N k=1

δ(λ − 1) = δ(λ − 1).

(35)

In the opposite case, i.e., for Q < 1, the so-called anti-Wishart matrices W contain excess information and their order is T (T < N). This means that a substantial part (equal to 1 − Q ) of the eigenvalue distribution is condensed at zero. The Mar˘cenko–Pastur distribution is a strict consequence of the fact that for small values of Q the empirical correlation matrix C (or its random counterpart W) stops to be a good estimator of the ‘‘true’’ correlation matrix (sometimes called the population matrix). Fig. 6 shows shapes of the distribution (33) for several values of Q . In practice, we deal with finite values of N and T leading to blurring of the boundaries of ρW (λ), and especially to exceeding λmax by one or more largest eigenvalues of W. In such a case one needs to know the distribution of λ1 , which after rescaling the variable λ1 → (λ1 − µ)/σ , where

√ √ µ = ( T − 1 + N )2 ,

σ =



µ4 (T − 1)N

1/6

,

(36)

is for large N , T convergent to the Tracy–Widom distribution characteristic for the GOE matrices [159,160]:



F1 (s) = exp −

1 2







q(x) + (x − s)q2 (x)dx ,

(37)

s

where q(x) is a solution of the equation q′′ (x) = xq(x) + 2q3 (x). In the limit of large N , T , width of the Tracy–Widom distribution is given by:

σTW →



1/Q (λmax /N )2/3 .

(38)

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

135

Fig. 6. Mar˘cenko–Pastur distribution (Eq. (33)) for a few different values of Q = T /N.

Asymptotic behaviour of the eigenvector components vα(k) in the limit N , T → ∞ is described by the Porter–Thomas distribution [154]:

 P (u) =

1 2π u

e−u/2 ,

(39)

where u = (vα(k) )2 for α = 1, . . . , N and normalization condition α (vα(k) )2 = N holds. Another useful quantity related to the distribution of eigenvector components is the inverse participation ratio:



Ik =

N  (vα(k) )4 .

(40)

α=1

It describes how many components of an eigenvector k significantly contribute to its length. For an entirely delocalized √ eigenvector with equal components vα(k) = 1/ N, Eq. (40) gives Ik = 1/N, while for a fully localized eigenvector with a single non-zero component: vα(k) = δαα0 , Ik = 1. For the Wishart matrices, Ik asymptotically converges to the value of 3/N characteristic for the GOE matrices. The inverse participation ratio contains similar information as the information entropy given by Eq. (1) does, but its value can be interpreted in a simpler way. 3.1.4. Ensemble of correlated Wishart matrices A type of the Wishart matrices (32) which is of great practical importance are the correlated Wishart matrices Wc . In their simplest form, with a perturbation of the order of 1, they are defined as [161]: T

Wc = M′ M′ + byyT ,

b ∈ R,

(41)

where M′ of size N × (T − 1) is a counterpart of the matrix M from Eq. (32) in which the first column was removed, and y is a vector with the Gaussian-distributed components. The eigenvalue density of Wc is given generally by the Mar˘cenko–Pastur distribution, except for the largest eigenvalue λ1 which can be strongly repelled to the right from the rest of the spectrum, provided the perturbation b is strong enough. If this is the case, in the limit of T → ∞ and for ω = bN, one obtains [161,162]:

 ω Q −1 , ω > 1 + 1/ Q , ω−1   λ1 = (1 + 1/ Q )2 , ω ≤ 1 + 1/ Q . λ1 = ω +

(42) (43)

From (43) it follows that indeed for small perturbations the eigenvalue spectrum of Wc does not differ from the Mar˘cenko–Pastur distribution. Emergence of a separated eigenvalue λ1 is possible only if the perturbation exceeds a √ threshold value of 1 + 1/ Q . This phenomenon resembles a phase transition [162]. In the more general case of p independent perturbations, aside from λ1 , the spectrum comprises additionally p − 1 separated eigenvalues.

136

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

3.1.5. Mutual information in many dimensions Mutual information, which is a useful tool for identification of statistical dependences in two-dimensional empirical data, faces serious practical problems if applied to a multivariate case. First, after a generalization of Eq. (27), any division of the phase space into multi-dimensional cells causes that a reasonable approximation of the multi-dimensional probability distribution pα1 ...αN (k1 , . . . , kN ) requires to use extremely long signals. Second, no relevant ensemble of random matrices has been defined, which implies that there is no theoretical predictions which can be referred to if one tried to use twodimensional mutual information and build a mutual-information-based matrix analogous to the above-described Wishart matrix ensembles. 3.2. Collective effects in stock markets 3.2.1. Data We choose the stock market as the first example of practical application of the matrix method to identification of the collective effects and their extraction out of noise sea. We base our study on price quotations representing stocks traded on two principal American markets: New York Stock Exchange (NYSE) and NASDAQ, as well as on the German stock market in Frankfurt (Deutsche Börse). Data from the American markets came from Trades and Quotes (TAQ) database [163] containing comprehensive recordings of all transactions with their main parameters: price, volume and time, which were done on all the stocks listed in NYSE and NASDAQ during the period starting on Dec. 1st, 1997 and ending on Dec. 31st, 1999. This period covered 521 trading days with the opening at 9:30 and the closing at 16:00. TAQ database contains a lot of recording errors like wrong prices or transaction times, whose number depends on particular stocks and oscillates roughly within 0%–1% of the total number of transactions. We filtered out all the errors which we managed to identify but despite this some errors still remained, especially those badly recorded prices which were too alike to be discerned from the actual quotations. They do not influence our results presented in this work, however. For each stock, the tick-by-tick quotations were transformed into a time series of prices sampled with a given constant frequency. We applied here the standard assumption, mentioned already in Section 2.3, that between consecutive transactions price remains constant and is equal to a preceding transaction price. Since temporal resolution of the TAQ recordings is only 1 s, it is normal that, for frequently traded stocks, many transactions are assigned to the same second. We thus considered prices that were averaged over all the transactions done within this second. For this analysis, we selected a set of N = 1000 stocks corresponding to the largest American companies. The selection criteria was market capitalization on Dec. 31st, 1999 at 16:00 h and uninterrupted trading of a stock during the whole period Dec. 1997–Dec. 1999. Instantaneous capitalization Kα (t ) of a company α is defined by: Kα (t ) = nα (t )pα (t ),

(44)

where nα is the number of outstanding shares, while pα is the current share price at time t. Data from the German market [164] came from the Karlsruher Kapitalmarktdatenbank (KKMDB) and comprised transaction-by-transaction recordings for 30 stocks listed in Deutscher Aktienindex (DAX) — the principal index of the Frankfurt Stock Exchange comprising the most highly capitalized and the most liquid stocks traded in Frankfurt. The period covered by this data set was equivalent to the one for the American market, but it comprised more trading days (524) and the trading hours were also longer (8:30–17:00 before Sept. 20, 1999 and 8:30–17:30 from that date) than in the previous case. Temporal resolution of these data, equal to 0.01 s, was sufficient so there was no need for averaging the prices of neighbouring transactions. For both the American and the German stocks, we adjusted prices according to splits and dividends. 3.2.2. Structure of empirical correlation matrix Due to strong nonstationarity of stock prices that can easily be observed in any market charts as the trends with various temporal horizons, a standard approach is to consider logarithmic price returns instead of the prices themselves. A logarithmic return is defined as: rα1t (ti ) = ln pα (ti ) − ln pα (ti−1 ),

ti − ti−1 = 1t ,

(45)

where 1t is sampling interval. Fig. 7 displays exemplary time series of the stock returns sampled with 1t = 15 min. Statistical properties of the stock returns will be discussed in more detail in Section 6.1, here we only notice that they are characterized by non-Gaussian, leptokurtic distributions, which is evident in the figure. One can also see that the periods of ‘‘laminar’’ price behaviour are interwoven with the periods of sudden explosions of nervous activity when prices are subject to strong movements. Such phenomenon can be observed in parallel for different stocks (grey vertical zones in Fig. 7), which gives an early indication of couplings between stock price movements. Fig. 8 presents a distribution of matrix elements for the 1000 × 1000 correlation matrix constructed from time series of stock returns (of length T = 13,546) corresponding to the largest American companies. A large size of the matrix gives a relatively smooth distribution. The sampling frequency that was chosen to 1t = 15 min was a trade-off between a need for statistically significant time series length (large T → small 1t) and a need for a good signal-to-noise ratio (which, as it will become clear later, requires longer sampling intervals 1t). The empirical distribution is far from the Gaussian one

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

137

Fig. 7. Time series of logarithmic price returns for exemplary stocks traded in NYSE (from top to bottom: Coca-Cola, Philip Morris, Texas Instruments). The first 5000 returns for 1t = 15 min are shown for each stock. Periods of the amplified fluctuations, if simultaneously observed in all three time series, are distinguished by vertical grey strips.

Fig. 8. Distribution of the correlation matrix elements for 1000 largest American companies represented by time series of returns with 1t = 15 min. Exponential functions y = ae−β x with β ≈ 22.0 and β ≈ 17.3 were fitted to the right slope of the distribution. The main panel has a logarithmic vertical axis, while the inset — a linear one.

that is assumed for the random matrices from the uncorrelated Wishart ensemble. Maximum of the distribution is shifted towards positive values and ⟨Cαβ ⟩ ≃ 0.07, while the visible asymmetry suggests a positive skewness. Such a shape of the distribution attests that a significant number of degrees of freedom is strongly coupled together. The eigenvalue spectrum of C is shown in Fig. 9. It consists of the largest, distant eigenvalue λ1 = 84.3, several (∼10) subsequent eigenvalues (the most prominent of which are: λ2 = 12.16, λ3 = 7.27 and λ4 = 6.02), and almost continuous spectrum of smaller eigenvalues with their centre of mass at λ ≈ 1. The smallest eigenvalue λ1000 = 0.22 that is also repelled from the bulk, but to the left, completes the picture. Such a structure of the eigenvalue spectrum is universal for all the stock markets, which was extensively documented in literature (e.g., [165–169]). A comparison of this empirical spectrum with the Mar˘cenko–Pastur distribution (Q ≈ 13.5; λmin = 0.53; λmax = 1.62) indicates that within the RMT bounds there are 77% of the eigenvalues of C. Particular eigenvalues can be better characterized by looking at components of the corresponding eigenvectors (Fig. 10). The largest eigenvalue is associated with a delocalized vector with all its components being positive and ⟨vα(1) ⟩α = 0.029. This can be interpreted as a strong decrease of rank of the main component of C, which can therefore be written as the sum [153]: C = C1 + C′ ,

C1 = b1,

(46)

where 1 is a matrix built of unities. Such a form of C resembles the correlated Wishart matrices (Eq. (41)). λ1 can be identified with a factor acting on all the degrees of freedom and generating their global coupling. In the language of econometrics, this factor is called the market factor and is treated as a force acting uniformly on all the stocks. This situation recalls the

138

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 9. Eigenvalue spectrum for the correlation matrix C constructed for the 1000 largest American companies. Grey region corresponds to the range allowed for the Wishart random matrices by Eq. (33).

a

b

Fig. 10. Eigenvector component distributions for the correlation matrix calculated from time series of returns (1t = 15 min) representing the 1000 largest American companies. (a) Eigenvectors associated with the largest eigenvalues λk for 1 ≤ k ≤ 4. (b) Eigenvectors associated with the remaining eigenvalues satisfying the condition λk > λmax (5 ≤ k ≤ 30, top left), with the eigenvalues from the Wishart range (101 ≤ k ≤ 900, top right), with typical eigenvalues less than λmin (970 ≤ k ≤ 995, bottom left), and with one of the smallest eigenvalues (k = 999, bottom right). In each plot, the empirical distribution (histogram) is accompanied by the Porter–Thomas distribution (Eq. (39)) for random matrices (dashed line).

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

139

Fig. 11. Participation ratio 1/Ik for all eigenvectors of the correlation matrix C based on 1000 companies. Inset shows the same quantity for the eigenvectors corresponding to the 30 largest eigenvalues plotted in semi-logarithmic scale. Grey horizontal stripe denotes standard deviation of the distribution of 1/Ik around its mean value ⟨1/Ik ⟩ ≈ 334 for a matrix constructed from surrogate signals.

many-body problems which are well-known in many areas of physics and to which one applies the mean-field approximation leading to a Hamiltonian with the analogous matrix structure as in Eq. (46). In the case of the stock market and its most important degrees of freedom (i.e., the highly capitalized companies), the above-mentioned force is primarily a mathematical construction since actually no resultant factor does act on these degrees. The couplings between particular pairs of degrees of freedom form spontaneously in consequence of individual decisions of the investors which decide to buy or sell particular assets [170]. The ‘‘market factor’’ is thus only a visible emergent effect of self-organization of the market on which global collectivity forms as a consequence of interactions among the elements. This remark refers particularly to the American and the other largest markets which are relatively independent. On the other hand, for smaller markets like the Warsaw Stock Exchange which are prone to follow the largest markets, there exists an external interaction that to some extent can play a role of the actual force governing behaviour of the market as a whole. However, even such external force cannot be fully identified with the ‘‘market factor’’ understood as the most collective eigenstate of the correlation matrix. A global character of the correlations expressed by λ1 can be clearly seen in Fig. 11, where the functional dependence of 1/Ik (‘‘participation ratio’’) on k is presented. A substantial contribution to the eigenstate corresponding to v(1) is made by about 60% degrees of freedom. This means that the market does not evolve as a rigid body (for which 1/I1 would be 1000) and it rather maintains certain ‘‘softness’’. However, in the zeroth approximation order, one can assume that the system is unified as a single degree of freedom. Other eigenvectors associated with the eigenvalues satisfying the condition λk > λmax do not reveal so strong delocalization, although in the case of v(2) the participation ratio is also above the average. The majority of these eigenvectors are characterized by a small number of prominent components, which means that relatively small numbers of degrees of freedom contribute to these vectors. From the market perspective, these are typically the companies which belong to the same or related industrial sectors [167]. They are more strongly correlated inside their own sector than with companies from other sectors. We then deal with the following structure at the correlation matrix level: C ≈ W + B,

(47)

where B is a block-diagonal matrix and W is a Wishart matrix. The eigenvalue spectrum of such a matrix C is described by the Mar˘cenko–Pastur distribution with additional distant eigenvalues whose number l is equal to the number of blocks in B. Therefore, the spectrum of C resembles the spectrum of W with l perturbations. The components of the corresponding eigenvectors thus have either asymmetric distributions or slopes of their distributions fall more slowly than the Porter–Thomas distribution predicts (Fig. 10). Typical eigenvalues shown in Fig. 9 are situated in the range allowed by RMT, yet it is not possible to fit the Mar˘cenko–Pastur distribution ρW (λ) to the empirical eigenvalue density as Fig. 12 persuades. Discrepancy with the distribution given by Eq. (33) stems from the existence of many large eigenvalues in the region λ ≥ λmax , which effectively pick up a considerable portion of the total variance σ 2 and suppress the main part of the empirical spectrum ρC (λ). One may find in this effect an analogy to the Haken’s slaving principle (Section 1.3.2) in which the collective modes associated with such eigenvalues, but chiefly with λ1 , subordinate the remaining modes of the matrix. The last group of the eigenvalues of C contains the eigenvalues less than λmin . Their associated eigenvectors typically possess few significant components (small value of 1/Ik in Fig. 11) or even several in extreme cases (the smallest λk ’s). Such eigenvectors describe strong correlations or anticorrelations between pairs or small groups of stocks.

140

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 12. Probability density distribution of finding the eigenvalues of the correlation matrix C constructed for the 1000 largest American companies (histogram), together with the fitted Mar˘cenko–Pastur distribution (dashed line). The four largest eigenvalues λ1 , . . . , λ4 are localized beyond the horizontal axis range. Inset: an analogous distribution for the matrix Crand created from surrogates.

A correlation structure that leads to a matrix given by Eq. (47) can be expressed by a simplified model [171], in which for any 1t a return of a stock α is given by a sum of 2 components: rα (ti ) = flα (ti ) + ϵα (ti ),

(48)

where flα and ϵα denote, respectively, a market-sector component (if the company belongs to a sector lα ) and an individual component. Both components are governed by the random variables with variances defined by the formulae: var(flα ) = (1 + (εlα )2 )−1 ,

var(ϵα ) = (εlα )2 (1 + (εlα )2 )−1 .

(49)

In a more subtle, hierarchical version of this model, each degree of freedom is subject to action of L factors with different strengths [172]: rα (ti ) =

L 

γαl fl (ti ) + ηα ϵα (ti ),

(50)

l =1 2 1/2 where γαl describes the strength of a factor l (different for different degrees of freedom α ), ηα = [1 − is the l=1 γαl ] strength of an individual component, while fαl and ϵα are i.i.d. random variables. The hierarchical model has a more realistic structure resembling the structure of the empirical matrix C.

L

3.2.3. Information masked by noise One of the practical problems which occur in analysis of empirical data by means of the correlation matrices is whether the part of the eigenvalue spectrum of C which is localized in the interval λmin ≤ λ ≤ λmax is universal and has to be treated as related to pure noise, or it contains some extra information about possible non-trivial correlations present in the studied data. This is especially important in the case of financial markets, where each eigenstate of the correlation matrix can be considered a possible realization of the investment portfolio Pk with a return given by: Pk1t (ti ) =

N 

α=1

vα(k) rα1t (ti ),

(51)

where each weight corresponds to a fraction of the total capital invested in an asset α . Risk of the portfolio Pk is determined by the associated eigenvalue λk : Risk(Pk ) = var(Pk ) = [v(k) ]T Cv(k) = λk .

(52)

According to the optimal portfolio theory [173], the risk-minimizing portfolios correspond to small eigenvalues which, from the RMT point of view, do not offer any genuine information. In our example, the eigenvector component distributions, which for 101 ≤ k ≤ 900 are well modelled by the Porter–Thomas distribution (Fig. 10b), argue for the pure noise paradigm. It seems, however, that if the spectrum of C cannot be expressed by the Mar˘cenko–Pastur distribution (as in our case here), the Wishart sea may contain something more than pure noise. A related indication is that width of the spectrum of W equal to λmax − λmin depends on signal length: the longer a signal is, the narrower is the spectrum. Owing to this, if in the Wishart range of random eigenvalues there are some eigenvalues of the ‘‘true’’ correlation matrix but they are masked by noise, then they can be discovered by elongating the observation time. Indeed, by manipulating a value of the parameter Q ,

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

141

Fig. 13. Relation between the number of eigenvalues of the double-filtered correlation matrix C′′w (see text for details) in the Wishart range [λmin , λmax ] (white vertical bar fields) and the number of eigenvalues above or below that range (upper and lower dark fields, respectively) for different values of Q = Tw /N. For several choices of Tw , time series of stock returns with 1t = 5 min for N = 100 largest American companies were divided into disjoint windows of length Tw and the results were averaged over all the windows. Maximum value of Qw = 406 corresponds to the largest window possible (Tw = T ). The eigenvalues were ordered according to their index k.

which can be done by dividing the whole signals of length T into windows of length Tw ≤ T , one can show that the larger Q = Tw /N is, the more eigenvalues emerge from the Wishart range [174]. We illustrate this point based on time series of stock returns for the 100 largest American companies. We chose 1t = 5 min which allowed us to obtain a set of signals of length T = 40,680. For a few different choices of Tw , we divided these signals into windows and in each window we calculated the correlation matrix Cw . Then the eigenvalue spectrum of Cw was averaged over all the windows. Such an averaged spectrum for a given value of Tw can be compared with the spectrum for the complete signals of length T . By treating Tw as a variable, we could investigate the number of eigenvalues situated outside the Wishart range as a function of Tw . In order to improve the correspondence between the empirical spectra and the theoretical range for the Wishart matrices: [λmin , λmax ], it is recommended to apply the standard procedure of removing from each signal the most collective component corresponding to λ1 . This component, which we denote by Z1 , can be expressed by a linear superposition of the original time series of returns rα (where the index 1t has been omitted for the sake of simplicity) in each window separately: Z1 (tj ) =

N 

α=1

vα(1) rα (tj ),

j = 1, . . . , Tw .

(53)

This collective component can be removed by least-square fitting it to each signal rα : rα (tj ) = aα + bα Z1 (tj ) + εα (tj ),

(54)

where aα , bα are the parameters to be found, and εα is a residual signal which is orthogonal to Z1 . From these residual signals, we constructed the residual correlation matrices C′w of rank N − 1, whose spectrum does not contain the most collective eigenvalue. However, since this spectrum, averaged over all the windows, still significantly deviates from the spectrum of a Wishart matrix and still contains a distant eigenvalue (now corresponding to λ2 of Cw ), the above procedure (54) can be applied once again in order to remove the component Z2 . In effect, we obtain the matrix C′′w of rank N − 2 whose spectrum does not reveal any distant eigenvalues with a large gap. Fig. 13 exhibits a functional dependence of the number of eigenvalues outside the Wishart range (i.e., the ones satisfying the condition: λk ̸∈ [λmin , λmax ]) on the parameter Qw . This number grows with increasing Tw , when more and more eigenvalues leave the Wishart range. This outcome supports our hypothesis that the Wishart sea can mask some nonuniversal eigenvalues of the correlation matrix C. From the theoretical perspective, it is possible to calculate the eigenvalue spectrum of an estimator of the ‘‘true’’ correlation matrix if one makes assumptions about this spectrum. This procedure may be described as ‘‘noise-dressing’’ of the spectrum of the ‘‘true’’ correlation matrix in order to obtain a spectrum that resembles the empirical one [175,176]. 3.2.4. Temporal scales of coupling formation The above discussion of the correlation matrix properties was related to the situation in which time series of returns were created for a specific time scale (1t = 15 min). That analysis can be naturally extended if we consider 1t to be a variable. It is then possible to investigate stability of the results for different time scales. Sampling frequency plays an important role because it remains in a steady relation to the time scales characteristic for the market (provided such time scales do exist). Therefore, one may expect that the structure of C is 1t-dependent.

142

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 14. Eigenvalue spectra of the correlation matrix C as functions of the time scale 1t for the 100 largest American companies (top), and for the 30 German stocks listed in DAX (bottom). The special scales of 1t = 390 min (New York) and 1t = 510 min (Frankfurt) are equivalent to one trading day on the respective markets. For longer scales (1t ≥ 30 min) both spectra were averaged over several realizations of the signals, phase-shifted by τ 1t, where τ = 1; 2; . . .. Saturation of λ1 (1t ) can be seen in both spectra for sufficiently long time scales. Dark zone at the bottom of each plot expresses the Wishart range.

Price movements of a given asset, triggered by news releases and other information arriving at the market as well as by price movements of other assets, take place at random moments determined by the arriving buy and sell orders. The orders themselves also differ in price and volume. This is why noise has to dominate at the shortest time scales of the order of a fraction of a second. However, as time passes, this picture changes: the investors manage to assimilate the information, to notice the price movements, and to make relevant decisions as regards buying or selling. Different investors have different reaction times. Typically, the largest institutional investors are able to react faster as they have access to data of better quality and use automatic strategies and computer-controlled trading systems allowing such investors to make decisions within milliseconds. On the contrary, the individual investors react with a larger delay of seconds to minutes, or even hours or days if they do not follow the situation continuously. Because the largest orders which determine the direction of price movements are placed by the institutional investors, one may expect that the most fundamental couplings among the degrees of freedom are built-up at relatively short time scales closer to minutes than hours [177,178]. Fig. 14 shows eigenvalue spectra of the correlation matrices as functions of 1t for two groups of stocks: 100 American stocks with the largest capitalization (Group A-100, top panel) and 30 German stocks listed in DAX (Group G-30, bottom panel). For both groups, we first of all see a strong dependence of λ1 on time scale. Two regimes can be distinguished in these dependences: for 1t < 15 min, a relatively strong increase of λ1 with increasing 1t can be observed, while for 1t > 15 min, the rate of this increase is gradually reduced so that for 1t ≥ 60 min (Group G-30) or 1t ≥ 240 min (Group A-100) it reaches a saturation level. A further increase of 1t does not produce any significant increase of λ1 . A similar analysis (not shown here) based on daily data from the same markets indicated that the saturation level remains stable up to the longest considered scale of 16 trading days. However, a more recent study based on the wavelet transform showed that, for time scales longer than 2 days, strength of the coupling starts to decrease after reaching an interim saturation level [179].

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

143

Fig. 15. Participation ratio 1/I1 calculated for the components of the eigenvector associated with the largest eigenvalue λ1 as a function of time scale 1t for the 100 American companies with the largest capitalization. The same data set as in Figs. 13–14 was used.

The eigenspectra presented in Fig. 14 differ from each other in magnitudes of the second and the subsequent eigenvalues which in the case of A-100 grow systematically with increasing 1t up to the longest time scale of 780 min. This suggests that after the saturation of the market-wide coupling represented by λ1 , less collective couplings among several subgroups of stocks can still develop. On the other hand, in the case of G-30, such an effect, if existing at all, is much weaker and restricted only to λ2 with a magnitude comparable to λmax . This means that the stocks of the largest German companies are intensively coupled together as a whole, but they do not reveal any clear sector-like structure. A separate interesting observation is that at the shortest, sub-minutely time scales, the G-30 stocks are more strongly correlated with each other than do the A-100 stocks. This manifests itself in a wider eigenvalue gap between λ1 and the rest of the spectrum in the case of G-30, even though this group has over three times less degrees of freedom then A-100. This means that correlations between the stocks of G-30 build up faster, which is consistent with reaching by this group the λ1 ’s saturation level for shorter 1t. It seems that an origin of this effect should be searched for in a larger sensitivity of the German market to an influence of the American market. In this way, the investors operating on the German market consider predominantly one factor, i.e., the American indices, without looking into details of the actual situation in the German market, while their colleagues operating in the United States pay attention to more diversified sources. The fact that German stocks are more strongly correlated than American ones was also noticed earlier in an analysis of temporal stability of correlations [152]. The growth of λ1 (1t ) can pursue in two different ways: a gradual increase of the global coupling strength or quick formation of a collective core consisting of a small number of stocks which is then joined by other stocks. To decide which of these options is more probable here, Fig. 15 shows a functional dependence of 1/I1 (the participation ratio for the eigenvector associated with λ1 ) on 1t for the A-100 group of stocks. This quantity slowly oscillates around its mean ⟨1/I1 (1t )⟩ ≈ 83, with a pronounced minimum for 1t = 1 min, a growth for 1 < 1t < 15 min, and mild monotonic decrease for 1t > 15 min. This behaviour of 1/I1 (1t ) suggests that the stocks are globally coupled already at very short time scales of the order of seconds, while the weak dependence on 1t may indicate that, for A-100, there is no central core of few stocks which would gradually aggregate the remaining degrees of freedom. The slight variation of 1/I1 (1t ) probably stems from some aggregation of stocks for 1t ≤ 15 min, which is obscured at the shortest scales by the existence of strong correlations among the stocks on the overnight returns (connecting the closing price on a given trading day with the opening price on the next trading day). Although the absolute amplitude of these returns is the same at different time scales, their relative amplitude is much larger at the short scales implying that the overnight returns contribute more to the correlation coefficients (i.e., to the matrix elements) than the regular returns. It is known that on each stock market there is a hierarchy of companies at the top of which there are the ones with the largest market value whose stocks are responsible for the largest traded volume. One may ask a question at this point whether the capitalization has any influence on the speed of coupling formation and saturation. An answer to this question K is given by Fig. 16 which displays a dependence of the global coupling strength expressed by λ1 l (1t ) on mean capitalization Kl of companies whose stocks belong to a group l. To be able to make comparisons, we selected four equivalent sets of N = 30 American stocks. In each set, there were stocks with comparable capitalization but between the successive sets the capitalization was increased by one order of magnitude. The associated plots in Fig. 16 clearly show that the couplings form the faster, the higher the mean capitalization Kl is. Except for the largest companies (Kα ∼ 1011 USD, Group K1 ) for which λ1 (1t ) saturates at the level of ≈ 11 for 1t ≥ 60 min, the remaining sets of stocks with systematically smaller λ1 (1t ) do not show any clear saturation for the scales up to 1t = 780 min. A hint that a saturation is nevertheless possible even in the case of the less-capitalized companies, but it may take place for the time scales longer than those considered here, can be seen in the case of Kα ∼ 1010 USD (Group K2 ). However, the plots for the last two sets of stocks with the capitalization Kα ∼ 109 USD (Group K3 ) and Kα ∼ 108 USD (Group K4 ) show an unperturbed power-law increase of λ1 (1t ). In addition, although

144

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 16. The largest eigenvalue of C as a function of time scale 1t for a few groups of stocks each consisting of 30 stocks that represent companies with comparable market capitalization (solid lines). Dashed line denotes the upper eigenvalue bound λmax for the Wishart matrix ensemble given by the Mar˘cenko–Pastur distribution.

Fig. 17. Correlation coefficient calculated for time series of stock returns with 1t = 1 min representing Ford (F) and General Motors (GM) in two different periods of time: 1971 and 1998–1999. Source: Data from 1971 are used after [180].

for 1t < 1 min small collective couplings can be observed for K1 –K3 , the stocks from K4 show statistically insignificant correlations. By comparing the values of λ1 for the two extreme sets, one can notice that the correlation strength of K4 at the time scale 1t = 780 min (2 trading days) is roughly the same as the correlation strength of K1 at 1t = 100 s. By taking into consideration the fact that on NYSE there are companies with the capitalization even 100 times smaller than K4 , and by assuming that for such companies the same regularity can be seen, we obtain a huge dispersion of the speed of the coupling formation between the stock groups of different capitalization. The origin of the above-discussed increase of the overall couplings with increasing 1t, known as the Epps effect [180], are predominantly the lagged correlations among different stocks. Recent studies have shown that the rationale behind this effect is that investors have different reaction times to events taking place in the market or beyond it [181]. Although the dependence of the largest eigenvalue on capitalization might suggest that the reason for different growth rates of λ1 for different groups of stocks lies in their transaction frequencies (the highly capitalized stocks are at the same time among the most liquid ones, while the small companies are only occasionally in the focus of the investors’ attention) [178,182], recent detailed analyses indicated that the liquidity has a rather insignificant influence [181]. In this context, now it seems that the variation of λ1 for the large and small companies cannot be a consequence of the variation of the effective market time whose stock-specific lapse might be paced by the transactions. The Epps effect is by no means a stable phenomenon. It has its own dynamics, which can be revealed by comparing the time scales at which the saturation of the market correlations took place in different years. Results of such an analysis are presented in Fig. 17, where values of the correlation coefficient CF,GM (1t ) for time series of the stock returns corresponding to Ford (F) and General Motors (GM) were plotted as a function of 1t for the year 1971 and for the years 1998–1999. It occurs that, for a given time scale, the correlation strength in the data recorded many years ago is significantly smaller than the same quantity in the data recorded more recently. For example, the value which was reached by CF,GM in 1971

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

145

Fig. 18. The rescaled (or normalized) eigenvalue spectra λ/N (ladders) and the rescaled participation ratio 1/(I1 N ) (squares) for the correlation matrices of different size N constructed from time series of stock returns with fixed 1t = 15 min. N denotes the number of the companies taken from the American market starting from the one with the highest capitalization. The narrow dark staircase-like line with variable width seen at the bottom of Figure denotes the (appropriately rescaled) Wishart range.

for 1t = 10 min, in 1998–1999 was reached already for 1t = 1 min. The same effect can be observed for other pairs of stocks for which historical results are available. This outcome proves that over the years the investors’ reaction time underwent considerable acceleration, which seems natural in the light of the technological advancements which in the context of trading were significant, indeed. The larger group of stocks is used to create the correlation matrix, the more likely is that the companies with incomparable sizes enter this group. In result, the stocks of the largest companies are more strongly correlated among themselves than with the stocks of smaller companies and than the latter are correlated among themselves. This may lead to decrease of the mean global correlation level with respect to the situation in which all the companies are large. That this can be an actual phenomenon, one can learn from Fig. 18, in which the normalized eigenvalue spectra λ/N and the normalized participation ratio 1/(I1 N ) were plotted against the number of companies used to construct the matrix C. The time scale of the returns was fixed at 15 min. The companies were first ranked by their capitalization and than taken one-by-one starting from the top so that the larger N is, the more diverse is the resulting group. For N = 10, only the largest companies with Kα > 200 billion USD were considered, while for N = 1000, the smallest among the companies were worth not more than ca. 700 million USD (i.e., they are situated between the K3 and K4 groups). The outcomes show that for N ≤ 100 about 90% of the stocks enter the global collective state associated with λ1 and 1/(I1 N ) weakly depends on the group size N. This picture changes quickly for N > 100, and when N = 1000 the normalized participation ratio falls below 60%. From this perspective, the previously considered (and dropped) possibility of the existence of a strongly coupled market core and weakly linked peripheries is now able to account for the obtained results. However, these two outcomes do not contradict each other, because the set considered earlier consisted of 100 stocks, i.e., it resided in the region of the almost-fully correlated market. On the level of the normalized eigenvalues, in Fig. 18 we observe monotonic decline of λ1 /N from 0.43 (N = 10) to 0.08 (N = 1000). This illustrates the fact that a market consisting of a large number of companies of different size is much more ‘‘soft’’ than the relatively ‘‘rigid’’ core built up of a few large companies. Such ‘‘softness’’ allows for more diverse behaviour of the stocks and for larger freedom of grouping the stocks into clusters (sectors) or, the opposite, for remaining outside the mainstream market dynamics. The speed of information spreading is also related to this: among the large companies constituting the core, the information is distributed fast which results in instant and strong couplings, while the peripheries remain unaffected by this information for a much longer time. Even after the information finally spreads over the small companies, which requires much time to be observed, it is already polluted with noise and it cannot produce strong correlations. The stock market is thus a system in which under typical conditions neither collectivity nor noise dominates, but both phenomena remain in dynamical balance with each other. As it has already been pointed out in Section 1.3, this is one of the manifestations of complexity. 3.3. Collective effects in the currency market Another example of a system with many degrees of freedom expressed by time series which can be analysed by the matrix methods is the currency market. The analysis presented in this section is based on daily quotations of the exchange rates of 59 currencies and 3 precious metals: gold, silver and platinum covering the period 1999–2008. The set of currencies comprises all the major floating currencies and some floating or pegged currencies of lesser importance as well as some

146

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

partially convertible currencies. Taking into account the precious metals in a context of the currency market stems from two reasons. First, the precious metals have strong historical links with the monetary system. Over centuries, the circulating coins comprised strictly defined amounts of gold or silver. Later on, such coins were replaced by paper banknotes convertible to gold in a predefined ratio as warranted by national central banks, which was a commonplace during 19th century and the first half of 20th century, and which partially survived until 1970s in the Bretton Woods system. Despite the fact that currently all the countries rely on the system of fiat money which no longer is related to gold or any other precious metal, the value of gold is still viewed as stable and not subject to inflation due to the roughly constant reserves of gold in the world. Therefore, in uncertain periods of time like wars or economic crises, gold can play and frequently does play the role of an alternative currency. Second, unlike the stocks and commodities whose value can be priced in local money, the proper currencies do not have such an independent reference frame and their value can be expressed only in other currencies. In this context, gold and other precious metals can offer a kind of such reference. 3.3.1. Complete multi-dimensional market structure Like stock prices, the exchange rates can exhibit strong trends which makes application of the statistical methods of data analysis unreliable. For this reason, it is convenient to analyse the exchange rate fluctuations instead of the exchange rates themselves. If the exchange rate of two currencies: B and X is defined as the number of the units of X that are equivalent in value to the unit of B (known as the base currency), then this rate may be denoted by B/X. Its logarithmic return at a time scale 1t is given by: rB1/tX (ti ) = ln[B/X(ti )] − ln[B/X(ti−1 )],

ti − ti−1 = 1t ,

i = 1, . . . , T ,

(55)

in full analogy to Eq. (45). Having got a set of N = 62 currencies, one can construct N (N − 1) = 3783 combinations of the exchange rates. With a trivial relation: X/Y(ti ) =

1 Y/X(ti )

−→ rX1/tY (ti ) + rY1/tX (ti ) = 0,

(56)

where transaction costs have been neglected, this gives N (N − 1)/2 = 1981 exchange rates of unique currency pairs. Time series length T = 2519 is the same in each case. Based on these time series, treated as the market degrees of freedom, the correlation matrix C of size N (N − 1) × N (N − 1) can be derived, whose elements are the correlation coefficients between pairs of the exchange rates. It should be noted at once that the matrix rank is equal to at most N − 1 due to the triangle rule (25) in force, which for the returns has the following form: rX1/tY (ti ) + rY1/tZ (ti ) + rZ1/tX (ti ) = 0.

(57)

Distribution of the matrix elements for C is presented in Fig. 19 (darker symbols). Clearly, it deviates from its counterpart for the Wishart matrices by having leptokurtic tails which span the whole allowed range −1 ≤ Cαβ ≤ 1. This means that some degrees of freedom evolve in full coherence among themselves effectively forming a single degree, whereas other degrees show negative correlations and some of them are even in full antiphase with respect to each other. In both cases, one observes a strong reduction of the matrix rank. In order to eliminate such degeneracies, all the currencies which are pegged to some other currencies by decision of the respective central banks, and which therefore do not have their own dynamics, were removed from our set. In consequence of this review, only the independent currencies and gold survived forming a subset I with 38 degrees of freedom. Not only some marginal currencies were removed in this way, but also a few floating ones like the Danish krone (DKK) linked to euro (EUR) via the ERM2 mechanism, the Hong Kong dollar (HKD) dependent on USD through the so-called linked exchange rate system, the Singaporean dollar (SGD) whose rate is defined against a basket of major currencies, and the Hungarian forint (HUF) with a fixed parity in respect to EUR. The correlation matrix Cind calculated for this subset of independent currencies has the size 703 × 703 and the distribution of its elements as presented in Fig. 19 (lighter symbols). In its essential part, this distribution overlaps with the distribution for the complete set of currencies, but it possesses considerably less extreme elements ≈ ±1. These extreme elements are manifestation of strong couplings that reduce the matrix rank. Nevertheless, still some number of the almost-coherent degrees of freedom can be found in the market which means that many null eigenvalues should be expected in the eigenspectrum of Cind . 3.3.2. Market structure for a given base currency Because of the strong degeneracy of the matrix Cind , studying the currency market in its complete representation by the matrix method is not recommended. It is far more convenient to analyse structure of the market from the perspective of a fixed base currency. Fixing the base entails a significant reduction of dimensionality, since in this case the total number of the degrees of freedom B/X that are simultaneously considered equals N − 1 = 37. This is also exactly the dimension of the relevant correlation matrix. By changing the base currency B, the whole family of N = 38 matrices CB can be created. This allows one to investigate how the structure of the global currency market looks like from the point of view of each base currency taken from the set I , as well as to identify any collective components of this market [183,184]. In such a case, it is

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

147

Fig. 19. Distribution of the correlation matrix elements (darker symbols) created from time series of the exchange rate returns representing N = 62 currencies and N (N − 1)/2 = 1819 exchange rates determining the size of C, together with distribution of the elements of the matrix Cind (lighter symbols) of size N ′ (N ′ − 1)/2 = 703 created for the subset of independent currencies (N ′ = 38). All the time series have the length T = 2519.

Fig. 20. Distributions of the matrix elements for CB for several base currencies and an artificial currency ‘‘rnd’’ whose exchange rate USD/rnd is represented by a random walk. In each case, vertical line denotes the mean matrix element of the corresponding CB .

also possible to change the notation to a more intuitive one: instead of speaking about correlations between the exchange rates B/X and B/Y, one may speak about correlations between the currencies X and Y from the perspective of B. Distributions of the correlation matrix elements for several characteristic choices of B are presented in Fig. 20 after B ⟩. Variation of this value between the base currencies is ordering them according to increasing mean element value ⟨CXY substantial and fits into the interval from 0.16 for the US dollar (B = USD) to 0.85 for the Ghanaian cedi (B = GHS). Collating these mean values with the values of the participation ratio 1/I1B corresponding to the most collective eigenvector (1)

vB proves that this eigenvector is highly collective in the case of base currencies which are characterized by a large value B of ⟨CXY ⟩ (for example, 1/I1GHS = 36.5) and less collective in the opposite case (e.g., 1/I1USD = 17.3). This is by no means a surprising result because selecting a base currency is equivalent to attaching the reference frame to this currency. Thus, choosing a currency which is weakly coupled with the rest of the market (GHS, for instance) would lead to the situation in which the whole market expressed in this base currency would behave as a strongly coupled, ‘‘almost-rigid’’ system with high values of both the mean matrix element and the participation ratio 1/I1B . The picture will change completely, if one chooses the most important currency – USD – to be the base. Many countries (for example, those from Far East and Latin America) have strong economic ties with the United States and this can be reflected in strong coupling of their currencies with the US dollar. From the perspective of a non-USD base currency, such currencies are the satellites of USD. Now, if the reference frame is attached to USD, its satellites become apparently independent implying that the mean element of the matrix CUSD becomes small. Besides the currencies from the set I , Fig. 20 shows a distribution of matrix elements for the artificial currency ‘‘rnd’’. We assumed that the returns of its exchange rate with respect to USD are given by a stochastic process with a typical

148

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

40

30

λk 20

GHS

XAU

rnd

JPY

PLN

CHF

EUR

GBP

0

USD

10

base currency Fig. 21. Eigenvalue spectra of CB for exemplary choices of the base currency B, including gold (XAU) and the artificial currency ‘‘rnd’’.

Fig. 22. The largest eigenvalue λB1 for all the 38 currencies from the set I . The currencies have been partitioned into four baskets: A∗ , A, B, C (see text for details). The smaller value of λB1 is, the more central role is played by the base currency B. The artificial currency ‘‘rnd’’ was included to serve as a reference.

distribution of fluctuations (actually, this is a randomized time series of the USD/PLN rate returns). By construction, this currency is completely independent and forms another – aside from gold – reference for the proper currencies. It differs from gold in such a way that, although its fluctuations are independent, it is unlikely that its evolution with respect to other currencies would reveal significant trends (because of zero autocorrelation). The eigenvalue spectra of the matrices CB plotted in Fig. 21 show a single repelled eigenvalue λB1 and the corresponding eigenvalue gap. In some cases, one can find another collective state associated with λB2 and, rarely, a few states associated with further eigenvalues. In accordance with the model properties of the correlated Wishart matrices and with empirical B observations (e.g., [152,167]), magnitudes of λB1 are related to ⟨CXY ⟩, which can be seen by comparing Figs. 20 and 21. In parallel with the mean matrix element, the largest eigenvalue describes decorrelation (or degree of independence) of the base currency in respect to the rest of the market. In other words, by assuming that from the perspective of the market structure, the central currencies which have their satellites are more important than the peripheral currencies, the magnitude of λB1 may be considered a measure of the significance of B. The rationale behind this assumption is that the currencies which have their independent evolution, even if they are associated with a big national economy (like the Japanese yen, JPY), do not play a key role in the global market and the market can function without their existence equally well. Fig. 22 contains the complete information on the largest eigenvalue of CB for all possible choices of B out of the set I . In order to make the plot more transparent, the currencies were assigned to baskets according to the widely used by practitioners criterion based on liquidity. The basket A∗ contains the major, most liquid currencies, the basket A contains other liquid currencies, the basket B – the currencies with poorer liquidity, but nevertheless traded without problems, and the basket C – the partially convertible illiquid ones, which are traded in some indirect way (for instance, via the nondeliverable forward contracts). It comes from Fig. 22 that apart from using the liquidity criterion, the following three main currency groups can be distinguished. The first group is formed by the currencies which do not show a gap between λ1 and the rest of the spectrum. In this group λB1 ≤ 10. USD and a few less important currencies can be assigned here. The second group, which is the most numerous one, is a mixture of the currencies from all the baskets and its characteristic property is that it comprises

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

149

Fig. 23. The largest eigenvalue λB1 for the dependent currencies, pegged to one of the major currencies. The rectangular areas denote the regions in which currencies depend on USD or EUR.

all the basket-A∗ currencies except for USD. In this case, the largest eigenvalue is moderate: 12 ≤ λB1 ≤ 22. The third group contains currencies from the baskets A–C and the largest eigenvalue for the currencies from this group is especially prominent: λB1 ≥ 25. When analysing positions of particular currencies in the plot, it is worthwhile to notice first that the low position of the US dollar (λUSD ≈ 9.2) is completely justified by its fundamental role in the world’s currency system. A similar position of the 1 Taiwanese dollar (TWD) can be explained by the strong economic and political links between Taiwan and the United States. On the other hand, positions of the Moroccan dirham (MAD) and the Tunisian dinar (TND) is more difficult to account for. In the former case, the weak collectivity of λMAD may originate from the fact that Moroccan economy is strongly connected 1 with both the United States and the European Union. This implies that MAD is related to both USD and EUR, the principal poles of the global market. Since, in addition, MAD does not reveal any significant individual dynamics which might have decoupled it from the rest of the market, the market viewed from the perspective of MAD may not be strongly collective but rather fragmented into the parts related to EUR and USD (see also Section 8.2.2). As regards TND, it is strongly coupled to MAD and inherits in this way some of its properties. At the opposite end of the variability range of λB1 reside the currencies which are largely decoupled from the market. Among them there are GHS, the Zambian kwacha (ZMK), the Algerian dinar (DZD), the Brazilian real (BRL), and the Turkish lira (TRY). Their common property is that their largest eigenvalue is larger then the one of gold (XAU). At first sight, this may seem to contradict intuition, but actually it is easy to explain this peculiarity if one takes into consideration the fact that during the analysed interval of time all these currencies passed through a period of hyperinflation. Large values of λB1 characterize also the Russian rouble (RUB) and the Indonesian rupiah (IDR) which during this interval were unstable, and the Icelandic krone (ISK), whose value experienced a breakdown in 2008. The last currency that reveals substantially independent evolution is the South African rand (ZAR), which has strong connections with the commodity market, including the precious metals market. Exactly as expected, in the same zone of λB1 , there is also the artificial currency ‘‘rnd’’. The remaining currencies, which do not reveal pathological or eccentric dynamics and which are less important than USD, are situated in the zone of moderate values of λB1 . This zone may thus be considered a zone of typical, healthy currencies. Among them, there is EUR which has relatively low value of λEUR = 13.0, the British pound (GBP), the Swiss franc (CHF), 1 and, slightly higher placed, the Japanese yen (JPY). To complete the picture, in the next figure (Fig. 23), the currencies which do not belong to the set I are considered. Each of these currencies is artificially linked to one of the major currencies by the monetary policy of the relevant central bank, so their dynamics is only a reflection of the dynamics of their superior currency. The calculations were done for the complete set of N = 62 currencies and precious metals, so in order to achieve an approximate correspondence of the scale with Fig. 22, all the values of λB1 were multiplied by the factor f = 37/61, equal to the ratio between the number of elements in I and the complete set (minus a base currency in both cases). As it can be seen in the plot, such currencies can roughly be partitioned into two groups: the currencies linked to USD and inheriting its central position and low λB1 , and the currencies linked to EUR and also inheriting its properties. Apart from the collective mode associated with the position of a given currency in the global market, in some cases, the eigenvalue spectra indicate that other, weaker couplings among the currencies may also exist. So as to determine properties of these more subtle couplings, it is recommended to remove from the original data the already-identified collective components. This can be done according to Eqs. (53)–(54) by fitting the extracted collective component to the original signals and by the subsequent subtracting of it. However, instead of following this method exactly and removing the component Z1B corresponding to the largest eigenvalue λB1 , here we remove the two components associated with the a priori known collective sources: USD and EUR. More specifically, from a signal corresponding to a given exchange rate B/X, we consecutively subtract the fitted components being the exchange rates B/Y, where Y = USD or Y = EUR: rXB (tj ) = aX + bX rYB (tj ) + εXB (tj ).

(58)

150

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

λk

40

30

20

GHS

XAU

rnd

JPY

PLN

CHF

EUR

GBP

0

USD

10

base currency Fig. 24. Eigenvalue spectra for the same base currencies as in Fig. 21 after filtering out contributions from the exchange rates B/USD and B/EUR. The corresponding Wishart range is represented by the shaded horizontal zone in the bottom part of the plot.

If the base currency is USD or EUR, for obvious reasons, we do not subtract the corresponding components USD/USD or EUR/EUR, and, in such a case, only one exchange rate may be subtracted, i.e., USD/EUR or EUR/USD. The resulting eigenvalue spectra for several selected base currencies, after applying the above procedure and diagonalizing the correlation matrix formed from the residual signals εXB , are displayed in Fig. 24. This Figure has to be compared with Fig. 21. As it can be seen, the global collective eigenstate represented by λB1 in Fig. 21 disappeared here for each currency. We thus may conclude that this eigenstate was related to either USD or EUR, or both. All the residual eigenspectra now show only a single non-random eigenvalue with little collectivity. Filtering out of the most collective component, regardless of whether it was complete or only partial, allows one to look for more subtle features of the forex structure that lead to its clustered form. We apply in this context the method proposed in [185], consisting in filtering the residual correlation matrix CBres in such a way that only the elements which exceed a predefined threshold value p survive, while all the smaller elements are replaced by zeros. In the next step, the parameter p is varied and changing from 1 down to 0. For each of its considered values, the number L of clusters in CBres.filt is counted so that we obtain a functional dependence of p on L. A cluster is defined here as a group of the exchange rates B/X with B a common base currency, for which CXY > p. For p = 1 there is no cluster, while for a sufficiently small value of p there is exactly one cluster comprising all the currencies. Thus, for some value of p = pc between these extremes, a maximum number of clusters can be identified. In such a case, if the clusters are stable in some small vicinity of pc , we may see the most subtle coupling structure of the market. An earlier application of this method to a stock market permitted us to study its sector structure in spite of a poor signal-to-noise ratio at a short time scale of 1t = 1 min [186]. By applying this method to the set of 38 independent currencies, we obtained a clear sector-like structure of the market. For each of the 8 analysed base currencies (Fig. 24), 6 small groups of the coupled currencies can be identified: (1) AUD–CAD–NZD, (2) BRL–CLP–MXN, (3) CHF–JPY, (4) CZK–PLN, (5) NOK–SEK, and (6) KRW–TWD. Four of these groups can be assigned to specific geographical regions and, thus, their existence reflects economical similarity of the respective countries. Next, one group (AUD–CAD–NZD) is linked to the commodity trade (sometimes ZAR can also be included here), while the last one (CHF–JPY) seems to be rather counter-intuitive and perhaps it originates from some specific trading strategies which were used during the studied period as regards the exchange rates related to these currencies. If a parallel analysis for the complete set of 62 currencies is carried out, the resulting cluster structure looks quite similar (Table 1). These outcomes remain in agreement with the ones available in literature [187–189]. It should be stressed, however, that the structure discussed above is only secondary with respect to the primary cluster structure in which there are only two centres: USD and EUR. 3.3.3. Stability of market structure The empirical correlation matrix CBind is unstable. Its eigenstates depend on time interval for which it was created. This stems from the strongly nonstationary character of the financial data which implies a flexible structure of the corresponding financial markets [152,167]. In the case of the Forex market, at each moment different currency pairs may be favoured by the investors, the economic situation may be different as well, the central banks may change interest rates, etc. All this may have a strong impact on the correlation strength between pairs of exchange rates leading to its fluctuations. In effect, the eigenvalue spectrum of the correlation matrix may change completely from one moment to another, so may the width of the gap between λB1 and the rest of the spectrum, the eigenvector structure and, consequently, the sector structure of the market. For this reason, the shape of the eigenvalue spectrum for any period of time is of statistical nature only (see, for instance, Figs. 9 and 21). Such nonstationary behaviour of the correlation matrix is illustrated in Fig. 25, where the temporal

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

151

Table 1 Secondary cluster structure of the Forex market consisting of 62 currencies from the perspectives of 6 different base currencies: USD, CHF, GBP, JPY, XAU, and GHS after filtering out the contributions from the exchange rates B/EUR and B/USD, where B stands for each base currency (details in text). Each column represents a base currency and the horizontal lines separate different clusters. USD

CHF

GBP

JPY

XAU

GHS

AUD NZD CAD ZAR

AUD NZD CAD ZAR

AUD NZD CAD

AUD NZD CAD

AUD NZD

AUD NZD

CHF XAF MAD TND GBP NOK SEK CYP CZK HUF PLN SKK DKK

CHF XAF MAD TND GBP NOK SEK CYP CZK HUF PLN SKK DKK

CHF XAF MAD TND GBP NOK SEK CYP CZK HUF PLN SKK DKK

DZD MAD TND

MAD TND

NOK SEK

NOK SEK

NOK SEK

CZK HUF PLN SKK

DKK HUF PLN SKK

CZK HUF PLN SKK

BRL CLP MXN

BRL CNY MXN

BRL CLP MXN

CHF KRW JPY SGD THB TWD

KRW JPY SGD THB TWD

KRW JPY SGD THB TWD

KRW TWD

AED BHD JOD KWD SAR

AED BHD JOD KWD SAR

AED BHD JOD KWD SAR

AED BHD JOD KWD SAR

XAG XAU XPT

XAG XAU XPT

XAG XAU XPT

XAG XAU XPT

CLP MXN

JPY SGD

AED BHD JOD KWD SAR

AED BHD JOD KWD SAR XAG XAU

evolution of λB1 (t ) was plotted for several base currencies. In this case, the matrix CBind was calculated independently for different positions of a moving window of length of 3 months (i.e., T = 60 trading days). The evolution of cross-currency correlations consists of two main components: the high-frequency one which may be considered noise and the slowly varying one which is responsible for trends (Fig. 25). This division is particularly evident for the market representations based on USD and EUR. During the two years following the introduction of EUR, a clear increase EUR of λEUR 1 (t ) was observed which elevated its value from ≈ 15 to above 20. This increasing trend of λ1 (t ) was accompanied USD by a horizontal trend of λ1 (t ). This situation changed in the first months of 2001, when a decreasing trend of EUR started, associated with a weaker, opposite trend of USD during 2002–2005. This decreasing trend of λEUR 1 (t ) ended in 2007. These trends represent a growth of the market collectivity viewed from the perspective of USD and an almost-parallel decline of the collectivity observed from the perspective of EUR. Such opposite trends may be interpreted as a decreasing number of the currencies which are strongly coupled to USD and an increasing number of the currencies which are strongly coupled to EUR. This can be seen in Fig. 26, where the number of the currencies that are more strongly correlated with USD (EUR) than with EUR (USD) was plotted: B B B MUSD = #{X : CUSD ,X > CEUR,X },

(59)

B B B MEUR = #{X : CUSD ,X < CEUR,X },

(60)

for an exemplary base currency (B = JPY). This result indicates that, over the years, euro secured itself as an attractor for the third-party currencies, while the position of the US dollar was weakened in this context. At the end of the analysed period (the years 2007–2008), both λEUR and λUSD had comparable magnitudes. It is difficult to assess whether this behaviour will 1 1 be continued or not in future. As regards the other base currencies, Fig. 25b shows a lack of long-term trends in the case of JPY CHF and GBP, and a prominent trend in the case of JPY during the years 2006–2007, which elevated λ1 up to the values characteristic for the currencies with an individual dynamics (like GHS).

152

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

a

b

Fig. 25. The largest eigenvalue λB1 of the correlation matrix CBind as a function of time for a few selected base currencies. The correlation matrix was calculated for each position of a moving window of length of 3 months (T = 60 trading days).

JPY

JPY

Fig. 26. Temporal dependence of the number of currencies that were more strongly correlated with USD than with EUR (MUSD ) and vice versa (MEUR ) for JPY as the base currency. Results are qualitatively the same for other choices of the base currency.

The analysis whose results were discussed above shows that the currency market has a rather complicated structure demonstrating traits of hierarchic organization both at the level of correlations and at the level of significance of the particular currencies. The former is expressed by the existence of the collective mode that typically comprises the whole market and the existence of more subtle correlations comprising smaller groups of currencies, while the latter is expressed by the number of other currencies that are linked to a given currency. Two kinds of links can be observed here: (1) the artificial pegs introduced unilaterally by the central bank of one of the interested countries because of some economic

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

153

reasons (such trivial links lead to obvious couplings between the market degrees of freedom) and (2) the links that stem from the self-organization of the global currency market, which also lead to couplings, but ‘‘softer’’ and non-trivial ones (like in the presented example of MAD) that are subject to continuous modifications (this issue will be discussed in more detail in Section 8.2.2). 4. Repeatable and variable patterns of activity If one considers any type of regular (i.e., non-chaotic and nonrandom) dynamics, one quickly arrives at conclusion that this type of dynamics is rather undesired for a system with adaptation ability. Regular dynamics is inevitably accompanied by deterministic responses to a given type of external perturbations which excludes any adaptation. On the other hand, pure randomness is not less undesired, because in such a case the system cannot elaborate an optimal response to a given perturbation. The most favourable solution that allows the system to avoid both pitfalls is to choose some intermediate state situated at the interface of the regular and chaotic dynamics [107]. The application of the matrix method to identification of couplings among different degrees of freedom which was the topic of Section 3, does not exhaust all possible applications offered by this method in data analysis. Having collected some empirical data in a form of time series covering a number of equivalent intervals of time, one may consider each time series an individual degree of freedom even if the analysed series represent an observable associated with a single physical degree of freedom. Attention is paid in this case to the correlations between events happening at equivalent moments of different intervals [190–192]. In this way, it is possible to identify repeatable events, structures or patterns of activity that remain in a stable relation with respect to the onset of each time series. This approach is useful, for example, in studying the system’s response to repeatable external perturbations or in the situation when the system is periodically paced by factors related to its internal organization. 4.1. Financial markets In the stock market, the basic pacing factor are the strict opening and closing hours that determine length of a trading day. Even if not all days are trading days and a week may consist of a variable number of trading days, each trading day is almost always the same. The only exceptions from this rule are related to sudden market breakdowns and security threats when the regulations allow to close the market at arbitrary moments, but such problems are rare. This is why a trading day constitutes the fundamental interval of time in the stock market implying that some market characteristics can be periodic with a period of 1 day. The observables which well describe the global collective behaviour of a stock market are market indices. For this reason, for the purpose of the present analysis we chose two basic indices representing the American markets: S&P500 and the German market: DAX. Both of them are the capitalization-weighted sums of stock prices for, respectively, 500 and 30 large companies traded on these markets. We considered time series of stock returns rd1t (tj ) calculated at the time scale of 1t = 1 min. The subscript d = 1, . . . , N numbers the consecutive trading days and j = 1, . . . , T numbers the consecutive minutes of each day. In the case of S&P500 we considered the time series representing NA = 390 trading days, while in the case of DAX time series representing NG = 510 trading days (starting on May 3rd, 2004 in both cases). These numbers were so adjusted that the number of trading days in each case was equal to the length of a trading day expressed in minutes (TA = 390 min, TG = 510 min). Owing to this, for both indices, Q = TA /NA = TG /NG = 1 and the Mar˘cenko–Pastur formula (33) may be applied here. Based on these data sets the correlation matrices CSP500 and CDAX were constructed whose elements are distributed as in Fig. 27. Both distributions are leptokurtic, but the distribution for CDAX is characterized by significantly thicker tails than the one for CS&P500 . This means that the correlations and anticorrelations among different trading days are much stronger in the German market than in the American one. The respective eigenvalue spectra are shown in Fig. 28. The earlier observation (Section 3.2) that the temporal evolution of different German stocks is more correlated than the temporal evolution of different American stocks is supplemented here by the result (λDAX /NG ≈ 0.23 vs. λSP500 /NA ≈ 0.09) that different days in 1 1 the German market are more resemblant to each other than in the case of the American market. In other words, the German market has a more prominent daily pattern of evolution with repeatable events occurring at precisely defined minutes. Even the next few eigenvalues reveal different character of both markets: in the case of the American market only λSP500 is clearly 2 separated from the Wishart-like bulk, while in the case of the German market there are at least 5 such eigenvalues. It is also noteworthy that for both cases there is a good agreement between the bulk and the Wishart range: 0 ≤ λ ≤ 4. Each eigenvalue is associated with an independent component of the intraday market evolution that can be termed a ‘‘principal component’’ or an ‘‘eigensignal’’ of the matrix, defined in parallel to Eq. (53): Zkidx (tj ) =

N 

vd(k) rd1t (tj ),

(61)

d=1

where ‘‘idx’’ refers to a particular index. A significant gap between the largest and the second largest eigenvalue indicates that the daily pattern associated with the component Z1idx strongly dominates the intraday evolution of both markets. It is shown in Fig. 29. In fact, it comprises the exceptionally high-amplitude index fluctuations taking place within a few

154

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 27. Matrix element distributions for the correlation matrices: CSP500 (light symbols) and CDAX (dark symbols), constructed for the S&P500 and DAX indices. Each matrix element describes the correlation coefficient between the time series of 1-min returns representing two different trading days.

Fig. 28. Eigenspectra of the empirical correlation matrices: CSP500 and CDAX (vertical bars) together with the Wishart range of random eigenvalues (shaded region). Due to the logarithmic scale, the plots do not show all the small eigenvalues and the whole Wishart range [0, 4].

minutes immediately after the opening. This is a known behaviour common to all the markets and it illustrates both the uncertainty as regards the direction in which the market will go during the next few hours and the short-time realization of a large number of diverse orders. The normalized participation ratio 1/(I1SP500 NA ) = 0.41 has a noticeably lower value than 1/(I1DAX NG ) = 0.51 which suggests that the pattern represented by Z1DAX is more frequently repeated than its American counterpart. It should be noted that the returns in particular days may have different, positive and negative signs, but nevertheless they enter the sum (61) with the same fixed signs because their original signs may be compensated by the signs of the adequate eigenvector components. Because of the fact that the eigensignal associated with λ1 is by definition characterized by the largest variance (see Eq. (52)), a trace of the strong fluctuations related to this eigensignal may also be found in the eigensignals for k ≥ 2. In order to overcome this problem, by means of the method described in Section 3.2 (Eq. (54)), we filtered out the component Z1idx (tj ) from the original time series, and calculate the new residual correlation matrices: CSP500 and CDAX of rank N − 1. The ∗ ∗ eigensignals corresponding to several largest eigenvalues λk∗ > λmax of these matrices are presented in Fig. 30. The eigensignals corresponding to S&P500 show (Fig. 30a) an amplified activity between 10:00 and 10:40 with the peak strength at 10:00 or immediately after this moment (as in the eigensignal for k∗ = 1). The triggering factor are macroeconomic news released exactly at that time by the US government. From the market point of view, such news may be treated as an external perturbation drawing the market out of an equilibrium state which is followed by nervous price fluctuations until the market reaches a new state of approximate equilibrium. None of the eigensignals shows events that occur at other moments which suggests that there does not exist any other factors destabilizing the American market in a periodically repeatable way. The amplitude of the eigensignal fluctuations in Fig. 30a increases also at the end of each trading day, but no sharp structures, comparable to those seen just after the opening, can be identified in the eigensignals. This amplitude increase at the end of trading is related to the closing of positions by the investors who do not want to wait until next trading day. To compare, an exemplary random eigensignal corresponding to an eigenvalue taken from the

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

155

Fig. 29. The strongest intraday components of the American (Z1SP500 ) and the German (Z1DAX ) markets, corresponding to the largest eigenvalue of the respective correlation matrices.

a

b

Fig. 30. Repeatable components of intraday market activity represented by the principal components of the correlation matrices: (a) CSP500 and (b) CDAX ∗ ∗ for a few largest non-random eigenvalues. These matrices were constructed from the residual signals after removing the principal components Z1SP500 and DAX Z1 according to (54). The component k∗ = 115 seen in the bottom plot of (a) represents nothing more than noise and resembles an ordinary average taken over all the trading days.

Wishart range is presented in the bottom plot of Fig. 30a. As expected, it does not display any structure and can be considered pure noise. The eigensignals representing the components of the intraday evolution of DAX have a richer structure than their counterparts for S&P500 (Fig. 30b). This proves that more regularities can be found in the German rather than in the American market. Among the most characteristic repeatable events are those occurring at 14:30 and 16:00 CET (triggered by the economic news releases in the United States at 8:30 and 10:00 EST as well as the news from the European Central Bank) and at 16:30 CET (other news releases in America at 10:30 EST). The structures of lesser importance occur at 9:15, 9:30 and 10:00 CET and are related to information releases in Germany and Europe. A source of the last strong fluctuation at

156

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 31. Components of the eigenvector v(1) of the correlation matrix CDAX and components of the selected eigenvectors of the matrix CDAX ∗ . Each component

vd(k∗ ) corresponds to a single trading day d.

17:00 CET is difficult to identify. Moreover, one can see the effect of the end of the trading day – the increased amplitude of fluctuations during the last trading hour – similar to the one observed in the American market. In general, the German market represented by DAX is sensitive to situation in the American market. This over-the-ocean dependence is easy to find in the intraday pattern of activity. When identifying repeatable structures in intraday evolution of the returns, a question arises whether their occurrences are periodic or random. This can be approximately determined by considering the components of the eigenvector v(k) corresponding to a given eigensignal Zk . Fig. 31 shows the components of v(1) for the original matrix CDAX and the eigensignals corresponding to a few selected eigenvectors of CDAX ∗ . The eigenvector for k = 1 has the components whose absolute value is comparable for the majority of days, confirming the indications of the participation ratio 1/I1 discussed earlier. On the other hand, the events observed in the eigensignals for k∗ = 1 and k∗ = 2 in Fig. 30 are well visible only on some days (k ) associated with large values of |vd ∗ |. The case of the eigenvector for k∗ = 3 lies somewhere in between these extremes. The components of the eigenvectors for 1 ≤ k∗ ≤ 3 do not reveal any trace of periodicity, which was also confirmed by the Fourier analysis of these eigenvectors. The results for S&P500 are qualitatively similar to the results for DAX. It is worth noting that the structures observed in the eigensignals in Fig. 30 are hardly visible in the signals representing individual trading days, because they are masked by the overwhelming noise. The same refers to the signal averaged over all the trading days due to different signs of the summed fluctuations. Although some of these structures are also visible in the intraday pattern of volatility obtained by averaging the intraday time series of absolute returns over all the trading days (or by calculating variance of the returns representing particular moments of each trading day), it is impossible to distinguish in this way whether the events from different hours are related to each other and whether they are periodic. The above analysis of the intraday high-frequency time series of returns shows that the temporal evolution of the stock market represented by an index may be viewed as a superposition of a number of components corresponding to noise and a few more regular components associated with repeatable patterns of activity characteristic for a given market that occur either almost everyday or on some days only. These components are related principally to the amplification of the fluctuation amplitude of the index at some specific moments or specific periods of a trading day, while the signs of the fluctuations may be arbitrary and situation-dependent. The features triggering the emerging of such components are both the external perturbations of the market, related to the economic news releases, and the internal properties of the dynamics which are associated with the market self-organization (e.g., the large amplitude of fluctuations after the opening and before the closing). This coexistence of noise and more regular dynamics is the temporal-domain counterpart of the coexistence of noise and collective modes representing the couplings among the market degrees of freedom. From this perspective, both effects may be considered the different manifestations of the complex structure and dynamics of this system. 4.2. Cerebral cortex The method of identification of events which remain in a constant temporal relation in respect to a given reference moment can as well be applied to data representing the measurements of an observable during repetitive perturbation of a system under the same conditions. As an example, we consider neurobiological data from a cognitive experiment in which cortical response to the presentation of complex object images was recorded by means of magnetoencephalography (MEG) and studied [193,194].

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

157

4.2.1. Magnetoencephalography Magnetoencephalography is a noninvasive technique allowing to measure high-frequency activity of the selected regions of the cortex by detecting the magnetic field produced outside the skull by neuronal currents [117–119]. Due to the directions of neuronal currents and the properties of magnetic field, only small cortical regions situated in or near fissures are appropriate for this technique. Fortunately, in these regions some important parts of the sensory areas are localized, which makes MEG very useful tool in the experiments aiming at measurements of the activity related to the processing of information about external stimuli. The temporal resolution of MEG is about 1 ms, which is comparable to the resolution of EEG. However, owing to relatively little distortion of magnetic field by the skull tissues, MEG is characterized by a much better spatial resolution (up to 5 mm). The magnetic field whose source is the cortex activity has very low magnetic induction outside the cortex (of order of 50–500 fT). This implies that the field can be measured exclusively by the superconducting detectors (SQUIDs). A typical MEG apparatus comprise from tens to over a hundred such detectors. For this reason, the MEG measurements have to be carried out in magnetically shielded rooms, while all the endogenous fields that are not related to the brain (e.g., those originating from heart, eyes, etc.) have to be removed by special technical arrangements. In the MEG measurements, the key problem is identification of sources of the observed distribution of magnetic field. This is the so-called inverse problem which has no unique solution, because a number of different source configurations can produce exactly the same field distribution. Therefore, one needs to use different a priori assumptions and approximations, allowing one to reject any unrealistic solutions and to consider only a solution which seems to be most probable [117]. One of the methods permitting to identify the likely sources is the so-called magnetic field tomography (MFT) [195] in which one looks independently into a ‘‘snapshot’’ distributions of the field at each moment a measurement is done. It is a timeconsuming method, but it allows for a reliable estimation of the spatial distribution of sources. The output of MFT are signals representing the total activity of a given cortical region in which a source is localized (Eq. (62)). 4.2.2. The experiment and preprocessing of data The study which we describe here was a part of a larger cognitive experiment, in which the images of complex objects and faces were shown to a subject whose task was to correctly recognize them [194]. Here we restrict our attention only to a part of this experiment, related to object recognition. The subject sat in front of a screen on which a series of black-and-white photographs of different objects were displayed in a random order, followed by a list of names from which the subject had to choose the correct one. The objects were divided into 5 categories: birds, flowers, horses, chairs and lorries. Each category consisted of 30 images. Projection time of a single image was equal to 500 ms and the names were displayed 1000 ms later. The magnetic field was sampled with 510-Hz frequency by means of the 74-channel (2 × 37) MEG apparatus covering both hemispheres with the detector array. The signals were recorded starting at 200 ms before the image projection onset and ending at 500 ms after the projection end, which gives 1.2 s total. In the preliminary part of the data analysis, the signals which corresponded to erroneously named objects were rejected. After this step, there were 140 mixed trials left. The raw signals were filtered by a band-pass filter in 1–45 Hz range, which did not suppress the any frequency used by the working brain. From the complete signals, the fragments of length of 700 ms beginning at 100 ms before the image projection were selected for further analysis. The sources were identified with the help of the MFT method. First, the spatial distribution of magnetic field was produced as if it was measured by an array of detectors placed every 1 cm and then the localization of sources was determined. 28 significant sources were identified as regions of interest (ROIs) in each hemisphere (ball of radius 12 mm). A complex visual stimulus evokes activity of many cortex regions, beginning with the primary visual regions (V1), which are responsible for early identification of the most primitive aspects of the stimulus (like, e.g., location within the receptive field, spatial orientation, colour), through the higher-order regions (V2–V5) which process more complex aspects, among which is functional integration of the information from the lower-order regions, and ending with the areas responsible for conscious recognition of the stimulus. In general, the visual stimuli, as typically more complex than the auditory and olfactory ones, activate large volume of the cortex consisting of many specialized areas. The particular areas have different characteristic activation times which are determined by neuronal paths along which information is processed. While the V1 areas are activated after about 40 ms after the stimulus onset independently of its type then activation of the higherorder areas and the related time scales may depend on properties of the stimulus, reaching hundreds of ms [196]. A similar behaviour of the visual cortex is evoked also by the end of the stimulus presentation. The early activity is related to processing of basic stimulus aspects, while the later activity — to the associative processes and to a number of feedback loops among areas of different order. We selected 2 regions of interest (ROIs) out of all 28 regions identified by MFT. The first one – the fusiform gyrus (FG) – is known to be associated with processing of complex objects, while the second one — the posterior calcarine sulcus (PCS) comprises areas of lower order (V1–V3) and receives afferent connections from the retina. Our analysis is aimed at identification of temporal scales of the evoked activity of the selected ROIs in respect to the moment of stimulus delivery. Spatial localization of FG and PCS in both hemispheres is shown in Fig. 32. The activity of each region in each trial of the image presentation is associated with a signal expressing the strength of this activity given by: M (t ) =



 vol

J(r, t ) · J(r, t )d3 r,

(62)

158

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 32. Location of the studied regions of the visual cortex: posterior calcarine sulcus (PCS) and fusiform gyrus (FG). The points around the skull denote the MEG detector positions. Source: [194].

Fig. 33. Signals representing single-trial neuronal activity (thin solid, dashed and dotted lines) of fusiform gyrus (FG) and posterior calcarine sulcus (PCS) localized both in the left (LH) and in the right (RH) hemisphere. In each trial an image was displayed for 500 ms starting from the latency of 0 ms. The evoked activity can hardly be observed in single-trial signals. This is not the case for the signals averaged over all the 140 trials (heavy solid lines), in which the evoked activity is visible as the maxima within the first 300 ms after the stimulus onset.

where J(r, t ) is the primary current density and integration goes over the space volume. Each category of the displayed objects consists of 26–29 time series. In order to improve statistics, all these categories were considered a joint category of complex objects. This is justified by the fact that the studied ROIs are known to activate in a similar way when processing information about different complex objects [193]. Owing to this, we had N = 140 signals representing each region in both hemispheres. These signals can be considered single degrees of freedom p of a given ROI activity (p = 1, . . . , N). Unlike the financial data, the signals expressing the neuronal activity of the brain are oscillatory and do not reveal any trends. Thus, they do not require any preliminary detrending and their values, not increments, can be directly used in the matrix analysis. 4.2.3. Correlation matrix structure We associated a time scale with each signal in such a way that the onset of the image projection was assigned to a latency of 0 ms. This choice implies that each individual signal starts at t1 = −100 ms and ends at tT = 600 ms. All signals are temporally equivalent and have length of T = 357. Fig. 33 presents exemplary single-trial signals corresponding to the activity of each of the 4 ROIs (LH — left hemisphere, RH — right hemisphere).

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

159

Fig. 34. Distributions of the matrix elements for both regions (FG, TBO) and both hemispheres (LH, RH) together with the Gaussian distribution least-square fitted to each empirical distribution.

A clear increase of activity within the first several hundreds milliseconds is seen only in some trials, while in the other ones any evoked activity mingles with the dominating spontaneous oscillations not related to the processing of the stimulus and it is thus hardly identifiable. The most common practice in this context is to assume that the weak evoked activity is, unlike the much stronger spontaneous activity, time-locked to a stimulus and it can thus be made visible by averaging the single-trial signals. A particular shape of the signal in each trial is therefore ignored and the attention is paid only to a reproducible part of the activity, regardless of whether such a part actually exists or not. However, a careful analysis of single-trial signals suggests that there are indeed some differences in the exact activation moments of a given region from trial to trial even if the structure of such activations is similar in many trials [197]. Averaging the signals over many trials may lead to blurring of the response shape and suppressing of the response amplitude. In fact, in the average signals (heavy dark lines in Fig. 33) within the first 300 ms after the stimulus onset, one can see only a slight increase of the amplitude in respect to the background, with more prominent maxima near the latencies of 150–160 ms and 240–250 ms. The evoked activity is also considerably ‘‘delocalized’’ and it is impossible to decide whether the 2–4 consecutive maxima (being the evoked potentials) seen in the mean signals occur one after another in the same sequence in every single trial, or some of them are just manifestations of the same potential which in different trials may occur at slightly different latencies due to different states of the brain. However, answering this question is impossible at the level of mean signals and requires a statistical analysis of the single trials. Based on the collected data we can construct the correlation matrices C of size N, one for each ROI. Unlike the leptokurtic distributions for the financial correlation matrices, the present distributions of the matrix elements have the shape of a sightly deformed Gaussian with a longer right tail (its skewness is in the range 0.2 ≤ γ1 ≤ 0.4) — see Fig. 34. This asymmetry implies a non-zero value of the mean matrix element which allows us to expect – in analogy to the distributions for the correlated Wishart matrix ensemble – that one or more eigenvalues exceed the random eigenvalue range for Q = 2.55 (leading to λmax = 2.65 with σTW = 0.04). Indeed, as the eigenvalue spectra in Fig. 35 show, a hypothesis that the matrices C represent the uncorrelated Wishart ensemble may be safely rejected. Three out of the four matrices have the eigenvalue spectra in which the largest eigenvalue λ1 is separated from λ2 by a manifest gap. However, this gap is significantly narrower than it was in the case of the financial markets. The fourth matrix, which represents PCS situated in the left hemisphere, has two separated eigenvalues with λ2 − λ3 > λ1 − λ2 . In both PCS and FG, there is a small asymmetry between the hemispheres: the right-hemisphere regions are characterized by larger values of λ1 . The point which seems to be unclear is the relatively large number of the eigenvalues satisfying the condition λk ≫ λmax + σTW in Fig. 35. For example, the matrix for the left-hemisphere PCS has as many as 20 such eigenvalues. In order to find an explanation for this effect, it is instructive to look at Fig. 36 which presents a rescaled eigenvalue density distribution ρC∗ (λ) = aρC (λ) for a = 2.5. This rescaling is equivalent to the simultaneous substitution of N∗ = 0.4 N and T∗ = N∗ so that Q∗ = 1. The so-modified empirical distribution can roughly be approximated by the Mar˘cenko–Pastur distribution for Q = 1, which was not possible in the case of the nominal values of N and T . A justification for such an operation is as follows. The empirical distribution does not conform to the theoretical distribution if Q = 2.55, because it has many eigenvalues in the first histogram cell as well as in the cells above λmax which is not predicted by the random matrix theory. Also reducing Q to the limiting value for the Mar˘cenko–Pastur distribution (Q = 1) only slightly improves the situation. Even if the large eigenvalues λk > 4 may result from genuine correlations, the large number of eigenvalues near 0 suggests a significant quasi-singularity of C. Its effective rank must be much less than in the case of N = T when exactly one zero eigenvalue λN exists. A further reduction of N can be achieved only by parallel reduction of T . Therefore, instead of manipulating the

160

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

25

20

15

λk 10

5

0

LH

RH

LH

RH

Fig. 35. Eigenvalue spectra for all 4 analysed ROIs located in the visual cortex. The eigenvalue range for the uncorrelated Wishart random matrices is denoted by the shaded zone.

Fig. 36. Exemplary distribution of the eigenvalue density for the left-hemisphere PCS together with the theoretical Mar˘cenko–Pastur distribution for the Wishart matrices with Q = 2.55 (adequate for the studied data) and with Q = 1. The empirical distribution was multiplied by a factor of 1/0.4 = 2.5 (see details in text). The better agreement of the empirical distribution with the theoretical one for the smaller value of Q (corresponding to the effectively shorter time series with Teff < T ) attests that some excess information is stored in the data due to significant autocorrelation.

length of the signals which would lead to completely different ρC (λ), it is recommended to rescale the height of the empirical distribution by a constant factor a and to obtain the same result. This result indicates that the analysed signals have lesser information content than it would come from their nominal length. This effect has its origin in the autocorrelation of the signals reaching tens of milliseconds which is a consequence of the dominant low-frequency oscillations. This effect is seen in the eigenvalue density distributions for all the analysed ROIs. Thus, it is rather difficult to determine precisely the number of the eigenstates of C which carry significant information. We may only assume safely that among such states are the ones associated with the eigenvalues that develop a considerable gap separating them from the subsequent eigenvalues. This reduces the number to 1 or 2. 4.2.4. Identification of the activity patterns The evoked potentials which are reproduced in the consecutive trials and which cause the strong correlations among the time series visible in the eigenvalue spectra in Fig. 35, can be identified in the standard way by the principal components of C: Zk (ti ) =

N 

vp(k) Mp (ti ).

(63)

p=1

It should be noted here that the analysed signals describe the total activity of a given region, which implies that their values do not possess a sign. The matrix analysis of such unsigned signals faces a significant difficulty related to the procedure of calculating the correlation coefficient (Eq. (26)) requiring normalization of signals. Due to this requirement, the correlation

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

161

Fig. 37. Distribution of the eigenvector components corresponding to v(1) (dark histograms) and v(2) (light histograms) for all the 4 regions of interest located in the visual cortex. In each panel, the theoretical Porter–Thomas distribution for random matrices was also presented, but it cannot be considered an optimal explanation of the empirical results.

analysis is carried out based on the effectively signed signals, whose signs are created by subtracting the mean. This is not a problem until one looks at the matrix elements and eigenvalues. However, any procedure which requires using the eigenvector components (like the method of creating the eigensignals), may bring non-intuitive results, because the components of a typical eigenvector may be either positive or negative (Fig. 37). The eigensignals (63) may thus also have both positive and negative signs which contradicts the unsigned nature of the original signals. Even more, due to the symmetry of the eigenvectors in respect to a change of their sign, the intervals in which Zk (t ) is negative cannot be simply (k) neglected as non-physical. A remedy for this problem might be considering the positive and the negative components vp in ± Eq. (63) separately and also separate analyses of the so-calculated signals Zk . Below we apply a different approach, though, in which the complete sums (63) will be considered, but only such structures will be treated as important, which reveal large amplitude in Zk — regardless of their sign. It is likely that such significant oscillations of either sign stem from the existence of large oscillations in the original Mp (t ) signals even if they entered the sum with a negative sign. Fig. 38 displays the eigensignals corresponding to the several largest eigenvalues for all the studied regions of interest. Even a short glimpse on the plots allows one to notice that the resulting structure of activations is in each case more clear than that observed in the mean signals in Fig. 33. The most interesting potentials have been indicated by the arrows. Z1 represents the most typical pattern of the evoked activity, which in some fragments resembles the mean signal seen in Fig. 33 (although actually it is far from an average as the eigenvector component distributions in Fig. 37 convince). The peak of the stimulus-related activity is observed at t ≈ 150 ms in both left-hemisphere regions and about 10 ms later in the right-hemisphere ones. For FG, a secondary peak is also visible at 230 ms on both sides without any significant interhemispheric delay. For PCS, the secondary peak is not so strong as for FG, although there were maxima in the average signals. The eigensignal Z2 comprises essentially the activations with their peaks located around 230 ms in the right hemisphere and around 250 ms in the left one, but less prominent potentials are also seen around 90–110 ms, 180 ms, and within the first tens of ms after the stimulus onset. Apart from these potentials observed in Z1 and Z2 , there are other peaks in the eigensignals corresponding to the subsequent eigenvalues which occur at different latencies in different regions. Those in PCS, especially in the right hemisphere, are typically stronger than those in FG and to some extent they have a periodic character. However, the shape of the eigensignals cannot be considered without referring to the eigenvalue spectra displayed in Fig. 35, since the eigenvalues define the fractions of total variance which are associated with the particular eigensignals. After this comparison, it is safe to assume that in each case only at most two eigensignals Z1 , Z2 represent doubtlessly significant events, with Z1 being always significant, while the significance of Z2 depending on whether the corresponding eigenvalue develops a clear gap or the activity grasped by Z2 is restricted to the first 300 ms after the stimulus onset. Under this condition, we consider two likely eigensignals to be representing the evoked activity in the case of both FGs and the left PCS, while only one such eigensignal in the case of the right PCS. The occurrence of potentials in Z1 which are hardly observed in Z2 (and vice versa) indicates a certain degree of independence of these potentials. That is, the activity represented by one eigensignal can happen in some but not in all the trials or it can happen in different trials at different latencies. One may argue that some of the eigensignals for k > 2 as representing the eigenvalues which exceed the Wishart bound λmax , may also contain statistically significant information and by neglecting the structures that occur in these eigensignals (lower panels of Figures 38a and 38b), the related information is lost. This argument, however, can be invalidated by observing that the original single-trial signals show the dominant low-frequency oscillations which is largely related to the spontaneous activity. Such low-frequency oscillations from one trial easily overlap with similar oscillations in different

162

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

a

b

Fig. 38. Eigensignals Zk (t ) associated with the few largest eigenvalues λk and a typical eigenvalue λ53 from the random part of the spectrum, serving as a signal of reference. (a) Fusiform gyrus (FG), (b) posterior calcarine sulcus (PCS). Vertical dotted lines denote the onset and the end of a stimulus.

trials even if they are slightly shifted in latency. This is why for each randomly selected latency it is easy to find a subset of trials, such that in each trial from this subset there is a maximum or minimum at or near this latency. On the other hand, each eigenvector selects those trials which reveal similar features at similar latencies, which is especially significant for the eigenvectors associated with the eigenvalues larger than λmax . Therefore, it is likely that the eigensignals for such eigenvalues show just the spontaneous oscillations which by chance were in phase across several trials. In this context, the existence of a number of eigenvalues above λmax is fully justified — they represent the spontaneous oscillations for different choices of their phase. By going from the largest eigenvalue downwards to the Mar˘cenko–Pastur upper bound, we first find one or sometimes a few eigenvalues which are substantially separated from the subsequent eigenvalues by an evident gap. We consider the eigenvalues above this gap to be associated with the extra correlations originating from the correlated evoked activity patterns. Then below the gap we find several or more eigenvalues to which the spontaneous activity is allowed to contribute significantly. The lower the eigenvalue magnitude is, the larger is the contribution from the coincidentally

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

163

correlated spontaneous activity. This example shows that interpretation of some eigenvalue spectra of correlation matrices is not so straightforward as it might seem to be according to the random matrix theory criteria. The above outcomes indicate that more than one pattern of the evoked cortical activity may be identified if we employ the matrix approach instead of averaging the single-trial signals. In this way, we allow the cortical response to vary from trial to trial, which seems to be a realistic assumption which takes into consideration the different cortex states during different trials, in contrast to the deterministic assumption that the response is always the same. The origin of such a variable pattern of activity can be at least twofold. First, activation of the particular region may occur at different latencies in different trials. Second, some activations may be secondary, triggered by the activity of other regions, including the higher-order ones, via the neuronal feedback loops. Such latency shifts of the peak activity and the secondary activations can depend on context, i.e., the global state of the brain at the moment of stimulation. The global state of the brain is constantly changed with each consecutive stimulus contributing to these changes, so there is no reason for expecting identical cortical responses. The variable activity patterns are also in perfect agreement with one of the paradigms of complexity, in which the relative stability of a system (e.g., the deterministic responses) is associated with its flexibility expressed by fluctuations. Every interpretation of the measurement outcomes corresponding to the cortex activity touches one of the fundamental problems, i.e., whether the background activity has to be treated as pure noise and consequently neglected or it can comprise some significant information inseparably related to the activity characteristic for a region under study. From the theoretical perspective, the spontaneous activity, understood as the whole activity not evoked by a stimulus, involves the neurons from the same area that the evoked activity does, which means that both types of activity may interfere with each other in some way. If this is the case, then there is neither pure background nor stimulus-related activity but both types are melted together. For instance, if the stimulus-triggered activation of some region alters the ongoing spontaneous oscillations (in any way, say by changing its phase or slightly modifying its amplitude), then is such altered activity still spontaneous or it rather becomes a part of the evoked response? From the practical perspective, this ambiguity is usually resolved in such a manner that everything which is not ideally time-locked to the moment of stimulus delivery is considered irrelevant. The singletrial signals are averaged and the mean signals are studied. In this section, we have argued that this is not an optimal way to proceed since much dynamical information about the cortex function can be lost. Instead, we advocate using the methods which allow one to identify different patterns of the cortex response even though some problems with interpretation of the results can be faced in some cases. 5. Long-range interactions Direct or, more often, indirect long-distance interactions that manifest themselves in the existence of correlations between the states of mutually remote elements, are a typical property of systems residing near a critical state. Though also in complex systems whose criticality is not obvious, but which possess a hierarchical organization, there can be found multi-element substructures forming separate sets of degrees of freedom. Their separateness consists in the fact that interactions among the elements inside the sets are considerably stronger than interactions between the elements from different sets. As regards the systems discussed in Parts 3 and 4, the examples of such sets of degrees of freedom are industrial sectors in the stock markets or the stock markets in different countries, as well as functionally specialized large groups of neurons with a similar localization within the cortex and processing together the same pieces of information. The long-range interactions that take place between such structures are much weaker than the short-range interactions in their insides but nevertheless they have equally high significance for the system as a whole, enabling both the information transfer between its distant parts and the functional integration of information dispersed across the system. Such long-range interactions can be identified either at the level of entire substructures or at the level of individual elements. In the latter case especially, dependences among the degrees of freedom adherent to distinct subsystems can be studied by means of the matrix approach [198]. In this part of the review we discuss the functional dependences between selected substructures of the cerebral cortex and the couplings between two geographically distant stock markets. 5.1. Asymmetric correlation matrix The method of matrix-based correlation analysis in a system of N degrees of freedom that was presented in the previous Section 4 can be generalized on the case when there are two systems of interest Ω1 , Ω2 having N degrees of freedom each. Let the observables Xα be associated with each degree α of the system Ω1 and the observables Yβ — with each degree β of the system Ω2 . Analogously, let {xα (ti )} and {yβ (ti )} be time series (i = 1, . . . , T ) of measurements of the respective observables. Additionally we allow the time series associated with Ω2 to be advanced (sign ‘‘+’’) or retarded (sign ‘‘−’’) by an interval τ = m1t in relation to their counterparts associated with Ω1 . Then the following two data matrices are considered: Xα,i =

1

σα

(xα (ti ) − x¯ α )

Yβ,i (τ ) =

1

σβ

(yβ (ti + τ ) − y¯ β ),

(64)

from which a real asymmetric correlation matrix of size N × N is created: C(τ ) =

1 T

X[Y(τ )]T .

(65)

164

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Its elements (as in Eq. (29)) are the correlation coefficients defined by Eq. (26). The diagonal elements are no longer expected to be unities and Tr C ≤ N. Calculation of the eigenvalues and the eigenvectors demands solving the τ -dependent characteristic equation: C(τ )v(k) (τ ) = λk (τ )v(k) (τ ).

(66)

The asymmetric character of C(τ ) leads to complex values of λk (τ ) and the eigenvector components vγ(k) (τ ), while its reality

implies that the eigenvalues merge into conjugate pairs, ditto the eigenvector components. The real part of the eigenvalue spectrum corresponds to the symmetric part of C(τ ) and the imaginary part corresponds to its antisymmetric part. Basically, the eigenvalues are ordered according to their magnitude: |λk | ≥ |λk+1 |, but in the case of two conjugate eigenvalues the sequence is defined by the auxiliary condition: Imλk > Imλk+1 . A random matrix ensemble that is appropriate for referencing the asymmetric correlation matrices is the ensemble grouping the square N × N matrices being a product of two rectangular ones of size N × T (the random counterparts of the matrices X and Y from Eq. (64)) whose elements are drawn from the Gaussian distribution. Unfortunately, a formula for the exact eigenvalue distribution for this matrix ensemble has not been derived yet as a function of the parameter Q = T /N except for the special case of Q = 1 [199] (despite the fact that the respective general formula has recently been derived for the complex correlation matrices [200]). However, for sufficiently large N and T , the matrices of this type have properties close to the properties of the asymmetric Ginibre orthogonal matrix ensemble (GinOE) [201] for which a closed analytic formula describing the eigenvalue distribution is known. This justifies using the GinOE ensemble as a reference for the empirical correlation matrices given by Eq. (65). The GinOE matrices G being a generalization of the Gaussian orthogonal matrices (GOE) are defined by the Gaussian distribution of elements: 2 p(G) = (2π )−N /2 exp[−Tr(GGT )],

(67)

where the size of G is N × N and variance of the distribution σ = 1. The eigenvalue spectrum of such a matrix has a rather intricate structure composed of N − L complex eigenvalues and L real ones, whereas the expectation value of L has asymptotic behaviour given by [202]: 2

 lim E (L) =

N →∞

2N

π

,

(68)

which for finite N can be approximated by the formula:



2N

E (L) = 1/2 +

π

 1−

3



8N

3 128N 2

 + O ( N −3 ) .

(69)

Distribution of the eigenvalues λ = λx + iλy on the complex plane is described by the following expression [203,204]:

ρG (λ) = ρGc (λ) + δ(λy ) ρGr (λ),

(70)

where 2|λy | 

ρ (λ) = √ c G



1

ρGr (λ) = √







  2λ2y 1 − erf( 2|λy |) e ∞

du e−u |λx |2

uN − 2

Γ (N − 1)



du e−u |λx |2

1

+√



uN − 2

Γ (N − 1) 

,

(71) λx

|λx |N −1 e−λx /2 2

0

2 du e−u /2

uN − 2

Γ (N − 1)

.

(72)

The function erf(x) in Eq. (71) denotes the Gaussian error function. In the limit N → ∞ the above expressions are significantly simplified and the eigenvalues λ form a structure consisting of a uniform disc on the complex plane and a uniform interval on the real axis [205,206]:

ρGc (λ) =

1

π

√ Θ ( N − |λ|), 1

ρGr (λ) = √



√ Θ ( N − |λx |),

(73) (74)

where Θ (·) denotes Heaviside’s function. An example of the eigenvalue distribution for a GinOE matrix G is plotted in Fig. 39. Random matrices from the ensemble of asymmetric correlation matrices have the eigenvalue spectra which, for small values of Q , are additionally described by a monotonically decreasing radial component ρG (r ) [199]. For large values of Q , this radial component becomes asymptotically uniform. Properties of the empirical correlation matrices can be directly compared with theoretical results for the GinOE matrices after the replacement λ → λ/σ , where σ is standard deviation of the distribution of the elements of C (provided this distribution does not deviate much from the Gaussian one).

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

165

Fig. 39. Theoretical distribution of eigenvalues (z ≡ λ) of the GinOE matrices for N = 10 plotted on the complex plane (see Eq. (70)). The darkest regions correspond to the highest eigenvalue density ρG (z ).

Fig. 40. Location of the symmetric amygdalae (AM) in human brain. The points around the skull denote positions of the MEG detectors. Source: [194].

5.2. Long-range interactions in the human brain 5.2.1. Visual cortex The visual cortex consists of many functionally separate parts that are independent current sources. These parts are mutually linked by neuronal connections which implies that their activations are not independent. These activations can be either simultaneous activations, which take place in the case if two regions have similar afferent connections from, e.g., the lower-order cortex regions or directly from the sensory organs and they are independently activate at the same time, or delayed activations related to, for instance, the directional information flow between these regions. As it was discussed in Section 4.2, such activations are not deterministic in relation to the moment of stimulus delivery, but the activations are statistically more probable at certain moments than at other moments. In this context an interesting questions arises, if the activation moments of given regions are time-locked to stimuli or rather they are related to activations of other regions. Here we study the same data as before in Section 4.2. We consider the cross-correlations between activity of selected pairs of regions. Now, besides the regions that have already been studied, namely the posterior calcarine sulcus (PCS) and the fusiform gyrus (FG), located in the visual cortex and strongly active during visual exposition to objects, in the present analysis we include yet another region of interest, i.e., the amygdala (AM) that is located outside the visual cortex. This structure is known to play an important role in the processing of information related to emotions and, specifically, in the recognition of facial expression of emotions. On the other hand, it does not take part in direct analysis of visual stimuli that are emotionally neutral like, for instance, object images [194,207]. Activations of this region should not be linked thus with the activity of PCS and FG. Exact location of the amygdalae in the brain is shown in Fig. 40. Since the stimuli are repeated many times, the optimal approach is to associate degrees of freedom with the signals describing activity of a given region in a single trial of the stimulus delivery, similar to what was done in Section 4.2. For each of the two disjoint sets of N = 140 signals from all the trials p corresponding to regions X and Y, the data matrices

166

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 41. Real part of the largest eigenvalue λ1 as a function of time lag τ between pairs of regions in the left (LH) and the right (RH) hemisphere (a timeshifted signal always √ comes from the second region of a pair). In each panel, shaded zone denotes the asymptotic (N → ∞) range allowed for the GinOE matrices (|λx | ≤ N), while horizontal lines denote the value at which ρGr (λ) = 0.01ρGr (0) for N = 140. All the significant regions of positive values of λ1 above the dashed lines correspond to purely real λ1 .

X and Y are created. A possibility of a relative lag in activation of X and Y is taken into account by introducing time lag τ and associating it with the signals from Y. This means that τ = 0 corresponds to signals that are parallel in time, τ > 0 corresponds to retarded signals from Y, and τ < 0 corresponds to retarded signals from X. For a given pair (X, Y), a correlation matrix C(τ ) has to be calculated and then its eigenvalue spectrum. For a given τ , correlated activations of two regions recurrent in many consecutive trials should similarly contribute to all or to the majority of the matrix elements. These correlations thus form the symmetric component of C(τ ). If this is a dominant component, then λ1 (τ ) ∈ R. This is why in our this analysis only the situations in which the largest eigenvalue is purely real and distant from the upper bound of eigenvalue spectrum for the GinOE matrices are interesting. Complex values of λ1 (τ ), especially such with a large imaginary part, are less significant from our perspective, because they are related to the antisymmetric component of C(τ ) linking positive correlations in some group of trials with negative correlations in the other group of trials. Identification of the τ values for which the activity of a pair of regions is the most strongly correlated is possible after deriving the functional dependence of λ1 (τ ). Such dependence is shown in Fig. 41 for both hemispheres and three unilateral pairs of ROIs: PCS-FG, PCS-AM and FG-AM, as well as for two cross-hemisphere connections between the homologous ROIs: PCS(LH) -PCS(RH) and FG(LH) -FG(RH) . The most interesting outcome is the pronounced maximum of λ1 (τ ) for the pairs PCS-FG in both hemispheres which is located at τ = 2 ms (LH) and τ = −2 ms (RH). The eigenvalue is purely real here indicating almost simultaneous strong activations of both ROIs taking place in the majority of trials. However, precise identification of the specific activations responsible for these correlations is impossible due to the fact that the analysis is based on full-length signals, while a satisfactory improvement of the temporal resolution would require taking sufficiently short time windows. This however would lead to Q ≤ 1 and handicap interpretation of the eigenspectrum structure. In comparison to Fig. 38 it seems that the strongest contributions to the observed correlations are made by the activations of PCS and FG taking place

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

167

Fig. 42. Examples of complex eigenvalue spectra of the lagged asymmetric correlation matrices calculated for MEG signals representing selected cortex regions (PCS, FG, AM) in two hemispheres (LH, RH). Time lags τ associated with the second region of each pair are shown in each panel. Shaded discs denote the random correlation zones predicted by the random matrix theory for the GinOE matrices in the limit N → ∞, while dashed circles denotes the points for which ρGc (λ) = 0.01ρGc (0). High concentration of points in the centres results from properties of the asymmetric matrices whose eigenvalue spectra for small Q = T /N deviate radially from the perfectly uniform GinOE spectra.

149–153 ms (LH) and 161–165 ms (RH) after the stimuli onsets. Time shifts of both activations are approximately consonant with the signs and values of the small lags seen in Fig. 41. Widths of the maxima may also be an effect of the widths of the peaks in Fig. 38 (Section 4.2.4). Apart from these dominant central maxima seen in Fig. 41, there does not exist any other maxima in the considered range −200 ms < τ < 200 ms. The negative values of Reλ1 (τ ) (often with Imλ1 (τ ) = 0) which fall outside the GinOE domain are associated with the anticorrelations prevailing in certain ranges of τ . In the unsigned signals Mp (t ) (Eq. (62)), anticorrelations correspond to increased activity of one region and suppressed activity of the other at the same time. For certain values of τ such anticorrelations are inevitable due to the oscillatory nature of the cortex activity. This means that anticorrelations for τ ̸= 0 may not provide any substantial information. A thoroughly different behaviour of λ1 (τ ) is observed for the four pairs comprising AM. The largest eigenvalue does not exceed much in this case the GinOE range predicted by the random matrix theory, which means a lack of statistically significant correlations between the excitations of AM and the other ROIs in both hemispheres. Slightly elevated values of λ1 (τ ) can be found only within the intervals: 5 ms < τ < 45 ms for FG-CM in the left hemisphere and −95 ms < τ < −75 ms for PCS-CM in the right hemisphere, but neither of them is sufficiently significant. The strongest inter-hemisphere correlations correspond to the activity of the homologous PCSs and FGs that are relatively shifted by from τ ≃ −70 ms to τ ≃ 10 ms. In the case of FGs, the two main maxima of λ1 (τ ) occur at τ ≃ −50 ms(the right-hemisphere FG is activated first) and at τ = 5ms (the left-hemisphere FG is activated a little earlier). In the case of PCSs, on the other hand, no distinguished lag can be pointed out. By comparing these results with Fig. 38, it becomes evident that for FGs the observed activation delay of 5 ms is probably a result of the correlation between the main activations at the latencies of 151 ms (LH) and 161 ms (RH) as well as 228 ms (LH) and 232 ms (RH). Further conclusions are difficult to be drawn because of significant blurring of the peaks in Fig. 41. Several examples of the complete eigenvalue spectra of C(τ ) for selected values of τ are displayed in Fig. 42. The top plots correspond to situations in which the real part of λ1 dominates. The top left plot corresponds to a strongly collective λ1 (the maximum coupling between PCS and FG in the left hemisphere for τ = 2 ms, see Fig. 42), while the top right one corresponds to λ1 that is only slightly larger than the upper bound predicted for the GinOE matrices in the N → ∞ limit (the maximum coupling between FG and AM in the left hemisphere for τ = 37 ms). The left-hand-side plot shows also significant asymmetry in respect to the point (0;0). The bottom plots represent situations in which Reλ1 ≃ 0 and the antisymmetric component is large (the bottom left plot) or in which λ1 resides inside the GinOE zone and no statistically significant information can be extracted out of noise (the bottom right plot). In all the examples in Fig. 42 many eigenvalues are concentrated around the point (0;0) which may originate partially from the fact that the value of Q = 2.55 is small enough for the finite-sample effects to be visible. This remains in agreement with the results of numerical simulations for the ensemble of asymmetric correlation matrices [199]. In addition, an effect analogous to the one discussed in Section 4.2.3

168

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 43. Distributions of diagonal (dark histograms) and off-diagonal (light histograms) elements of correlation matrix C(τ ) for a few characteristic situations: a matrix with a overwhelming symmetric component (top left), a matrix with a weaker but still evident symmetric component (top right), a matrix with mixed symmetric and antisymmetric components (bottom left), and a matrix with a dominant antisymmetric component (bottom right). In each case, the distributions were derived from N diagonal and N (N − 1) off-diagonal elements and this fact strongly influences the smoothness of the respective histograms.

can also play a role here. That is, the strong autocorrelation of signals can decrease their effective length by decreasing their actual information content and, in consequence, it can shift a large fraction of the eigenvalues towards zero. Having obtained the results which unequivocally point to the existence of temporal dependences between activity of different cortex regions, we can address the question about repeatability of these results, or - in different words about their ‘‘determinism’’. If repeatable dynamical components were strong enough, then after each consecutive stimulus delivery a given region would be active always at the same characteristic latency. This in turn would imply that the maximum correlations between regions of a given pair would be observed for the same lag in all the trials. Therefore, irrespective of whether we calculate the correlation coefficients for signals that are parallel in time (the diagonal elements Cpp (τ ), p = 1, . . . , N) or whether we calculate such coefficients for signals that come from different trials (the off-diagonal elements Cpq (τ ), p ̸= q), the expectation value of these correlation coefficients should be the same. This is so because in the consecutive trials only the evoked activity may be time-locked to a stimulus which each time arrives at random moments in respect to the spontaneous activity. In order to look for the existence of fully repeatable patterns of the evoked activity, we consider separate distributions of the diagonal and off-diagonal correlation matrix elements. It occurs that for the time lags corresponding to the maximally correlated activity of ROI pairs, there is a considerable difference between these distributions. This can be seen in Fig. 43 where the distribution of the diagonal elements for the maximum correlation between PCS and FG (τ = 2 ms, the top left panel) is sizably moved to the right towards higher positive correlations in respect to the distribution of the off-diagonal elements which is placed around Ci,j = 0 (although it reveals some positive skewness). A much smaller difference is seen for the ROIs situated in different hemispheres (the top right panel), which show correlated activity in Fig. 41. On the other hand, in the remaining situations all the distributions have comparable values of the first moment, regardless of whether one looks at the diagonal or the off-diagonal elements. Exactly such a structure of the distributions is expected in the situations in which the symmetric component of the corresponding matrices does not fall outside the GinOE zone (the bottom panels). Two chief conclusions can be drawn from the results presented above. First of all, the results showed that two considered ROIs: PCS and FG containing, respectively, the lower-order and the higher-order visual areas and taking part in processing of information about complex objects, show correlated activity evoked by visual stimuli. The correlations are slightly stronger in the right hemisphere. The lags between the maximum activity of both regions are only of order of a few milliseconds, which indicate that a key role is played here by direct, monosynaptic information transfer from the neurons of PCS to the neurons of FG within the same hemisphere. This result is consistent with the results of our earlier analysis of the same data set by means of mutual information [194] although temporal resolution of the present method is lower and it is difficult to identify equally precisely the particular activations that are correlated across ROIs. In concordance with general knowledge about the function of amygdala, which does not take part in processing of emotionally neutral object images, here it also does not reveal any significant correlation of its own activity with the activity of the other two regions.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

169

Fig. 44. Location of magnetic field sources activated in the auditory cortex in both hemispheres by a sound stimulus. Source: [211].

The second conclusion that stems from this analysis is that the evoked activity of a given region is more strongly correlated with the evoked activity of other regions involved in processing of the same stimulus than with the moment at which this stimulus arrives. In other words, despite the fact that a given ROI is activated at different latencies in different trials, the lags between the activations of different regions are preserved across different trials. From the whole brain perspective, topographic locations of PCS and FG within the same macroscopic cortex area, the visual cortex, imply that the interactions between these ROIs have to be considered medium-distance ones. 5.2.2. Auditory cortex Some indications about existence of inter-hemisphere interactions can be obtained by analysing data from another experiment, in which auditory cortex activity was recorded during the delivery of a series of sound stimuli to ears [190]. In analogy to the case of visual cortex discussed in the previous section, each delivery of a sound activates neurons in primary and secondary areas of the auditory cortex (located as in Fig. 4, Section 2.1). This activation is bilateral, that is, even one-sided stimulation evokes activity in both hemispheres. Details of the experiment are as follows: Simple tones of 1 kHz frequency, 50 dB intensity and 50 ms duration were delivered to one ear in a series of 120 equivalent trials of length of 1 s. Five healthy subjects participated in the study, the main part of which consisted of two such series of stimuli, the first one conveyed to the left (L), and the second to the right (R) ear of each subject. After completing this part, all the subjects took part in a different task involving the contingent negative variation condition [208], which from the present perspective served as a way of resetting the brain state. Finally, both series of the auditory stimuli were repeated for each subject under exactly the same conditions as before. Each stimulus evoked the activity which was recorded by a 74-channel MEG set consisting of two parts with 37 SQUID detectors each, covering the skull over the temporal lobes in both hemispheres. The magnetic field generated by neuronal currents was sampled with the frequency of 1042 Hz. The proper experiment was preceded by a series of test measurements, whose objective was to identify locations of the minimum and maximum of the magnetic field during the strongest evoked activity. The information gathered in this test run served to optimize the detector positions before the proper measurements begin. Fig. 44 shows the most active areas of the auditory cortex during the stimulus processing. The strongest activity is restricted to a relatively small region that can be approximated by a single magnetic dipole. Owing to this, we can assume that the activity of the auditory region in each hemisphere is associated with a unique source of magnetic field that can be observed by a single detector placed directly above it. However, this optimal detector is only an abstract concept and the signal associated with it has to be derived from the real signals (recorded by the physical detectors) by their weighted superposition [197]. In effect, for each repetition of the stimulus only two such signals (one for each hemisphere) are subject to a further analysis. All the signals were then filtered with a band-pass filter in the 1–200 Hz range together with a notch filter at the power-line frequency of 50 Hz and its harmonics. Since both parts of the auditory experiment were carried out under the same conditions, we may consider trials from these two parts together. Thus, for either ear, we had N = 240 trials. We limit our discussion to only two subjects (Subject A and Subject B) representing the extreme cases; signals for the other three subjects look intermediate. An individual trial p of the stimulus delivery was represented by two parallel time series of length T = 1042, associated with each hemisphere. Time scales of these signals were defined in the same manner as for the cognitive experiment above, i.e., t = 0 corresponded to the stimulus onset, and the time series cover the latency interval from t1 = −220 ms to

170

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 45. Exemplary virtual-detector signals Mp (t ) corresponding to both hemispheres (LH, RH) of Subjects A and B. Several single-trial signals (light-dashed or light-dotted lines) are shown together with the signals averaged over all 240 trials (heavy solid lines). The signals correspond to the unilateral, rightear stimulation. The average signals show the repeatable components of activity, since here any components that might be observed in single trials are averaged out.

tT = 780 ms. Typical signals corresponding to single trials together with the signals averaged over all the trials are shown in Fig. 45. In the average signals, a characteristic series of oscillations with the potentials called P40m, N100m and P200m can be found. These potentials are hard to be found in the single-trial signals. Both the average and the single-trial signals differ between the subjects. Our objective was to investigate temporal relations between the evoked activity of both auditory regions and to observe how they depend on the stimulation side. Our methodology was similar to the one we used in the earlier part of this section. For each subject and each experimental condition (L, R) we constructed a family of inter-hemisphere correlation matrices C(τ ), where time lags are taken from the range: −70 ms ≤ τ ≤ 70 ms. Lag direction was so-defined that τ > 0 and τ < 0 denote, respectively, an advanced activation of the left hemisphere and its retarded activation. In each single-trial signal, two qualitatively different latency intervals were selected for the analysis: the interval 0–300 ms comprising the evoked potentials (E) and the interval 480–780 ms in which the evoked potentials are no longer observed and only the spontaneous activity is present (S). Both intervals have the same length. Fig. 46 exhibits the corresponding functional dependence Reλ1 (τ ) for both subjects, both stimulation sides (L, R), and both intervals. A comparison of the top and the bottom panels allows to notice that, regardless of the subject and condition, the interhemisphere correlations are seen mainly in E (large values of Reλ1 (τ ), while in S they are absent or almost insignificant. Exactly this result was expected since in a state of wakefulness the spontaneous activity in distant regions of the cortex is independent. In contrast, in the case of Subject B (both conditions) and Subject A (left ear) the real part of λ1 (τ ) resides within the GinOE range or in its close vicinity for all considered values of τ , which is consistent with our expectation that the spontaneous activity is independent in each hemisphere. The upper panels of Fig. 46 show a significant difference between strength of the correlations that demonstrates itself in different heights of the maxima of Reλ1 (τ ) for Subject A and B. This effect has its likely origin in a favourable ratio of the low-frequency oscillation amplitude to its high-frequency analogue in Subject A (Fig. 45). The low-frequency correlated activity is also responsible for the large widths of the Reλ1 (τ ) peaks. This agrees with the low-frequency nature of the evoked oscillations. Time lags between the activations of the opposite auditory regions depend on the stimulation side. This is best evident in the shifts of the corresponding maxima of Reλ1 (τ ) in opposite directions with respect to τ = 0. For Subject A, the real part of the largest eigenvalue reaches its maximum for τ = −13 ms if a stimulus is left-sided and for τ = 6 ms if a stimulus is right-sided. For Subject B, these are τ = −8 ms (L) and τ = 10 ms (R). The signs of these shifts express the known fact that the auditory region located in the contralateral (opposite) hemisphere is activated statistically earlier by 5 to 15 ms than its counterpart in the ipsilateral (stimulation-side) hemisphere [209–211]. This is interesting since distances between the stimulated ear and the hemispheres might suggest the opposite relation. The distributions of the diagonal and the off-diagonal matrix elements which are presented in Fig. 47 differ between the subjects and are consistent with the differences between the maximum values of Reλ1 (τ ) for these subjects. This means that the matrices for Subject A have a stronger component of the order of 1 and therefore they resemble the symmetric correlated Wishart matrices (Eq. (41)). In the case of Subject A, both the distribution types have asymmetric shapes but the distributions of the diagonal elements have larger the first moment than the distributions of the off-diagonal elements. In

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

171

Fig. 46. Real part of the largest eigenvalue of C(τ ) as a function of time lag τ for both subjects and both experimental conditions: left-ear stimulation (L) or right-ear stimulation (R). Top panels correspond to the matrices calculated for the evoked (E) activity interval (latency 0–300 ms), while bottom panels correspond to their counterparts calculated for the spontaneous (S) activity interval (latency 480–780 ms). In each panel, the grey zone denotes the eigenvalue range allowed for the GinOE random matrices with the variance equal to the (subject-specific) variance of the entries of the empirical correlation matrix C in the S interval.

Fig. 47. Matrix element distributions for C(τ ) calculated in the evoked activity interval (0–300 ms), where the values of τ are such that Reλ1 (τ ) assumes maximum. Top-row panels show the distributions for Subject A and bottom-row panels for Subject B, while the left–right location of panels corresponds to the side of stimulation (L or R). Dark histograms are associated with the diagonal elements, while light histograms are associated with the off-diagonal ones. Vertical lines denote mean values of the histograms.

the case of Subject B, a small difference between the moments is seen only for the right-ear condition, while for the left-ear one it is within statistical error. Interpretation of the above differences between the distributions of the diagonal and the off-diagonal elements is similar to that presented in the context of the visual cortex: for Subject A, the activity of the auditory regions is stronger correlated for simultaneous signals than for the signals taken from different trials. This means that besides the repeatable component of the evoked activity present in almost all the trials, the cortex response shows some features that vary from trial to trial. These unrepeatable features, if present simultaneously on both sides in the same trial, but differing across different trials,

172

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

can be responsible for the observed suppressing of correlations expressed by the off-diagonal matrix elements. It should be stressed once again here that the spontaneous activity in both regions is uncorrelated or at most weakly correlated (the bottom panels of Fig. 46) and, thus, it does not contribute to the correlations expressed by the diagonal elements. The case of Subject B is different. Here the first moments of the diagonal and the off-diagonal distributions do not differ much, which is concordant with the shape of the average signals in Fig. 45 where the evoked potentials are less prominent as compared to the ones for Subject A. Based on the results presented above, we may say that the activity evoked by a sound stimulus in Subject A shows some traits of an inter-hemispheric coupling. It is hard to decide whether this coupling is a result of direct interactions through neuronal pathways in corpus callosum or it is rather a result of mediating the activity of both regions by an actual state of the brain as a whole, which may be different in each trial. If this second alternative was true, then the interactions would be indirect (though they also would be long-range). Whichever alternative is closer to reality, the properties of the activations evoked by external auditory stimuli suggest that these activations have a more complex character than it might seem at first glance, combining repeatability with trial-to-trial variability, and pointing out the existence of effective functional links between the hemispheres. In Subject B, the dynamics of the evoked activity does not allow us to identify all these elements, perhaps due to a much worse signal-to-noise ratio. 5.3. Long-distance couplings in financial markets Nowadays, in the era of electronic transaction systems and the Internet, the notion of distance may be considered obsolete, especially in the world of financial markets. However, the stock markets have still more local character than, for example, the currency market. The stocks of a given company are typically traded only on one stock exchange, and a large part of the economical links and of the income of a typical company is related to the geographical region or the country where this stock exchange is situated. This means that the individual stock exchanges form, to some extent, separate systems. Therefore any actual dynamical dependences among the stock exchanges localized in different geographical regions may be considered the long-range interactions. To the existence of such dependences point out some earlier results [151,152], obtained for daily quotations, which show that actually all the local stock markets are coupled together to form a single global stock market. A drawback of those analyses was, however, their poor temporal resolution that prohibited identification of temporal scales associated with the inter-market couplings. Any improvement of temporal resolution requires working on high-frequency data, since the expected temporal delays between price movements on different markets are of the order of seconds. Here we analyse tick-by-tick data from the German and the American stock markets consisting of two sets of N = 30 stocks [212]. The American market is represented by the companies which were listed in Dow Jones Industrial Average (Dow Jones, DJIA) during the period 1998–1999, while the German one is represented by the companies listed in DAX30 during the same period. Based on the original data, time series of logarithmic returns (Eq. (45)) were formed for different time scales from the interval 3 s ≤ 1t ≤ 2 min. (We had to limit the shortest 1t to 3 s because the data collected in the TAQ database is recorded with 1 s accuracy.) Because of the difference in time zones, the opening hours of both markets overlap only in a relatively short interval from 15:30 to 17:00 CET (before Sept. 20th, 1999) or from 15:30 to 17:30 CET (since Sept. 20th, 1999). This implied that the studied time series also had to be restricted to these intervals. Moreover, only the days on which both markets were open were taken into consideration. Length of the resulting time series thus varied from T = 954,000 for 1t = 3 s, to T = 23,850 for 1t = 2 min. For each time scale, a family of the inter-market correlation matrices was derived: C(τ ) = XDJ [YDAX (τ )]T .

(75)

By construction, these matrices contained only information on the inter-market correlations without information on the correlations inside each market. The time lag τ was associated with time series for the German stocks. For each value of τ , the eigenvalue spectrum of the corresponding C(τ ) was determined and so was a functional dependence of λ1 on the lag τ . Fig. 48 shows the modulus of this function for three selected values of 1t. All the plots display a strong coupling between the degrees of freedom of both markets expressed by the large real values of λ1 (τ ) that substantially exceed the RMT bound for a given 1t. What is interesting, the cross-market coupling is almost instant: a shift of the |λ1 (τ )| maximum can be noticed at the resolution not lower than 1t = 15 s. The time series with the sampling interval of 1t = 3 s show that the statistically significant correlations occur within the range −2 min ≤ τ ≤ 3 min. They are strongest for −30 s < τ < 0 s with two maxima at τ = −9 s and τ = −15 s. Shift of the maximum correlation towards τ < 0 indicates that the same piece of information is statistically faster distributed among the German stocks than among the American ones. On the other hand, memory of this piece of information decays longer among the German stocks, for which λ1 reaches noise level only after 3 min as compared to 2 min in the case of the American stocks. These delays at the same time define the efficiency horizons for these markets in respect to information flow between them. Inside these markets memory effects last also for 2–3 min as their autocorrelation properties indicate. Fig. 49 shows |λ1 (τ )| calculated for the following single-market lagged correlation matrices defined for both markets: CDJ (τ ) = XDJ [XDJ (τ )]T ,

CDAX (τ ) = YDAX [YDAX (τ )]T .

(76)

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

173

Fig. 48. Absolute values of the largest (histograms) and the second largest (solid lines) eigenvalue of the τ -lagged asymmetric inter-market correlation matrix C(τ ) constructed from time series of the stock returns corresponding to companies listed in DJIA and DAX. Each entry of C(τ ) is the correlation coefficient for a pair of stocks in which one stock is listed in DJIA and the other in DAX30. The τ -dependence is shown for three different time scales 1t. τ > 0 denotes delaying the German market, τ < 0 denotes delaying the American one. The case of |λ1 (τ )| > |λ2 (τ )| corresponds to purely real λ1 (τ ).

Fig. 49. Within-market correlations expressed by the modulus of the largest eigenvalue of the inter-market correlation matrix (Eq. (75)) for the stocks listed in DJIA and in DAX. In the first approximation, the function λ1 (τ ) may be treated as a generalization of the correlation coefficient for the multivariate data. The time lag was restricted here to τ > 0 due to the symmetry condition: Cαβ (τ ) = Cβα (−τ ). Values of λ1 for τ = 15 are not shown due to an inherent artefacts of the numerical procedure. For τ ≤ 3 min, λ1 (τ ) has a vanishing imaginary part. Horizontal grey zone denotes the GinOE region.

The plots in Fig. 48 exhibit that |λ2 (τ )| is essentially a constant function for all the considered time scales 1t. In addition, its value does not exceed the GinOE upper bound for random eigenvalues. This signifies that the eigenvalue spectrum of C(τ ) consists of at least N − 1 random eigenvalues and at most a single non-random one. Two different cases of this spectrum are plotted in Fig. 50. Our results show that during the analysed period 1998–1999 the coupling between the stocks listed in DJIA and DAX had one-factor character and – from the perspective of the other market – both groups of stocks behaved as single degrees of freedom. That is, the cross-market correlations were not sensitive to company profiles, market sectors etc. This conclusion is qualitatively different from the ones drawn from the analyses of the lagged cross-correlations within the same market, where a clear multi-factor structure was seen [182,199]. Although at first glance strongly counter-intuitive, the faster information spreading among the stocks traded in the German market is consistent with the observation from Section 3.2.4 (Fig. 14) that the stocks comprised by DAX are relatively

174

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 50. Eigenvalue spectra for the inter-market lagged correlation matrix C(τ ) calculated from time series of high-frequency returns (1t = 3 s) of 30 stocks listed in DJIA and 30 stocks listed in DAX during the period 1998–1999. Two characteristic situations for different time lags τ are presented: one large purely real eigenvalue λ1 together with N − 1 random eigenvalues (left) and completely random spectrum (right). Shaded regions denote the range allowed for the GinOE matrices with the same variance of their elements as in the case of a typical matrix C(τ ).

strongly coupled even at short time scales, while the American stocks reveal more individual behaviour at such scales. Before they become noticeably coupled due to information spreading, some time has to pass, typically from a few seconds to a few tens of seconds. What is important, this effect does not concern only the Dow Jones stocks traded in NYSE, where the delays may be caused by the physical trading. An analogous study of correlations between the DAX stocks and 30 highly capitalized stocks traded in NASDAQ showed the same phenomenon of the maximum correlations shifted towards τ < 0. The only difference between these two cases was considerably smaller values of λ1 (τ ) for the NASDAQ-DAX case. In general, it seems that this phenomenon of relatively slow information spreading is characteristic for the American stock markets. 5.4 Apart from the above examples of the application of the asymmetric real (GinOE) random matrix ensembles to study empirical data, the non-Hermitian matrices have so far been found useful in various areas of physics, like, e.g., quantum chromodynamics [213–215], quantum chaos [216,217], quantum scattering phenomena [218], and also in random networks [219]. All these and other applications show that there is strong and growing demand for further development of the random matrix theory especially in the direction of asymmetric correlation matrices. It is conceivable that many new and important applications of such matrices are still to be found. This is because any really complex system is in principle at least partly governed by time-lagged correlations. As it was mentioned in Section 5.1, in this case the use of the Ginibre orthogonal matrices as the reference ensemble is limited since it requires a Gaussian distribution of matrix elements which is not strictly fulfilled by the empirical correlation matrices, especially if the signals are too short. In this context, the most desired future theoretical development from the practical perspective is derivation of the general form of the exact formula for the eigenvalue distribution for the random asymmetric real correlation matrices as a function of the parameter Q = T /N. However, this complex-value counterpart of the Mar˘cenko–Pastur formula (33) seems to be a significant intellectual challenge as it is still missing. 6. Lack of characteristic scale Power-law relations associated with scaling f (ax) = aβ f (x) are characteristic for the systems passing through a critical point or residing in its neighbourhood (Section 1.3.5). This is especially evident in the case of the power-law divergent correlation length ξ (Eq. (11)), which leads to power-law decrease of the correlation and autocorrelation functions (Eq. (12)). An important aspect of criticality is also the power-law dependence of the number of events on their size, whose theoretical model is the self-organized criticality. In addition, scaling means self-similarity which is a fundamental of fractal structure, which is often considered one of indicators of complexity (Part 7). From this perspective, existence of power-law relations in empirical data is always interesting, because it can be a prerequisite suggesting possible complex nature of the processes underlying the measured observables. A prerequisite, not a proof. It cannot be a proof since the power-law dependences can also be a consequence of relatively simple processes. In this section, the results associated with the power laws observed in financial and linguistic data. Such power laws are among the most important characteristics of these two data types. Apart from the scaling, the possible deviations from it are equally or even more significant, especially if scaling can carry some traits of universality. It should be noted, however, that we do not present here all the results related to the lack of characteristic scale and some additional ones related to scaling in complex networks will be presented Part 8.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

175

6.1. Power laws in financial data 6.1.1. Central Limit Theorem and distribution stability Central Limit Theorem (CLT) in its well-known classic form states that if Xi , i = 1, . . . , n are independent identically distributed random variables with finite mean µ and variance σ 2 , then a variable X = X1 + · · · + Xn , after a proper rescaling: (x − µn)/σ has a distribution which in the limit n → ∞ is equivalent in the measure sense to the normal distribution N (0, 1): x − µn

 lim P

n→∞

u1 ≤



σ n

 ≤ u2

u2

 =

du





u1

exp(−u2 /2),

(77)

for all finite u1 , u2 [220]. If the condition about the existence of the finite two first moments is not fulfilled, then the limiting distribution for the variable X is not the Gaussian distribution but one of a family of the Lévy stable distributions. These distributions are given by their characteristic function Lˆ αβ [221]:



Lˆ α,β (t ) = exp iγ t − c |t |α

ω(t , α) =



tan(π α/2) (2/π ) ln |t |



1 + iβ

t

|t |

 ω(t , α) ,

(78)

α ̸= 1 α = 1,

where γ ∈ R, c ≥ 0. The constants 0 ≤ α ≤ 2 and −1 ≤ β ≤ 1 define the asymptotic behaviour in the limit x → ±∞ and the asymmetry, respectively. The Lévy stable distributions do not have their analytic form in x, but their important property is a power-law decrease in the limit of large x: Lα,β (x) ∼

1

x → ±∞.

|x|1+α

(79)

A special case of the Lévy stable distribution for β = 0 and α = 0 is the Gaussian distribution. Lévy Stable distributions, although they play an important role in mathematics, are basically non-physical, because in the real world there exist no processes (‘‘Lévy flights’’) which would produce empirical data with infinite moments. However, these distributions can well approximate data in some range of x. For this reason, a family of more realistic distributions called ‘‘truncated Lévy distributions’’ was introduced [222] in which tails of the standard Lévy stable distributions are exponentially damped in the limit of large x [223]: Ltr α,β (x) ∼



|x|1+α

e−γ |x| ,

(80)

where γ > 0 and c± are constants dependent on a sign of x. Due to finite variance of the distributions given by Eq. (80), a sum of the associated random variables has a distribution that is in the limit n → ∞ convergent to the Gaussian distribution. This convergence is the slower the smaller is a value of the damping parameter γ . The power-law decrease of the distribution slopes that is commonly observed in empirical data may not necessarily be related to the actually power-law distributions or the Lévy stable distributions. An asymptotically power-law decrease or a decrease that resembles power law in some specific range of x can also be observed in other distributions, among which the most common are the log-normal distribution given by: 1 [ln(x/x0 )]2 pLN (x) = √ exp − 2σ 2 x 2π σ 2





,

(81)

the stretched exponential distributions [224] β

pSE (x) = ce−(x/τ ) ,

(82)

where 0 < β < 1, and the q-Gaussian distributions which will be presented in more detail in later part of this section. 6.1.2. Stylized facts For the financial markets, a characteristics of fundamental importance is shape of the distribution of price fluctuations. From a theoretical perspective, this shape allows for some insight into the nature of processes governing the market’s evolution and for constructing realistic models of these processes, while from a more practical perspective it allows to estimate investment risk. In literature, the earliest studies based on small data samples contributed to formulation of a hypothesis about the Gaussian distribution of financial returns. This hypothesis was further elaborated into the first model of price evolution in which the price was subject to random walk and the returns at a given temporal scale 1t was described by the IID random variables with normal distribution [225]. Proposed shape of the return distribution was a natural consequence of a large number of independent ‘‘shocks’’ paced by the transactions, to which the price is exposed during a

176

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 51. Time series of returns for 30 companies listed in Dow Jones Industrial Average (1t = 5 min). The most extreme price changes are attributed to the respective companies. Horizontal dotted lines indicate the specific levels of price change: ±5%, ±10%, and ±20%.

sufficiently long time interval 1t, and of the Central Limit Theorem. The Gaussian model used to be a ‘‘model in force’’ for the processes taking place in financial markets and, owing to its mathematical simplicity, in some aspects is used till now (e.g., in the option pricing theory of Black and Scholes [226]). However, as it can be seen in Fig. 51, it typically happens that in time series of returns there occur values whose probability in the Gaussian case is so small that their existence cannot be accounted for based on this model. In fact, the returns are more satisfactorily modelled by leptokurtic distributions. It also occurred that the distributions corresponding to different time scales 1t have similar shape [227]. Since for the returns the following relation holds: rN 1t (ti ) =

N −1 

r1t (ti + k ∗ 1t ),

(83)

k =0

such a result might have suggested that the processes behind the observed price fluctuations resembled the Lévy processes with stable distributions. Together with availability of the data recorded electronically with high sampling frequency there emerged a possibility of gaining insight into more distant regions of the distribution tails, corresponding to less frequent events, and a possibility of verifying the hypothesis of the distribution’s stability. The results obtained [228] for the S&P500 index sampled with oneminute frequency showed that the return distributions has features similar to the Lévy stable distributions predominantly in the central part, while their peripheral parts decline noticeably faster than their homologous regions in the stable distributions. This outcome was an impulse for introduction of the truncated Lévy distributions (Eq. (80)) and the associated stochastic processes [222,223]. The instability of the return distributions was confirmed by the widely cited extensive studies of the American stock market [144,145] which demonstrated that the data are characterized by distributions with the power-law tails decreasing as 1/x1+α with the exponent α ≃ 3 shared across many time scales up to 1t = 16 trading days for the stocks and 1t = 4 trading days for the S&P500 index. These results were qualitatively supported with only minor quantitative differences for several other stock markets [145,229–231], the forex market [146,232], and commodity markets [143], which led to an attempt of universalization of this property over all the markets as the ‘‘inverse-cubic power law’’. On the their hand, several studies based on data from small (like Norwegian) or emerging (like Brazilian) markets indicated that on these markets the problem of existence of the inverse-cubic power-law scaling is less obvious [233–235]. Interestingly, any systematic relation between the distribution’s shape in the limit of large r1t and the degree of market maturity cannot be identified. Fig. 52 shows distributions of returns calculated at short time scales 1t for three sets of stocks traded on three distinct markets and for two currency exchange rates. Because the distribution shape for negative and positive returns reveal only minor quantitative differences, the distributions for the absolute returns are plotted. A power-law shape of the tails in all the distributions is clearly seen with the exponents that fall in the range between α ≃ 2.8 (the American companies) to α ≃ 3.8 (the JPY/GBP exchange rate). Owing to presentation of the results in a cumulative form, the slope index α can be directly compared with the parameter describing the asymptotic behaviour of the Lévy stable distributions given by Eq. (79). It occurs that the empirical distributions in Fig. 52 are not stable in the Lévy sense, because in all the cases α > 2, but nevertheless it is strongly leptokurtic. Origin of such peripherally power-law unstable distributions of the financial returns that are observed on many markets has not been decisively identified yet. Some attempts carried out in this direction suggested that this phenomenon can either be an effect of nonlinear interactions among the investors trading according to different strategies and investing horizons [236], a consequence of the power-law distribution of traded volume and of the arrivals of massive orders in the

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

177

Fig. 52. Cumulative distributions of absolute values of the normalized returns for exemplary high-frequency data sets: 100 American companies with the highest capitalization (US 100, years 1998–1999), 30 German companies listed in DAX (DE 30, years 1998–1999), 45 large Polish companies (PL 45, years 2000–2005), and the USD/EUR and JPY/GBP exchange rates (years 2004–2008). The US 100 data set corresponds to 1t = 5 min, while the other ones correspond to 1t = 1 min. A cumulated Gaussian distribution N (0; 1) together with the schematically drawn cumulative power-law distributions P (X > x) = 1/xα serve as reference frames.

Fig. 53. Probability distribution of the absolute returns for the 100 American companies with the highest capitalization (1t = 5 min), approximated by different model distributions: a log-normal distribution pLN (x), a stretched exponential distribution pSE (x), a power-law (Pareto) distribution lα (x), and a truncated power-law distribution ltr α (x). The range of x in which the empirical distribution is approximated, has been distinguished by the lack of shading.

context of square-root functional dependence of the returns on volume size [237], or the power-law distribution of liquidity fluctuations expressed by the presence of significant price gaps in the limit-order books [238]. (In this case, even a small transaction can remove the barrier formed by orders that screen this gap from the best price, open the gap, and engender a large jump of the best price in direction of the imbalance in the order volume.) Recently, a new idea has been proposed that the power-law tails are associated with the individual hedging strategies of large institutional investors using financial leverage [239]. It must be admitted, however, that the observed power-law decrease of the return distribution tails may only be apparent and it can in fact be an approximation of the actual behaviour in the form of S (x)/xα ∼ 1/xα , where S (x) is a slowly varying function for large x. Such a situation is illustrated in Fig. 53, in which the empirical probability distribution of the absolute returns (1t = 5 min) for the 100 largest American companies was fitted by a few different model distributions: the lognormal (Eq. (81)), the stretched exponential (Eq. (82)), a power law and exponentially truncated power law (the counterpart

178

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 54. Autocorrelation function Caut (τ ) of the DAX returns with 1t = 3 s (data from the years 1998–1999) and of the S&P500 returns with 1t = 1 min (the years 2004–2006). Exponential function a exp −γ τ has been fitted to the first of these two relations. Standard deviation of Caut (τ ) in the case of random correlations determines the noise level.

of the truncated Lévy distribution for α beyond the stability range). As it can be seen, except for the stretched-exponential distribution which is the worst at modelling the data, the theoretical distributions do in fact approximate the empirical distribution and, at least visually, none of them can be rejected as a null hypothesis in the chosen range of x. The same refers to the q-Gaussian distributions which will be introduced later in this section (see Fig. 61)). A lack of characteristic scale might suggest that there are long-range correlations in the consecutive returns, but the empirical results do not support this suggestion. The autocorrelation function Caut (τ ) decreases very fast in this case reaching the noise level after a few minutes. This effect is exhibited in Fig. 54 for S&P500 and DAX. In the latter case the index was reconstructed from time series of stock returns with an excellent temporal resolution of 1t = 3 s, which allowed us to obtain a sample sufficient to fit the results with the exponential function a exp(−γ x), where γ = 0.45. This function well reflects the behaviour of Caut (τ ) for time lags between about 30 s and 4–5 min. For S&P500, the decay time of the autocorrelation is similar, but due to the longer time scale of the returns (1t = 1 min), at which noise effects are smaller, strength of the autocorrelation is initially larger than for DAX (at the same one-minute scale these dependences are similar to each other). This outcome can be compared with the multivariate autocorrelation calculated for the companies listed in DJIA or DAX shown in Fig. 49 (Section 5.3). In the light of these properties of the stock returns, a rather slow convergence of their distributions to the Gaussian distribution at long time scales 1t is a phenomenon difficult to account for. These distributions are not stable even at very short time scales of order of 1 min and less, while – according to the results published in [144,145] – their shape seems to be invariant under aggregation up to daily scales or longer, at which the returns are sums of even a few thousand returns corresponding to 1t = 1 min. In view of the lack of long-range linear autocorrelations in time series of the returns, the responsibility for the weak convergence of their distributions should be taken by the nonlinear correlations and microscopic effects. Long memory which is absent in time series of returns is, in contrast, the most striking property of volatility, defined as return modulus: v1t (ti ) = |r1t (ti )| (instant volatility) or as standard deviation of returns v1t (j) = σj (r1t ) (mean volatility) calculated in temporal windows of length Tw ≫ 1 (j = 1, . . . , T /Tw ), indexed by j. Autocorrelation function of the instant volatility derived for exemplary time series of USD/EUR and S&P500 is presented in Fig. 55. Caut (τ ) decays in a power-law manner τ −β with the exponent β ≃ 0.3 whose value is typical for volatility and only marginally depends on a market from which data was taken [145,240,241]. Both time series were partially detrended by removing the intraday trend (whose trace is still visible notwithstanding) forming strong oscillations in Caut (τ ) with a period of 1 trading day that can mask any power-law characteristics. For S&P500, the scaling behaviour in Caut (τ ) is broken after about 2500 min (7 trading days). For USD/EUR, as early as after 1440 min (1 trading day) the initial relationship changes into a slightly slower decline and finally breaks down an order of magnitude later than its counterpart for S&P500. At large lags τ , the autocorrelation function changes sign and becomes negative, after which it starts to slowly oscillate between the negative and positive values (inset in Fig. 55). Long memory is directly linked with the effect of volatility clustering (already mentioned in Section 2.3). Prices are inclined to make sudden jumps if in a period immediately preceding the present moment their movements have already been large and abrupt, and to make small jumps if the previous movements have also been small. If one considers a signal being a record of price fluctuations (see Fig. 7), this signal is strongly nonstationary: the intervals of tamed behaviour with small price fluctuations are interlaced with the intervals of wild behaviour with large fluctuations. Such a clustering can have its basis buried in the fact that the asset prices do not reflect only the information that arrives at the market at a given

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

179

Fig. 55. Autocorrelation function Caut (τ ) for the instant volatility of the S&P500 index and the USD/EUR exchange rate (1t = 1 min) after removing the intraday trend. Dashed lines denote the fitted power-law function τ −δ . Inset: the same for the S&P500 index, but in the linear scale. Long periods of negative autocorrelation can be seen here.

moment but also the full diversity of nonlinear processes associated with making decisions and using different strategies by the investors (e.g., [239]). This is because different pieces of information arrive at the market relatively rarely as compared with the average transaction frequency and the frequency of price changes. It might thus be freely absorbed by the market in a cascade way, by descending from long time scales, typical for the frequency of news releases, to short time scales, typical for trading [146]. One may notice here a slight parallel to the phenomenon of turbulence, in which energy is transported from the large spatial scales to smaller ones, where it can be dissipated — but this analogy cannot be treated too verbatim [242,243]. It is also worth noting, that the financial data do not have time reversal symmetry. This can be seen, among others, in the influence of historical volatility at long time scales on contemporary volatility at short time scales [244]. In addition to the volatility case, a power-law behaviour of the autocorrelation function was found in many other financial observables (Section 2.3). Among them, long memory in the quantities related to the market microstructure: the size and the type of arriving orders [136,245] has a special importance for the understanding of the nature of processes ruling the evolution of markets. As it is supposed, this memory stems from the existence of large orders which must be divided into smaller ones and which are realized gradually for many hours or even days [131,137]. It might seem that in this way for a long period of time during which such an order is being executed, the market could not possess a symmetry between supply and demand and should evolve in a direction assigned by this order. Thus, a piece of information arriving at the market might have an impact on price movements as long as the process of absorbing the orders evoked by this information is not completed. This behaviour might create a considerable margin of market inefficiency conflicting with the EMH. Detailed studies reveal, however, that the market is somehow able to withstand this unbalancing force drawing the market out of the equilibrium state, by counteracting it with liquidity adjustments [136]. So, by a process of adaptation, the market autonomously asserts its own efficiency (of course, if we neglect the at-most-a-few-minutes-long autocorrelation in the returns). 6.1.3. Impact of collective couplings on return distributions One of interesting effects seen in the fluctuation distributions of financial data is similarity of the distributions corresponding to the returns of individual stocks and market indices being the weighted sums of many stock prices. On account of the fact that the return distributions are not stable, the Central Limit Theorem should enforce the distributions corresponding to the index returns to be closer to a Gaussian. This is not the case, though. Even for strongly peripheral parts of the distributions in a broad range of time scales 1t, the slopes have the same functional form with the same powerlaw exponent in both types of data [145,246]. The lack of convergence means in this context that the assumption about independence of the summed returns of different stocks must be broken. Indeed, as the results presented in Section 3.2.2 indicate, the price movements of different stocks are typically correlated forming the whole hierarchy of couplings within the market. Clearing these couplings by independent randomizations of the time series corresponding to different stocks leads to a significant improvement of the convergence [145]. This issue can be approached at also from another direction, by asking a question how the shape of the index return distribution changes with varying the coupling strength between the stocks listed in this index [246]. In order to address this question, one can exploit the fact that the inter-stock correlations are unstable. Studies based on the correlation matrix formalism showed that correlations inside a group of companies vary considerably in time. Average correlations in a moving window, expressed by the largest eigenvalue λ1 (t ) fluctuate between almost complete lack of coupling (λ1 ≈ λmax , where λmax is the upper bound of the eigenvalue spectrum for the Wishart random matrices) and almost ‘‘rigid’’ market (λ1 ≈ N, where N is the number of companies). It was also found that market falls, which are characterized by strong price

180

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 56. Cumulative distributions for absolute normalized returns of the DAX index and of the stocks of the related 30 companies for the periods of strong ζ ≫ 0.5 and weak ζ ≪ 0.5 couplings among the stocks. The data correspond to years 1998–1999 and to 1t = 5 min. The slanting dashed line denotes the cumulative power-law distribution with the limiting slope for the Lévy stable distributions.

fluctuations, are more collective than market rallies, when price fluctuations are much smaller [247,248]. The instability of correlations is observed as well in the currency market in Fig. 25, where the largest eigenvalue exhibits especially strong oscillations for GHS and JPY as the base currencies. Time series of the stock returns for N = 30 companies listed in DAX were divided into disjoint windows wj of length Tw = 30 (j = 1, . . . , T /Tw ). The time series for 1t = 5 min (a scale at which the average correlations are already well developed, see Fig. 14) were selected for the analysis. The length of each series was T = 53,847 which gave the number of windows: nw = T /Tw = 1794. In each window the correlation matrix was calculated together with its eigenvalue spectrum. Due to the fact that our subject of interest is the index, which describes the average behaviour of the whole group of 30 stocks, the analysis was restricted to the largest eigenvalue λ1 (wj ) only. Correlation strength was parametrized by a value of ζ defined by the following equation:

ζ =

#{j : λ1 (wj ) < Λ} nw

,

(84)

where Λ denotes a threshold discriminating the values of λ1 (wj ). The parameter ζ determined then a fraction of windows for which the largest eigenvalue was smaller than a given threshold. By choosing the specific values of ζS = 0.95 and ζW = 0.05 defining, respectively, the strong and the weak inter-stock couplings, we could determine the associated threshold values ΛS and ΛW and then select the windows which fulfilled the proposed criteria. Finally, we could derive the return distributions for the stocks and the DAX index in the selected window sets. By this, we obtained separate distributions for the periods of strong and weak market couplings. Fig. 56 shows the results. The distributions for the absolute stock returns display the absence of qualitative difference between the periods of collective market evolution and the periods dominated by noise. The observed quantitative difference is small enough that in the first approximation it can be concluded that the distribution corresponding to the stocks is invariant under a change of the coupling strength. The case of the distributions of the absolute index returns looks unlike, however. For the windows with strong correlations, the tail of the distribution is much thicker than its counterpart for the windows with weak correlations, whose tail decays in an almost Gaussian manner. The samples based on which both the distributions were created were so small (T ′ = 2670 data points) that it was not justified statistically to fit a power-law function to the slope of the distribution corresponding to the S windows. Despite this, a comparison with the power-law function with α = 2 depicted in Fig. 56 indicates that in the case of a larger sample this distribution could have a form of a truncated Lévy stable distribution. A similar outcome was obtained for the index returns at different time scales 1t. This may suggest that, from the index point of view, the market evolution has a rather compound character and many interwoven phases can be pointed out, in which processes with different statistical properties may play the leading role. Such emergence of different processes from the overall noise and overtaking the dominance by them is perhaps on its own a stochastic process (in the same way as the volatility is) but the repeatable components cannot be neglected here, either. Existence of the characteristic periods of a trading day in which the market shows distinct behaviour than in the other periods has already been discussed in Section 4.1. In the German market, for example, such a period occurs near 14:30 h when the market reacts to the economic news released in the United States. If the relevant short period of time (let it be 14:25–14:35) was extracted from each trading day and the index return distribution was calculated only from these periods and some reference periods (e.g., 10:00–14:00), then these distributions would be significantly different. This is evident in Fig. 57. The interval near 14:30 could in principle be

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

181

Fig. 57. Cumulative distributions for absolute normalized returns of the DAX index in selected periods of trading days. The data correspond to the years 2004–2006 and to 1t = 1 min. Power-law functions y = 1/xα were least-square fitted to the empirical distribution tails.

narrowed down even more and then one may expect that the difference between the corresponding distribution and the referential one would be even larger, but this could be done only at the expense of statistical significance due to too small data sample in that case. 6.1.4. Acceleration of market evolution It was shown in Refs. [144,145] based on data sets comprising the years 1994–1995 (high-frequency quotations) and the years 1962–1996 (daily quotations) that the return distributions are characterized by the slopes with power-law decays with the power exponent of α ≈ 3 in a broad range of temporal scales from 1 min to 16 trading days in the case of individual stocks and to 4 trading days in the case of the S&P500 index. However, from the today’s perspective, considering in the same joint analysis both the recent data and the data from rather distant years could be a methodological inconsistency originating from an unsound assumption that the statistical properties of financial data are stable in time. Intuition prompts though, that development and self-organization of the markets that have been observed during the decades of their activity, the technological progress, the increase of market volume and the number of investors, as well as the advancing globalization of the financial world cannot leave the nature of data untouched. Let the transaction frequency be an example of these changes. A few decades ago in the epoch before dissemination of the electronic trading systems, a time scale of 1 min was considered very short. The institutional investors responsible for the majority of traded volume were able to carry out a few transactions at most during this time. Nowadays, such large investors have an access to the systems asserting the transaction times of order of milliseconds. From such a perspective, the scale of 1 min can be treated as very long in which several thousand or several teens of thousands of transactions can be done. This should lead to the shortening of memory effects in all types of financial data. In fact, by comparing the decay time of the return autocorrelation for the same S&P500 index, it is clear that if this time was equal to about 15–20 min in the data from the years 1983–1996 [145,241], then it was squeezed to only 4 min in the years 2004–2006 (Fig. 54). Shortening of the lagged cross-correlations between the stocks from the same market was also identified [182]. The phenomenon of acceleration can also be seen in the distributions of stock returns [249,250]. Fig. 58 presents cumulative distributions of the absolute returns for the 100 largest American companies based on data from the years 1998–1999. By increasing 1t of the returns, a gradual decrease of the tail thickness is observed. If the tails are assumed to be power-law decreasing for each 1t, the power exponent is gradually increasing from α ≈ 2.8 for 1t = 4 min to α ≈ 4.5 for 1t = 780 min (2 trading days). Thus, for this group of stocks, the convergence is seen already for short time scales. However, this group comprises only the stocks with the highest liquidity, for which the number of transactions reached several tens of thousands a day during the studied period. Such stocks can have different dynamical properties than the less liquid but more typical stocks. An extended analysis based on 1000 American companies with the capitalization 8 · 108 < Kα < 5 · 1011 USD showed that the power-law dependence of the inverse-cubic type found earlier on time scales shorter than a few days, in the studied period was broken down on time scales of order of tens of minutes. Similar result was obtained for the German stocks listed in DAX [249]. In the case of indices the breakdown occurs even faster and the effect of progressive convergence to a Gaussian is observed as early as for 1t = 16 min [250]. A completely different behaviour is demonstrated in Fig. 59 by the return distributions for 100 stocks corresponding to companies with small capitalization (1.5 · 108 < Kα < 3 · 108 ) and small liquidity. For 1t < 780 min the distributions are invariant with respect to 1t and their tails can be approximated by a power-law function with the exponent α = 3.35. This result indicates that the speed of convergence depends on stocks’ liquidity which in turn is related to the number of

182

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 58. Cumulative distributions of absolute normalized returns corresponding to different time scales 1t for the 100 American companies with the highest market capitalization. Data from the years 1998–1999. Gradual atrophy of the distribution’s tail with increasing 1t is clear. Dashed lines depict the cumulative distributions: normal N (0; 1) and power-law P (X > x) = 1/xα .

Fig. 59. Cumulative distributions of absolute normalized returns corresponding to different time scales 1T for 100 small American companies with the capitalization of 1.5 · 108 < Kα < 3 · 108 USD. Data from the years 1998–1999. The tails do not show equally significant marks of convergence towards the Gaussian distribution with increasing 1t as the tails in Fig. 58 did. Dashed lines denote the cumulative distributions: normal N (0; 1) and power-law P (X > x) = 1/xα with the slope α = 3.35 equal to the slope of empirical distribution for 1t = 16 min.

transactions. This receives some further support from other results, according to which the shape of the return distributions for a fixed 1t is the same as the shape of the return distributions corresponding to time intervals of variable length determined by a fixed number of consecutive transactions [238]. This means that if the transaction frequency drops, a distribution of the same shape starts to describe the data at some longer time scale. Since trading on the stocks of smallcapitalization companies is less frequent than trading on the stocks of highly capitalized ones, we can observe the effect seen in Figs. 58 and 59. These results can be verified independently by considering the dependence of standard deviation of the returns on time scale: σr (1t ). In this case the time series were formed by the returns of the artificial ‘‘indices’’ created as sums of stock prices corresponding to the companies from each of the two analysed groups. The results are shown in Fig. 60. Location of the points along the straight lines indicates a diffusion-like relation:

σr (1t ) ∼ 1t β ,

(85)

where β specifies diffusion type. For both groups of companies, two scaling regions can be distinguished with β > 0.5 each which means anomalous diffusion: the superdiffusion. For short time scales (1t ≤ 10 min), the large-company index is more superdiffusive than for longer scales at which the value of β = 0.54 is close to the one for normal diffusion. In the case of the small-company index the situation is opposite: at long time scales the process is more anomalous than at short

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

183

Fig. 60. Standard deviation σr dependence on time scales 1t for time series of the returns calculated for ‘‘indices’’ being simple sums of stock prices for the 100 largest American companies (left) and for 100 small companies from the same market (right). Power-law functions 1t β were fitted to the empirical data. The two different values β > 0.5 shown in each panel signify two different superdiffusion regimes whose crossover occurs at the points denoted by the × signs.

scales. The transition from the anomalous diffusion to the approximately normal diffusion near 1t = 10 min seen for the large companies and a lack of such clear transition up to the longest scale (1t = 780 min) seen for the small companies remains in agreement with the behaviour of the return distributions in Figs. 58 and 59. 6.2. Stock market as a nonextensive system The nonextensive statistical mechanics [251] is considered a generalization of the Boltzmann–Gibbs statistical theory for the systems whose Boltzmann–Gibbs entropy: SBG = −kB

W 

pi ln pi ,

(86)

i =1

where kB is a positive (Boltzmann) constant and W is the number of different discrete states of a system, is nonextensive, i.e., it does not obey the asymptotic behaviour: SBG (N ) ∼ N

for N → ∞.

(87)

This is the case, for instance, for the systems situated at the interface between order and disorder, which reveal power-law dependence of physical quantities on order parameters, long-range correlations, and other phenomena observed in systems that undergo a phase transition or systems that permanently reside at the critical point. The standard statistical mechanics is unable to provide a satisfactory description of such systems, therefore there is a space for a generalization of this theory over the systems that are nonextensive in the Boltzmann–Gibbs sense (87). The nonextensive statistical mechanics occurs useful in description of such systems. As it was discussed in Section 1.3, the critical state is believed to be a natural state of complex systems, therefore the nonextensive statistical mechanics occurs naturally in this field of research. Existence of the statistical dependences in financial data, especially the long-range correlations, causes that the standard statistical mechanics which is based on the assumption of the absence of such dependences does not suffice to describe the data; the nonextensive statistical theory has to be applied. The nonextensive statistical mechanics was developed around the concept of nonadditive entropy [252]: Sq = kB

1−

[p(x)]q dx , q−1



(88)

where p(x) is a probability distribution. A family of distributions maximizing the entropy (88) for 0 < q < 3 under the following conditions:

 x

[p(x)]q dx = µq , [p(x)]q dx



(x − µq )2 

[p(x)]q dx = σq2 , [p(x)]q dx

(89)

is the family of q-Gaussian distributions given (to a normalization constant) by [253]: Gq (x) ∼ expq [−Bq (x − µq )2 ],

(90)

where 1

expq x = [1 + (1 − q)x] 1−q ,

Bq = [(3 − q)σq2 ]−1 .

(91)

If the random variables are independent, the attractors for these distributions are the Gaussian distribution (for 1 < q < 5/3) or the Lévy stable distributions (q > 5/3), while if the random variables (or at least their specific class) are correlated,

184

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 61. Cumulative distributions of absolute normalized returns corresponding to different time scales 1t for the 100 American companies with the highest market capitalization, together with the fitted cumulative q-Gaussian distributions. Each q-Gaussian was labelled by the associated value of the Tsallis parameter q. In order to better visualize the results, each q-Gaussian was multiplied by a positive factor c ̸= 1.

the q-Gaussians themselves become the attractors. The relevant generalization of the CLT has already been formulated [254,255]. Despite some small quantitative differences (see, e.g., [230,256]), the negative and positive wings of the return distributions are similar enough to each other, that they could be considered together. Therefore, in Fig. 61 the cumulative distributions of the absolute returns are presented. Two important conclusions can be drawn from this Figure. First, it seems that among all the statistical distributions considered in this section the q-Gaussians model the empirical distributions best, in both their central and peripheral regions. Second, if 1t increases, the convergence of the empirical distributions demonstrates itself in the decreasing q. Although Fig. 61 shows distributions for the stock returns, qualitatively similar results were obtained for the index returns [230,250] and the currency exchange rate returns as well [257]. This constitutes a major argument in favour of describing the financial data in the language of nonextensive thermodynamics. 6.3. The Zipf law and the natural language Another interesting type of power-law relation is a rank-ordered distribution of values of a random variable. The rank introduces an inverted hierarchy in which lower ranks are attributed to larger values, while higher ranks are attributed to smaller values. If {x} is a T -element set of the values of a random variable X , then the following relation holds: xR−1 ≥ xR ≥ xR+1 , where R (1 ≤ R ≤ T ) is a rank of the value xR . Rank-ordering of the values of X is closely related to the cumulative distribution P (X > x) of this variable:

⌊TP (X > xR )⌋ ≈ R,

(92)

where ⌊·⌋ stands for the floor value of a number. The best-known relation of this type is an inverse proportionality of the number of occurrences (commonly called the frequency) F of a given word in a text sample on its rank, where the words must be ordered according to their frequency. The earliest records about this relation appeared in Refs. [258,259], but it was only G.K. Zipf who approached this issue more systematically and carried out an analysis spanning several different languages [260]. His studies revealed that the inverse proportional relation F (R) has a rather universal character in the case of natural languages, what fully justifies the later granting the term of ‘‘a law’’ to it. In a broader context, the notion of ‘‘a Zipf law’’ can be extended on all the inverse power-law relations in which one of the involved variables is rank. Among such Zipf-like relations one can mention the dependences: of city population on a city’s rank [261], of the number of earthquakes on rank of their seismic moment [71], of the number of companies on rank of their size [262], of the number of scientific publications on rank of their citations [263], and many more [74]. 6.3.1. The linguistic Zipf law The Zipf law in its classic form is given by the following relation [260,264]: A , β ≈ 1, (93) Rβ where A is empirical constant with a typical value A ≈ 0.1T . Often this law is expressed also in a form of the so-called inverse Zipf law [265]: F (R) =

I (F ) =

A′ Fβ



,

β ′ ≈ 2,

(94)

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

185

in which I (F ) denotes the number of words with frequency F . The power exponents in both versions of the Zipf law are mutually related: β = 1/(β ′ − 1). Origin of the power-law scaling in F (R) and I (F ) is a subject of debate. The original idea of Zipf links scaling to the principle of least effort [261]. In a later expression in terms of information theory, this principle can be formulated as the optimization of information transfer between a speaker (source) and a listener (receiver). Each of these two persons undertakes a specific effort: the speaker encodes information by means of words, while the listener tries to decode it. The speaker aims at expressing the information in the simplest way possible, i.e., by using as few distinct words as possible. At the opposite end, the listener, whose principal task is to find an appropriate one-to-one mapping of the words onto corresponding notions, expects to receive the most precise message built from unambiguous words. Since these expectations of the speaker and the listener are contradictory, efficient transfer of information requires making a compromise between these extremes. A formal statement of this problem supposes that the cost of using a word of rank R is given by S (R) ∼ logk R, where k is length of this word. (Hereafter, by word we mean a specific sequence of characters terminated by space.) If this word’s probability is p(R), then the average cost per word and the average informative content of a word are expressed, respectively, by: S¯ =



p(R)S (R),

H =−



R

p(R) ln p(R),

(95)

R

where H is information entropy defined by Eq. (1). In this case, the optimization consists in minimizing the quantity S¯ /H with respect to the probability distribution p(R). In effect, a dependence of the type of Eq. (93) is obtained [264,266] leading to a rule that the words that are used most often should be the shortest ones, while those rarely used should be the longest ones. In fact, in real language such a rule is easily observed. It is worthwhile to notice that strong deviations from the Zipf law are observed in some non-standard conditions like, for example, in schizophrenia, early infancy, or on a battlefield, in which, instead of minimizing S¯ /H, an inclination to minimize S¯ regardless of the information content of a message or to maximize H regardless of the cost. From theoretical deliberations it follows that the variability range of the power exponent should not exceed the interval 0.7 ≤ β ≤ 1.67 [267]. A too small value of β is connected with a substantial number of infrequent words, which require involving significant mind resources of the speaker and the listener. In contrast, a too large β means a poor vocabulary or fixing of the speaker on a single narrow subject. In both cases this can lead to problems with communication. The above optimization approach assigns the process of language self-organization a fundamental role in the origin of the Zipf law. This hypothesis is a subject of strong criticism, however. This is so because it is possible to explain the powerlaw scaling of word frequencies without referring to the self-organization of language, but instead only to the properties of simple stochastic processes. The first group of processes that can produce power-law distributions of the Zipf type are the ecologically inspired Yule–Simon processes [73,268] known also as a kind of preferential attachment processes. They act on whole words. It is assumed that each time a new word is added to a text with probability ψ or one of already  used words with probability 1 − ψ . Which of the words is to be used again depends on the probability η(R∗ ) = F (R∗ )/ R∗ F (R∗ ) of drawing a word with the rank R∗ (defined in the already written part of the text). Then for F (R∗ ) ≫ 1 the relation (93) holds. The other group of relevant stochastic processes are the processes known as ‘‘typewriting monkey’’ or ‘‘intermittent silence’’ [269,270]. From the language structure perspective, they are low-level processes and act on individual characters (letters). In their simplest formulation, each character may be typed with the same probability θ = (1 − φ)/n, where φ is the probability of typing space and n is the number of characters in an alphabet. The probability of writing a word with the rank R = nk , where k is the word’s length, is then: p(R) = φ



1−φ n

k





1−φ n

logn R

= φ Rlogk (1−φ)−1 ,

(96)

which gives Eq. (93) after substituting β = 1 − logk (1 − φ). For a relatively small value of φ , it occurs that β ' 1, which is, at least approximately, consistent with the empirical values of β for some languages. In the light of the above results, it may seem doubtful whether the Zipf law comprises any significant information about natural language. However, viewing the process of writing words and texts as a purely stochastic process has a considerable drawback, since in such a case one neglects the fact that randomly generated texts do not possess all the properties of real texts written in natural language. This refers, for instance, to the distributions of word lengths which in the ‘‘monkey language’’ can be arbitrarily large, while in natural language this length is restricted to at most tens of characters (except for the situations, when authors create words with unnatural length due to stylistic reasons, but this rarely happens). If in a procedure generating random texts one places a restriction on the maximum word length with a threshold value that is typical for natural language, the resulting relation F (R) will no longer be consonant with the Zipf law [271]. Texts that are written in natural languages can also be distinguished from random texts by means of information entropy of the I (F ) distributions [265]. Moreover, a model of symbolic references (i.e., mappings of objects on words) with a built-in Zipf distribution suggests that this distribution can facilitate the emergence of syntax as a method of linking distant objects [272]. In a spate of sometimes contradictory arguments it then seems that although the Zipf law as a statistical relation may not have necessarily evolved as a consequence of the self-organization of language (what cannot be excluded, either), the existence of this law may have been a catalyst of shaping language as an efficient tool of communication.

186

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 62. Rank-frequency distribution for the original English version of ‘‘Ulysses’’ by J. Joyce. A power-law function (93) was fitted to the empirical distribution with the power exponent β = 1.03 (slant dashed line). Vertical lines denote the range of consistency of the empirical distribution with the Zipf law (11 ≤ R ≤ 4400).

6.3.2. Deviations from the uniform Zipf law Actually, in spite of its fame, the Zipf law is valid only in a relatively narrow range of R. A classic Zipfian dependence for a literary text is exhibited in Fig. 62, based on the English text of ‘‘Ulysses’’ by J. Joyce. Its length is T = 265,272 words and it consists of V = 30,101 distinct words. A power-law function (93) was fitted to the empirical distribution. It well approximates the data in the range 11 ≤ R ≤ 4400 (6 ≤ F (R) ≤ 2500), if β = 1.03 (which is a value typical for the English language). Outside this range, and for R ≤ 10 in particular, the distribution clearly deviates from a uniform power-law behaviour. This deviation can be modelled for small R by means of Eq. (93) modified as follows: F (R) =

A

(R + C )β

,

(97)

expressing the so-called Zipf–Mandelbrot law [264]. For comparatively large ranks R > 4400, the agreement between the data and the power law with β = 1.03 gradually weakens, because the number of words with F (R) < 6 becomes smaller than it may result from the power law. In the case of rare words, a disagreement with the Zipf law is a typical property seen in the majority of long texts. However, for individual texts the statistics of F in this range of high ranks is poor. In addition, a maximum rank is always bounded by the length of a text, which significantly obstructs any analysis. An insight into behaviour of F (R) in the range of infrequent words is possible only after extending the analysis on corpora composed of many texts. In extreme cases, the corpora allow one to reach the ranks comparable with the total number of words used in a given language. The analyses performed on such huge samples based on texts written in English show that the power-law relation observed for smaller ranks with the exponent β1 = 1.05, for (circa) R > 5000 breaks down and gradually transforms into another power-law relation with β2 = 2.3 [273,274]. From the lexical point of view, the existence of such two different power-law regimes can be explained as a consequence of division of the vocabulary into a common core known to and used by all authors, which consists of several thousand basic words, and a specialized vocabulary with words that are used occasionally or only in single texts. What is more, in single texts there is a continuity of action which implies that certain words occur more often that other ones not because of the general properties of language, but due to specific determinants connected with the content of a book or publication. An example of such a determinant can be the names of characters featuring in a particular novel and therefore having low ranks there, but if this novel becomes a part of a large corpus containing hundreds or even thousands of books, the corresponding words are moved on the Zipf plot towards much larger values of R. This can also be a factor leading to breakdown of 1/R scaling in a corpus. The function F (R) with two scaling regions can be described by a single equation after generalizing the Zipf–Mandelbrot law at the differential-equation level [273]. Its form given by Eq. (97) is a solution of the following differential equation: dF (R)

= −a[F (R)]µ , dR with µ = 1 + 1/β . By adding another element to r.h.s. of this equation, one obtains: dF (R)

= b[F (R)]ν − (a − b)[F (R)]µ , dR whose solution have two different forms depending on F (R): F (R) ∼ ((µ − 1)aR)1/(1−µ) F (R) ∼ ((ν − 1)aR)1/(1−ν)



dla F (R) ≪ (a/b − 1)1/(ν−µ) , dla F (R) ≫ (a/b − 1)1/(ν−µ) .

(98)

(99)

(100)

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

187

Fig. 63. Frequency-rank plots for the English and Polish versions of ‘‘Ulysses’’ by J. Joyce, for the original Polish text of ‘‘Lalka’’ (‘‘The Doll’’) by B. Prus, and for the original English text of ‘‘Finnegans Wake’’ also by J. Joyce. Power-law functions defined by the exponent β were fitted to the empirical data. In the case of ‘‘Finnegans Wake’’ fitting a single power-law function is impossible, therefore the corresponding distribution must be characterized by two exponents: β1 and β2 .

Polish, as a Slavic language, possesses considerably different grammar than English. This is visible even at the level of words which are much more inflectable in Polish than in English (and in other Western European languages as well). For example, a typical Polish noun has 10–12 inflected forms related to different cases and numbers, while a typical verb has over 30 forms related to different tenses, persons, numbers, genders, moods and aspects. To compare, a typical English noun has 2 forms (singular and plural) and a typical verb has 4–5 forms (e.g., the verb show has 4 derivative forms: shows, showed, shown, showing). In Polish, some inflected forms are used frequently, but most of the forms are used seldom which suggests that in Polish texts a number of rare words is larger than in English ones. Since rare words are situated in the tail of the Zipf distribution, one may thus expect that for Polish language the power exponent should be smaller than its counterpart for English language. In fact, as a study of a Polish corpus of literary texts shows [275], in this case the Zipf law is valid with βPL = 0.94 (as compared to βEN = 1.05). A difference between the Zipfian plots for these two languages is clearly seen if one looks at F (R) calculated for the same piece of text. Fig. 63 shows such a comparison for ‘‘Ulysses’’ in the English original version and in a Polish (considered excellent) translation [276], for which β = 0.93. The Polish version is shorter (T = 238,064) but, due to the smaller β , it consists of much more different words (V = 53,487). It is worth noting that for the Polish version the scaling range of F (R) is very broad and one can observe even a slight overrepresentation of the rarest words (i.e., those occurring 1 or 2 times in the whole text) as compared to the power-law function. This is consistent with the above considerations. Fig. 63 presents also the F (R) distribution for a Polish 19th-century novel ‘‘Lalka’’ (‘‘The Doll’’) by B. Prus. Its text is comparable to ‘‘Ulysses’’ in terms of length (T = 252,243) but – being stylistically a more conventional novel – it has poorer vocabulary (V = 37,520). In this case (as well as in other works of this author) β = 0.99, which is notably high if compared to the average Polish exponent βPL . For high ranks, F (R) behaves here similar as for the English ‘‘Ulysses’’, deflecting downwards from a perfect power-law scaling. We use another J. Joyce’s novel, ‘‘Finnegans Wake’’ (the one that inspired Gell-Mann to invent the name for quarks), as the last example. In contrast to the texts discussed above, now the plot of F (R) does not display a more-or-less uniform scaling behaviour, but there are two separate, narrower scaling regions instead. The first one corresponds to the range 20 < R < 200 and is represented by a surprisingly high value of β1 = 1.25, while the second one corresponds to R > 500 and has a relatively small (as for English) exponent β = 0.90. This result can be explained by the fact that ‘‘Finnegans Wake’’ is a highly experimental work with an exceptionally complicated structure and vocabulary taken from many different languages. A peculiarity of this novel is that its narrative intentionally mimics a dream and a state at the edge of dream and wakefulness, by using extensively the stream-of-consciousness technique. This untypical stylistics translates itself into nonstandard statistical properties of the text. One can observe here how elevated complexity of a text can manifest itself not in a power-law decrease of F (R) expressed by the Zipf law, but rather in the deviations from this law. 6.3.3. Parts of speech and lemmatization In a classic Zipfian analysis words are treated as character sequences that neither have meaning nor play specific roles in a message. In such an analysis natural language is amorphous: after a random reshuffling of words or after attributing them to improper parts of speech, the frequency-rank distribution remains unaffected even though the text may lose its sense completely. However, real language has a structure, the words may not be arbitrarily shuffled, and their grammatical role may not be freely changed. This is because the information is carried not only by individual words themselves, but also by a context those words are put in. The context is given by both the meaning of the adjacent words and the grammatical

188

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

F(R)

1000

100

10

1 1

10

100

1000

10000

1000

10000

R

F(R)

1000

100

10

1 1

10

100

R Fig. 64. Rank-frequency plots created from words belonging to specific parts of speech for the English text of ‘‘Ulysses’’ by J. Joyce (top) and the Polish text of ‘‘Lalka’’ by B. Prus (bottom). In the case of ‘‘Ulysses’’ only nouns and verbs were tagged. Dashed lines denote schematically power-law functions that are characteristic for the complete texts without division into word classes.

structures comprising these words. An important role is also played by the fact that words adhere to particular parts of speech which determine or at least influence their function in a phrase or sentence. A simple Zipfian-like analysis that can allow for grammatical inhomogeneity of language is the analysis of rank-frequency distribution of words belonging to the same part of speech. Obviously, this approach makes any sense only in the case of the open word classes in which the number of elements is statistically sufficient to detect any power-law behaviour. Here we study two texts in their original language versions: ‘‘Ulysses’’ and ‘‘Lalka’’, in which all the words were tagged by their relevant parts of speech. Due to more difficult tagging of English words which are much more flexible than Polish ones as regards the parts of speech they can belong to, in ‘‘Ulysses’’ only two classes were distinguished: nouns and verbs, while all the other classes were considered together as if they were a single class. As opposed to this, in ‘‘Lalka’’ all the basic classes were considered separately. Rank-frequency plots for different word classes in both texts are shown in Fig. 64. In striking contrast to the full set of words, words grouped into classes do not reveal clear power-law dependences in either of the texts. Among the classes only verbs have distributions which can resemble the power-law ones in a predominant range of R. This is especially evident for ‘‘Ulysses’’, but also to a lesser degree for ‘‘Lalka’’ this is the plot for verbs which is the closest to a power law. For other classes such an effect was not found. In the case of ‘‘Ulysses’’, this was also ruled out for adjectives and adverbs for which roughly approximate rank-frequency distributions were obtained by selecting the relevant words directly from the complete word ranking (thus losing some information related to words which can belong to both these classes) instead of tagging the text. (Because of such approximate character of this result we do not show the corresponding plots.) The fact that for either of the books analysed here the individual word classes do not form a power law distribution which is actually formed by the complete set of words, must be a consequence of grammar. Decreasing of the slopes of the distributions with decreasing R hints at a weak representation of the corresponding open classes among the lowranked words in the complete (mixed) ranking. Indeed, the most frequently used words are those from the closed classes (like prepositions, pronouns and conjunctions, as well as articles in English) and chiefly they occupy the low ranks. On the other hand, the words from these closed classes can hardly be found for R > 200, where the distribution of words

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

189

belonging to open classes is more or less uniform. This behaviour can at least in part account for the low-rank deviations from uniform scaling and a better agreement with such scaling for the moderate and high ranks. Therefore one may say that the nonuniform structure (as regards the representations of different parts of speech) of the mixed F (R) is a consequence of language organization, in which the most frequent words do not have references to objects (such references are in the domain of open-class words) but they merely play a purely grammatical role. In this context the most interesting class is verbs whose rank-frequency distributions are closest to a power law in both texts. This might be interpreted twofolds. First, verbs play somehow a dual part in a message. Their majority is associated with actions but there exist also a narrow group of verbs which can lose their ties with real actions and fulfil only grammatical functions. In English, such verbs are the auxiliary verbs like be, have, will, shall, etc. which typically modify meaning of the ordinary verbs they refer to and provide additional information about grammatical categories of tense, voice, aspect, and so on. This duality of certain verbs implies that they are present among the most frequent words preventing thus the flattening of the corresponding F (R) distribution for small ranks. Second, it cannot be precluded that the frequencies of using verbs – the words that directly refer to actions – also reflect the optimization of cost. However, unlike the Zipf principle of least effort which operates at the level of information transfer, this optimization may regard the very actions. From this point of view, the frequency of performing different actions by a human may be described by a power-law distribution and this hypothetical property may be inherited by verbs. This is of course a highly speculative proposition, but the idea that the actions performed by humans can be optimized in respect to effort/cost seems to be intuitively justified (one can find here a distant echo of the physical principle of least action leading to the Lagrange equations). Independently, it seems also worth noticing that the plot for nouns functionally resembles very much the rank-size dependence for family names in the majority of the countries [277–279]. This may point to the existence of a unified mechanism that governs generation of all the nouns, including such their distinguished component as the family names. The second, after the existence of parts of speech, manifestation of grammar at the lexical level are the inflected forms of words which significantly raise the number of different words that are symbolic references to the same object. The resulting lack of unambiguous mapping between objects and their lexical representation can be limited by replacing all the derivative forms by the corresponding lemmas (basic forms of words), i.e., infinitives in the case of verbs and singular nominatives in the case of nouns, adjectives, pronouns, etc. Similar as before, the procedure of lemmatization was applied to ‘‘Ulysses’’ and ‘‘Lalka’’. Next, based on the so-transformed texts, the rank-frequency distributions of lemmas were calculated. A comparison of these distributions and their counterparts for the inflected words is exhibited in Fig. 65. As one might expect, in the Polish text, the distributions for lemmas (L) and words (W) differ from each other over the whole range of ranks, while the most striking difference is observed for high ranks. In the interval 10 < R < 1500 the L-distribution scales according to the (L) (L) power exponent β1 = 1.03 > β , while in the interval R > 2000 another scaling regime can be identified with β2 = 1.52. In ‘‘Ulysses’’, the distribution does also change its behaviour for large R, but in the range 7 ≤ R < 1000 it resembles the (L) corresponding distribution for words: β1 = β . The second range of scaling for R > 1000 has a smaller exponent than its (L)

counterpart for ‘‘Lalka’’ equalling β2 = 1.17. Emerging of the second, high-rank scaling regime for lemmas is not surprising, however. The number of rarely used lemmas is smaller than the number of rarely used words, because a considerable fraction of the latter are just the inflected forms. The much-better-developed Polish inflection is able to explain also the larger value (L) of β2 for ‘‘Lalka’’. The Zipfian-like scaling with β ≈ 1 is observed over a broader range of ranks for the words than for the lemmas. Since it can safely be assumed that the F (R) relation for the lemmas is more linguistically primitive (the basic forms of words must have occurred earlier than their inflected forms), it is possible that in the course of evolution the language was modifying its structure so as to gradually approach the optimum being a uniform power law. Regardless of this, another interesting issue related to the power-law character of F (R) for the lemmas is answering the question if scaling is a property of the lemmas only, or perhaps also of the objects these lemmas refer to. Addressing this question is impossible on grounds of a mere statistical analysis of the lemmas, because their mapping to objects is not one-to-one. It requires a contextual study of the texts regarding identification of all the lemmas’ meanings, which would be an extensive, time-consuming task. A task worth carrying out in future, however, since it would allow for transferring the considerations from the linguistic level to the neurobiological level, that is, there where the language is actually formed. A fine but important suggestion that the objects can also be a subject to power-law relations is the fact that in Chinese language the Zipf law is not valid for individual ideograms which are the counterparts of the morphemes, but it is valid for their compounds (the n-grams). Such compounds represent a higher level of language organization than single words in the European languages and they are the actual carrier of information in Chinese [280]. If natural language is viewed from the perspective of its functionality, then it is easy to point out at some organizational elements that seem to be in a spirit of the concept of HOT. Complexity of the highly optimized systems aims at allowing them to survive and to function without disturbances despite random failures of elements, software errors, or other unwanted and/or unexpected conditions. Similar objectives may have lain behind complexity of language which was increasing in parallel with the need for reliable exchange of more and more complex information. In the case of language, ‘‘a failure’’ means inability to encode or decode a given piece of information in a given way. Such a failure is the larger, the less information can be included in a message as compared to the broadness and importance of the subject. Tolerance in this context means the ability of language to express as much different messages on different topics as possible, even in the case of some simultaneous failures (e.g. forgotten words or specific grammatical rules). An example of optimization is the principle of

190

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

10000

F(R)

1000

100

10

1 1

10

100

1000

10000

1000

10000

R

10000

F(R)

1000

100

10

1 1

10

100

R Fig. 65. Rank-frequency distributions created from original words (all the inflected forms are treated as separate words, W) or only from their lemmas (L) (L) (basic forms, L) for ‘‘Ulysses’’ by J. Joyce (top) and ‘‘Lalka’’ by B. Prus (bottom). Dashed lines denote schematically slopes with the exponents β1 and β2 corresponding to power-law functions fitted to the distributions for lemmas.

least effort, mathematically formalized by B. Mandelbrot, which leads to the linguistic Zipf law. On the other hand, this is not the only possible way of optimizing language and this is why looking at natural language from the HOT perspective can be beneficial in future. 7. Fractality and multifractality As forms of non-trivial geometry, fractals are often intuitively considered complex objects. In parallel, fractal dimension is sometimes viewed as a measure of complexity. The very existence of fractal structures in natural systems is usually considered manifestation of complex, nonlinear character of the underlying processes. Among the generators of such fractal structures there are critical phenomena, self-organized criticality, multiplicative cascade processes, and mutual couplings of two or more different processes, sometimes acting opposite, as in the phenomenon of diffusion-limited aggregation or in the landscape formation, in which diffusive erosion of the rocks is stimulated by certain nonlinear processes of atmospherical and geological character [71]. Moreover, the Lévy processes with asymptotically power-law slopes discussed in Section 6.1 (Eq. (79)) and other scale-invariant processes are naturally associated with the hierarchical structure and fractality. From a geometrical point of view, if in a random walk process probability of a long jump is larger than in the classical Brownian motion with Gaussian distribution of fluctuations, then a trajectory of a ‘‘particle’’ consists of long periods in which it remains localized in a region of small radius and of sudden ‘‘flights’’ to distant parts of the allowed phase space. Since length of the jumps is distributed according to a power law, by magnifying one of such small regions in which the particle stays for long time, we obtain the same picture as before — in full analogy to rescaling of some part of a fractal. It should be noted that the classical Brownian motion can also be viewed as a fractal process but without any clear hierarchical structure characteristic for the Lévy flights. If instead of geometrical objects one considers signals, it is necessary to generalize the notion of mathematical fractals to statistical (approximate) fractals. In the latter case the perfect invariance of the signal shape under affine transformation is

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

191

not required in favour of invariance of its statistical properties. It is expected that a short excerpt of the signal after proper rescaling has the same statistical properties as the whole signal or its longer parts. A special type of fractal objects and fractal signals are multifractals, i.e., fractals with non-homogeneous distribution of measure. In multifractals scaling properties are defined only locally [281,282]. Description of such non-homogeneous objects by means of a single fractal dimension is far incomplete and it is necessary to describe them by using an infinite family of fractal dimensions, each of which describing only one fractal subset comprising one specific type of singularities. Intuitively, multifractals are the most complex objects among fractals. 7.1. Fractals, multifractals and criticality Basic quantitative characteristics of fractal objects is capacity dimension. Let Φ be a set of points in Rd that support a measure of density µ and let L be a d-dimensional grid with cells of side length l which covers Φ . Capacity dimension α is then defines by:

α = lim sup l→0

ln n(l)

− ln l

,

(101)

where n(l) is number of cells in L with non-zero measure mass. Dimension α is sufficient to describe a fractal object with uniform density µ (a so-called monofractal). If the measure µ(x) is distributed inhomogeneously on Φ (a so-called multifractal), the capacity dimension alone is no longer sufficient to completely describe such an object. In this case, a family of generalized fractal dimensions (the Rényi dimensions) Dq′ has to be considered [283]. Let µl (k) be a mass of the kth cell of L and let the partition function Zq′ (l) be defined as a sum of the q′ th order moments: Zq′ (l) =

n(l) 



[µl (k)]q .

(102)

k=1

Then the following relation is valid for fractal sets: ′ Zq′ (l) ∼ lτ (q ) ,

τ (q′ ) = (q′ − 1)Dq′ .

(103)

For a multifractal set Φ , Dq′ is monotonically decreasing function of q′ , while for a monofractal: Dq′ = α , where α ≡ D0 is the capacity dimension of Φ . Number Nα (l) of cells in which sets of points have dimension α is given by: Nα (l) ∼ l−f (α) ,

(104)

where f (α) and Dq′ are linked together by the Legendre transformation: f (α) = q′ α − (q′ − 1)Dq′ .

(105)

By taking derivative of Eq. (105) we get the function α(q′ ):

α(q′ ) =

d dq′

(q′ − 1)Dq′ .

(106)

Thus, if we know Dq′ , we are able to calculate f (α). For monofractal sets f (α) = D0 = α and f (α) consists of a single point. In contrast, for multifractal sets f (α) is a continuous function with inverted parabolic shape assuming maximum at α(q′ )|q′ =0 . The function f (α) is called singularity spectrum. In the case of fractal signals, the method of calculating Dq′ based on the partition function (102) cannot be applied due to nonstationarity of such signals. One of possible solutions to this problem offered by detrended fluctuation analysis (DFA) consists in suppressing the nonstationarity of a studied signal by applying a detrending procedure [284]. Let x(ti ) be a time series of length T and Y (tj ) be its profile defined by: Y (tj ) =

j  (x(ti ) − x¯ ), i=1

x¯ =

T 1

T i=1

x(ti ).

(107)

If s denotes a segment’s length (scale), then the whole series can be divided into Ms = [T /s] disjoint segments. By starting this division independently from the begin and the end of the series, one obtains 2Ms segments total. Detrending of a segment ν local is equivalent to fitting a polynomial ξν(k) (tj ) of a given degree k to the profile Y (tj ) and then subtracting this polynomial from Y (tj ). For a given segment ν (ν = 1, . . . , Ms ), variance of the residual series equals to: F 2 (ν, s) =

s 1

s i=1

[Y ((ν − 1)s + i) − ξν(k) (ti )]2 ,

(108)

192

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

and analogously for ν = Ms + 1, . . . , 2Ms . Having calculated the variance (108), one can obtain fluctuation function: Fq′ (s) =



1 2Ms

2Ms  ′ [F 2 (ν, s)]q /2

1/q′ ,

(109)

ν=1

which is a counterpart of the partition function in Eq. (102). In the standard DFA procedure q′ = 2, while in its multifractal version (MFDFA) q′ varies and can in principle assume any real value [285]. If the time series have fractal structure, the fluctuation function displays power-law behaviour: Fq′ (s) ∼ sh(q ) . ′

(110)

The exponent h(q′ ) is then a counterpart of the exponent τ (q′ ) in Eq. (103) to which it is related by [285]:

τ (q′ ) = q′ h(q′ ) − 1.

(111)

For q = 2, the following relation holds: h(2) ≡ H, where H is the Hurst exponent, which serves as a tool for identifying the autocorrelations with power-law decay. If the analysed signal is monofractal, for each q′ the equality h(q′ ) = H holds. By knowing h(q′ ), it is possible to derive the singularity spectrum f (α) according to the following equations: ′

α(q′ ) = h(q′ ) + q′

dh(q′ ) dq′

,

f (α) = q′ (α(q′ ) − h(q′ )) + 1.

(112)

Here α can be identified with the Hölder index, which is a local measure of function irregularity. In this context, a value of f (α) may be interpreted as the fractal dimension of a set of singularities with the Hölder index α . Since MFDFA implicitly assumes continuity of time, the support of a discrete time series in this procedure has the dimension equal to 1. In effect, a maximum value of f (α) (for q′ = 0) is also equal to 1. In practice, the MFDFA method is applied in such a way that for a given value of q′ the function Fq′ (s) is calculated within a broad range of scales s and then a region of linear dependence of ln Fq′ on ln s is sought. From Eq. (110) it goes that the slope coefficient is identical to the generalized Hurst exponent hq′ . By repeating this action for a number of different q′ values, a family of exponents h(q′ ) is derived from which in turn the quantities α and f (α) can be independently obtained. Due to finite length of time series, signals formed by monofractal processes do not give an ideal pointlike f (α), but rather a relatively narrow parabola, instead. A proper interpretation of an empirical singularity spectrum requires thus some care. Besides MFDFA, there are other tools that can be applied in the multifractal analysis of time series. Another commonly used method is the wavelet transform modulus maxima (WTMM), which is based on detection of power-law scaling of coefficients of the wavelet expansion of a signal [286,287]. Intuitively, the WTMM approach to analysis of empirical data coming from complex systems seems to be a good idea, since the wavelet transform was proposed in order to deal with strongly nonstationary data for which the Fourier transform fails to work [288]. Moreover, the wavelet transform is especially useful in the context of data with fractal properties, because, by construction, it is able to detect self-similar and hierarchical structures. The transform allows one to decompose a time series x(i) in time-scale plane by convoluting it with a discrete wavelet function ψ : T ψ ( n, s ) =

N 1

s i =1

ψ



i−n



s

x(i),

(113)

where s denotes the scale of the wavelet which was shifted by n data points along the time series x(i). In general, the so-called mother wavelet ψ (for n = 0 and s = 1) can be selected out of many available candidates depending on a type of analysis and properties of a signal. In the context of a fractal analysis, the preferred mother wavelets should be well localized both in space and in frequency domains. This requirement is satisfied by the derivatives of a Gaussian, which can be considered as a family of wavelets: dm

(e−x /2 ). (114) dxm These wavelets are often used for nonstationary data since they are capable of detrending the signals by removing the polynomial trends up to (m − 1)th order [285]. The wavelet spectrum Tψ (n, s) is typically represented by a colour-coded 2-D map on an (n, s)-plane. A hierarchy of singularities present in a signal can easily be revealed in this way as a hierarchical structure of values of the coefficients Tψ (n, s) on the map. The WTMM method exploits the fact that a singularity of strength α present in the signal at point n0 leads to a power-law behaviour of these coefficients: ψ (m) (x) =

2

Tψ (n0 , s) ∼ s α(n0 ) .

(115)

In general, these power-law relation can be unstable if the singularities are dense, so it is recommended to identify local maxima of Tψ and to use their moduli: Z (q′ , s) =

 l∈L(s)



|Tψ (nl (s), s)|q ,

(116)

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

193

where L(s) denotes the set of all maxima for scale s and nl (s) is the position of a particular maximum. Additionally, a supremum has to be taken in order to preserve the monotonicity of Z (q′ , s) on s: Z (q′ , s) =





(sup |Tψ (nl (s′ ), s′ )|)q .

(117)

′ l∈L(s) s ≤s

If the studied signal is fractal, then we expect that the dependence ′ Z (q′ , s) ∼ s τ (q )

(118)

holds. For a multifractal signal, τ (q ) is nonlinear, while it is linear for a monofractal one. The singularity spectrum f (α) can in this case be calculated by means of the following formulae [281]: ′

α = τ ′ (q′ ) and f (α) = q′ α − τ (q′ ).

(119)

Like MFDFA, WTMM analysis is broadly used in studying fractal and correlation properties of empirical data. Both methods are then competing with each other. Systematic comparison of WTMM and MFDFA shows, however, that the latter method is more universal as regards the reliability of results and their stability under different choices of intrinsic parameters. This is true for different signal types and lengths [289]. In Fig. 66 we show a comparison of results obtained by both methods for a few exemplary types of random, computer-generated signals, for which the exact f (α) spectra are known: an antipersistent (H = 0.3) and a persistent (H = 0.75) fractional Gaussian noise (top and middle rows of part (a), respectively), Lévy flights with αL = 1.5 (bottom row of (a)), and a log-normal binomial cascade (part (b)). For more examples see [289]. The fractional Gaussian noise has a monofractal spectrum with α = H and f (α)|α=H = 1, which is well reproduced by MFDFA but not by WTMM which in both cases gives an indication of multifractality. A similar false signal of multifractality is given by WTMM for the Lévy process, which actually is only bifractal and its singularity spectrum consists of two points: (0; 0) and (1/αL ; 0), where αL is the stability parameter of the Lévy distribution (black squares in Fig. 66) [290,291]. MFDFA again gives a more reliable result here (almost nonexistent right wing of the spectrum). Nevertheless, for some types of data, WTMM can produce results which are as close to reality as those from MFDFA (Fig. 66b), especially for long signals. This suggests that WTMM may also be used to quantify multifractal properties of empirical data provided one is aware of its limitations. However, all the results presented in this section were obtained by using MFDFA. Shape of the f (α) spectrum is often considered an indicator of structural complexity of fractal signals or geometric fractals. Variability range of α within fixed range of q′ , defined by:

1α = αmax − αmin = α(q′min ) − α(q′max ),

(120)

serves as a related quantitative measure. Justification for its use stems from the fact that the larger value of 1α is obtained, the stronger diversity of component fractals is observed in an analysed multifractal. 7.2. Multifractal character of financial data Since introduction of the notion of multifractality [281,283], this property was discovered in empirical data obtained from a number of different systems. Among the most characteristic examples of such quantities are harmonic measure in the processes of diffusion-limited aggregation [292], distributions of the potential in the liquid flow through porous media [293], distributions of the velocity field in turbulence [294,295], distribution of galaxies and their clusters [296], the processes governing climate change [297], human heartbeat [298], and many more. Multifractality was also discovered in data from financial markets and it seems to be their relatively common property. Multiscaling was observed in the returns of currency rates [299,300], in the related triangular residual returns [257], in stock prices and market indices [301,302], as well as in commodity prices [303]. The same multifractal character was identified in the intervals between consecutive transactions [304–306], volatility [307], and volume traded [308]. Origin of multifractality in financial data has not been fully explained yet. According to the present, partial knowledge on this subject, financial multifractality has its source at the level of microscopic market activity: the volume and liquidity fluctuations [308]. On a more abstract level, multifractality is associated with the multiplicative cascade processes, whose simplest form may in the case of returns at time scale 1t be given by [309]: r 1t (t ) = σ 1t (t )ε(t ),

σ u1t (t ) = Wu σ 1t (t ), 1t

(121)

where Wµ depends only on the scaling factor u < 1, σ (t ) is volatility dependent of time, and ε(t ) is a Gaussian-type noise. The processes of this type are able to reproduce many empirical properties of financial data. An example of a model exploiting the multiplicative cascades is the multifractal model of asset returns (MMAR) [310]. It assumes that the clock time flowing uniformly poorly reflects the effective market time paced by, e.g., the highly variable transaction activity and, therefore, the clock time has to be replaced by a deformed multifractal time θ (t ), which better expresses what takes place in the market. In MMAR, the real-time evolution of the logarithmic price is a process analogous to the fractional Brownian motion BH with the Hurst index H but going in the multifractal time: ln p(t ) = BH [θ (t )]. The processes based on this model, with some later extensions [311,312], and together with the multifractal random walks [313], are among the most promising tools for modelling real market dynamics, by allowing one to reproduce the most important stylized facts of empirical data.

194

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

MF-DFA

WTMM

1 0.5 0



1 0.5 0 1 0.5 0 0

0.25

0.5

0.75

1

0

0.25

0.5

0.75

0.5

1

1.5

1



1 0.5 0 0

0.5

1

α

1.5

2

0

α

2

Fig. 66. Comparison of singularity spectra f (α) obtained for computer-generated, random signals by two methods: MFDFA (left) and WTMM (right). The processes studied were the monofractal fractional Gaussian noise with the Hurst exponent H = 0.3 (top row of (a)) and H = 0.75 (middle row of (a)), the bifractal Lévy flights with αL = 1.5 (bottom row of (a)), and the stochastic log-normal binomial cascade with ln x0 = 1.1 (Eq. (81), part (b)). The theoretical spectra for the Lévy flights are denoted by two black squares, while the theoretical spectra for the log-normal binomial cascade are given by the solid lines (see, e.g., [289] for the respective formula). In each panel, the error bars denote the standard deviation of data points calculated from 10 independent realizations of a corresponding process.

7.2.1. Multiscaling in foreign currency and stock markets Figs. 67–70 present outcomes of the MFDFA-based analyses of different data types taken from the stock markets and the foreign exchange market. The quantities which perform best in reflecting fractal properties of data are the fluctuation functions Fq′ (s), calculated for different values of the Rényi parameter q′ , and the singularity spectra f (α). By looking at the fluctuation function behaviour while changing the scale s, one can infer whether the Fq′ (s) dependence on scale has a power-law form (110) and, if so, in which range of s it is observed. Then the shape of f (α) allows one to find out if the data under study is mono- or multifractal. Return fluctuations of the main currency exchange rates have clearly multifractal character (Fig. 67). Out of the shown currency pairs, the most strongly multifractal one (in the sense of the f (α)’s width) is GBP/USD (1α = 0.17), while the weakest multifractality is seen for USD/EUR (1α = 0.06). In the latter case, this result might be slightly surprising, because USD/EUR is the most active of all the traded pairs and, as such, could be characterized by the most complex fractal properties. On the other hand, however, the popularity of trading the USD/EUR cross leads to a strong tendency to average out all the individual components of dynamics, which might otherwise form a more complicated structure of the data. In contrast to forex data, time series of stock price returns display larger diversity of fractal properties, which are attributes of individual companies. As Fig. 68 shows, a few types of Fq′ (s) behaviour can be distinguished. Fractal character is best visible in those companies for which Fq′ (s) is power law in a broad range of scales. Two examples of this type, obtained for Bank One (ONE) and First Union Bank (FTU), are seen in the upper panels. Singularity spectra calculated for 10 companies with similar scaling type are presented in Fig. 69. The fluctuation functions for the second group of companies do not show so uniform scaling dependence, but nevertheless the piecewise scaling is still present with different slopes. This group is represented by Dell (DELL) and Merill Lynch (MER) in the middle of Fig. 69. There is also a group of companies for which Fq′ (s) do not exhibit any obvious power-law behaviour, and among the elements of this group are Automatic Data Processing (AUD) and Exxon Mobil (XON) with the corresponding graphs in the lower part of Figure. In the case of AUD, one can in principle argue that there is approximate monofractality for s > 1000 min. All the three groups contain considerable number of companies and no group is superior to others, which makes impossible to present a typical result for the whole market. Among the factors which had influence on fractal properties of data are capitalization, stock trading frequency [305], and nonstationarity [314]. The differences in Fq′ (s) behaviour in distinct intervals of s are related to general properties of the

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

195

Fig. 67. Top: Fluctuation functions Fq′ (s) calculated for time series of returns of the CHF/JPY and USD/EUR pairs (−3 ≤ q′ ≤ 3). Vertical lines denote range limits for scales s, for which the singularity spectra f (α) were calculated. Bottom: f (α) for the following pairs: CHF/JPY, GBP/CHF, GBP/USD, USD/EUR (data from years 2004–2008, 1t = 1 min, T = 1.2 · 106 ).

Fig. 68. Fluctuation functions Fq′ (s) of stock returns (1t = 5 min) calculated for selected American companies with large capitalization (data from 1998–1999, −3 ≤ q′ ≤ 3). Top: companies for which Fq′ (s) shows a single interval of power-law dependence (I) extending over 2 decades of s. Middle: companies for which Fq′ (s) shows two shorter intervals of scaling (I,II). Bottom: companies for which Fq′ (s) does not show any doubtless power-law dependence. Vertical lines denote boundaries of the identified scaling regions.

market, which are distinct on short time scales (where the effects linked with internal market dynamics and its components dominate) and distinct on long time scales (where external factors can be influential) [315]. Now we look at another type of financial data, namely the returns of market indices. We study two indices: S&P500 and DAX, representing two large, mature stock markets. Time series of returns with 1t = 1 min have the length of T = 201,630

196

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 69. Singularity spectra f (α) calculated for the stock returns of 10 exemplary American companies with a single, long interval of power-law dependence in Fq′ (s). The Rényi parameter is restricted to −3 ≤ q′ ≤ 3.

Fig. 70. Top: Fluctuation functions Fq′ (s) for the returns of DAX (left) i S&P500 (right) indices (years 2004–2006, −3 ≤ q′ ≤ 3). Vertical lines denote the boundaries of the identified scaling regions. Bottom: The corresponding singularity spectra f (α) calculated in the intervals: I (dashed lines) and II (solid lines).

(S&P500) and T = 269,280 (DAX) data points and cover the years 2004–2006. The results are collected in Fig. 70. In both indices there are two separate intervals of s in which approximate power-law dependence of Fq′ (s) can be pointed out. Interval I is associated with a richer spectrum of h(q′ ) and, what follows, with a broader spectrum f (α). This spectrum has comparable width for both indices: 1α ≈ 0.35 (S&P500) i 1α ≈ 0.30 (DAX). The second interval (II) is located in the medium range of scales up to s = 10,000 min and associated with a poorer set of singularities: 1α ≈ 0.15 (S&P500) i 1α ≈ 0.13 (DAX), although still comparable in both cases. All the spectra f (α) have their maxima shifted towards α > 0.5, while their Hurst exponents are close to the value characteristic for the classical Gaussian noise: H = 0.50 ± 0.01. This outcome complies with a general observation that mature markets are characterized by H near 0.5 in contrast to emerging markets whose indices show persistence [301]. Multifractal properties of the returns, in parallel with many other statistical measures, vary in time reflecting thus the nonstationarity of market evolution. The effect which can be an example of this nonstationarity is the dependence of Fq′ (s) on an evolution phase. Fig. 71 presents results of MFDFA obtained for the DAX returns in two temporally equivalent periods of a market boom and subsequent fall in 1998. In these two cases the fluctuation functions behave alike for both the small

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

197

Fig. 71. Fluctuation functions Fq′ (s) (top) and singularity spectra f (α) (bottom) calculated for the DAX returns in growth and fall phases (−3 ≤ q′ ≤ 3). Vertical lines denote the boundaries of the scale intervals in which Fq′ (s) shows distinct behaviour during each phase. The insets in top panels show the time course of DAX between February and October, 1998 (light lines) and the selected intervals of market growths or falls (dark lines).

and the large values of s, while a difference is seen for middle values (i.e., 150 min < s < 1200 min during the boom and 80 min < s < 1300 min during the fall) where the slope of Fq′ (s) is slightly smaller in the up phase. This is supported by values of the Hurst exponent: H = 0.43 ± 0.01 (the boom) i H = 0.47 ± 0.01 (the fall). This outcome is consistent with the outcomes of another analysis exhibiting that likelihood of the persistent behaviour during the down phase exceeds that during the up phase [316]. On account of the fact that the growth and fall phases, understood as local trends, occur at all temporal scales from intraday to multiyear ones, one may expect that the properties of data also differ between the growths and the falls on all temporal scales. From this perspective, the returns reveal rather complicated dynamics consisting of the interwoven components with different fractal properties — what expresses the very essence of multifractality. 7.2.2. Microscopic price increments and transaction waiting times The examples discussed so far were based on time series of returns r 1t (ti ) obtained from price p(t ) which is a function with a continuous or discrete, periodically sampled argument. In real world, however, the problem is more compound, because the price is known only at the transaction moments and it remains otherwise undefined (Section 2.3). Transactions took place at completely random moments which implies that market time is effectively a discrete random variable. Thus, the evolution of a given stock price can be presented as a superposition of two stochastic processes: the one governing price and the one governing market time. What is interesting in this context, both processes were shown to be multifractal [304,305]. This is illustrated in Fig. 72, in which exemplary results of a multifractal analysis of the logarithmic price increments defined by:

1ln pν (i) = ln pν (ti ) − ln pν (ti−1 )

(122)

and the inter-transaction intervals: 1ti = ti − ti−1 , where i denotes consecutive transactions done on a stock ν . Opposite to the returns defined by Eq. (45) where price is sampled with a fixed time interval 1t, now the temporal increments 1ti are variable. The upper and middle rows in Fig. 72 exhibit the fluctuation functions Fq′ (s) obtained for the time series of {1ln pν } and {1ti }, respectively, for two German companies: Deutsche Telecom (DTE) and VIAG (VIA). The power-law scaling in both cases is admittedly not ideal, but nevertheless the plots show the dependence of h on q′ . The singularity spectra derived from h(q′ ) are multifractal for both types of signals, although the location and width of the corresponding parabolas are different. The spectra for prices are placed near α = 0.5, while the Hurst exponents (H ≈ 0.48 for DTE and H ≈ 0.46 for VIA) indicate a lack of long-range correlations and a mere trace of antipersistence. The spectra for the inter-transaction intervals have considerably different character. Their width 1α is larger than in the previous case and they are shifted towards larger values of α . These differences between the prices and the waiting times are related to different nature of the underlying processes. Price fluctuations are a signed process, while the time flow is always positive. In addition, the time series {1ti }

198

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

. . . . .

.

.

.

.

.

Fig. 72. Fluctuation functions Fq′ (s) for time series of inter-transaction intervals (middle) and the corresponding logarithmic price increments (top) together with singularity spectra f (α) (bottom) calculated for the stocks of exemplary companies from the German market: Deutsche Telecom (DTE, T = 434,846) and VIAG (VIA, T = 142,634). Vertical lines denote the boundaries of the regions for which the spectra f (α) were calculated.

have strong temporal correlations, which is indicated by the Hurst exponents: H ≈ 0.83 for DTE and H ≈ 0.70 for VIA. According to a known relation:

β = 2(1 − H ),

(123)

linking H to the exponent of the power-law decrease of autocorrelation: C (τ ) ∼ τ , for these and other companies, not shown here, the inter-transaction intervals are characterized by long memory similar to the long memory in volatility discussed in Section 6.2. The time series {1ln pν } and {1ti } are linearly uncorrelated but reveal nonlinear statistical dependences (for instance, the one between {1ti } and {|1ln pν |}). This is why the multifractality of time intervals can influence the multifractality of price movements, although it is unlikely that this is a unique source of the latter. Thus, it is justified to say that, to a some degree, the stock price dynamics has a form of fractal function defined on a fractal domain. One inevitably faces here a serious problem of the absence of an appropriate method of analysis of such data. The existing methods of multifractal analysis, like WTMM, MFDFA and other, either silently assume that time flow between the samples is uniform, or pay no attention to this question at all. In both situations, it is impossible to make a reliable quantitative description of real market dynamics. −β

7.2.3. Sources of multifractality In order to address the question which property of financial data is the actual source of its multifractality, which is a subject of some controversy in literature [317–321], it is necessary to note that multifractality is a nonlinear phenomenon which implies that all the linear correlations found in the data are irrelevant in this context. A possible cause of multifractality can be, for example, nonlinear correlations in the returns (the long memory of volatility, the negative cross-correlations between volatility at one moment and a sign of later returns, the higher-order correlations etc.), which are strong and ubiquitous in financial data. Processes with Gaussian fluctuation distributions (e.g., increments in the fractional Brownian motion) have strictly monofractal f (α) spectra. However, if in some process the fluctuation distribution is leptokurtic, then its singularity spectrum might not be monofractal. As we have already mentioned in Section 7.1, this is true, for instance, for the Lévy flights with the fluctuations described by stable distributions [290] and for the truncated Lévy flights (Eq. (80)), whose spectra are asymptotically bifractal [291], consisting of two points: (0; 0) and (1/αL ; 0), where αL is the stability parameter of the Lévy distribution (Eq. (79)). It should also be noted that neither the return distributions are stable in the sense of the generalized central limit theorem nor the distributions of the temporally aggregated returns are fast convergent towards normal distribution. The actual shape

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

199

Fig. 73. Fluctuation functions Fq′ (s) (top) and singularity spectra f (α) (bottom) for DAX returns and their surrogates (data cover the years 1998–1999, 1t = 15 s, −3 ≤ q′ ≤ 3). Fluctuation functions for the surrogates were averaged over their 5 independent realizations. Vertical lines denote the boundaries of the scale regions for which approximate scaling behaviour of Fq′ (s) is observed.

of these distributions is somewhere in between these extreme cases and, therefore, it fosters a rather slow convergence (Section 6.2). Since the definition of the fluctuation function Fq′ (s) explicitly comprises temporal aggregation of data (Eqs. (107)–(109)), Fq′ (s) can also comprise information about the character of convergence and, in consequence, behave distinctly in different ranges of s [322]. As an instructive example can serve a time series of the DAX returns at 1t = 15 s time scale, covering the years 1998–1999. The time series has the length T = 1068,960 allowing us to perform a statistically reliable analysis in a broad range of scales s ≤ 2 · 105 . A family of the functions Fq′ (s) calculated for these data is shown in Fig. 73 (top left panel). There are two separate scaling regions of the functions: the first region (I) is clearly persistent (H = 0.63) and corresponds to s < 60 min, while the second one (II), corresponding to s > 60 min, is completely void of memory effects (H = 0.50). In both cases f (α) shows multiscaling seen in the bottom panel of the same Figure, although in the region I the variety of singularities (1α ≈ 0.33) is much larger than in the region II (1α ≈ 0.14). The fluctuation functions Fq′ (s) calculated for the surrogate signals, obtained from the original time series by reshuffling, are presented in top right panel of Fig. 73. Now the region I does not display any uniform scaling and only with difficulty a rather short range of scales 12 min < s < 120 min in which Fq′ (s) is multifractal (1α ≈ 0.19) can be distinguished. If compared with the original data, the region II completely lost its multifractal character and to a good approximation it became monofractal (1α ≈ 0.015). As expected, in the whole range of scales the surrogate signals do not reveal any correlations (H = 0.50). Interpreting these results in a standard way, in the region I multifractality of the original data comes in a large part from the temporal correlations, although a contribution from the heavy tails is also significant, while in the region II the only source of multifractality are the correlations. In order to better understand the origin of the weak multifractality in the region I for the surrogates, it is recommended to consider model data with known statistical properties. The best choice for such data is time series whose values correspond to independent random variables taken from the q-Gaussian distribution Gq (x) (Eq. (90)). The assumption on the variables’ independence is crucial in this context, since the q-Gaussians are not stable under a convolution in this case. Then, by varying a value of the Tsallis parameter q, it is possible to obtain the distributions with different speed of the tail convergence and the random variables subject to different versions of the central limit theorem (CLT). For q < 5/3 the attractor for Gq (x) is the classic Gaussian distribution (which is a specific case of the q-Gaussian distribution with q = 1), while for q > 5/3 the attractors are the adequate Lévy-stable distributions, where the ‘‘adequateness’’ is defined by the relation:

αL =

3−q q−1

.

(124)

Fig. 74 presents plots of the fluctuation functions Fq′ (s) for exemplary time series with fixed length T = 106 and different value of q. If this value increases, the corresponding distributions have less and less steep slopes. When q = 1.5, in virtually whole range of the shown scales s, the plot of Fq′ (s) is monofractal. Only for the shortest scales (s < 100) one can notice a

200

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 74. Fluctuation functions Fq′ (s), q ∈ [−4; 4] for artificial time series of length T = 106 with q-Gaussian distribution characterized by different values of the Tsallis parameter q. Vertical lines indicate change of scaling regime from the asymptotically multifractal regime (L-CLT) to the asymptotically monofractal one (CLT).

Fig. 75. Width 1α of the singularity spectra f (α) calculated for artificial time series with q-Gaussian distribution characterized by different values of the Tsallis parameter q and with different length T denoted by dashed line for q > 5/3 shows the theoretical value 1α = 1/αL for the Lévy-stable distributions. For q < 5/3, the analogous value 1α = 0. For each point, the error bars show the standard deviation of 1α derived from 5 independent realizations of the time series. Vertical line at q = 5/3 denotes the transition from the asymptotically Gaussian regime (CLT, left) to the asymptotically Lévy regime (L-CLT, right).

trace of a richer diversity of values of the exponents h(q′ ). On the opposite pole there is the plot for q = 1.7, in which the monofractal region is absent, while the multifractal scaling of h(q′ ) is observed over the whole range of s, instead. This value of q already corresponds to the region in which the generalized version of CLT is in force (L-CLT). Here variables with the q-Gaussian distribution are convergent to a variable with the Lévy-stable distribution. By passing from q = 1.5 to q = 1.7, the vertical line drawn in each panel of Fig. 74, which separates two different scaling regimes: the L-CLT multifractal regime (to the left of the line) and the monofractal CLT regime (to the right of the line), gradually moves from the left end of the scale axis to the right one and finally disappears. This effect is even better evident in Fig. 75, where the functional dependence of 1α on q is shown for time series of different length T . In the case of the longest series (T = 106 ), this plot illustrates basically the same result which has already been shown in Fig. 74, but in this form it is probably more convincing in depicting the impact of the weak CLT and L-CLT convergence on f (α). In the CLT domain, the theoretical singularity spectrum must have a zero width (1α = 0). For T = 106 , this requirement is approximately fulfilled if q ≤ 1.5, but for the shortest signals considered here (T = 104 ) — only if q ≤ 1.3. By advancing towards q = 5/3, a difference between the actual and the asymptotic value grows systematically,

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

201

and the shorter is the signal, the larger is the difference. For a time series of DAX returns, for which the f (α) spectrum was plotted in Fig. 73, the empirical data is best approximated by the q-Gaussian distribution with q ≈ 1.6. For this value, the model spectrum gives ⟨1α⟩ = 0.13 ± 0.08, what remains in agreement with the width 1α ≈ 0.19 of the singularity spectrum calculated for the DAX return surrogates and exhibited in Fig. 73. For q = 1.6 and q = 1.65, existence of the two scaling regimes of Fq′ (s) seen in Fig. 73 makes fitting a single power-law function impossible. Instead, there is an ambiguity in the choice of a region in which f (α) can be calculated. In Fig. 75, this ambiguity is expressed by two competitive values of 1α defined in the range 1.55 < q < 1.65 (except the shortest signals for which the transition between both regimes is smooth and the exponents h(q′ ) were calculated only approximately). Within the L-CLT regime, the width 1α is also larger than the theoretically predicted one, but this difference tends to decrease if we depart from the point q = 5/3 to the right. A conclusion which can be drawn from these results is that in a neighbourhood of the transition point from the CLT regime to the L-CLT regime, any interpretation of the f (α) spectra must be done with much care since there is no a priori criterion for selecting a ‘‘good’’ range of s in which Fq′ (s) scales. Therefore, by considering surrogates of the empirical data that can be approximated by a q-Gaussian distribution with q residing near the cross-over value of 5/3, especially if the data is too short to allow reaching the scales associated with the monofractal regime, an observer might draw a too conclusion that the surrogates are multifractal. Depending on an approach, interpretation of such results may be distinct. If a stress is put rather on the asymptotic properties of data, then the short-scale multifractality has to be considered apparent and the true multifractality can be attributed entirely to the correlations. On the other hand, if a distinction between the shortscale and the long-scale properties is acceptable, the multiscaling due to heavy tails of distributions can also be thought of as a fact. It is worth noting that this approach is less restrictive than the conclusions presented in some other works, where the multiscaling of Fq′ (s) was considered apparent and misleading if the underlying processes were monofractal by construction [323,324]. 7.3. Speculative bubbles and critical phenomena 7.3.1. Discrete scale invariance Similarly as translational invariance may assume a more general discrete form, the concept of continuous scale invariance staying behind the geometry of fractals can also be considered in discrete terms. In this case a system reproduces itself if rescaled by a factor λn assuming discrete values: λ1 , λ2 , . . . , λn related by λn = λn . For an observable f (x) characterizing such a system, the following relation then holds: f (λn x) = λαn f (x) = λnα f (x),

(125)

where α is a real number and λ = λ1 is termed a ‘‘preferred scaling factor’’ [325] as it determines the factors of rescaling such that the system is seen self-similar. The solution of Eq. (125) appears not restricted to a power-law function known from the conventional critical phenomena. Its more general form [326,327] is given by: f (x) = cxβ P



ln x ln λ



,

(126)

where P denotes a periodic function of period one. This general solution is thus represented by a power-law function modulated by oscillations that are periodic in ln x. Expanding this expression into the Fourier series, imposing a condition that the solution is real, and restricting this solution to the dominating first-order component leads to the following function:





f (1) (x) = cxβ 1 + a cos 2π

ln x ln λ





.

(127)

The structure of f (1) (x) implies that a part characterized by a continuous scale invariance is decorated by log-periodic oscillations. Phenomenon of discrete scale invariance can be directly related to fractal objects in which self-similarity also demands rescaling by a specific value of scaling factor [325]. Let Φ be a fractal obtained by successive iterations of generating procedure, and let L be its cover by a network of square cells with side length l. If n(l) denotes number of nonempty cells of cover L, then a local l-dependent fractal dimension is given by: D(l) = −

ln n(l) ln l

.

(128)

In the limiting case l → 0 the above equation transforms into definition of capacity dimension (101). By continuously decreasing size of the covering cells, the dimension D(l) oscillates, increasing by leap each time l becomes comparable to detail size in the kth iteration of the procedure generating Φ , and decreasing between successive leaps when n(l) remains

202

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

constant while value of ln(1/l) increases. This periodic behaviour of D(l) can be quantified by generalizing the fractal dimension to be a complex number: Dk = D +

2π ki ln l

,

(129)

where D is the capacity dimension of Φ . Similar as ordinary power-law dependences with real exponents are characteristic for critical phenomena in systems with continuous scale invariance symmetry, the power-law dependences with complex exponents occur in systems with broken symmetry, i.e., systems with discrete scale invariance. In both cases, variable x denotes a distance between current value of a control parameter from a critical point. Breaking the symmetry of continuous scale invariance may occur as a consequence of many processes among which a chief place is naturally occupied by hierarchical processes (e.g. cascades), which produce fractal character of the system. Observables whose behaviour can be described by solutions with log-periodic oscillations (127) can be found, for example, in turbulence, in diffusion-limited aggregation, and in creation of fractures and seismic preshocks [325]. 7.3.2. Log-periodic oscillations as a part of financial market dynamics A hypothesis that the phenomenon of discrete scale invariance can be observed in financial markets was formulated after noticing that large financial crashes that lead to sudden drops of asset prices and market indices sometimes by several tens per cent over several days, resemble a torrential relaxation of strains in seismic faults during earthquakes. Therefore, since there is strong evidence that some earthquakes are preceded by log-periodic oscillations of certain observables (e.g., specific ion concentrations in water), it seems quite natural that similar symptoms may be sought in financial data prior to crashes [328,329]. Here the role of variable x of Eq. (127) is played by temporal distance from the moment of a crash: x = |t − tc |. This implies that time is subject to discrete scaling with a preferred scaling factor λ. Because during formation of market bubbles (t < tc ) a trend is upward, the condition β < 0 must be fulfilled in Eq. (127). This leads to a singularity at t = tc . By an analogy to critical phenomena, the parameter tc is called critical time (despite the fact that it still remains uncertain whether the phenomena taking place then in the markets are indeed critical in the statistical physics sense). However, one can safely admit that at tc a phase transition occurs: from a state characterized by an approximate symmetry between supply and demand (with only small advantage of the latter, which gradually drives the prices upwards), the market passes to a state in which this approximate symmetry is broken and supply is overwhelmingly predominant. By now, there is no fully convincing explanation of this type of invariance in financial markets. One of its possible sources can be hierarchical organization of the global market, both according to geographic or economic factors (world regions, countries, exchanges, industrial sectors, subsectors, etc.) and according to a size hierarchy of market players and sums of money they invest. This hypothesis is justified by the fact that hierarchical processes are among the sources of discrete scale invariance and log-periodic oscillations in various physical systems [329]. An additional argument which can be used here is the fractal structure of market dynamics: after proper rescaling of axes, short-term evolution of a market index cannot be distinguished by an eye from its long-term evolution [330]. This means that events, which from a longterm evolution perspective are nothing more than noise, on short time scales may be considered small crashes. If crashes are considered moments of destruction of speculative bubbles, then the bubbles continuously emerge and collapse on all time scales. Evolution of such bubbles, their gradual superstructuring on normal, ‘‘healthy’’ evolution of an observable, has its own dramaturgy expressed by an upward trend decorated with log-periodic oscillations. In this context, the whole evolution of the market can be viewed as a series of such bubbles whose decorating oscillations are themselves bubbles on a lower hierarchical level with their own log-periodic substructures being, in turn, local bubbles on an even lower level of organization [330,331]. This can be seen in Fig. 76 where evolution of the index S&P500 during the periods 1996–2003 and 2003–2009 was presented as a hierarchy of log-periodic structures during both the increases (bubbles) and the decreases (anti-bubbles). A mechanism leading to the emergence of log-periodic oscillations, although still unknown, has to be related with the mechanism of formation of market bubbles (see, for example, Refs. [332,333] for propositions of such mechanisms). These occur in consequence of herding behaviour of the investors that mimic (not necessarily irrationally [130]) investment decisions of others. If this mimicking becomes a leading motive of activity for a growing number of investors, then from a level of individual inclinations for this strategy, it turns into a collective preference. The market as a whole becomes sensitive to perturbations and, what follows, a transition from one phase (growth) to another (fall) becomes more and more probable. If such a mechanism functions in a hierarchically organized market model, then the log-periodic oscillations of the market indices can be obtained [334–336]. If to assume that there is no coincidence in the observed agreement between empirical data and this model, then a conclusion comes that there exist some extremely strong, simply dominating, deterministic components of different duration in the market’s dynamics. Their possible existence implying that price can no longer be considered a martingale is in a striking contrast to the efficient market hypothesis. The problem becomes even more complicated when one takes into consideration the fact that market trends which carry the oscillations can persist for many years. What is more, some researchers argue that long-term economic evolution of the world (or at least of the United States) during the last 200 years parametrized by (partially reconstructed backwards) the S&P500 index have a shape of a single upward trend modulated by

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

203

a

b

Fig. 76. Medium-term evolution of the S&P500 index in years 1996–2003 (a) and 2003–2009 (b). Accelerating (t < tc ) and decelerating (t > tc ) logperiodic functions (127) with the preferred scaling factor λ = 2 and with the oscillations governed by |cos x| were adjusted to data, forming two-level and even three-level hierarchical structures.

oscillations of the log-periodic type guiding the contemporary economy to an imminent singularity within the forthcoming decades [337,338]. However, it is difficult to imagine that successive generations of investors by their decisions (which inevitably have incomparably shorter temporal horizons than centuries) build up the same bubble. Thus, if the hypothesis of the two-century-long deterministic trend is to make any sense, this phenomenon must have its origin much deeper than in the long-term-correlated market operations. It must be associated with the fundamental laws of economy in its broadest sense. It is worth to take a look at Fig. 77 in which clear log-periodic structures can be observed in the long-term evolution of US Fed Prime Rate during over sixty years. This behaviour can be a hint that the discrete scale invariance may be rooted deeper in economic mechanisms than only in the dynamics of stock markets. 7.3.3. Prediction potential of the log-periodic model The log-periodic model of financial market dynamics, if carefully applied, appears to be a surprisingly successful tool for predicting future movements of market indices and commodity prices [339–341]. Experience teaches us that the model’s utility value exceeds all widely known methods of technical and fundamental analysis, allowing one to identify moments of trend reversals with relatively small temporal error dependent on |t − tc |. However, using this model in a straightforward way without taking into consideration the complicated nature of markets, can lead to catastrophic investment errors [342] (see also other criticism of the log-periodic model in [343,344]). However, since ability of identification of the bubbles in early stages of their development would be of fundamental importance for the markets and for the world’s economy, there are attempts at constructing automatic tools for potential-bubble detection [345]. The model based on Eq. (127) in its complete version adapted for financial data analysis has 7 parameters [329]. The key parameter is the preferred scaling factor λ describing how fast do the oscillations contract. Fixing its value is the most important step to make the log-periodic model reliable in predicting future evolution of markets. If λ was to be treated as a free parameter fitted each time to specific data [130], the utility of the model would be doubtful. This is because a pattern of minima–maxima in the time course of a price or a market index can be seriously misleading

204

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 77. Historical data of the US Fed Bank Prime Loan Rate and the adjusted log-periodic functions with accelerating and decelerating oscillations (127) with λ = 2. In both cases tc ≈ 1981. Note a mismatch between this particular log-periodic trend and the data starting around 2006.

Fig. 78. Brent oil price per barrel (top) and gold price per ounce (bottom) in the London market together with adjusted log-periodic function (127) with λ = 2. For oil, the predicted critical time tc was in perfect agreement with the actual date of the price maximum (July 11th, 2008), while for gold the trend reversal occurred 6 trading days earlier than predicted (grey vertical belt). Vertical arrows indicate the days on which the forecasts were made.

with various extra structures not originating from its internal dynamics (e.g., internal dynamics of a speculative bubble) captured by the model. A significant part of the market fluctuations result from sudden external impulses which can (but not necessarily have to) influence the long-term deterministic components that form the oscillation pattern. In such circumstances, fixing the scaling parameter λ permits us to neglect certain structures in the data which do not comply

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

205

Fig. 79. Exchange rate of Swiss franc (CHF) to Polish zloty (PLN) between June and September 2011 (segmented line with full symbols) together with the log-periodic function adjusted to the data (smooth line). Note the excellent agreement between the critical time tc predicted by the model and the actual date of the trend reversal. Small vertical arrow indicates the day on which the forecast was made. A similar picture can be drawn for the CHF/EUR rate.

with the assumed overall scenario. On the other hand, in the case of free λ, there is no a priori reason to ignore any of the seen structures. Thus, it may happen that either a few contradicting scenarios must be taken into consideration with the same a priori weight, which leads to different predictions, or even no log-periodic scenario fits the data at all. In both cases predictions can be largely erroneous. Our experience of successful predictions and careful analysis of different historical data sets allows us to postulate λ ≃ 2 [330,331,338,341,346]. This value is also supported by its similarity to the value λ = 2 observed in other physical phenomena [325]. In practice, prediction based on the log-periodic model relies upon identification of approximate location of two consecutive minima (or maxima) in historical data, linking these minima to the two consecutive minima of the log-periodic function (thus fixing the phase φ and the critical time tc in (127)) and extrapolating this function or its alternative variants into future. Unspecified type of periodic function imposed on the power-law trend in Eq. (126) allows one for arbitrary selection of the oscillation shape and better agreement between the model and the data [338]. Fig. 78 presents two examples of effective application of the log-periodic model to prediction of the moments of trend reversal in price of oil barrel [341] and gold ounce [340]. It is worth noting that in the former case the temporal precision of the forecast was equal to 6 trading days, while in the latter one the forecast ideally anticipated the real price behaviour and the sharp trend reversal occurred exactly at the predicted date. However, some advances of the moments of the actual trend reversal relative to the determined tc can be expected since they are a natural consequence of the increasing market sensitivity to even tiny perturbations as critical point is approached. One final example related to the recent (beginning of August 2011) stock market turmoil and the resulting Swiss franc (CHF) volcanic appreciation relative to all the major currencies is illustrated in Fig. 79 based on the CHF to Polish zloty (PLN) exchange rate. The prediction pointed to August 12, 2011 as the deadline for termination of this appreciation [347]. This exchange rate reached maximum during the night August 10/11 and then sharply started declining according to the prediction. This example points to the fact that the predictive potential of the above methodology may apply also to the currencies. 8. Network representation of complex systems The most fundamental factor that shapes complex systems are the interactions among their elements. However, the character of these interactions is system-specific: different systems can be based on different interactions or reveal different structure of dependences. This bears substantial difficulty in the search for regularities and in the formulation of universal laws governing the structure and evolution of complex systems. An advantage of the network formalism that cannot be overestimated in this context is that it allows one to describe distinct, sometimes disparate systems in the same language of abstract graphs. Thanks to this, the researchers studying complex systems were able to spot some earlier unnoticed properties that are common to many such systems. The network formalism is based on the notions of nodes (or vertices), which can be identified with individual structural or functional elements of a system, and edges, which represent physical interactions or other relations between the pairs of nodes. In this way, apart from the systems with obvious network structure like, e.g., transportation and communication systems or the World Wide Web, one can also analyse the systems whose network representations are more abstract: the metabolism of a living cell, the immune system, the financial markets, and so on. Some concepts originating from the network theory have led to a better understanding of mechanisms governing the development of complex systems, their adaptation to changing environment, self-organization of their structure for the sake of stability and resistance to failures or external attacks.

206

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

The network approach to the phenomenon of complexity has turned so beneficial that some researchers think of it to be the key to understand the principles of the complex systems structure and behaviour [348]. In this section, we illustrate how the network formalism can be applied to the analysis of empirical data from two financial markets: the stock and the foreign exchange market. 8.1. Network formalism and basic properties of networks 8.1.1. Measures of network properties A binary network consisting of N nodes (set N ) and E edges is fully defined by an adjacency matrix A whose entries ωij (i, j = 1, . . . , N ) are the numbers 0 or 1 describing whether an edge connecting the nodes i, j ∈ N exists. In the case of a weighted network, the role of the adjacency matrix is played by a weight matrix , whose entries are the weights ωij ∈ R. If in a binary network Lij is the length of the shortest connection between the nodes i, j (expressed by the number of intermediate nodes or edges), then the characteristic path length: L=

1



N (N − 1) i,j∈N i= ̸ j

Lij ,

(130)

is a measure of the network compactness. The larger L is, the more branched is the network. For a weighted network, Lij may be replaced by a sum of the edge weights along a path connecting the nodes i and j or any other function of these weights. The characteristic path length is then calculated in full analogy to Eq. (130). Significance of a given node depends on its location within the network structure. It can be described in several different ways, but usually it is expressed by degree ki (the number of edges linking this node with other nodes) or by betweenness (or centrality) bi given by the formula:



ni (j, k)

j,k∈N , j̸=k

n(j, k)

bi =

,

(131)

where ni (j, k) is the number of the shortest paths between the nodes j, k that pass through the node i, while n(j, k) is the total number of the shortest paths connecting j and k. The betweenness is a measure of the node centrality. Removing a node of high centrality from the network causes substantial worsening of the network effectiveness, therefore this quantity seems to be an intuitive measure of the node significance. For a weighted network, the role of ki is taken by the node strength si , defined as a sum of weights of the edges that connect the node i with other nodes. Topology of a binary network in a neighbourhood of a node i can be described by a local clustering coefficient gi , defined as the ratio of the actual number of edges in a subgraph consisting of all the neighbours of the node i and the maximum possible number of edges. Global topology of the network can be characterized by the mean clustering coefficient Γ being the average over all the nodes i ∈ N :

 Γ =

1  N

gi =

i

1  N

i

j ,l

aij ajl ali

ki (ki − 1)

.

(132)

The mean clustering coefficient for a weighted network may be defined in a few different ways, but here we prefer to use the formula [349]:

Γ˜ =

1  N

i

1 ki (ki − 1)

 (ω˜ ij ω˜ jl ω˜ li )1/3 , j ,l

ω˜ m,n =

ωmn max(ωmn )

(133)

m,n

which is sensitive not only to the presence or absence of triangle motifs (i.e., structures in which the existence of the edges i − j and i − l implies the existence of the edge j − l), but also to the weights of their constituent edges. A large value of the mean clustering coefficient is characteristic for the networks with a high number of triangles. 8.1.2. Scale-free networks An interest in the network representations of complex systems was triggered by the discovery that the World-Wide Web topology disagrees with the random network paradigm, which predicted a Poisson-type distribution of the link number k pointing from a particular document to other documents (and vice versa): P (k) =

e−λ λk k!

,

λ = ⟨k⟩,

(134)

where ⟨k⟩ is the mean node degree. The networks with such a distribution of node degrees (called the Erdös–Rényi networks) can be constructed by randomly connecting previously unconnected pairs of nodes until a predefined total number of

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

207

edges in the network is reached. It was long believed that real-world networks have exactly this random-network topology. However, the actual distribution of the World-Wide Web occurred to be power law [350]: P (k) ∼ k−γ ,

γin ≈ 2.1,

γout ≈ 2.45.

(135)

A similar scale-free topology was simultaneously discovered in social networks expressing the actor co-appearances in movies (γ ≈ 2.3) [70] and the scientific paper citations (γ ≈ 3) [263]. These outcomes allowed to formulate a hypothesis that scale-free topology is commonplace in the real-world systems [70]. Later studies confirmed the ubiquitous character of such topology in networks representing diverse natural, social and other systems. On top of the already-mentioned ones, the scale-free topology was discovered in the Internet physical structure [351], air traffic and airport networks [352,353], Internet social networks [354,355], sexual contact networks [356], epidemic networks [357], metabolic networks [358,359], gene-coexpression networks [360], and many other systems [361,362]. Among them there are both directed and undirected networks. From this work perspective, it is worth noting that the network formalism can be applied to representing the functional connections in human brain [112,113], to mapping the word proximities in literary texts [363], and to mapping the inflection of words [364,365]. In all these examples the analysed network structures occurred to be scale-free. Wide abundance of exactly this type of network architecture is by no means accidental. The scale-free networks are characterized by two properties of fundamental importance as regards the functional efficiency of the associated systems. First, as many computer simulations showed the scale-free networks are relatively resistant to accidental errors or failures (e.g. dysfunctional nodes). The majority of nodes in such a network has a small degree implying a lack of strategic significance for the network as a whole. Owing to this property, even the removal of 80% of the nodes may not be sufficient to disintegrate such a network, while in the Erdös–Rényi case the network disintegrates after removing about 30% of its nodes on average [366]. Although the scale-free networks are vulnerable to failures of the nodes with high centrality, the small number of such nodes implies that in typical conditions, if there is no coordinated attack on the network, its damage is unlikely. Second, due to optimal edge density, the scale-free networks are economical. A small characteristic path length L favours fast information transfer, while a strongly reduced number of edges allows to limit use of the resources and eliminate the hypersensitivity to fluctuations which is typical for dense networks. Optimization of the number of connections is characteristic to a broader class of the small-world networks [367,368] whose subclass are the scale-free networks. In general, the small-world networks are defined by a slowly increasing L with increasing the number of nodes, not faster than L(N ) ∼ O(ln N ) (see Ref. [369] for analytic results for various types of networks). A subclass of the small-world networks, the so-called Watts–Strogatz networks, are additionally characterized by a large value of the mean clustering coefficient Γ . This subclass is often identified with the whole class of the small-world networks [368]. An important subclass of the small-world networks are hierarchical networks, in which groups of nodes tend to cluster together around local centres, and these centres tend to cluster around higher-order centres, which are themselves the elements of even larger structures. The hierarchical networks can be scale-free. In this case, the mean clustering coefficient is also power-law dependent on the node degree: Γ ∼ k−δ . This causes that the nodes with small ki are predominantly linked to other nodes inside a cluster, while only few nodes with a relatively high degree (the central nodes) are responsible for the inter-cluster connections. The process which is most often credited as a mechanism of the scale-free networks formation is the preferential attachment [70]. In each step n of this process, to a network consisting of Nn−1 nodes, a new node with a fixed number of links l is added. Probability µ(i) that this node will be linked to an existing node i is proportional to the degree of this node: µ(i) ∼ ki . In this way, the attachment procedure which starts in step 1 with an initial network of N0 nodes and l0 edges, after n steps leads to a network consisting of N0 + n nodes and l0 + ln edges. If n is large enough, then the degree distribution has a power-law slope as in Eq. (135). Even though this power-law distribution is a consequence of the specific, linear dependence of the probability µ(i) on ki and γ = 3 is thus the only possible result, one can also obtain other values of the power exponent γ , complying with the empirical results, by certain modifications of the attachment procedure [70]. 8.1.3. Minimal spanning trees The networks which do not have cycles are dendrograms. In such a case, for each pair of nodes, there exist a unique path which connects them. A particular type of dendrogram is a minimal spanning tree (MST) which is characterized by the minimum possible sum of the edge weights out of all dendrograms that can be spanned on a given set of nodes. An MST can be constructed from a weighted network with the weights {ωij } by means of several algorithms, among which there is an algorithm based on a distance matrix (which can be identified with a weight matrix [370]). Since, in the case of multivariate data, a starting point of this algorithm can be a correlation matrix, from the perspective of this work this algorithm is the optimal one. Let a network be defined in such a way that each node i is a time series {xi } representing some observable Xi and each edge connecting a pair i − j has a weight equal to the correlation coefficient Cij for {xi } and {xj }. The edges of this network are undirected. Because the coefficients Cij do not satisfy the conditions of metric, an inter-node distance dij is introduced: dij =



2(1 − Cij ),

(136)

which is metric [370]. Due to the properties of the correlation coefficient, values √ of the inter-node distance are limited to 0 ≤ dij ≤ 2, with the following special values: dij = 0 for identical signals, dij = 2 for independent signals, and dij = 2 for

208

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 80. Binary MST constructed for time series of stock returns (1t = 15 min, the years 1998–1999) representing the 1000 highly capitalized American companies. Lengths of the edges are arbitrary and the edge crossings are completely artefactual.

perfectly anticorrelated signals. The chosen algorithm of MST construction is based on ordering of all the edges according to their increasing distance dij and successive connecting of the closest pairs of nodes, provided at least one node of the pair has not been connected yet. The procedure is continued until no node is left out. A complete MST consists of N nodes and E = N − 1 edges. The MSTs may have various topologies depending on properties of the node sets on which they are spanned. MST topology can be described with the help of the measures introduced above. In the case of a binary MST, the characteristic path length has the same general formula as in Eq. (130), while the formula for the betweenness of a node i can be considerably simplified: bi =

 j,l̸=i

δi (j, l) (N − 1)(N − 2)

(137)

where δi (j, l) = 1 if a node i is situated on the path between nodes j and l, or δi (j, l) = 0 otherwise. If a node i is the centre of an MST and is linked to all other nodes, then bi = 1, while if i is a peripheral node with degree ki = 1, then bi = 0. 8.2. Financial markets as complex networks 8.2.1. Network representation of the stock market The structure of financial markets treated as abstract systems of interacting assets can be represented by a weighted network in which the individual assets are nodes, while the values of some statistical measure expressing dependences among observables associated with these assets (e.g., prices, returns, volatilities) are edges [370–372]. Typically, the soconstructed empirical network is fully connected (i.e., it possesses all the possible edges, E = N 2 − N), since it is unlikely that for some pair of nodes Cij = 0 exactly. One of the benefits of the network approach is the possibility of representing graphically the correlation structure of a market in an incomparably compact form. Let the American market be an example [373]. We selected N = 1000 stocks associated with the highly capitalized companies traded on NYSE or NASDAQ. This data set has already been analysed in Section 3.2 by means of the correlation matrix method. Although the stock–stock correlations for the companies of different capitalization saturate only on time scales much longer than the daily one, we chose the time scale of 1t = 15 min because it corresponds to the correlations which are fully identifiable by statistical methods (see Section 3.2 and Refs. [177,178,374]). Due to the fact that for the present number of nodes the number of edges is enormous (E ≈ 106 ), it is impossible to create a readable picture of the complete network. An MST representation shown in Fig. 80 consists of only 999 edges and is therefore much more convenient to visualize than the complete network. Topology of this MST indicates that the network is strongly centralized with the most important node of degree kGE = 228 being General Electric and two nodes of secondary importance being Citigroup (kC = 75) and Cisco Systems (kCSCO = 61). Similar as in other highly centralized networks, one can observe here a large number of peripheral nodes of degree 1. In order to present positions of particular companies, out of the whole set we selected only 100 companies with the largest capitalization and constructed a smaller MST restricted to these companies. The MST structure in Fig. 81 reflects the essential features of the large tree from Fig. 80 and indicates that the stock returns for GE, C and CSCO display such behaviour as if they represented the mean evolution of

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

209

Fig. 81. Weighted MST for the 100 largest American companies (with their names encoded by tickers). Lengths of the edges are arbitrary, while their widths are proportional to the respective weights ωij ≡ Cij .

Fig. 82. Betweenness of nodes representing the stocks of 1000 highly capitalized American companies.

large groups of stocks or even the whole markets (GE for NYSE, CSCO for NASDAQ). Their central roles these stocks owe to their capitalization which qualify to the largest among the companies of similar industrial profiles. The ranks occupied by the stocks in the network hierarchy can be described by the betweenness bi of the corresponding nodes given by Eq. (137). Values of this parameter for each of the 1000 stocks forming the network are shown in Fig. 82. From the perspective of the MST structure, the significance of the main nodes is substantially higher than their degrees might suggest: 90.5% of the paths linking pairs of nodes pass through GE, about 38% — through C, about 29% through CSCO, and there are 7 further nodes with over 10% of the paths passing through each of them: Lucent Technologies (LU), Unilever (UN), ING, Bank One (ONE), Morgan Stanley (MWD), Exxon (XON), and Chevron (CHV). Curiously, a Zipf-like rank plot for bi gives in a wide rank of ranks: bi (R) ∼ e−0.036R . Due to significant centralization of the network, the MST presented in Fig. 80 is compact and the path lengths between randomly selected nodes are typically small. In order to inspect the behaviour of L depending on the number of nodes N, we calculated the characteristic path lengths for the MSTs consisting of a variable number of nodes. First, the companies were ordered according to their market capitalization and, then, the number of stocks N in a set was increased by successively adding new stocks starting from the 10 top-capitalized ones until all the 1000 stocks were considered. The numerical values in Table 2 show that L(N ) < ln N, which is characteristic for both the small-world networks and the centralized ones. However, the average correlation coefficient is small ⟨Cij ⟩ = 0.07, so is the mean weighted clustering coefficient Γ¯ = 0.06, which indicates that although the studied network counts among the small-world networks, it is not of the Watts–Strogatz type.

210

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 83. Cumulative distribution of the node degrees for the network representing 1000 highly capitalized American companies (solid line with points). A power-law function with the exponent γ = 1.24 ± 0.05 was fitted to data (straight line). The discrepancy between the results for the empirical data and the function indicates that the analysed network does not have a scale-free character. Table 2 Characteristic path length L as a function of the number of nodes N for the minimal spanning trees representing stocks of N American companies. Values of ln N are shown for comparison. N L(N ) ln N

10 0.91 2.30

25 0.98 3.22

50 1.30 3.91

75 1.31 4.31

100 1.34 4.60

250 1.99 5.52

500 2.43 6.21

750 2.88 6.62

1000 3.28 6.91

The MST for the considered set of companies does not have a structure typical for the scale-free networks. This can be seen in Fig. 83, where the cumulative distribution of the node degrees is displayed. In the middle part of the plot one can notice that the data points are inclined to arrange approximately along the power-law function with the power exponent γ = 1.24 ± 0.05. However, outside this middle range of k. the distribution shows the behaviour that is typical for the centralized networks: the overrepresentation of the most central and the most peripheral nodes deflects the distribution towards higher probabilities. This means that the analysed network reveals two superimposed structures: the scale-free one that characterizes the nodes of moderate degree, and the centralized one represented by the other nodes. Thus, this MST is neither scale-free nor small-world. It seems that the conclusions from another analysis of topology of the stock market network [375] suggesting that within a broad range of k the network is scale-free, are too far-reaching. Our network of inter-stock correlations can easily be transformed into a decentralized form by filtering out the most collective component associated with the largest eigenvalue of the underlying correlation matrix. This can be done in a standard way by means of Eq. (54) and by constructing the residual correlation matrix and then the residual MST. This residual tree is shown in Fig. 84. It has completely different topology as compared with its counterpart for the original data (Fig. 80), being now closer to the random network topology. Although the node with the largest degree is American Electric Power (kAEP = 65) which has moderate capitalization KAEP = 6.5 · 109 USD, the highest topological significance is still assigned to GE (bGE = 0.65). Unlike the original MST, in the case of the residual one, the nodes which follow GE have comparable values of bi . This change of topology is manifested also in the characteristic path length, whose value increased and no longer satisfies the small-world condition: L = 16.3 > ln 1000. It should be marked, however, that topology of this residual tree deviates in fact from the random network topology, because of a number of nodes with a high degree. This result remains in agreement with an observation made in Section 3.2.3, in which a residual correlation matrix constructed for the same filtered data still had the non-random eigenvalues exceeding the upper Wishart bound λmax (Fig. 13). The MST shown in Fig. 84 has a clear cluster structure. The clusters are formed by the stocks which are more strongly correlated with other stocks inside the same cluster than with the stocks outside of it. Identification of clusters in a network can be carried out with the help of many algorithms [376], we however apply here the same procedure which we applied to the currency market in Section 3.3.2, consisting in filtering out the correlation matrix elements with below-threshold values [185]. This is equivalent to removing from the complete network the edges with weights ωij = Cij < p, where p is the threshold. The result of this procedure applied to the set of 1000 American stocks is presented in Fig. 85. The threshold weight value below which the edges were removed was p = 0.18. For this value, the number of independent clusters consisting of at least 3 nodes had its maximum. The clusters identified in this way comprise the stocks representing the same or similar market sectors or the same geographical regions. Microscopic topological properties of a network (node degrees, betweenness, existence of particular edges) based on the stock–stock correlations are in general unstable, reflecting the significant instability of the correlation coefficients for the pairs of stocks. In order to assess how this microscopic variability manifests itself in the global properties of the network, the time series of returns corresponding to the N = 100 largest American companies were divided into windows of length

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

211

Fig. 84. Minimal spanning tree for 1000 highly capitalized American stocks after filtering out the most collective eigenstate of the correlation matrix (‘‘the market factor’’).

T = 100 (which corresponds to 1500 min, i.e., almost 4 trading days). For each window, a weight matrix  was calculated and then the clustering coefficient Γ¯ , whose temporal dependence is exhibited in the bottom panel of Fig. 86. For the majority of the time, Γ¯ fluctuates around its mean level ≈0.2, but there were intervals in which the coefficient value strongly increased. It is instructive to compare the behaviour of Γ¯ with the time course of the Dow Jones Industrial Average during the same period (the upper panel of Fig. 86). There is a clear correlation between these two quantities: Γ¯ rises during sudden market drawdowns, while it remains suppressed if the market grows. The most striking example of such behaviour is August and September of 1998, when the clustering coefficient entered the range 0.3 < Γ¯ < 0.5. This supports an earlier-made observation that market falls are typically more collective than market grows or stagnation periods [247,248]. This obviously translates to the instability of the networks representing the market. 8.2.2. Network representation of the currency market Network representation of the currency market constructed from the correlation matrix for all possible pairs B/X, where B is the base currency, contains excess information. This stems from the fact that the respective correlation matrix has considerably reduced rank due to the triangle relations among the currencies (Eq. (25). In effect, the matrix of size N (N − 1) has the rank equal to N − 1 and the same number of non-zero eigenvalues. This is one of the reasons, why it is more convenient to consider a family of small correlation matrices (ans such network representations), each of which corresponds to a particular base currency [184]. The complete data set consists of time series of daily returns representing all currency pairs formed from 62 currencies and precious metals. Each time series has the length of T = 2519 days (the years 1999–2008). For each choice of B, a network consisting of N − 1 nodes corresponding to the exchange rates B/X was constructed. For the same reason as in Section 3.3, we prefer to restrict our analysis to a subset of Nind = 38 independent currencies. The MSTs corresponding to both sets of currencies for an exemplary choice of base currency, B = MAD, are shown in Fig. 87. The structure of these trees considerably differ between the sets. In both cases, the node with the highest degree is USD, but the difference is that in the case of the complete currency set, the USD node is the root for a few extensive branches, while in the case of the independent currency set, there are 2 branches and only one of them can be considered extensive. In the latter case, the vast majority of the edges link USD with the nodes of degree 1. This difference has its origin in the artificial pegs connecting some currencies with the US dollar. Due to their constraints, such currencies were not included in the independent currency set. The currency pegs in the top panel of Fig. 87 are represented by the edges with the largest widths. Because of the fact that the USD-pegged currencies are identical or almost identical with USD, they can be connected with other its satellites and can form extensive branches. This may lead to incorrect interpretation of the structure of the MSTs, since some nodes associated with the currencies of marginal importance have an unreasonably large degree and betweenness. The examples

212

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 85. Stock clusters identified in the minimal spanning tree constructed for the 1000 largest American companies after filtering out the market factor from the original time series of returns. The clusters correspond to stocks which are stronger correlated among themselves than with the rest of the market. Figure shows all the clusters consisting of at least 3 nodes. For the sake of readability, the edges connecting the nodes in clusters have been omitted.

Fig. 86. Mean clustering coefficient Γ˜ (t ) (bottom) calculated for time series of the 100 largest American companies in a moving window of length of T = 100 returns (1t = 15 min) together with temporal evolution of DJIA (top) during the same period of time.

are Bahraini dinar (BHD) and Malaysian ringgit (MYR). Owing to the exclusion of such currencies, an MST becomes more transparent and it better reflects the status of each currency. Depending on a choice of the base currency, the network representations of the currency market can have different topological properties. The most characteristic types of the MSTs will be described based on the following exemplary base

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

213

Fig. 87. Minimal spanning trees for MAD as the base currency. Each node represents time series of daily returns corresponding to an exchange rate MAD/X, where X is one of 61 currencies and precious metals (top) or one of 37 independent currencies (including gold (XAU), bottom). Edge widths are proportional to the respective weights ωij .

214

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 88. Minimal spanning tree for the XAU/X exchange rates, where X is one of 37 independent currencies. Edge widths are proportional to the respective weights ωij . Note the comparable widths of all the edges, which indicates that gold has it own dynamics that is largely independent from the currency market.

currencies: MAD, XAU, EUR and USD. The bottom MST in Fig. 88 has an interesting topology, revealing the properties of both the centralized networks (the existence of a significant cluster concentrated around USD whose centrality expressed by betweenness is bUSD = 0.71) and the distributed networks (a long branch leading to an elevated value of the characteristic path length: L = 2.69; the existence of a few nodes with high betweenness: bMXN = 0.57; bCAD = 0.46; bAUD = 0.38, bZAR = 0.30; bPLN = 0.27; bCZK = 0.23). The characteristic path length is larger here than in the case of the stock market consisting of 500 stocks (Lstocks = 2.43, Table 2). On the other hand, the number of the nodes with bi > 0.2 is twice as much as in the case of the stock market. Interestingly, from the perspective of MAD, the EUR node is characterized by a surprisingly low degree and low betweenness as compared with its significance in the global market. This effect can be explained by strong correlations linking EUR with MAD, which plays the role of the base currency in Fig. 88. Thus, the evolution of EUR enters MAD as a component of its dynamics and, in the present case, EUR is partially eliminated from the market as if it was the base currency, too. Another type of MST topology can be seen in Fig. 88 for gold (B = XAU). This type is characteristic for the precious metals and the currencies whose evolution is independent from the evolution of other currencies (due to inflation, economical breakdowns, etc.). This is exactly the same situation as in the matrix analysis, when the largest eigenvalue of a correlation matrix has large magnitude. (Fig. 22). All the edges have high weights, which implies a large value of the mean clustering coefficient: Γ¯ = 0.69. Moreover, the MST for XAU is more compact than the one for MAD, as the characteristic path length confirms: L = 1.49 (as compared to ln(Nind − 1) = 3.61). The network representation of the market viewed from the perspective of gold is therefore a fully connected network with the small-world property of MST. The network viewed from the perspective of euro is strongly centralized with one dominant node being USD. Its degree is kUSD = 21 and its betweenness is bUSD = 0.88 (upper tree of Fig. 89). The second value is comparable with betweenness of GE in the MST for the stocks. The two secondary nodes have considerably smaller betweenness: bMXN = 0.29 (Mexican peso), bCAD = 0.23 (Canadian dollar). There is a clear cluster structure of the MST: besides USD and its satellites, one can identify the cluster with MXN in the centre, the triples: CAD–AUD–NZD and KRW–TWD–FJD, and the Scandinavian pair: SEK–NOK. Thus, the secondary cluster structure is dominated by the geographical links [187,377]. An uncommon property

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

215

Fig. 89. Minimal spanning trees for EUR (top) and USD (bottom) as the base currencies. The MST was constructed from the exchange rates EUR/X and USD/X, respectively, where X denotes one of 37 independent currencies. Different topologies of both trees are clearly visible.

216

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 90. (Top) Cumulative distributions of node degrees corresponding to 61 exchange rates B/X, where B is one of the following currencies: USD, EUR, TRY (Turkish new lira) or COP (Colombian peso). (Bottom) Values of the power exponent γ for all 62 possible choices of the base currency except for USD and HKD, for which the distributions P (x > k) do not show any power-law behaviour. Horizontal line denotes the mean value ⟨γ ⟩.

of this representation is a lack of the European and Mediterranean clusters. The corresponding nodes are attached by the small-weight edges to randomly selected nodes to which they are in fact not related (e.g., PLN–MXN, ISK–PEN, SEK–AUD). Similar topology describes also other networks based on European currencies closely related to EUR. The fourth and last type of topology is assigned to B = USD and a few of its satellite currencies. Its specific property is the largest value of L = 3.17 and the smallest value of Γ¯ = 0.14. The MST structure does not reveal any dominant, high-degree nodes. However, despite this, there are nodes whose topological role is significant, like EUR (kEUR = 8, bEUR = 0.65), AUD (kAUD = 6, bAUD = 0.61) and SEK (kSEK = 2, bSEK = 0.48). The clusters of nodes concentrated around EUR, AUD, MXN and KRW are even better visible here than in the networks based on MAD and XAU. Unlike the XAU-based network, the USD-based network is heterogeneous as regards the edge weights. One mat also say that it has the richest structure among all the four topology types. This review of topological properties of the network representations of the currency market viewed from the perspectives of different base currencies will be concluded with a brief description of the node degree distributions for the MSTs [378]. Such cumulative distributions for exemplary choices of base currencies are shown in the upper part of Fig. 90. For better statistics, the distributions were derived for the complete set of 62 currencies, comprising also the dependent ones. The results show that the minimal spanning trees for the vast majority of base currencies have power-law distribution of node degrees with the mean power exponent ⟨γ ⟩ = 1.55 ± 0.07. Only the MSTs for USD and the USD-pegged HKD have the distributions with a faster-than-power-law declining tail. We fitted a power-law function with the power exponent γ treated as a free parameter to these distributions and obtained the values collected in the bottom panel of Fig. 90. They oscillate between γPLN = 1.41 and γLKR = 1.71 with the smallest values being typical for the European currencies and the largest values being typical for the USD-related ones. These extremes express the difference between the topology of the centralized networks based on the currencies related to EUR and the topology of the decentralized networks based on the currencies related to USD. The values of γ which we obtained here are close to the values observed in other complex networks [361]. It should be noted, however, that although the power-law function fits well the analysed empirical data, all the fits are based on a small number of data points, typically less than 10, which makes our results approximate only.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

217

Fig. 91. Minimal spanning trees for the exchange rates EUR/X, where X stands for each of 37 independent currencies except for EUR itself, calculated in windows of length of 2 years. The edge widths are proportional to the corresponding weights ωij .

218

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

Fig. 92. Minimal spanning trees for the exchange rates USD/X, where X stands for each of 37 independent currencies except for USD itself, calculated in windows of length of 2 years. The edge widths are proportional to the corresponding weights ωij .

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

219

Fig. 93. Characteristic path length L of the minimal spanning trees as a function of time. Each MST was calculated for USD/X (upper light line) and EUR/X (lower dark line) in a moving window of length of T = 60 trading days (∼3 calendar months), where X stands for each of the 38 independent currencies except for the respective base currency. The dashed lines denote two alternative long-term trends: the approximately linear increase of L(t ) over the years 2001–2007 and the horizontal trend of L(t ) over the years 2004–2008. Both trends refer to B = EUR. Table 3 Mean clustering coefficient Γ˜ (calculated for the complete network), characteristic path length L, degree of the most connected node kmax and its betweenness bmax (all three calculated for the MST representations) for the signals divided into two-years-long subintervals. All the parameters were derived for time series of returns representing the exchange rates B/X, where X denotes each of the 37 non-base, independent currencies and B denotes either EUR or USD.

Γ˜

L

ki

bi

B=

USD

EUR

USD

EUR

USD

EUR

USD

EUR

1999–2000 2001–2002 2003–2004 2005–2006 2007–2008

0.09 0.10 0.16 0.22 0.26

0.42 0.40 0.33 0.29 0.26

3.81 3.68 2.69 3.68 3.59

0.89 1.32 2.01 1.72 3.44

6 9 7 9 7

22 21 12 17 13

0.62 0.64 0.62 0.73 0.69

0.88 0.88 0.83 0.81 0.62

Another point which is worth considering is stability of the network representations discussed here. It is known from many studies that financial correlations are unstable in time [380,381]. We also already know from Section 3.3 that the strength of the collective eigenvalue λ1 can change between different periods of time (Fig. 25). It is thus justified to expect that the structure of both the complete networks and the minimal spanning trees may be unstable, too. Figs. 91 and 92 show the MSTs calculated in two-years-long intervals for two base currencies: EUR and USD. Changes of their topology illustrate the evolution of the global currency market during the decade 1999–2008. The observed instability of the trees may originate from the existence of noise in the exchange rates, which catalyses the fluctuation of edge weights in the complete network and, consequently, the reshuffling of some node positions in the respective MST, especially regarding the nodes connected by the low-weight edges. Moreover, the instability may also originate either from the market reaction to unpredicted or larger-than-expected events (for instance, the currency crises [188]), from the introduction or abandoning of particular trading strategies, or from the long-term trends [184,382]. A careful inspection of the consecutive trees in both Figures permits one to notice the existence of such trends. By comparing the years 2001–2002 and 2003–2004 from the point of view of EUR as the base currency, one can mark the substantial decrease of the USD node degree which occurred during these periods and which seems to be persistent as yet. Table 3 allows us to compare values of different parameters in each of the windows for two choices of the base currency: USD and EUR. The globally oriented quantities: the mean clustering coefficient and the characteristic path length are accompanied there by two local measures: the degree kmax and the betweenness bmax of the maximum-linked node. A trend is best visible in the evolution of Γ¯ : for B = USD, this coefficient increased monotonically (which means an increase of the mean correlation strength), while for B = EUR, the value of Γ¯ monotonically decreased. The opposite is seen for the characteristic path length: the MST for EUR had more and more lengthy branches, while no significant effect in L is observed for the USD-based MST. The betweenness bmax of the maximum connected node evidently decreases in the case of B = EUR (the observed node is USD) and increases in the case of B = USD (the observed node is EUR). This regularity can also be observed if we increase the temporal resolution of our analysis. Fig. 93 exhibits the characteristic path length as a function of time for two base currencies: USD and EUR. For both of them, L(t ) was calculated at different positions of a window of length of T = 60 trading days (about 3 calendar months) moved along the whole 10-years-long signals. Fine-scale behaviour of this quantity may be considered ambiguous to some extent, but from the Figure it stems that a substantial growth of L(t ) (along a linear trend) was observed for B = EUR at least between the years

220

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

2001 and 2007, accompanied by a slight decrease of this quantity for B = USD. An alternate interpretation of the same plot is that the changes of L(t ) took place during the period 2002–2004, and since then this quantity only fluctuates around its average value (Fig. 93). The increased temporal resolution allowed us to observe the coexistence of processes taking place on different time scales. Basically, there are two such time scales visible in Fig. 93: the fast-variable component which seems to be of stochastic origin (at least from the perspective of the applied resolution) and the slowly variable component which manifests itself in the existence of trends and which therefore must be of the macroeconomic origin [378,379]. All these outcomes support the earlier results from Section 3.3 obtained there by the correlation matrix approach. The main conclusion that can be drawn from both analyses is that, as the time lapses, euro gains more and more significance as a reference point for other currencies at the expense of the US dollar. The origin of this phenomenon is not fully understood yet, but it seems that over the last decade less and less currencies were strongly correlated with USD. The USD value expressed in other major currencies has fallen significantly since the beginning of the 21st century. Even more, its value has fallen even if expressed in many non-major (but independent from USD) currencies. This might lead to a decoupling of the USD evolution from the evolution of the remaining part of the currency market. At the level of the network representation of the market, a consequence of such decoupling could be a decrease of the centrality value for the node representing USD. One also cannot exclude other, more subtle effects which may play a role in this context, connected with, for example, the changes of commodity prices (oil, gold etc.) which may lead to decorrelation of the US dollar from the currencies of some countries whose economies are commodity-dependent. 9. Summary and outlook This review even though already relatively long by no means exhausts the list of the complexity issues that can be addressed from the physics perspective. Its content and composition is unavoidably shaped by the present authors own related activity, experience and preferences. Still, the three natural complex systems quantitatively elaborated above – the financial markets, the human brain and the natural language – with no doubt constitute the most representative complex systems that exist in the Universe. These are also the systems whose study becomes more and more appreciated by the physics community and in the recent years this kind of study systematically gains ground, so far in relative proportions as about reflected above, in the physics literature. There are many good reasons for this. The principal one is that physics methodology proves useful in capturing the operation of natural systems and in identifying and systematizing the properties that are similar or even common to all of them. Among those the scale free-like – fractal or more involved multifractal – properties as emergent effects are especially attracting attention. That such a kind of properties is emergent is particularly transparently seen in the linguistic examples exposed in the Section 6.3. Here, the linguistic complexity primarily manifests itself through the logic of mutual dressing among words belonging to different parts of speech that the entire proportions emerge scale-free even though in the majority of these parts separately the proportions do not respect such a kind of organization. As the material presented here highlights the complex systems research is a highly interdisciplinary field of science with a potential of integrating such diverse disciplines like physics, information theory, chemistry, biology, linguistics, economics, social sciences and many others. Equivalently, the notion of ‘‘a complex system’’ opens a natural frame for comprising a broad class of systems which, traditionally, are subject of interest of different fields of science and share the property of being hard to be describable in a reductionistic manner but at the same time they happen to display orderly patterns that can even be qualified as simple. This may point to the existence of a certain common underlying rule that governs evolution of such systems that in the computational terminology corresponds to a ‘‘simple program’’ [383]. The related potential of integrating seemingly disconnected scientific disciplines is at the heart of physics as it physics whose primary objective is unification. Somewhat in the same spirit, and at the same time reducing the possibility expressed in ‘‘Stigler’s law of eponymy’’ [384] that ‘‘no scientific discovery is named after its original discoverer’’, in the arena of complexity research the quest towards identification of the first discoverers of those effects that are most characteristic to complexity finds evidence even in the terminology. At present it is the Bible that appears an original source of inspiration [385]. Such terms as Joseph Effect and Noah Effect [80,386] to respectively mean the persistence of trends and their sudden, discontinuous changes, are entering into the common use. Another is the Matthew Effect [387], the Biblical forerunner of the preferential attachment rule [70] that generates the scale-free patterns in networks. Finally, it should be stressed that applying various physical methods to study complex systems which are not a subject of interest of the traditional physics is by no means a unidirectional transfer of knowledge from physics to other fields of science. This is a mutual influence. The physical methods which are now routinely used in complex systems research are further developed there and then – in an improved and more sophisticated form – return to physics allowing one to perform more sublime analyses of the ‘‘physical systems’’. Even more, advances in the science of complex systems open new problems and formulate new questions which create a demand to pursue development of physical theories in specific directions. This, in turn, can contribute to further advances in physics itself. Acknowledgements We would like to express our sincere thanks to all our friends, collaborators and supporters, who through multitude of various forms of inspiring and productive exchanges over many years influenced and shaped our view on the issue of

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226

221

complexity and thus on the composition of the material presented in the present review. In the first place, listed in the alphabetical order, they are Andrzej Z. Górski, Frank Grümmer, Janusz Hołyst, Andreas A. Ioannides, Stanisław Jadach, Marek Jeżabek, Krzysztof Kułakowski, Ryszard Kutner, Lichan Liu, Czesław Mesjasz, Jacek Okołowicz, Paweł Oświe¸cimka, Marek Płoszajczak, Rafał Rak, Ingrid Rotter, Franz Ruf, Josef Speth, Tomasz Srokowski, Anthony W. Thomas, and Jochen Wambach. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61]

E. Nagel, The Structure of Science: Problems in the Logic of Scientific Explanation, Routledge, 1961. E. Kałuszyńska, Found. Sci. 1 (1998) 133–150. J. Ellis, Nature 323 (1986) 595–598. J. Gribbin, The Search for Superstrings, Symmetry, and the Theory of Everything, Little, Brown and Company, 1999. M.C. Gutzwiller, Rev. Mod. Phys. 70 (1998) 589–639. S. Drożdż, S. Nishizaki, J. Speth, J. Wambach, Phys. Rep. 197 (1990) 1–65. G.F.R. Ellis, Nature 435 (2005) 743. S. Weinberg, Reductionism redux, The New York Review of Books 42 (15) (1995) 39–42. Reprinted in: S. Weinberg, Facing Up: Science And Its Cultural Adversaries, Harvard Univ. Press, 2001. F. Dyson, The scientist as rebel, The New York Review of Books 42 (9) (1995) 31–33. Reprinted in: F. Dyson, The Scientist as Rebel, Random House Inc., 2006. R.B. Laughlin, D. Pines, Proc. Nat. Acad. Sci. USA 97 (1999) 28–31. Aristotle, Metaphysics, Book VIII (4th century B.C.). P.W. Anderson, Science 177 (1972) 393–396. A.V. Getling, Rayleigh–Bénard Convection: Structures and Dynamics, World Scientific, 1997. U. Frisch, Turbulence: The Legacy of A.N. Kolmogorov, Cambridge Univ. Press, 1996. H.E. Stanley, Introduction to Phase Transitions and Critical Phenomena, Oxford Univ. Press, 1987. G.P. Shpenkov, Friction Surface Phenomena, Elsevier Science, 1995. J.C. Santamarina, H.S. Shin, Friction in granular media, in: Y.H. Hatzor, J. Sulem, I. Vardoulakis (Eds.), Meso-scale Shear Physics in Earthquake and Landslide Mechanics, CRC Press, 2009, pp. 157–188. B. Mandelbrot, Science 156 (1967) 636–638. B.T. Werner, Science 284 (1999) 102–104. G.F.S. Wiggs, Prog. Phys. Geog. 25 (2001) 53–79. R.R. Sinden, DNA Structure and Function, Academic Press, 1994. J.E. Baldwin, H. Krebs, Nature 291 (1984) 381–382. A.T. Winfree, The Geometry of Biological time, 2nd ed., Springer, 2001. R.R. Klevecz, J. Bolen, G. Forrest, D.B. Murray, Proc. Natl. Acad. Sci. USA 101 (2004) 1200–1205. E. Szathmáry, J.M. Smith, Nature 374 (1995) 227–232. P. Schuster, Complexity 2 (1996) 22–30. J. Hofbauer, K. Sigmund, Evolutionary Games and Population Dynamics, Cambridge Univ. Press, 1998. R.E. Michod, Darwinian Dynamics, Princeton Univ. Press, 2000. J.A.S. Kelso, Dynamic Patterns: The Self-Organization of Brain and Behavior, MIT Press, 1995. N. Chomsky, Rules and Representations, Columbia Univ. Press, 1980. M.H. Christiansen, N. Chatera, Behav. Brain Sci. 31 (2008) 489–509. A. Yasutomi, Physica D 82 (1995) 180–194. J. Duffy, J. Ochs, Amer. Econ. Rev. 89 (1999) 847–877. P. Howitt, R. Clower, J. Econ. Behav. Organ. 41 (2000) 55–84. M.H.I. Dore, The Macrodynamics of Business Cycles: A Comparative Evaluation, WileyBlackwell, 1993. A.C.-L. Chian, Complex Systems Approach to Economic Dynamics, in: Lecture Notes in Economics and Mathematical Systems, vol. 592, Springer, 2007. R. Axelrod, The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration, Princeton Univ. Press, 1997. C. Mesjasz, Acta Phys. Pol. A 117 (2010) 706–715. K. Mainzer, Thinking in Complexity, Springer-Verlag, 1994. S.H. Strogatz, Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering, Westview Press, 2001. M.E. Fisher, Rev. Mod. Phys. 70 (1998) 653–681. D.L. Turcotte, Rep. Prog. Phys. 62 (1999) 1377–1429. H. Haken, Synergetics. An Introduction. Nonequilibrium Phase Transitions and Self-Organization in Physics, Springer-Verlag, 1977. B.M. Ayyub, G.J. Klir, Uncertainty Modeling and Analysis in Engineering and the Sciences, Chapman & Hall/CRC, 2006. S. Lloyd, IEEE Control Syst. Mag. 21 (4) (2001) 7–8. A.N. Kolmogorov, Dokl. Akad. Nauk SSSR 30 (1941) 301–305. M. Gell-Mann, Complexity 1 (1995) 16–19. M. Gell-Mann, S. Lloyd, Effective complexity, in: M. Gell-Mann, C. Tsallis (Eds.), Nonextensive Entropy — Interdisciplinary Applications, Oxford Univ. Press, 2003, pp. 386–398. C. Bennett, Logical depth and physical complexity, in: R. Herken (Ed.), The Universal Turing Machine — A Half-Century Survey, Oxford Univ. Press, 1988, pp. 227–257. N. Ay, M. Müller, A. Szkoła, preprint arXiv:0810.5663, 2008. C. Bennett, Dissipation, information, computational complexity and the definition of organization, in: D. Pines (Ed.), Emerging Syntheses in Science, Addison-Wesley, 1987, pp. 215–231. S. Wolfram, Physica D 10 (1984) 1–35. N. Margolus, Physica D 10 (1984) 81–95. C. Shannon, Bell Syst. Tech. J. 27 (1948) 379–423. 623–656. S. Lloyd, H. Pagels, Ann. Phys. 188 (1988) 186–213. E. Schrödinger, What is life? Cambridge Univ. Press, 1944. Y. Bar-Yam, Complexity 7 (2004) 47–63. R. Metzler, Y. Bar-Yam, Phys. Rev. E 71 (2005) 046114. A.M. Fraser, H.L. Swinney, Phys. Rev. A 33 (1986) 1134–1140. C. Bennett, How to define complexity in physics, and why, in: Complexity, Entropy and the Physics of Information, in: W.H. Zurek (Ed.), SFI Studies in the Sciences of Complexity, vol. VIII, Addison-Wesley, 1990, p. 148. I. Prigogine, From Being to Becoming. Time and Complexity in the Physical Sciences, W.H. Freeman & Co., 1980.

222 [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137]

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226 I. Prigogine, Etude thermodynamics des phenomenes irreversibles, l’Université Libre de Bruxelles, 1945. A.M. Turing, Phil. Trans. R. Soc. Lond. B 237 (1952) 37–72. A.-L. Barabási, H.E. Stanley, Fractal Concepts in Surface Growth, Cambridge Univ. Press, 1995. B. Gutenberg, R.F. Richter, Bull. Seism. Soc. Am. 34 (1944) 185–188. G.B. West, J.H. Brown, B.J. Enquist, Science 276 (1997) 122–126. J.H. Brown, J.F. Gillooly, A.P. Allen, V.M. Savage, G.B. West, Ecology 85 (2004) 1771–1789. W. Souma, Physics of personal income, in: H. Takayasu (Ed.), Empirical Science of Financial Fluctuations, Springer-Verlag, 2002, pp. 343–352. B.D. Malamud, D.L. Turcotte, F. Guzzetti, P. Reichenbach, Earth Surf. Process. Landforms 29 (2004) 687–711. A.-L. Barabási, R. Albert, Science 286 (1999) 509–512. D. Sornette, Critical Phenomena In Natural Sciences, Springer-Verlag, 2006. T.A.Jr. Witten, L.M. Sander, Phys. Rev. Lett. 47 (1981) 1400–1403. G.U. Yule, Phil. Trans. R. Soc. Lon. B 213 (1925) 21–87. M.E.J. Newman, Contemp. Phys. 46 (2005) 323–351. H.E. Stanley, Rev. Mod. Phys. 71 (1999) S358–S366. P. Bak, C. Tang, K. Weisenfeld, Phys. Rev. E 38 (1988) 364–374. J.B. Johnson, Phys. Rev. 26 (1925) 71–85. F.N. Hooge, Phys. Lett. A 29 (1969) 139–140. R.F. Voss, J. Clarke, Phys. Rev. Lett. 36 (1976) 42–44. B.B. Mandelbrot, J.R. Wallis, Water Resour. Res. 5 (1968) 909–918. B.A. Taft, Deep Sea Res. 21 (1974) 403–430. W.H. Press, Comments Astrophys. 7 (1978) 103–119. D.L. Gilden, T. Thornton, M.W. Mallon, Science 267 (1995) 1837–1839. Y. Chen, M. Ding, J.A.S. Kelso, Phys. Rev. Lett. 79 (1997) 4501–4504. V.A. Billock, G.C.de. Guzman, J.A.S. Kelso, Physica D 148 (2001) 136–146. M. Kobayashi, T. Musha, IEEE Trans. Biol. Eng. 29 (1982) 456–457. B. Pilgram, D.T. Kaplan, Am. J. Physiol. 276 (1999) R1–R9. B. Kulessa, T. Srokowski, S. Drożdż, Acta Phys. Pol. B 34 (2003) 3–15. R.F. Voss, J. Clarke, Nature 258 (1975) 317–318. P. Dutta, P.M. Horn, Rev. Mod. Phys. 53 (1981) 497–516. M.B. Weissman, Rev. Mod. Phys. 60 (1988) 537–571. L.P. Kadanoff, S.R. Nagel, L. Wu, S.-M. Zhou, Phys. Rev. A 39 (1989) 6524–6537. H.M. Jaeger, C.-H. Liu, S.R. Nagel, Phys. Rev. Lett. 62 (1989) 40–43. G.A. Held, D.H. Solina II, D.T. Keane, W.J. Haag, P.M. Horn, G. Grinstein, Phys. Rev. Lett. 65 (1993) 1120–1123. P. Bak, K. Sneppen, Phys. Rev. Lett. 71 (1993) 4083–4086. S.J. Gould, N. Eldredge, Paleobiology 3 (1977) 115–151. Y. Huang, H. Saleur, C.G. Sammis, D. Sornette, Europhys. Lett. 41 (1999) 43–48. D. Sornette, Proc. Natl. Acad. Sci. USA 99 (2002) 2522–2529. J.M. Carlson, J. Doyle, Phys. Rev. E 60 (1999) 1412–1427. J.M. Carlson, J. Doyle, Proc. Natl. Acad. Sci. USA 99 (2002) 2538–2545. J. Lisman, Trends Neurosci. 20 (1997) 38–43. M. Abeles, Y. Prut, H. Bergman, E. Vaadia, Progress in Brain Research 102 (1994) 395–404. J. Güémez, M.A. Matías, Physica D 96 (1996) 334–343. P. Érdi, Complexity Explained, Springer-Verlag, 2007. G. Tononi, G. Edelman, O. Sporns, Trends Cogn. Sci. 2 (1998) 474–484. K.J. Friston, Neuroimage 5 (1997) 164–171. S.A. Kauffman, The Origins of Order: Self-organization and Selection in Evolution, Oxford Univ. Press, 1993. W. Freeman, R. Kozma, P. Werbos, BioSystems 59 (2001) 109–123. I. Tsuda, G. Barna, How can chaos be a cognitive processor? in: M. Yamaguti (Ed.), Towards the Harnessing of Chaos, Elsevier Science, 1994, pp. 47–61. D.R. Chialvo, Nature Physics 6 (2010) 744–750. D. Stassinopoulos, P. Bak, Phys. Rev. E 51 (1995) 5033–5039. D.R. Chialvo, Physica A 340 (2004) 756–765. V.M. Eguíluz, D.R. Chialvo, G.A. Cecchi, M. Baliki, A.V. Apkarian, Phys. Rev. Lett. 94 (2005) 018102. K. Kubota, Ann. Nucl. Med. 15 (2001) 471–486. P.M. Matthews, G.D. Honey, E.T. Bullmore, Nat. Rev. Neurosci. 7 (2006) 732–744. R.C. deCharms, Nature Reviews Neuroscience 9 (2008) 720–729. M. Hämäläinen, R. Hari, R.J. Ilmoniemi, J. Knuutila, O.V. Lounasmaa, Rev. Mod. Phys. 65 (1993) 413–497. C. Del Gratta, V. Pizzella, F. Tecchio, G.L. Romani, Rep. Prog. Phys. 64 (2001) 1759–1814. A.A. Ioannides, The Neuroscientist 12 (2006) 524–544. C. Beckner, R. Blythe, J. Bybee, M.H. Christiansen, W. Croft, N.C. Ellis, J. Holland, J. Ke, D. Larsen-Freeman, T. Schoenemann, Language Learning 59 (Supl. 1) (2009). S. Pinker, P. Bloom, Behav. Brain Sci. 13 (1990) 707–727. E. van Everbroeck, Linguistic Typology 7 (2003) 1–50. M.A. Nowak, D.C. Krakauer, Proc. Natl. Acad. Sci. USA 96 (1999) 8028–8033. M.A. Nowak, J.B. Plotkin, V.A.A. Jansen, Nature 404 (2000) 495–498. W.B. Arthur, Science 284 (1999) 107–109. J.D. Farmer, Ind. Corp. Change 11 (2002) 895–953. Z. Burda, J. Jurkiewicz, M.A. Nowak, Acta Phys. Pol. B 34 (2003) 87–131. J.D. Farmer, J. Geanakoplos, Complexity 14 (2009) 11–38. E. Fama, J. Business 38 (1965) 34–105. D. Sornette, Phys. Rep. 378 (2003) 1–98. J.-P. Bouchaud, J.D. Farmer, F. Lillo, How markets slowely digest changes in demand and supply, in: T. Hens, K. Schenk-Hoppe (Eds.), Handbook of Financial Markets: Dynamics and Evolution, Elsevier: Academic Press, 2008. M. Bartolozzi, D.B. Leinweber, A.W. Thomas, Physica A 350 (2005) 451–465. J.-P. Bouchaud, R. Cont, Eur. Phys. J. B 6 (1998) 543–550. P. Gopikrishnan, V. Plerou, X. Gabaix, H.E. Stanley, Phys. Rev. E 62 (2000) R4493–R4496. V. Plerou, P. Gopikrishnan, L.A.N. Amaral, X. Gabaix, H.E. Stanley, Phys. Rev. E 62 (2000) R3023–R3026. F. Lillo, J.D. Farmer, Stud. Nonlinear Dyn. Econom. 8 (2004) 1–33. F. Lillo, J.D. Farmer, Fluct. Noise Lett. 5 (2005) L209–L216.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226 [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213]

223

B. Tóth, J. Kertész, J.D. Farmer, Eur. Phys. J. B 71 (2009) 499–510. R.T. Baillie, C.-F. Chung, M.A. Tieslau, J. Appl. Econom. 11 (1995) 23–40. F.X. Diebold, G.D. Rudebusch, J. Monetary Econ. 24 (1989) 189–209. M.H.R. Stanley, L.A.N. Amaral, S.V. Buldyrev, S. Havlin, H. Leschhorn, P. Maass, M.A. Salinger, H.E. Stanley, Nature 379 (1996) 804–806. Y. Liu, P. Gopikrishnan, P. Cizeau, M. Meyer, C.-K. Peng, H.E. Stanley, Phys. Rev. E 60 (1999) 1390–1400. K. Matia, L.A.N. Amaral, S.P. Goodwin, H.E. Stanley, Phys.Rev. E 66 (2002) 045103(R). V. Plerou, P. Gopikrishnan, L.A.N. Amaral, M. Meyer, H.E. Stanley, Phys. Rev. E 60 (1999) 6519–6529. P. Gopikrishnan, V. Plerou, L.A.N. Amaral, M. Meyer, H.E. Stanley, Phys. Rev. E 60 (1999) 5305–5316. U.A. Müller, M.M. Dacorogna, R.B. Olsen, O.V. Pictet, M. Schwarz, C. Morgenegg, J. Banking Finance 14 (1990) 1189–1208. Bank For International Settlements ‘‘Triennial Central Bank Survey’’, 2010, http://www.bis.org. International Monetary Fund, http://www.imf.org. World Federation of Exchanges, http://www.world-exchanges.org. M. Schaden, Physica A 316 (2002) 511–538. S. Maslov, Physica A 301 (2001) 397–406. S. Drożdż, F. Grümmer, F. Ruf, J. Speth, Physica A 294 (2001) 226–234. S. Drożdż, J. Kwapień, J. Speth, M. Wójcik, Physica A 314 (2002) 355–361. T.A. Brody, J. Flores, J.B. French, P.A. Mello, A. Pendey, S.S.M. Wong, Rev. Mod. Phys. 53 (1981) 385–479. T. Guhr, A. Müller-Groeling, H.A. Weidenmüller, Phys. Rep. 299 (1998) 189–425. J. Wishart, Biometrica 20 (1928) 32. V.A. Mar˘cenko, L.A. Pastur, Math. USSR Sb. 1 (1967) 457–483. A.M. Sengupta, P.P. Mitra, Phys. Rev. E 60 (1999) 003389. C.A. Tracy, H. Widom, The distribution of the largest eigenvalue in the Gaussian ensembles, in: J.F. van Diejen, L. Vinet (Eds.), Calogero–Moser– Sutherland Models, in: CRM Series in Mathematical Physics, vol. 4, Springer-Verlag, 2000, pp. 461–472. I. Johnstone, Ann. Stat. 29 (2001) 296–327. K.E. Bassler, P.J. Forrester, N.E. Frankel, J. Math. Phys. 50 (2009) 033302. J. Baik, G. Ben arous, S. Péché, Ann. Prob. 33 (2005) 1643–1697. Trades and Quotes Database, http://www.nyxdata.com. Karlsruher Kapitalmarktdatenbank Karlsruhe University, http://fmi.fbv.uni-karlsruhe.de. L. Laloux, P. Cizeau, J.-P. Bouchaud, M. Potters, Phys. Rev. Lett. 83 (1999) 1467–1470. V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, H.E. Stanley, Phys. Rev. Lett. 83 (1999) 1471–1474. V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, T. Gühr, H.E. Stanley, Phys. Rev. E 65 (2002) 066126. A. Utsugi, K. Ino, M. Oshikawa, Phys.Rev. E 70 (2004) 026110. J. Kwapień, S. Drożdż, P. Oświe¸cimka, Acta Phys. Pol. B 36 (2005) 2423–2434. Y. Malevergne, D. Sornette, Physica A 331 (2004) 660–668. J.D. Noh, Phys. Rev. E 61 (2000) 5981–5982. C. Tumminello, F. Lillo, R.N. Mantegna, Europhys. Lett. 78 (2007) 30006. H. Markowitz, J. Finance 7 (1952) 77–91. J. Kwapień, S. Drożdż, P. Oświe¸cimka, Physica A 359 (2006) 589–606. Z. Burda, J. Jurkiewicz, Physica A 344 (2004) 67–72. Z. Burda, A. Görlich, J. Jurkiewicz, B. Wacław, Eur. Phys. J. B 49 (2006) 319–323. G. Bonanno, F. Lillo, R.N. Mantegna, Quant. Finance 1 (2001) 96–104. J. Kwapień, S. Drożdż, J. Speth, Physica A 337 (2004) 231–242. T. Conlon, H.J. Ruskin, M. Crane, Adv. Compl. Sys. 12 (2009) 439–454. T.W. Epps, J. Am. Stat. Assoc. 74 (1979) 291–298. B. Tóth, J. Kertész, Quant. Finance 9 (2009) 793–802. B. Tóth, J. Kertész, Physica A 360 (2005) 505–515. S. Drożdż, A.Z. Górski, J. Kwapień, Eur. Phys. J. B 58 (2007) 499–502. J. Kwapień, S. Gworek, S. Drożdż, A.Z. Górski, J. Econ. Interact. Coord. 4 (2009) 55–72. D.-H. Kim, H. Jeong, Phys. Rev. E 72 (2005) 046133. M. Kaciczak, Correlations on the world financial markets, Master Thesis, Supervisor: J. Kwapień, Univ. of Science and Technology, Kraków 2006, (in Polish). T. Mizuno, H. Takayasu, M. Takayasu, Physica A 364 (2006) 336–342. M.J. Naylor, L.C. Rose, B.J. Moyle, Physica A 382 (2007) 199–208. A.Z. Górski, S. Drożdż, J. Kwapień, Eur. Phys. J. B 66 (2008) 91–96. J. Kwapień, S. Drożdż, A.A. Ioannides, Phys. Rev. E 62 (2000) 5557–5564. S. Drożdż, J. Kwapień, F. Grümmer, F. Ruf, J. Speth, Physica A 299 (2001) 144–153. J. Kwapień, S. Drożdż, F. Grümmer, F. Ruf, J. Septh, Physica A 309 (2002) 171–182. L.C. Liu, A.A. Ioannides, M. Streit, Brain Topogr. 11 (1999) 291–303. A.A. Ioannides, L.C. Liu, J. Kwapień, S. Drożdż, M. Streit, Hum. Brain Mapp. 11 (2000) 77–92. A.A. Ioannides, M.J. Liu, L.C. Liu, P.D. Bamidis, E. Hellstrand, K.M. Stephan, Int. J. Psychophysiol. 20 (1995) 161–175. J.V. Odom, M. Bach, C. Barber, M. Brigell, M.F. Marmor, A.P. Tormene, G.E. Holder, Vaegan, Doc. Ophthalmol. 108 (2004) 115–123. L.C. Liu, A.A. Ioannides, H.W. Müller-Gärtner, Neuroimage 8 (1998) 149–162. S. Drożdż, J. Kwapień, A.A. Ioannides, Acta Phys. Pol. B 42 (2011) 987–999. C. Biely, S. Thurner, Quant. Finance 8 (2008) 705–722. E. Kanzieper, N. Singh, J. Math. Phys. 51 (2010) 103510. J. Ginibre, J. Math. Phys. 6 (1965) 440–449. A. Edelman, E. Kostlan, M. Shub, J. Amer. Math. Soc. 7 (1994) 247–267. A. Edelman, J. Multivariate Anal. 60 (1997) 203–232. H.-J. Sommers, W. Wieczorek, J. Phys. A: Math. Theor. 41 (2008) 405003. H.-J. Sommers, A. Crisanti, H. Sompolinsky, Y. Stein, Phys. Rev. Lett. 60 (1988) 1895–1898. P.J. Forrester, T. Nagao, Phys. Rev. Lett. 99 (2007) 050603. J.S. Morris, C.D. Frith, D.I. Perrett, D. Rowland, A.W. Young, A.J. Calder, R.J. Dolan, Nature 383 (1996) 812–815. W.G. Walter, R. Cooper, V.J. Aldridge, C. Mccallum, J. Cohen, Electroenceph. Clin. Neurophysiol. 17 (1964) 340–344. C.C. Pantev, B. Lütkenhöner, M.M. Hoke, K. Lehnertz, Audiology 25 (1986) 54–61. N. Nakasato, S. Fujita, K. Seki, T. Kawamura, A. Matani, I. Tamura, S. Fujiwara, T. Yoshimoto, Electroen. Clin. Neurophysiol. 94 (1995) 183–190. J. Kwapień, S. Drożdż, L.C. Liu, A.A. Ioannides, Phys. Rev. E 58 (1998) 6359–6367. J. Kwapień, S. Drożdż, A.Z. Górski, P. Oświe¸cimka, Acta Phys. Pol. B 37 (2006) 3039–3048. M. Stephanov, Phys. Rev. Lett. 76 (1996) 4472–4475.

224 [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270] [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] [285] [286] [287] [288] [289] [290]

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226 R.A. Janik, M.A. Nowak, G. Papp, I. Zahed, Nucl. Phys. B 501 (1997) 603–642. G. Akemann, T. Wettig, Phys. Rev. Lett. 92 (2004) 102002. S. Drożdż, A. Trellakis, J. Wambach, Phys. Rev. Lett. 76 (1996) 4891–4894. Y. Fyodorov, B. Koruzhenko, H.-J. Sommers, Phys. Rev. Lett. 79 (1997) 557–560. S. Drożdż, J. Okołowicz, M. Płoszajczak, I. Rotter, Phys. Rev. C 62 (2000) 024313. M. Timme, F. Wolf, T. Geisel, Phys. Rev. Lett. 92 (2004) 074101. J.-P. Bouchaud, M. Potters, Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management, Cambridge Univ. Press, 2000. W. Paul, J. Baschnagel, Stochastic Processes. From Physics to Finance, Springer-Verlag, 1999. R.N. Mantegna, H.E. Stanley, Phys. Rev. Lett. 73 (1994) 2946–2949. I. Koponen, Phys. Rev. E 52 (1995) 1197–1199. J. Laherrère, D. Sornette, Eur. Phys. J. B 2 (1998) 525–539. L. Bachelier, Ann. Sci. l’École Norm. Sup. 3 (1900) 21–86. F. Black, M. Scholes, J. Polit. Econ. 81 (1973) 637–654. B. Mandelbrot, J. Business 36 (1963) 394–419. R.N. Mantegna, H.E. Stanley, Nature 376 (1995) 46–49. T. Lux, Appl. Financ. Econom. 6 (1996) 463–475. R. Rak, S. Drożdż, J. Kwapień, Physica A 374 (2007) 315–324. R.K. Pan, S. Sinha, Europhys. Lett. 77 (2007) 58004. F. Schmitt, D. Schertzer, S. Lovejoy, Appl. Stoch. Mod. D. A. 15 (1999) 29–53. J.A. Skjeltorp, Physica A 283 (2000) 486–528. L. Couto miranda, R. Riera, Physica A 297 (2001) 509–520. G. Oh, C.-J. Um, S. Kim, J. Korean Phys. Soc. 48 (2006) S197–S201. T. Lux, M. Marchesi, Nature 397 (1998) 498–500. X. Gabaix, P. Gopikrishnan, V. Plerou, H.E. Stanley, Nature 423 (2003) 267–270. J.D. Farmer, L. Gillemot, F. Lillo, S. Mike, A. Sen, Quant. Finance 4 (2004) 383–397. S. Thurner, J.D. Farmer, J. Geanakoplos, Leverage causes fat tails and clustered volatility, preprint arXiv:0908.1555, 2009. Z. Ding, C.W.J. Granger, R. Engle, J. Empir. Finance 1 (1993) 83–106. J.-P. Bouchaud, Physica A 285 (2000) 18–28. S. Ghashghaie, W. Breymann, J. Peinke, P. Talkner, Y. Dodge, Nature 381 (1996) 767–770. A. Arneodo, J.-P. Bouchaud, R. Cont, J.-F. Muzy, M. Potters, D. Sornette, Comment on ‘‘Turbulent cascades in foreign exchange markets’’, preprint arXiv:cond-mat/9607120, 1996. A. Arneodo, J.-F. Muzy, D. Sornette, Eur. Phys. J. B 2 (1998) 277–282. J.-P. Bouchaud, Y. Gefen, M. Potters, M. Wyart, Quant. Finance 4 (2004) 176–190. J. Kwapień, S. Drożdż, J. Speth, Physica A 330 (2003) 605–621. S. Drożdż, F. Grümmer, A.Z. Górski, F. Ruf, J. Speth, Physica A 287 (2000) 440–449. L. Sandoval Jr., I. De Paula Franca, Correlations of financial markets in times of crisis, Physica A 391 (2012) 187–208. S. Drożdż, J. Kwapień, F. Grümmer, J. Speth, Acta Phys. Pol. B 34 (2003) 4293–4306. S. Drożdż, M. Forczek, J. Kwapień, P. Oświe¸cimka, R. Rak, Physica A 383 (2007) 59–64. C. Tsallis, Introduction to Nonextensive Statistical Mechanics, Springer, 2009. C. Tsallis, J. Stat. Phys. 52 (1988) 479–487. C. Tsallis, S.V.F. Levy, A.M.C. Souza, R. Maynard, Phys. Rev. Lett. 75 (1995) 3589–3593. L.G. Moyano, C. Tsallis, M. Gell-Mann, Europhys. Lett. 73 (2006) 813–819. S.M.D. Queiros, C. Tsallis, AIP Conf. Proc. 965 (2007) 21–33. C. Tsallis, C. Anteneodo, L. Borland, R. Osorio, Physica A 324 (2003) 89–100. S. Drożdż, J. Kwapień, P. Oświe¸cimka, R. Rak, New. J. Phys. 12 (2010) 105003. J.-B. Estoup, Gammes sténographiques. Methodes et exercises pour l’acquisition de la vitesse, Institut Sténographique de France, 1916. E.L. Thorndike, A Teacher’s Word Book of 20,000 Words, Teacher’s College, 1932. G.K. Zipf, Selective Studies and the Principle of Relative Frequency in Language, MIT Press, 1932. G.K. Zipf, Human Behavior and the Principle of Least Effort, Addison-Wesley, 1949. R.L. Axtell, Science 293 (2001) 1818–1820. S. Redner, Eur. Phys. J. B 4 (1998) 131–134. B. Mandelbrot, Trans. I.R.E. 3 (1954) 124–137. A. Cohen, R.N. Mantegna, S. Havlin, Fractals 5 (1997) 95–104. R. Ferrer i Cancho, R.V. Solé, Proc. Natl. Acad. Sci. USA 100 (2003) 788–791. R. Ferrer i Cancho, Eur. Phys. J. B 44 (2005) 249–257. H.A. Simon, Biometrika 42 (1955) 425–440. G.A. Miller, Amer. J. Psychol. 70 (1957) 311–314. W. Li, IEEE Trans. Inf. Theory 38 (1992) 1842–1845. R. Ferrer i Cancho, R.V. Solé, Adv. Complex Syst. 5 (2002) 1–6. R. Ferrer i Cancho, O. Riordan, B. Bollobás, Proc. R. Soc. Lond. B 272 (2005) 561–565. M.A. Montemurro, Physica A 300 (2001) 567–578. R. Ferrer i Cancho, R.V. Solé, J. Quant. Ling. 8 (2001) 165–173. J. Kwapień, S. Drożdż, A. Orczyk, Acta Phys. Pol. A 117 (2010) 716–720. J. Joyce, Ulisses, transl. by M. Słomczyński, Wydawnictwo Pomorze, 1992. S. Miyazima, Y. Lee, T. Nagamine, H. Miyajima, Physica A 278 (2000) 272–288. D.H. Zanette, S.C. Manrubia, Physica A 295 (2001) 1–16. H.S. Yamada, K. Iguchi, Physica A 387 (2008) 1628–1636. L.Q. Ha, P. Hanna, J. Ming, F.J. Smith, Artif. Intell. Rev., doi:10.1007/s10462-009-9135-4. T.C. Halsey, M.H. Jensen, L.P. Kadanoff, I. Procaccia, B.I. Shraiman, Phys. Rev. A 33 (1986) 1141–1151. A.-L. Barabási, T. Vicsek, Phys. Rev. A 44 (1991) 2730–2733. H.G.E. Hentschel, I. Procaccia, Physica D 8 (1983) 435–444. C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger, Phys. Rev. E 49 (1994) 1685–1689. J.W. Kantelhardt, S.A. Zschiegner, E. Koscielny-Bunde, A. Bunde, S. Havlin, H.E. Stanley, Physica A 316 (2002) 87–114. J.F. Muzy, E. Bacry, A. Arneodo, Internat. J. Bifur Chaos Appl. Sci. Engrg. 2 (1994) 245–302. A. Arneodo, E. Bacry, J.F. Muzy, Physica A 213 (1995) 232–275. I. Daubechies, Ten Lectures on Wavelets. CBMS-NSF Series in Appl. Math., SIAM, 1992. P. Oświe¸cimka, J. Kwapień, S. Drożdż, Phys. Rev. E 74 (2006) 016103. S. Jaffard, Probab. Theory Related Fields 114 (1999) 207–227.

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226 [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] [316] [317] [318] [319] [320] [321] [322] [323] [324] [325] [326] [327] [328] [329] [330] [331] [332] [333] [334] [335] [336] [337] [338] [339] [340] [341] [342] [343] [344] [345] [346] [347] [348] [349] [350] [351] [352] [353] [354] [355] [356] [357] [358] [359] [360] [361] [362] [363] [364] [365] [366] [367]

225

H. Nakao, Phys. Lett. A 266 (2000) 282–289. C. Amitrano, A. Coniglio, P. Meakin, M. Zanetti, Phys. Rev. B 44 (1987) 4974–4977. H.E. Stanley, P. Meakin, Nature 335 (1988) 405–409. M.H. Jensen, L.P. Kadanoff, A. Libchaber, I. Procaccia, J. Stavans, Phys. Rev. Lett. 55 (1985) 2798–2801. J.-F. Muzy, E. Bacry, A. Arneodo, Phys. Rev. Lett. 67 (1991) 3515–3518. F.S. Labini, M. Montouri, L. Pietronero, J. Physique IV 8 (1998) Pr-115–Pr-118. Y. Ashkenazy, D.R. Baker, H. Gildor, S. Havlin, Geophys. Res. Lett. 30 (2003) 2146–2149. P.C. Ivanov, L.A.N. Amaral, A.L. Goldberger, S. Havlin, M.G. Rosenblum, Z.R. Struzik, H.E. Stanley, Nature 399 (1999) 461–465. A. Fisher, L. Calvet, B. Mandelbrot, Multifractality of Deutschemark/US dollar exchange rates, Cowles Foundation Discussion Paper 1166, 1997. K. Ivanova, M. Ausloos, Physica A 265 (1999) 279–291. T.Di. Matteo, T. Aste, M.M. Dacorogna, Physica A 324 (2003) 183–188. A. Bershadskii, Physica A 317 (2003) 591–596. K. Matia, Y. Ashkenazy, H.E. Stanley, Europhys. Lett. 61 (2003) 422–428. P. Oświe¸cimka, J. Kwapień, S. Drożdż, Physica A 347 (2005) 626–638. Z. Eisler, J. Kertész, Eur. Phys. J. B 51 (2006) 145–154. Z.-Q. Jiang, W. Chen, W.-X. Zhou, Physica A 388 (2009) 433–440. F. Ren, W.-X. Zhou, Europhys. Lett. 84 (2008) 68001. Z. Eisler, J. Kertész, Europhys. Lett. 77 (2007) 28001. L. Borland, J.-P. Bouchaud, J.-F. Muzy, G. Zumbach, The dynamics of financial markets — Mandelbrot’s multifractal cascades and beyond, Wilmott Magazine, preprint, 2005, arXiv:cond-mat/0501292. L. Calvet, A. Fisher, B. Mandelbrot, Large deviations and the distributions of price changes, Cowles Foundation Discussion Paper 1165, 1997. T. Lux, Detecting multi-fractal properties in asset returns: the failure of the ‘scaling estimator’, University of Kiel, Working Paper, 2003. Z. Eisler, J. Kertész, Physica A 343 (2004) 603–622. E. Bacry, J. Delour, J.-F. Muzy, Phys. Rev. E 64 (2001) 026103. Ł Czarnecki, D. Grech, Acta Phys. Pol. A 117 (2010) 623–629. Z. Eisler, J. Kertész, S.H. Yook, A.-L. Barabási, Europhys. Lett. 69 (2005) 664–670. R. Rak, S. Drożdż, J. Kwapień, P. Oświe¸cimka, Acta Phys. Pol. B 36 (2005) 2459–2468. J. Kwapień, P. Oświe¸cimka, S. Drożdż, Physica A 350 (2005) 466–474. G. Lim, S. Kim, H. Lee, K. Kim, D.I. Lee, Physica A 386 (2007) 259–266. S. Kumar, N. Deo, Physica A 388 (2009) 1593–1602. Z.Y. Su, Y.T. Wang, H.Y. Huang, J. Korean Phys. Soc. 54 (2009) 1395–1402. W.-X. Zhou, Europhys. Lett. 88 (2010) 28004. S. Drożdż, J. Kwapień, P. Oświe¸cimka, R. Rak, Europhys. Lett. 88 (2009) 60003. J.-P. Bouchaud, M. Potters, M. Meyer, Eur. Phys. J. B 13 (2000) 595–599. S. Bianchi, Appl. Econ. Lett. 12 (2007) 775–780. D. Sornette, Phys. Rep. 297 (1998) 240–270. J. Nauenberg, J. Phys. A 8 (1975) 925–928. G. Jona-Lasinio, Nuovo Cimento B 26 (1975) 99–119. J.A. Feigenbaum, P.G.O. Freund, Internat. J. Modern Phys. B 10 (1996) 3737–3745. D. Sornette, A. Johansen, J.-P. Bouchaud, J. Physique I 6 (1996) 167–175. S. Drożdż, F. Ruf, J. Speth, M. Wójcik, Eur. Phys. J. B 10 (1999) 589–593. P. Oświe¸cimka, S. Drożdż, J. Kwapień, A.Z. Górski, Acta Phys. Pol. A 117 (2010) 637–639. M. Kozłowska, R. Kutner, Acta Phys. Pol. A 118 (2010) 677–687. G. Harras, D. Sornette, J. Econ. Behav. Org. 80 (2011) 137–152. D. Sornette, A. Johansen, Physica A 261 (1998) 581–598. A. Johansen, D. Sornette, Eur. Phys. J. B 9 (1999) 167–174. A. Johansen, O. Ledoit, D. Sornette, Int. J. Theor. Appl. Finance 3 (2000) 219–255. A. Johansen, D. Sornette, Physica A 294 (2001) 465–502. S. Drożdż, F. Grümmer, F. Ruf, J. Speth, Physica A 324 (2003) 174–182. A. Johansen, D. Sornette, Eur. Phys. J. B 17 (2000) 319–328. S. Drożdż, J. Kwapień, P. Oświe¸cimka, J. Speth, Acta Phys. Pol. A 114 (2008) 539–546. S. Drożdż, J. Kwapień, P. Oświe¸cimka, Acta Phys. Pol. A 114 (2008) 699–702. L. Laloux, M. Potters, R. Cont, J.-P. Aguilar, J.-P. Bouchaud, Europhys. Lett. 45 (1999) 1–5. D. Brée, N.L. Joseph, Fitting the log periodic power law to financial crashes: a critical analysis, 2010, preprint no. arXiv:1002.1010. D. Brée, D. Challet, P.P. Peirano, Prediction accuracy and sloppiness of log-periodic functions, 2010, preprint no. arXiv:1006.2010. W. Yan, R. Woodard, D. Sornette, Phys. Procedia 3 (2010) 1641–1657. M. Bartolozzi, S. Drożdż, D.B. Leinweber, J. Speth, A.W. Thomas, Internat. J. Modern Phys. C 16 (2005) 1347–1361. http://picasaweb.google.com/finpredict (conducted by S. Drożdż, J. Kwapień, and P. Oświe¸cimka). A.-L. Barabási, Science 325 (2009) 412–413. J.-P. Onnela, J. Saramaki, J. Kertész, K. Kaski, Phys. Rev. E 71 (2005) 065103(R). R. Albert, H. Jeong, A.-L. Barabási, Nature 401 (1999) 130–131. R. Percacci, A. Vespignani, Eur. Phys. J. B 32 (2003) 411–414. A. Barrat, M. Barthélemy, R. Pastor-Satorras, A. Vespignani, Proc. Natl. Acad. Sci. USA 101 (2004) 3747–3752. R. Guimerà, S. Mossa, A. Turtschi, L.A.N. Amaral, Proc. Nat. Acad. Sci. USA 102 (2005) 7794–7799. W. Bachnik, S. Szymczyk, P. Leszczyński, R. Podsiadło, E. Rymszewicz, Ł Kuryło, D. Makowiec, B. Bykowska, Acta Phys. Pol. B 36 (2005) 3179–3191. M. Krawczyk, L. Muchnik, A. Mańka-Krasoń, K. Kułakowski, Physica A 390 (2011) 2611–2618. F. Liljeros, C.R. Edling, L.A.N. Amaral, H.E. Stanley, Y. Aberg, Nature 411 (2001) 907–908. R. Pastor-Satorras, A. Vespignani, Phys. Rev. Lett. 86 (2001) 3200–3203. H. Ma, A.-P. Zeng, Bioinformatics 19 (2003) 270–277. R. Tanaka, Phys. Rev. Lett. 94 (2005) 168101. J.M. Stuart, E. Segal, D. Koller, S.K. Kim, Science 302 (2003) 249–255. S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, D.-U. Hwang, Phys. Rep. 424 (2006) 175–308. M. Barthélemy, Phys. Rep. 499 (2011) 1–101. R. Ferrer i Cancho, R.V. Solé, Proc. R. Soc. Lond. B 268 (2001) 2261–2265. R. Ferrer i Cancho, R.V. Solé, R. Koehler, Phys. Rev. E 69 (2004) 051915. H. Fukś, Proc. 2009 IEEE Toronto Int. Conf. — Science and Technology for Humanity, IEEE, 2009, pp. 491–496. R. Albert, H. Jeong, A.-L. Barabási, Nature 406 (2000) 378–382. J. Travers, S. Milgram, Sociometry 32 (1969) 425–443.

226 [368] [369] [370] [371] [372] [373] [374] [375] [376] [377] [378] [379] [380] [381] [382] [383] [384] [385] [386] [387]

J. Kwapień, S. Drożdż / Physics Reports 515 (2012) 115–226 D.J. Watts, S.H. Strogatz, Nature 393 (1998) 440–442. A. Fronczak, P. Fronczak, J.A. Hołyst, Phys. Rev. E 70 (2004) 056110. R.N. Mantegna, Eur. Phys. J. B 11 (1999) 193–197. G. Bonanno, G. Caldarelli, F. Lillo, S. Miccichè, N. Vandewalle, R.N. Mantegna, Eur. Phys. J. B 38 (2004) 363–371. P. Sieczka, J.A. Hołyst, Physica A 388 (2009) 1621–1630. S. Drożdż, J. Kwapień, J. Speth, AIP Conf. Proc. 1261 (2010) 256–264. C. Coronnello, M. Tumminello, F. Lillo, S. Miccichè, R.N. Mantegna, Acta Phys. Pol. B 36 (2005) 2653–2679. G. Bonanno, G. Caldarelli, F. Lillo, R.N. Mantegna, Phys. Rev. E 68 (2003) 046130. S. Fortunato, Phys. Rep. 486 (2010) 75–174. G.J. Ortega, D. Matesanz, Internat. J. Modern Phys. C 17 (2006) 333–341. J. Kwapień, S. Gworek, S. Drożdż, A.Z. Górski, Acta Phys. Pol. B 40 (2009) 175–194. D.-M. Song, M. Tumminello, W.-X. Zhou, R.N. Mantegna, Phys. Rev. E 84 (2011) 026108. S. Miccichè, G. Bonanno, F. Lillo, R.N. Mantegna, Physica A 324 (2003) 66–73. M. McDonald, O. Suleman, S. Williams, S. Howison, N.F. Johnson, Phys. Rev. E 72 (2005) 046106. D.J. Fenn, M.A. Porter, P.J. Mucha, M. Mcdonald, S. Williams, N.F. Johnson, N.S. Jones, Dynamical clustering of exchange rates, 2009, preprint arXiv:0905.4912. S. Wolfram, A New Kind of Science, Wolfram Media, 2002. S.M. Stigler, Trans. N. Y. Acad. Sci. 39 (1980) 147–158. S. Drożdż, The science of complexity: with the Bible along Wall Street, Nasza Politechnika no. 91, Cracow Univ. of Technology, 2011, pp. 23–26 (in Polish). B.B. Mandelbrot, Fractals and Scaling in Finance, Springer, 1997. R.K. Merton, Science 159 (1968) 56–63.