A comparison of LMC and SDL complexity measures on binomial distributions

A comparison of LMC and SDL complexity measures on binomial distributions

Physica A 444 (2016) 271–275 Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa A comparison of LM...

395KB Sizes 2 Downloads 9 Views

Physica A 444 (2016) 271–275

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

A comparison of LMC and SDL complexity measures on binomial distributions José Roberto C. Piqueira ∗ Escola Politécnica da Universidade de São Paulo, National Institute of Technology for Complex Systems, Brazil Avenida Prof. Luciano Gualberto, travessa 3, n. 158, 05508-900, São Paulo, SP, Brazil

highlights • • • • •

SDL and LMC complexity measures are applied to repeated trials binomial distribution. Increasing the number of trials, the informational entropy decreases. Maximum SDL and LMC complexity measures occur for unbalanced success probability of the trials. SDL and LMC complexity measures decrease with the number of trials. The maximum values of SDL and LMC measures do not depend on the number of trials.

article

info

Article history: Received 15 June 2015 Received in revised form 4 October 2015 Available online 23 October 2015 Keywords: Binomial Complexity Information Measure Probability

abstract The concept of complexity has been widely discussed in the last forty years, with a lot of thinking contributions coming from all areas of the human knowledge, including Philosophy, Linguistics, History, Biology, Physics, Chemistry and many others, with mathematicians trying to give a rigorous view of it. In this sense, thermodynamics meets information theory and, by using the entropy definition, López-Ruiz, Mancini and Calbet proposed a definition for complexity that is referred as LMC measure. Shiner, Davison and Landsberg, by slightly changing the LMC definition, proposed the SDL measure and the both, LMC and SDL, are satisfactory to measure complexity for a lot of problems. Here, SDL and LMC measures are applied to the case of a binomial probability distribution, trying to clarify how the length of the data set implies complexity and how the success probability of the repeated trials determines how complex the whole set is. © 2015 Elsevier B.V. All rights reserved.

1. Introduction Life emerged from complex interactions between pieces of matter, generating dynamical processes that need more than reductionist science to be explained [1,2]. In spite of this, the reductionistic approach gave a strong development to all natural sciences along the former two centuries. However, during the second half of the 20th century the onset of a new science unification movement started when the study of dissipative structures in chemical reactions [3] generated the ideas of ‘‘Deterministic chaos’’ [4], ‘‘Selforganization’’ [5] and ‘‘Self-organizing criticality’’ [6] that, together with the development of Nonlinear Dynamics [4], supported by computational methods, became popular.



Correspondence to: Escola Politécnica da Universidade de São Paulo, National Institute of Technology for Complex Systems, Brazil. E-mail address: [email protected].

http://dx.doi.org/10.1016/j.physa.2015.10.040 0378-4371/© 2015 Elsevier B.V. All rights reserved.

272

J.R.C. Piqueira / Physica A 444 (2016) 271–275

Combined with the Edgard Morin ideas about ‘‘Complex thinking’’ [7], these new ways of thinking pervaded several areas of the human knowledge, decisively contributing to model important problems related to the onset of surprising or catastrophic events as traffic jams, crowd behavior, tsunamis and earthquakes [8]. In this paper, by using a simple probability distribution, it is shown that the size of a system or a set of data is not the origin of complex behavior. Uncertainty and interaction, as exhaustively discussed in the literature, combined with nonlinearities, are the main agents of complexity [9]. In the next section, the concepts of informational entropy, disorder, order, SDL complexity and LMC complexity [10,11] are presented. Their applications to binomial distributions are explained followed by a section with the numerical results related to number of trials and success probability at each trial. A conclusion section closes the work. 2. Binomial distribution: entropy, SDL and LMC complexity measures To start the calculations, N binomial independent trials with success probability p are considered. Consequently, the probability density function is discrete and given by: p(i) =

  N i

pi (1 − p)N −i ,

(1)

for i = 0, 1, . . . , N [12]. With the probability function in hand, it is possible to calculate the informational entropy, E, in bits per trial, by using: E=−

N 

pi log2 pi ,

(2)

i =0

having its maximum value given by: Emax = log2 N [13]. Consequently, the disorder, ∆, measuring the thermodynamic equilibrium [10,11], is calculated by using:

∆=

E Emax

.

(3)

As stated in Ref. [10], combining disorder, ∆ and order (1 − ∆), SDL complexity measure, CSDL , can be defined as: CSDL = ∆(1 − ∆),

(4)

meaning that maximum complexity corresponds to an equal balance between order and disorder. LMC complexity measure contains a term called disequilibrium D that in SDL complexity measure was replaced by the order term (1 − ∆). The disequilibrium D measures the distance between the probability distribution and the equiprobable one, and is defined by: D=

N  (pi − 1/N )2 .

(5)

i=0

Then, LMC complexity measure is given by: CLMC = 1D.

(6)

3. Numerical experiments Based on the simple theoretical points related in the former section, some questions can be discussed. The first is about how the increase of repeated independent binomial trials changes informational entropy of the experiment. For some different values of p, Fig. 1 shows how informational entropy, E, depends on the total number of trials, N. Other point to be discussed is how the success probability of each trial, p, changes informational entropy, E, SDL, CSDL , and LMC, CLMC , measures considering the number of trials, N, as a parameter. Fig. 2 shows these functions considering the number of trials equal to 1, 5 and 10. Observing Fig. 2, it seems to be interesting to observe how the parametrization by the number of trials changes SDL and LMC complexity measures. Figs. 3 and 4 show the result of the numerical experiment of changing the number of trials, observing the behavior of SDL complexity (Fig. 3) and LMC complexity (Fig. 4). 4. Conclusions In spite of using only classical works and a very simple example, this work provides several interesting facts about complexity. The first is a consequence of the observation of Fig. 1, i.e., as the number of independent trials increases, the informational entropy of the whole set of the results decreases, indicating that the computational complexity [14] decreases as the length of the data set increases.

J.R.C. Piqueira / Physica A 444 (2016) 271–275

273

Fig. 1. Binomial distribution informational entropy × number of trials.

(a) Informational entropy and SDL and LMC complexities (n = 1).

(b) Informational entropy and SDL and LMC complexities (n = 5).

(c) Informational entropy and SDL and LMC complexities (n = 10). Fig. 2. Entropy and complexity × Probability.

The observation of Fig. 2(a) confirms that maximum entropy per binary symbol occurs when these symbols are equiprobable and its value is 1 bit per symbol. Consequently, at this point, SDL and LMC complexities are zero and their maximum values occur for unbalanced probabilities, being equal to 0.25 for SDL and 0.15 for LMC. Fig. 2(b) and (c) shows that increasing the number of trials, flats entropy and complexity measure curves, maintaining the minimum value for complexity at the point p = 0.5 and the symmetry of the curves. The maximum entropy continues

274

J.R.C. Piqueira / Physica A 444 (2016) 271–275

Fig. 3. SDL Complexity × Probability (N as a parameter).

Fig. 4. LMC Complexity × Probability (N as a parameter).

to be at p = 0.5, but the value is lower than 1, corresponding to minimum, but not zero, SDL complexity and zero LMC complexity. The maximum values for SDL and LMC continue to occur for unbalanced probabilities and its value is maintained near than 0.25 for SDL and slightly increases for LMC. Fig. 3 confirms this observation, i.e., increasing the number of trials, for any value of probability, SDL measure decreases. The minimum value for SDL measure occurs for p = 0.5 and its maximum is equal to 0.25. This fact does not depend on the number of trials, i.e., on the length of the data set. For LMC measure, the results are qualitatively the same as the SDL results, i.e., increasing the number of trials, for any value of p, complexity measure decreases, as shown in Fig. 4, confirming the results shown in Refs. [11,15]. The minimum value for LMC measure occurs for p = 0.5 and this fact does not depend on the number of trials, i.e., on the length of the data set. Finally, one can say that complexity, measured either by LMC or SDL, does not increase with the number of successive experiments, if they are independent. To increase complexity the trials must be related. Besides, unbalanced probabilities are an important source of complexity. References [1] [2] [3] [4] [5]

E. Schrödinger, My View of the World, Ox Bow Press, Woodbridge, Connecticut, USA, 1983, Reprint. E. Schrödinger, What is Life? Cambridge University Press, Cambridge, UK, 2013, 14th-Reprint. G. Nicolis, I. Prigogine, Self-Organization in Nonequilibrium Systems, John Wiley & Sons, USA, 1977. S. Wiggins, Introduction to Nonlinear Dynamical Systems, Springer, USA, 2003. H. Haken, Information and Self-Organization, Springer-Verlag, Berlim, Germany, 2000.

J.R.C. Piqueira / Physica A 444 (2016) 271–275 [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

P. Bak, How Nature Works: The Science of Self-Organised Criticality, Copernicus Press, New York, USA, 1996. E. Morin, Introducción al Pensamiento Complejo, Gedisa, Spain, 2011. A. Bunde, J. Kropp, H.J. Schellnhuber, The Science of Disasters, Springer-Verlag, Berlin, Germany, 2002. P. Érdi, Complexity Explained, Springer-Verlag, Berlin, Germany, 2008. J. Shiner, M. Davison, P. Landsberg, Simple measure for complexity, Phys. Rev. E 59 (2) (1999) 1459–1464. R. López-Ruiz, H.L. Mancini, X. Calbet, A statistical measure of complexity, Phys. Lett. A 209 (5–6) (1995) 321–326. R.B. Ash, Basic Probability Theory, Dover, USA, 2008. C.E. Shannon, W. Weaver, The Mathematical Theory of Communication, Illini Books Edition, Urbana, Chicago, USA, 1963. A.N. Kolmogorov, Three approaches to the definition of the concept quantity of information, Problemy Peredachi Informatsii 1 (1965) 3–11. X. Calbet, R. López-Ruiz, Tendency towards maximum complexity in a nonequilibrium isolated system, Phys. Rev. E 63 (2001) 066116.

275