Multifractal detrended cross-correlation analysis of genome sequences using chaos-game representation

Multifractal detrended cross-correlation analysis of genome sequences using chaos-game representation

Physica A xx (xxxx) xxx–xxx Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa Q1 Q2 Multifracta...

1MB Sizes 0 Downloads 86 Views

Physica A xx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

Q1

Q2

Multifractal detrended cross-correlation analysis of genome sequences using chaos-game representation Mayukha Pal a,b , V. Satya Kiran c , P. Madhusudana Rao b , P. Manimaran a,∗ a

C R Rao Advanced Institute of Mathematics, Statistics and Computer Science, Gachibowli, Hyderabad-500 046, India

b

College of Engineering, Jawaharlal Nehru Technological University, Hyderabad-500085, India

c

Institute of Science and Technology, Jawaharlal Nehru Technological University, Kakinada - 533003, India

highlights • We have analyzed the genome sequences of some prokaryotes using 2D MF-X-DFA for characterizing multifractal cross-correlations. • The existence of strong multifractal behavior is observed between the genome sequences. • Cluster analysis was performed on the calculated scaling exponents for finding the class affiliation of genome sequences.

article

info

Article history: Received 15 October 2015 Received in revised form 24 February 2016 Available online xxxx Keywords: Multifractal detrended cross-correlation analysis Chaos game representation Genome sequences Scaling exponent Cluster analysis

abstract We characterized the multifractal nature and power law cross-correlation between any pair of genome sequence through an integrative approach combining 2D multifractal detrended cross-correlation analysis and chaos game representation. In this paper, we have analyzed genomes of some prokaryotes and calculated fractal spectra h(q) and f (α). From our analysis, we observed existence of multifractal nature and power law cross-correlation behavior between any pair of genome sequences. Cluster analysis was performed on the calculated scaling exponents to identify the class affiliation and the same is represented as a dendrogram. We suggest this approach may find applications in next generation sequence analysis, big data analytics etc. © 2016 Elsevier B.V. All rights reserved.

1. Introduction A large number of studies have been carried out on natural phenomena that are expressed as time series and images for characterizing the correlation behavior and fractal nature [1,2]. In this context, searching for information in genomic sequences has attracted considerable attention in bioinformatics. The concepts of statistical physics, signal processing, nonlinear dynamics etc. play an important role in inferring information from dynamics and structure of genomic sequences [3,4]. Until now, various methods have been developed to characterize correlation and fractal nature of 1D and 2D data sets. Starting from R/S analysis, structure function, wavelet transform modulus maxima, detrended fluctuation analysis and its variants, detrended moving average fluctuation analysis and its variants, average wavelet coefficient method, wavelet based fluctuation analysis method etc. have been used to analyze the correlation behavior and fractal nature in one dimension and higher dimensions [5–16]. These approaches find many applications ranging from finance, medicine, physics, biology

∗ Correspondence to: C R Rao Advanced Institute of Mathematics, Statistics and Computer Science, University of Hyderabad Campus, Prof. C R Rao Road, Gachibowli, Hyderabad-500046, India. Tel.: +91 40 2301 3118. E-mail address: [email protected] (P. Manimaran). http://dx.doi.org/10.1016/j.physa.2016.03.074 0378-4371/© 2016 Elsevier B.V. All rights reserved.

1

2

Q3

3 4 5 6 7 8 9 10

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21

22 23 24 25 26 27 28 29 30

31 32 33 34 35 36 37 38

39

M. Pal et al. / Physica A xx (xxxx) xxx–xxx

etc. [17–30]. In biological sciences, many studies on DNA sequences were using these above mentioned methods that have Q4 reported the power-law correlations and multifractal behavior [24–29].

Recently, a new method on detrended cross-correlation analysis has been emerged to analyze the cross-correlation behavior and fractal nature between any two non-stationary time series [30,31] and later the same approach has been applied to analyze data sets in higher dimensions [32]. Apart from this detrended moving average cross-correlation analysis method and its variants, statistical moment based multifractal cross-correlation analysis, multifractal height crosscorrelation analysis and detrended partial cross-correlation analysis method has emerged as a tool for characterizing the data sets in different dimensions [33–36]. These approaches find application in earth sciences, econophysics, engineering and natural sciences [37–45]. In biology, the multifractal detrended cross-correlation analysis (MF-X-DFA) method was applied on coding and non-coding DNA sequences to identify the class affiliation of bacteria and archea [46]. In 1990, Jeffery has developed the chaos game representation method visualize the DNA sequences in an image format [47]. This approach finds applications in investigating patterns in sequence segments, analysis of secondary structure of the genome and protein sequences, phylogenetic analysis, multifractal analysis etc. [48–55]. More recently, combining with chaos game representation and the two dimensional MF-X-DFA method, a new integrative approach was developed Q5 to study the cross-correlation behavior of DNA sequences whose lengths are not equal in size [58]. In this paper, we applied recently developed integrative approach combining the 2D dimensional multifractal detrended cross-correlation analysis and chaos game representation theory to characterize the cross-correlation and multifractal nature of whole genomes. Apart from this, our main interest is to make a class affiliation among analyzed genomes using the above approach. In Section 2 of our paper we describe the CGR and MF-X-DFA procedure while Section 3 is dedicated to share results and discussion and Section 4 gives our conclusions to the study. 2. Methods In this work, the recently developed approach integrating the CGR and the 2D MF-X-DFA was applied to genome sequence data [55]. In which the CGR is used to convert the genome sequence into an image (i.e. 2 dimensional data) through an iterative mapping technique formulated by H.J. Jeffery [56]. By this procedure the CGR images of different genomes will have same size images even if genome sequences are of unequal lengths. The cross-correlation analysis was carried out on the CGR images using the recently developed MF-X-DFA method developed to characterize the multifractal nature and cross-correlation behavior between any pair of one dimensional or two dimensional data sets by W.X. Zhou [32]. Our earlier paper gives the detailed procedure of the integrative approach used in this study [55]. From the analysis of the CGR images, the fluctuations are extracted. The fluctuation function Fxy (q, s) is obtained for various scales ‘s’. The power law behavior of the data is obtained by analyzing the fluctuation function. Fxy (q, s) ∼ shxy (q) .

(1)

If the calculated scaling exponents hxy (q) values are independent of q values, then cross correlated images poses monofractal nature. Similarly, if the hxy (q) values are depending on q values then there exists multifractal nature. The 2D MF-X-DFA method is same as 2D MFDFA when the cross-correlated data sets is same i.e. x = y. Further for positive q, the hxy (q) describes the scaling behavior of the segments with large fluctuations. On the contrary, for negative q, hxy (q) describes the scaling behavior of the segments with small fluctuations. The multifractal behavior of the cross-correlated data sets can also be studied by evaluating the fxy (α) spectrum. The Legendre transform of τxy (q) gives values of fxy (α): fxy (α) ≡ qαxy − τxy (q).

(2)

42

Here τxy (q) = qhxy (q) − Df , for the 2D CGR images in our study we consider Df = 2. Also the values of αxy is obtained from αxy = dτxy (q)/dq. The strength of the multifractality can be calculated from the width of the fxy (α) spectrum. Broader the spectrum stronger the multifractality and narrower spectrum depicts weak multifractal behavior.

43

3. Results and discussion

40 41

44 45 46 47 48 49 50 51 52 53 54

In this work, we have analyzed the genomic sequences of some prokaryote for characterizing cross-correlation behavior using an integrative approach combining chaos game representation and multifractal detrended cross-correlation analysis method. The genomic sequence data of eight prokaryotes were obtained from the National Center for Biotechnology Information (NCBI) Genome database (http://www.ncbi.nlm.nih.gov/). The details of genomic sequences are given in Table 1. As an initial step, genomic sequences were converted into an image using chaos game representation theory and then frequency CGR matrix were obtained for all genomes. To obtain the frequency CGR matrix we have divided CGR image into 2k × 2k grids and ‘k’ represent the length of oligonucleotide considered for study. Total number of points inside each grid is the frequency of ‘k’ length oligonucleotide present in the sequence. In this work, we consider k = 10, and the resulted frequency CGR matrix is of size 1024 × 1024. Thus the obtained frequency CGR matrix is then passed as an input to the 2D MF-X-DFA method. It is worth mentioning that the minimum size of CGR image to be considered for analysis should be 8 × 8 (i.e. k ≥ 3) so the minimum scale ‘s’ will be 1/4th size of the CGR image. The CGR approach can make a frequency CGR

M. Pal et al. / Physica A xx (xxxx) xxx–xxx

3

Table 1 The list of genome sequences of eight prokaryotes considered in our analysis. Phylum

Species

Strain

Short name

Firmicutes Tenericutes Actinobacteria Actinobacteria Proteobacteria Proteobacteria Proteobacteria Proteobacteria

Mycoplasma genitalium Mycoplasma pneumoniae Streptomyces coelicolor Mycobacterium tuberculosis Haemophilus influenzae Campylobacter jejuni Escherichia coli Helicobacter pylori

Mycoplasma genitalium G37 uid57707 Mycoplasma pneumonia 309 uid85495 Streptomyces coelicolor A3 2 uid57801 Mycobacterium tuberculosis Beijing NITR 203 uid197218 Haemophilus influenza 10810 uid86647 Campylobacter jejuni 00 2425 uid219359 Escherichia coli 042_uid161985 Helicobacter pylori J99 uid57789

M. genitalium M. pneumoniae S. coelicolor M. tuberculosis H. influenza C. jejuni E. coli H. pylori

Fig. 1. The calculated h(q) values depend on q values, which implies the existence of multifractal cross-correlation behavior for all bivariate time series.

matrix of equivalent size of different genomes whose lengths are not of equal size. This procedure of pattern making using FCGR matrix could be used to deduce relationships between genomes [38,39]. To obtain the fractal characteristics and cross-correlation behavior between any pair of genomes we have made use of the recently developed 2D MF-X-DFA method. Since the length of genome may vary for different organisms, frequency CGR matrix was used as input to the above method for cross-correlation analysis. We analyzed eight bacterial genome data sets belonging to 4 distinct phylum groups. They are Firmicutes (M.genitalium); Proteobacteria (H.pylori, Hinflu, C.jejuni, and E.coli); Tenericutes (pneumoniae); Actinobacteria (S.coelicolor, and MTb). The cross-correlation analysis was performed between all pairs of genome. From the results, we observe existence of multifractality nature in all data sets (i.e.) the scaling exponents hxy (q) values varies with ‘q’ values as clearly shown in Fig. 1. The strength of multifractality was observed from the width of the calculated singularity spectrum, broader the spectrum stronger the multifractal behavior. The narrow spectrum reveals the weak multifractality nature. Fig. 2 depicts the singularity spectrum of all bivariate time series that shows strong multifractality nature. The same cross-correlation analysis was performed when the time series x and y are same (i.e. 2D MFDFA), all the eight time series show strong multifractal behavior and also it is evident from broader singularity spectrum (Fig. 3). In this study, we use the scale range [32, 64, 128, 256, 512] and also qth order moments values from −10 to +10 with a step size of 0.2.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

4

M. Pal et al. / Physica A xx (xxxx) xxx–xxx

Fig. 2. The broad singularity spectrum f (α) shows strong multifractal nature for all the bivariate time series.

Fig. 3. The cross-correlation analysis performed when the two time series are same (i.e. x = y), the calculated h(q) values show multifractal behavior and it is also from evident broader f (α) spectrum.

1 2 3 4 5 6 7 8

In further analysis, we performed cluster analysis on scaling exponents hxy (q) values obtained at q = 2 (i.e. hxy (q = 2) ≡ Hxy ) from all the bivariate time series. These Hxy values are given in Table 2. Using the Hxy we have constructed a dendrogram for a given set of bacterial genomes. A dendrogram is a tree like structure that represents the closeness between entities based on its distance. The Hxy values are used as distance to represent how likely two organisms are evolved from common ancestors. We used hierarchical clustering analysis considering Hxy values as distance that determines the similarity between any pair of genomes. The analysis was performed using MATLAB to construct the dendrogram. It is worth mentioning that in bioinformatics, gene sequences which characterize the phylogeny of an organism, for example 16S rrna, are used for constructing a phylogenetic tree. The methods like clustalw initially do multiple sequence alignment of genes

M. Pal et al. / Physica A xx (xxxx) xxx–xxx

5

Table 2 The calculated scaling exponents Hxy = (hxy (q = 2)) between all possible pairs of CGR images of genome sequences. Organism

M. genitalium

M. pneumoniae

S. coelicolor

M. tuberculosis

H. influenzae

C. jejuni

E. coli

H. pylori

M. genitalium M. pneumonia S. coelicolor M. tuberculosis H. influenza C. jejuni E. coli H. pylori

1.848 – – – – – – –

1.896 1.915 – – – – – –

2.155 2.080 1.851 – – – – –

2.095 2.043 1.888 1.899 – – – –

1.896 1.929 2.084 2.044 1.905 – – –

1.846 1.896 2.154 2.101 1.877 1.806 – –

1.967 1.978 1.977 1.970 1.965 1.963 1.954 –

1.881 1.918 2.096 2.056 2.044 1.852 1.971 1.861

Fig. 4. The obtained dendrogram after cluster analysis on the calculated scaling exponents Hxy between all bivariate time series.

which is again represented as a phylogenetic tree. But the 2D MF-X-DFA approach takes a whole genome as a unit and predicts functional relationships between species from the calculated Hxy values. The obtained dendrogram is shown in Fig. 4. Our work throws light on phylogenetic closeness between various organisms. This method may be helpful to interpret evolutionary characteristics between the species under a biological context.

1 2 3 4

4. Conclusion

5

In conclusion, we have analyzed the eight prokaryote genomes using an integrative approach combining chaos game representation and 2D MF-X-DFA method for multifractal characteristics. Also we estimate phylogeny through cluster analysis from the calculated scaling exponent Hxy values of all bivariate time series analysis. This study may throw light in analyzing genomes that are now being sequenced regularly using Next Generation Sequencing (NGS) and could be useful in bioinformatics, big data analytics etc.

6 7 8 9 10

Uncited references

11

Q6

[57].

12

Acknowledgment

13

The authors MP, and PM would like to thank Department of Science and Technology, Government of India, for their financial support (DST-CMS GoI Project No. SR/S4/MS: 516/07 Dated 21.04.2008).

14 15

References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

B.B. Mandelbrot, The Fractal Geometry of Nature, W.H. Freeman, New York, 1982. J. Feder, Fractals, Plenum, New York, 1998. B.B. James, S.L. Larry, J.W. Bruce, Fractal Physiology, Oxford University Press, New York, 1994. A. Bunde, S. Havlin, Fractals in Science, Springer Verlag, Heidelberg, 1995. H.E. Hurst, Long-term storage capacity of reservoirs, Trans. Amer. Soc. Civ. Eng. 116 (1951) 770. J.F. Muzy, E. Bacry, A. Arneodo, Wavelets and multifractal formalism for singular signals: Application to turbulence data, Phys. Rev. Lett. 67 (1991) 3515. S. Ingve, H. Alex, M.N. Olav, Determination of the Hurst exponent by use of wavelet transforms, Phys. Rev. E 58 (1998) 2779. J.W. Kantelhardt, S.A. Zschiegner, E. Koscielny-Bunde, S. Havlin, A. Bunde, H.E. Stanley, Multifractal detrended fluctuation analysis of nonstationary time series, Physica A 316 (2002) 87. E. Alessio, A. Carbone, G. Castelli, V. Frappietro, Second-order moving average and scaling of stochastic time series, Eur. Phys. J. B 27 (2002) 197. A.-R. Jose, C.E. Juan, R. Eduardo, Performance of a high-dimensional R/S method for Hurst exponent estimation, Physica A 387 (2008) 6452–6462. G.F. Gu, W.-X. Zhou, Detrended fluctuation analysis for fractals and multifractals in higher dimensions, Phys. Rev. E 74 (2006) 061104.

16

17 18 19 20 21 22 23 24 25

Q7

26 27

6

1 2 3 4 5

[12] [13] [14] [15] [16]

10

[17] [18] [19] [20] [21]

11

[22]

12

[23]

13 14

[24] [25]

15

[26]

16

[27] [28]

6 7 8 9

17

19

[29] [30]

20

[31]

21

24

[32] [33] [34] [35]

25

[36]

26

[37]

27

[38] [39] [40] [41] [42] [43]

18

22 23

28 29 30 31 32

34

[44] [45]

35

[46]

36

[47]

37

[48]

38

[49]

39 40

[50] [51]

41

[52]

42

[53]

43

[54] [55]

33

44 45 46

[56] [57]

M. Pal et al. / Physica A xx (xxxx) xxx–xxx Y. Zhou, Y. Leung, Z.-G. Yu, Relationships of exponents in two-dimensional multifractal detrended fluctuation analysis, Phys. Rev. E 87 (2013) 012921. G.F. Gu, W.-X. Zhou, Detrending moving average algorithm for multifractals, Phys. Rev. E 82 (2010) 011136. S. Arianos, A. Carbone, C. Türk, Self-similarity of higher-order moving averages, Phys. Rev. E 84 (2011) 046113. P. Manimaran, P.K. Panigrahi, J.C. Parikh, Wavelet analysis and scaling properties of time series, Phys. Rev. E 72 (2005) 046120. P. Manimaran, P.K. Panigrahi, J.C. Parikh, Multiresolution analysis of fluctuations in non-stationary time series through discrete wavelets, Physica A 388 (2009) 2306. T. Patrick, Two-dimensional turbulence: a physicist approach, Phys. Rep. 362 (2002) 1–62. A.-R. Jose, R. Eduardo, C. Ilse, C.E. Juan, Scaling properties of image textures: A detrending fluctuation analysis approach, Physica A 361 (2006) 677–698. P. Manimaran, P.K. Panigrahi, J.C. Parikh, Difference in nature of correlation between NASDAQ and BSE indices, Physica A 387 (2008) 5810. F. Wang, D.-W. Liao, J.-W. Li, G.-P. Liao, Two-dimensional multifractal detrended fluctuation analysis for plant identification, 11, 2015, p. 12. A.V. Alpatov, S.P. Vikhrov, N.V. Grishankina, Revealing the surface interface correlations in a-Si:H films by 2D detrended fluctuation analysis, Semiconductors 47 (2013) 365. C. Meneveau, K.R. Sreenivasan, P. Kailasnath, M.S. Fan, Joint multifractal measures: Theory and applications to turbulence, Phys. Rev. A 41 (1990) 894–913. W.-J. Xie, Z.-Q. Jiang, G.-F. Gu, X. Xiong, W.-X. Zhou, Joint multifractal analysis based on the partition function approach: analytical analysis, numerical simulation and empirical application, New J. Phys. 17 (2015) 103020. C.-K. Peng, S.V. Buldyrev, A.L. Goldberger, F. Sciortino, M. Simons, H.E. Stanley, Fractal landscape analysis of DNA walks, Physica A 191 (1992) 25–29. S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, M. Simons, F. Sciortino, H.E. Stanley, Long-range power-law correlations in DNA, Phys. Rev. Lett. 71 (1992) 1776. 3. C.-K. Peng, S.V. Buldyrev, A.L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H.E. Stanley, Long-range correlations in nucleotide sequences, Nature 356 (1992) 168–170. C.K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger, Mosaic organization of DNA nucleotides, Phys. Rev. E 49 (1994) 1685. P. Bernaola-Galván, R. Román-Roldán, J.L. Oliverm, Compositional segmentation and long-range fractal correlations in DNA sequences, Phys. Rev. E 53 (1996) 5181. Z.G. Yu, V. Anh, K.S. Lau, Multifractal and correlation analyses of protein sequences from complete genomes, Phys. Rev. E 68 (2003) 021913. B. Podobnik, H.E. Stanley, Detrended cross-correlation analysis: a new method for analyzing two non-stationary time series, Phys. Rev. Lett. 100 (2008) 084102. B. Podobnik, D. Horvatics, A. Peterson, H.E. Stanley, Cross-correlations between volume change and price change, Proc. Natl. Acad. Sci. USA 106 (2009) 22079. W.-X. Zhou, Multifractal detrended cross-correlation analysis for two non-stationary signals, Phys. Rev. E 77 (2008) 066211. Z.Q. Jiang, W.-X. Zhou, Multifractal detrending moving average cross-correlation analysis, Phys. Rev. E 84 (2011) 016106. J. Wang, P.-J. Shang, W.-J. Ge, Multifractal cross-correlation analysis based on statistical moments, Fractals 20 (2012) 271–279. L. Kristoufek, Multifractal height cross-correlation analysis: A new method for analyzing long-range cross-correlations, Europhys. Lett. 95 (2011) 68001. X.-Y. Qian, Y.-M. Liu, Z.-Q. Jiang, B. Podobnik, W.-X. Zhou, H.E. Stanley, Detrended partial cross-correlation analysis of two nonstationary time series influenced by common external forces, Phys. Rev. E 91 (2015) 062816. F. Ma, Y. Wei, D. Huang, Multifractal detrended cross-correlation analysis between the Chinese stock market and surrounding stock markets, Physica A 392 (2013) 1659. L.Y. He, S.P. Chen, Multifractal detrended cross-correlation analysis of agricultural futures markets, Chaos Solitons Fractals 44 (2011) 355. C. Xue, P. Shang, W. Jing, Multifractal detrended cross-correlation analysis of BVP model time series, Nonlinear Dynam. 69 (2012) 263. F. Wang, G.-P. Liao, X.-Y. Zhao, W. Shi, Multifractal detrended cross-correlation analysis for power markets, Nonlinear Dynam. 72 (2013) 353. Z. Li, X. Lu, Cross-correlations between agricultural commodity futures markets in the US and China, Physica A 391 (2012) 3930. G. Cao, L. Xu, J. Cao, Multifractal detrended cross-correlations between the Chinese exchange market and stock market, Physica A 391 (2012) 4855. B. Podobnik, I. Grosse, D. Horvatic, S. Ilic, P.Ch. Ivanov, H.E. Stanley, Quantifying cross-correlations using local and global detrending approaches, Eur. Phys. J. B 71 (2009) 243. B. Podobnik, Z.-Q. Jiang, W.-X. Zhou, H.E. Stanley, Statistical tests for power-law cross-correlated processes, Phys. Rev. E 84 (2011) 066118. M. Pal, P. Madhusudana Rao, P. Manimaran, Multifractal detrended cross-correlation analysis on gold, crude oil and foreign exchange rate time series, Physica A 416 (2014) 452. P.J. Deschavanne, A. Giron, J. Vilain, G. Fagot, B. Fertil, Genomic signature: characterization and classification of species assessed by chaos game representation of sequences, Mol. Biol. Evol. 16 (1999) 1391. A. Campbell, J. Mrazek, S. Karlin, Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA, Proc. Natl. Acad. Sci. USA 96 (1999) 9184. R. Sandberg, G. Winberg, C. Branden, A. Kaske, I. Ernberg, J. Coster, Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier, Genome Res. 11 (2001) 1404. J.S. Almeida, J.A. Carrico, A. Maretzek, P.A. Noble, M. Fletcher, Analysis of genomic sequences by Chaos game representation, Bioinformatics 17 (2001) 429. A.J. Gentles, S. Karlin, Genome-scale compositional comparisons in Eukaryotes, Genome Res. 11 (2001) 540. J.M. Gutierrez, M.A. Rodriguez, G. Abramson, Multifractal analysis of DNA sequences using a novel chaos-game representation, Physica A 300 (2001) 271–284. S.V. Edwards, B. Fertil, A. Giron, P.J. Deschavanne, A genomic schism in birds revealed by phylogenetic analysis of DNA strings, Syst. Biodivers. 51 (2002) 599. C. Stan, C.P. Cristescu, E.I. Scarlat, Similarity analysis for DNA sequences based on chaos game representation, case study: The albumin, J. Theoret. Biol. 267 (2010) 513. J.J. Han, W.J. Fu, Wavelet-based multifractal analysis of DNA sequences by using chaos-game representation, Chin. Phys. B 19 (2010) 010205. M. Pal, B. Satish, K. Srinivas, P.M. Rao, P. Manimaran, Multifractal detrended cross-correlation analysis of coding and non-coding DNA sequences through chaos-game representation, Physica A 436 (2015) 596–603. H.J. Jeffrey, Chaos game representation of gene structure, Nucleic Acids Res. 18 (1990) 2163. C. Stan, M.T. Cristescu, L.B. Iarinca, C.P. Cristescu, Investigation on series of length of coding and non-coding DNA sequences of bacteria using multifractal detrended cross-correlation analysis, J. Theoret. Biol. 321 (2013) 54.