Complexity analysis of time series based on generalized fractional order cumulative residual distribution entropy

Complexity analysis of time series based on generalized fractional order cumulative residual distribution entropy

Physica A 537 (2020) 122582 Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa Complexity analysis...

1MB Sizes 1 Downloads 57 Views

Physica A 537 (2020) 122582

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

Complexity analysis of time series based on generalized fractional order cumulative residual distribution entropy ∗

Yu Wang , Pengjian Shang School of Science, Beijing Jiaotong University, Beijing 100044, PR China

article

info

Article history: Received 4 June 2019 Received in revised form 21 July 2019 Available online 12 September 2019 Keywords: Cumulative residual distribution entropy Generalized fractional order cumulative residual distribution entropy Complexity Stock market

a b s t r a c t Based on the theory of cumulative residual entropy and distribution entropy, this paper proposes a new model—cumulative residual distribution entropy (CRDE), which is more suitable for the complexity and risk analysis of time series The new model makes full use of the known information, including not only the information of probability, but also the information about the value of random variables. Additionally, the CRDE considers the potential inherent information of vector-to-vector distance in the state space. By combining theoretical analysis with empirical research, this paper verifies that the new model has more advantages in measuring extreme events and small probability events, with higher consistency, stability and practicability. In this paper, the cumulative residual distribution entropy model is extended to the fractional order. The generalized fractional cumulative residual distribution entropy (GCRDE) can better capture the tiny evolution of time series, which is more advantageous for studying the dynamic characteristics of complex systems. The new model makes up for some shortcomings of traditional models and it can play a guiding role in the study of complex systems in the real world. © 2019 Elsevier B.V. All rights reserved.

1. Introduction With the rapid development of the economy, the financial market has become increasingly complex and the risks in the market are getting higher and higher. The precision requirements in the research of finance and economics are getting higher and higher, therefore more and more mathematical theories, ideas and methods have been introduced into the study of financial markets. Due to the complexity of financial market, financial market is a nonlinear dynamic system, which can be regarded as a complex system with multiple subsystems interacting with each other. If we use linear models to study financial markets, we will easily lose some important information. Therefore, we often use the research methods of complex systems in physics, biology and other fields to analyze financial systems. Considering that entropy is based on the physics knowledge, the definition of which is flexible and diverse, and the calculation is simple, it is widely used to measure the complexity of complex systems. Furthermore, it can help identify and quantify regular signals (e.g., periodic), random signals and chaotic signals. In recent years, many entropy measures have been proposed, such as approximate entropy, sample entropy, permutation entropy (PE) and multi-scale permutation entropy (MPE). The method of information entropy analysis has gradually penetrated into various fields, such as statistical physics, biomedicine, social sciences, information science, hydrology, geography, etc., and their research achievements have been rapidly popularized and widely used. ∗ Corresponding author. E-mail address: [email protected] (Y. Wang). https://doi.org/10.1016/j.physa.2019.122582 0378-4371/© 2019 Elsevier B.V. All rights reserved.

2

Y. Wang and P. Shang / Physica A 537 (2020) 122582

Information entropy refers to the uncertainty of information which is usually used to measure the complexity of time series, while risk analysis is used to measure the uncertainty of loss. In order to reduce the impact of random variable distribution on the estimation error, some scholars have proposed using entropy to measure the risk of financial market. The cumulative residual entropy measure based on the information entropy synthesizes the information of the value of random variables, which is more practical than the traditional information entropy measures and has a better performance in risk measurement. Approximate entropy (ApEn) [1,2], sample entropy (SampEn) [3,4] and other traditional entropy measures lack stability and consistency. Their calculation results vary greatly with the change of pre-determined parameters and are limited by the length of data [5,6]. Therefore, the distribution entropy is proposed. By using the probability density estimation method, the potential inherent information of vector-to-vector distance in the state space is fully utilized, which greatly improves the robustness of the algorithm. American mathematician Shannon [7] first proposed the concept of information entropy to measure the uncertainty of the system. As an important concept in information theory, the concept of entropy has been widely used in various fields such as information theory, financial analysis, data compression and statistics. Massoumi & Racine [8], considering that it is difficult to accurately estimate the rapid changes in financial markets, pointed out that it would be more appropriate to measure uncertainty with information entropy than variance. Reesor [9], Jianshe [10] respectively proposed the application of relative entropy and incremental entropy in the risk measurement. Dionisio et al. [11] pointed out that information entropy was more effective in measuring the actual state of financial risk. Murali Rao [12] defined the Cumulative Residual Entropy, a new measure of uncertainty. Li et al. [13] proposed the distribution entropy algorithm in 2015 based on the clinical examination background of short-term cardiovascular function. Leibniz first introduced the concept of fractional calculus (FC) in mathematics. Ubriaco put forward the expression of fractional order information entropy [14]. Machado combined the concept of fractional order with the theory of entropy [15–19], and proposed the fractional order sample entropy. The extended model can be used to analyze the fractional order information characteristics of complex systems. The results show that the sensitivity of the model to the evolution process of signals can be improved by extending the model to fractional order, as a result the model can better describe the dynamics of complex systems. The second part of this paper introduces the definition and the principle of cumulative residual entropy and distribution entropy algorithm, further analyzes its properties and advantages, and proposes a new model called cumulative residual distribution entropy based on the ideas of these two algorithms. The characteristics and advantages of extending entropy to fraction order are studied, and the new model is then extended to fraction order to better describe the dynamic characteristics of complex systems. In the third part of this paper, empirical research is carried out. Firstly, the influence of parameters on the algorithm is studied with the simulated data of known complexity level, and the appropriate parameter combination is determined. Then, we select real stock market data as sample data, and calculate the CRDE and GCRDE values of simulated data and real data to compare and analyze the characteristics and performance of the new model. Finally, the fourth part draws a conclusion and summarizes the whole paper. 2. Methodology 2.1. Cumulative residual entropy The information entropy measure is a good way to measure the uncertainty of time series, and its performance is better than the variance. However, it can be seen from the definition formula of entropy that entropy is only a function of probability density, and the information about the value of random variable is neglected, which makes it have defects in solving some practical problems, especially for the risk analysis of extreme events. The cumulative residual entropy [20,21] measure based on information entropy takes the loss distribution function into account, including both the information of probability and the value of the random variable itself, which has better practicability than the original information entropy measure. Definition. An N-dimensional random vector X ϵ RN

ε (X ) = −

∫ RN +

P (|X | > λ) logP (|X | > λ) dλ

(1)

where X = (X1 , X2 , . . . , XN ), λ = (λ1 , λ2 , . . . , λN ), and here |X | > λ means |Xi | > λi , i = 1,. . . ,N, RN+ = {xi ϵ RN ; xi ≥ 0}. It can also be written as ε (X ) = −

∫ +∞ −∞

F (X )logF (X )dx, where F (X ) = 1 − F (X ) is a decumulative distribution function

(ddf), and F(X) is a cumulative distribution function of X. The key to accumulating residual entropy is to replace the density function with the ddf function, which makes up for some of the deficiencies in the definition of Shannon entropy, which is more practical and general. The model of cumulative residual entropy is simple and easy to calculate. In real life, the empirical distribution is easier to obtain. The empirical distribution of cumulative residual entropy converges to the population distribution. The cumulative residual entropy can be calculated for both the case of known distribution and the lack of distribution information, so it is more widely used. In addition, both continuous random variables and discrete random variables can be calculated by using the cumulative residual entropy.

Y. Wang and P. Shang / Physica A 537 (2020) 122582

3

2.2. Distribution entropy Consider a one-dimensional time series of length N, {xn }t=1,...N , and get N-m embedded vectors from the embedded dimension m: Xjm = {xj , xj+1 , . . . , xj+m−1 }, j = 1, 2, . . . , N − m + 1. According to the definition of Chebyshev distance, we construct a distance matrix {⏐ D = {di,j } between embedded}subse⏐ quences to extract the spatial structure information of time series, where di,j = max ⏐xi+k − xj+k ⏐ , 0 ≤ k ≤ m − 1 , 1 ≤ i, j ≤ N − m. Due to the lack of prior information, the non-parametric estimation method should be used here to estimate the probability density. The histogram can reflect the distribution of a set of data, which is one of the most primitive models of probability density estimation. Since it is a simple and clear non-parametric estimation method, the distribution entropy is to use the histogram method to estimate the empirical probability density function (ePDF). Probability density can be understood intuitively as the number of events occurring in a certain interval. We divide the total interval covering the range of all data values into equal small intervals, then count the number of data falling into each interval, and finally get the probability of each interval. Assuming that the range of all possible values of the distance between embedded subsequences in distance matrix D is divided into M equal cells, the probability (frequency) of each cell pt , t = 1, 2, . . . , M is obtained by constructing the histogram. Thus, we completely quantify the information contained in the distance matrix D, and then obtain the distribution characteristics of all di,j (1 ≤ i, j ≤ N − m). In addition, in order to reduce the deviation, we do not consider the case of i = j while estimating ePDF, that is, we do not take the elements at the diagonal of distance matrix D into consideration. Distribution entropy is defined according to the traditional Shannon entropy formula: DistEn (m, M ) = −

M ∑

pt log2 (pt )

(2)

t =1

Compared with other traditional entropy measures, distribution entropy can better extract the nonlinear properties of time series. This global quantization method of vector-to-vector distance is a good way to reveal the inherent structure of the system. Entropy is often used to measure the system complexity. However, some entropy measures, such as approximate entropy, sample entropy, permutation entropy and conditional entropy, usually get the maximum value when all the values of the sequence are independent and uniformly distributed (a complete random sequence). In fact, the complexity of chaotic system in real life is more complex than that of such a stochastic process. Li et al. [7] confirmed that the distribution entropy can solve this paradox problem to a certain extent and has a better performance in the complexity measurement. 2.3. Cumulative residual distribution entropy In this section, from the perspective of information entropy, combined with the concept of cumulative residual entropy and distribution entropy, we propose a new model that is more suitable for complexity and risk analysis of time series, making it more advantageous in measuring extreme events and small probability events. The new model has higher consistency, stability and practicability, and it is further extended to the fractional order. Algorithm of cumulative residual distribution entropy (CRDE): 1. Spatial reconstruction: Spatially reconstruct a one-dimensional time series of length N {xn }t=1,...N , obtain N-m embedded subsequences by the embedded dimension m: Xjm = {xj , xj+1 , . . . , xj+m−1 }, j = 1, 2, . . . , N − m + 1 2. Construction of distance matrix: Calculate distance between embedded subsequences, obtain the ⏐ {⏐ the Chebyshev } distance matrix D = {di,j }, di,j = max ⏐xi+k − xj+k ⏐ , 0 ≤ k ≤ m − 1 , 1 ≤ i, j ≤ N − m, thereby extracting the spatial structure information of the time series. 3. The empirical probability density estimation by using the histogram method: Completely quantify the information contained in the distance matrix D, and divide the range of all possible values of di,j into M equal cells. The probability (frequency) pt , t = 1, 2, . . . , M (ignoring i = j) of each cell are obtained by constructing a histogram. 4. Calculation of CRDE: After obtaining the empirical decumulative distribution function of the distance between subsequences, define the cumulative residual distribution entropy based on the traditional Shannon entropy formula:

CRDE(m, M) = −

M ∑ k=1

[( 1−

k ∑

) pt

( log 1 −

t =1

k ∑

)] pt

(3)

t =1

∑M

∑k

which can also be written as CRDE(m, M) = − k=1 F k logF k , where F k = 1 − Fk = 1 − t =1 pt , k = 1, . . . , M. When the sample size is large enough, the empirical distribution converges to the population distribution, as

4

Y. Wang and P. Shang / Physica A 537 (2020) 122582

a result, the definition of cumulative residual distribution entropy can also be written in a continuous form : ∫ CRDE(D) = − RN P(D > λ)logP(D > λ)dλ, where D is the Chebyshev distance between embedded subsequences. +

The cumulative residual distribution entropy combines the idea of the cumulative residual entropy and the distribution entropy and synthesizes their advantages. It makes a full use of the known information from the perspective of spatial structure of time series. To some extent, the new model overcomes some shortcomings of traditional entropy models and is closer to the reality. 2.4. Generalized fractional order cumulative residual distribution entropy Leibniz first introduced the concept of fractional order calculus (FC) in mathematics, and Ubriaco combined the concept of fractional order with entropy theory to put forward the expression of fractional order information entropy [8]: Sq = E [(− ln p)q ] =



(− ln pi )q pi

(4)

i

where q is the order of entropy 0 ≤ q ≤ 1, and when q = 1 it is the same as the traditional Shannon entropy formula. And the expected value of the information content is: Iq (pi ) = (− ln pi )q . Fractional order calculus (FC) obtains the intermediate value by using power function. The Laplace transform formula of fractional derivative (α ) of signal x (t) with an initial value of zero is as follows: L{0 Dα x(t)} = sα L{x(t)}

(5)

where t is time, L{·} is a Laplacian operator, and s is a variable. Due to this property, we can use the Fourier transform and the Z transform to extend the element sα in the formula to construct a fractional derivative approximation. Machado proposed the α -order entropy (α∈R) [17,22]: Iα (pi ) = Dα I (pi ) = −

Sα =

∑{ i



p−α i

Γ (α + 1)

p−α i

[ln pi + ψ (1) − ψ (1 − α)]

(6)

}

Γ (α + 1)

[ln pi + ψ (1) − ψ (1 − α)] pi

(7)

here Γ (·) and ψ (·) represent the gamma function and the digamma function, respectively. According to this, the cumulative residual distribution entropy proposed in this paper is then extended to fractional order:

⎧ ( ⎫ )−α [ ( ∑ ) ]⎪ ( ) k k ⎨ 1 − kt=1 pt ⎬ ∑⎪ ∑ ∑ GCRDE(m, M, α ) = − ln 1 − pt + ψ (1) − ψ (1 − α) 1− pt ⎪ ⎪ Γ (α + 1) ⎭ i ⎩ t =1 t =1

(8)

which can also be written as:

{ GCRDE (m, M, α) =

∑ i



Fk

−α

Γ (α + 1)

} ] ln F k + ψ (1) − ψ (1 − α) F k

[

(9)

∑k

where F k = 1 − Fk = 1 − t =1 pt , k = 1, . . . , M. Extending entropy to fractional order can make the model more sensitive to the evolution of time series, provide more dynamic information, and be more practical for studying complex dynamic systems. 3. Results and analysis 3.1. Model dependence on parameters The cumulative residual distribution entropy proposed in this paper is a function of embedding dimension m and interval number M. The influence of its own parameters on the algorithm is studied by the simulated data of known complexity level. In the process, the values of the embedding dimension m and interval number M are constantly changed and the appropriate combination of parameters of CRDE algorithm is then determined to improve the accuracy and efficiency of CRDE algorithm, so as to facilitate subsequent empirical analysis. We first consider the dependence of CRDE on the embedded dimension m. The core idea of distribution entropy is to extract the spatial structure information in the sequence. If the m value is too large or too small, the spatial structure information cannot be extracted effectively. Since the embedding dimension m usually takes a small value between 3 and 7, the case where m takes a large value will not be considered in this experiment. Next, we test the results of m at five different values from 3 to 7 in steps of 1.

Y. Wang and P. Shang / Physica A 537 (2020) 122582

5

Then we analyze the effect of parameter M on CRDE. The relatively large value M can appropriately unfold the distribution of the distance matrix D, but if the value M is too large, the distribution structure of D will be overunfolded, resulting in the distribution information contained in D cannot be properly quantified. The total number of elements in matrix D except the principal diagonal is (N−m)(N−m−1), so the maximum value of M is (N−m)(N−m−1) (i.e., M ≤ (N−m)(N−m−1)). In the experiment, we take 16 different values of M from 4(22 ) to 64(26 ) and calculate the corresponding CRDE values. Because the one-dimensional chaotic logistic map in chaotic system has a simple structure and a good performance, we choose this method to generate simulation sequences. The logistic map equation in the form of one-dimensional difference is: x(n + 1) = rx(n)(1 − x(n))

(10)

where r is the system parameter, r ∈ (0, 4], x ∈ (0, 1). Studies have shown that when 3.5699456 < r ≤ 4, the logistic map generates chaotic sequences which have different dynamic characteristics and are very sensitive to initial values, in this way, ideal pseudo-random sequences can be obtained. Although this map method is sensitive to the initial value, the initial value does not affect the occurrence of chaotic state, so x(0) can take any value in the interval (0,1). Let x (0) = 0.5, let r takes 3.2857, 3.5714, and 4 respectively to obtain a periodic sequence and two chaotic sequences. In this section, we select the sequences generated by the logistic map and the Gaussian noise sequence (all with the data length of 1000) as data samples. We perform four sets of sequences to obtain the CRDE values for different parameter combinations of m and M (m = {3, 4, 5, 6, 7}, M = {4, 8, 12, . . . , 60, 64}). The results are shown in Fig. 1(a) (b) (c) (d): From Fig. 1, we can see that except for the chaotic sequence generated by logistic map at r = 4, the results of the other three groups show that the entropy values of the different m values almost coincide with each other, which to some extent indicates that CRDE algorithm has a low dependence on the embedded dimension m, and the choice of different m values does not have a great impact on the results. Because the core idea of distribution entropy is to extract spatial structure information in sequence, too large or too small m value cannot effectively extract its spatial structure information. Considering that CRDE does not depend strongly on m value, the value of m is fixed to 5 in subsequent experiments. In addition, it can be observed that no matter what value m is taken, the value of CRDE of the four sequences increases linearly with the increase of M approximately. This is due to the definition formula of entropy. If it is normalized, it is not difficult to find that in essence, the size of M does not have a great influence on the algorithm results. Considering that a larger M value enables the model to better estimate the probability density, the value of M is fixed at 64 in subsequent experiments. Many traditional entropy measures lack stability and consistency. Their calculation results vary greatly with the change of pre-determined parameters and are limited by the length of data. For the approximate entropy or the sample entropy, the difference in the threshold r has a large influence on the entropy value, by contrast, the selection of the M value in CRDE is not as important as the selection of the value r. By using the probability density estimation method, the potential inherent information of vector-to-vector distance in the state space is fully utilized, which greatly improves the robustness of the algorithm. In summary, we can see that the CRDE algorithm has a low sensitivity to its own parameters m and M, so it is more practical and widely used in practical applications. In order to ensure the best experimental results, considering the experimental effect, computational complexity and computational cost, we set m = 5 and M = 64 in the subsequent experiments In order to verify the performance of CRDE in complexity measurement, we take sequences of data lengths of 1000 generated by logistic map when r takes 15 values equidistantly in the interval [2,4], and calculate their CRDE values (m = 5, M = 64). The results are shown in Fig. 2: By observing Fig. 2, it can be found that when r is between 2 and 3, the CRDE value of logistic sequence is low, and when r is between 3 and 4, the CRDE entropy value is relatively high. The results are basically consistent with the theoretical complexity level trend, indicating that CRDE plays a certain role in reflecting the complexity of sequence. 3.2. Results of stock market time series In this section, we select daily closing price of the CSI 300 Index, the Shanghai Composite Index and the Shenzhen Component Index for a total of 1000 trading days from August 26, 2014 to September 28, 2018 as data samples, and calculate its return series according to the formula rn = ln Pn − ln Pt ,n−1 (n = 1, 2, . . . , N ). At the same time, the sequences with data length of 1000 generated by logistic map at r = 3.2857, r = 3.5714 and r = 4 and Gaussian noise with data length of 1000 were selected for comparative study. The cumulative residual distribution entropy (CRDE) and generalized fractional cumulative residual distribution entropy (GRDE) (m = 5, M = 64) of these 7 sequences are calculated and compared below. First, we calculate the CRDE of the sample sequences. The histograms of the seven sequences obtained when estimating the probability density in the calculation process are shown in Fig. 3: Then we consider the calculation of GCRDE. The parameter α in formula (8) takes 100 values from −1 to 1 equidistantly, and then we calculate the GCRDE value of the seven sequences under different α values. The results are shown in Fig. 4(a):

6

Y. Wang and P. Shang / Physica A 537 (2020) 122582

Fig. 1. Calculate the CRDE of four sets of sequences under different parameter combinations of embedding dimension m and interval number M (m = {3, 4, 5, 6, 7}, M = {4, 8, 12, . . . , 60, 64}), data length is 1000.

Looking at Fig. 4(a), it can be found that the GCRDE values of both the simulated data and the real data first increase slowly with the increase of α , and then decrease rapidly. When the value of α is too large, it has an excessive influence on the value of GCRDE. To eliminate this influence of parameters, we take 70 values equidistantly of α ∈ [−1, 0.6] and calculate these 7 sequences again. The GCRDE values are shown in Fig. 4(b). It can be found that by selecting the largest GCRDE value of each sequence as its complexity measure index, the complexity between sequences can be well distinguished and compared. Therefore, the maximum GCRDE value of the sequence is selected to represent its GCRDE value. Table 1 shows the CRDE values and (maximum) GCRDE values for the simulated and real data, and Fig. 5 better reflects their size relationship through the line graph. As can be seen from the figure, the trend of CRDE value and GCRDE value of the sequence is roughly the same, and the chaotic sequence generated by the logistics map (r = 3.5714) has the largest CRDE value and the largest GCRDE value, indicating that its complexity is stronger than the other sequences. Comparing the CRDE value and the GCRDE value of the real data, the relationship between them is Shenzhen>Shanghai>CSI300, which indicates that the complexity of Shenzhen stock market is higher than that of Shanghai stock market, while the complexity of the comprehensive market of Shenzhen and Shanghai stock market is relatively lower.

Y. Wang and P. Shang / Physica A 537 (2020) 122582

7

Fig. 2. The CRDE values (m = 5, M = 64) of sequences produced by logistic map when 15 values of r are taken equidistantly in the interval [2,4].

Fig. 3. Histograms of each sequences obtained in the calculation process.

8

Y. Wang and P. Shang / Physica A 537 (2020) 122582

Fig. 4. GCRDE values for simulated data and real data.

Fig. 5. CRDE values and (maximum) GCRDE values of simulated data and real data. Table 1 CRDE values and (maximum) GCRDE values of simulated data and real data. CRDE GCRDE

CSI300

Shanghai

Shenzhen

Logistic (r = 3.2857)

Logistic (r = 3.5714)

Logistic (r = 4)

Gaussian

8.4613 8.4701

9.1984 9.2008

9.2519 9.4569

17.7400 18.9632

18.1209 22.1248

9.0271 20.5295

7.5977 10.3738

Besides, it can be found that the CRDE value and the GCRDE value of the chaotic sequence generated by the logistic map are quite different when r = 4. Considering the chaos of the sequence, the GCRDE value here can better explain the complexity of the sequence. It can be seen from the line graph that GCRDE can better distinguish the complexity differences between sequences. Considering comprehensively, it can be preliminarily judged that extending the model to fractional order can achieve better results. In addition to the perspective of complexity metrics, we can also discuss CRDE and GCRDE models from the perspective of risk analysis. From the theoretical level, information entropy means that the uncertainty of information which is usually used to measure the complexity of time series, while risk analysis is measuring the uncertainty of loss. Therefore, we can also use entropy to measure the risk of financial market. The entropy of the stock market yield sequence is a measure of the uncertainty of the closing price rise or fall. The higher the entropy value is, the more complex the market is, and the more factors affecting the capital market, which means the higher the risk is. The cumulative residual entropy distribution entropy measure proposed in this paper on the basis of information entropy, which integrates the information of random variable value and the spatial structure distribution of sequence, has better practicability than the traditional information entropy and can achieve better effect in risk measurement. From the results, it can be analyzed that the risk of the Shenzhen stock market is higher than that of the Shanghai stock market, while the risk of the comprehensive market of Shenzhen and Shanghai stock market is relatively lower. 4. Conclusion Based on the theory of cumulative residual entropy and distribution entropy, this paper proposes a new model, cumulative residual distribution entropy, which is more suitable for complexity and risk analysis of time series. The main advantages of this model are: (1) It contains not only the information of probability, but also the information of

Y. Wang and P. Shang / Physica A 537 (2020) 122582

9

random variable values. By making full use of the known information and taking extreme events and low-probability events into consideration, the measurement of uncertainty is closer to the reality and more general. (2) Through the method of probability density estimation, CRDE algorithm extracts and quantifies the spatial structure information of the sequence, and makes full use of the potential inherent information of vector-to-vector distance in the state space. (3) The model has low sensitivity to its own parameters which greatly improves the robustness, consistency, stability and practicability of the algorithm. (4) The algorithm is simple, fast and easy to operate. In this paper, the CRDE model is also extended to fractional order. The GCRDE model can better capture the small evolution of signal data, and it has more advantages in studying the dynamic characteristics of complex systems. Through empirical analysis of simulated data and data in real life, the actual analysis conclusions are basically consistent with the theoretical analysis conclusions. The empirical analysis part studies the influence of its own parameters on the model and selects the optimal combination of parameters for it. The CRDE value and GCRDE value of the simulated data and the real data are calculated and compared. It is found that the new model can reflect the complexity and risk of the system to a certain extent, and the performance of the model is better when extended to fractional order. The new model makes up for some of the shortcomings of the traditional models and can play a guiding role in the study of complex systems in the real world. This paper not only has important practical significance for financial investment decision-making, but also for the risk prediction of financial assets. It provides a theoretical basis for further quantitative research on financial markets. In addition, the entropy analysis method proposed in this paper may be applied to the research of complex systems in various fields such as statistical physics, biomedicine, information science, geography, etc., and provide new methods and ideas. Acknowledgment The financial support from the funds of the Fundamental Research Funds for the Central Universities, China (2018JBZ104) is gratefully acknowledged. References [1] S.M. Pincus, Approximate entropy as a measure of system complexity, Proc. Natl. Acad. Sci. USA 88 (1991) 2297–2301. [2] S. Pincus, B.H. Singer, Randomness and degrees of irregularity, Proc. Natl. Acad. Sci. USA 93 (1996) 2083–2088. [3] J.S. Richman, M.J. Randall, Physiological time-series analysis, using approximate entropy and sample entropy, Am. J. Physiol. Heart Circ. Physiol. 278 (2000) H2039–H2049. [4] J.M. Yentes, N. Hunt, K.K. Schmid, J.P. Kaipust, D. Mcgrath, N. Stergiou, The appropriate use of approximate entropy and sample entropy with short data sets, Ann. Biomed. Eng. 41 (2013) 349–365. [5] P. Li, C. Liu, X. Wang, L. Li, Testing pattern synchronization in coupled systems through different entropy-based measures, Med. Biol. Eng. Comput. 51 (2013) 837. [6] C. Liu, C. Liu, P. Shao, L. Li, X. Sun, Comparison of different threshold values r for approximate entropy: application to investigate the heart rate variability between heart failure and healthy control groups, Physiol. Meas. 32 (2011) 167–180. [7] C.E. Shannon, A mathematical theory of communication, Bell Labs Tech. J. 27 (1948) 379–423. [8] E. Maasoumi, J. Racine, Entropy and predictability of stock market returns, J. Econometrics 107 (2002) 291–312. [9] R.M. Reesor, Relative Entropy, Distortion, the Bootstrap and Risk, National Library of Canada, 2001. [10] J. Ou, Theory of portfolio and risk based on incremental entropy, J. Risk Financ. 6 (2005) 31–39. [11] A. Dionisio, R. Menezes, D.A. Mendes, An econophysics approach to analyse uncertainty in financial markets: an application to the Portuguese stock market, Eur. Phys. J. B 50 (1–2) (2006) 161–164. [12] M. Rao, Y. Chen, B.C. Vemuri, F. Wang, Cumulative residual entropy: a new measure of information, IEEE Trans. Inform. Theory 50 (2004) 1220–1228. [13] P. Li, C. Liu, K. Li, D. Zheng, C. Liu, Y. Hou, Assessing the complexity of short-term heartbeat interval series by distribution entropy, Med. Biol. Eng. Comput. 53 (2015) 77–87. [14] M.R. Ubriaco, Entropies based on fractional calculus, Physica A 373 (2009) 2516–2519. [15] J.A.T. Machado, Entropy analysis of integer and fractional dynamical systems, J. Appl. Nonlinear Dynam. 62 (2010) 371–378. [16] J.A.T. Machado, Fractional dynamics of a system with particles subjected to impacts, Commun. Nonlinear Sci. Numer. Simul. 16 (2011) 4596–4601. [17] J.A.T. Machado, Entropy analysis of fractional derivatives and their approximation, J. Appl. Nonlinear Dynam. 1 (2012) 109–112. [18] J.A.T. Machado, A.M.S. Galhano, Approximating fractional derivatives in the perspective of system control, Nonlinear Dynam. 56 (2009) 401–407. [19] J.A.T. Machado, A.M. Galhano, A.M. Oliveira, K. Tar. József, Approximating fractional derivatives through the generalized mean, Commun. Nonlinear Sci. Numer. Simul. 14 (2009) 3723–3730. [20] F. Wang, B.C. Vemuri, M. Rao, Cumulative residual entropy, a new measure of information & its application to image alignment, in: IEEE International Conference on Computer Vision, IEEE Computer Society, 2003. [21] A. Chamany, S. Baratpour, A dynamic discrimination information based on cumulative residual entropy and its properties, Comm. Statist. Theory Methods 43 (2014) 1041–1049. [22] José Tenreiro Machado, Fractional order generalized information, Entropy 16 (2014) 2350–2361.