Environmental Pollution xxx (2016) 1e14
Contents lists available at ScienceDirect
Environmental Pollution journal homepage: www.elsevier.com/locate/envpol
Multifractal behavior of an air pollutant time series and the relevance to the predictability* Qingli Dong a, Yong Wang a, b, *, Peizhi Li a a b
School of Statistics, Dongbei University of Finance and Economics, Dalian 116025, China Postdoctoral Research Station of Dongbei University of Finance and Economics, Dalian 116025, China
a r t i c l e i n f o
a b s t r a c t s
Article history: Received 1 August 2016 Received in revised form 21 November 2016 Accepted 30 November 2016 Available online xxx
Compared with the traditional method of detrended fluctuation analysis, which is used to characterize fractal scaling properties and long-range correlations, this research provides new insight into the multifractality and predictability of a nonstationary air pollutant time series using the methods of spectral analysis and multifractal detrended fluctuation analysis. First, the existence of a significant power-law behavior and long-range correlations for such series are verified. Then, by employing shuffling and surrogating procedures and estimating the scaling exponents, the major source of multifractality in these pollutant series is found to be the fat-tailed probability density function. Long-range correlations also partly contribute to the multifractal features. The relationship between the predictability of the pollutant time series and their multifractal nature is then investigated with extended analyses from the quantitative perspective, and it is found that the contribution of the multifractal strength of long-range correlations to the overall multifractal strength can affect the predictability of a pollutant series in a specific region to some extent. The findings of this comprehensive study can help to better understand the mechanisms governing the dynamics of air pollutant series and aid in performing better meteorological assessment and management. © 2016 Elsevier Ltd. All rights reserved.
Keywords: Multifractality Air pollutants Predictability Spectrum analysis
1. Introduction Serious air pollution has been witnessed in China due to increased energy consumption and rapid industrialization in recent decades. Combined with social and economic issues, air pollution can largely threaten the framework of sustainable development (Thatcher and Hurley, 2010). According to a study by the World Bank, China has 16 of the 20 most polluted cities in the world pez et al., 2011), and pollution has had serious effects on public (Lo health, resulting in various diseases, such as cardiopulmonary and respiratory diseases (Chudnovsky et al., 2014). Discharged material, such as SO2, NO2, CO, O3, PM2.5 and PM10, is assessed by many air quality monitoring systems (AQMS) (Zhao et al., 2015) to inform the public about the level of air pollution. It is of critical importance to investigate the time structure of pollutant series not only to better understand the dynamic mechanisms (Bandowe et al., 2014) and
*
This paper has been recommended for acceptance by Dr. Hageman Kimberly Jill. * Corresponding author. School of Statistics, Dongbei University of Finance and Economics, Dalian 116025, China. E-mail address:
[email protected] (Y. Wang).
design more efficient early warning systems but also to provide management with more information to characterize the behavior of pollutants, such as diffusion, dilution and coagulation (Xue et al., 2015). However, air pollution systems are complex, being affected by local interactions and correlations between various factors. The diverse sources of air pollutants include industrial processes, vehicular emissions and energy production from power stations, coupled with complicated physical and chemical processes (Feng et al., 2015). Air pollution systems have various components, such as meteorological factors, atmosphere self-purification and solar radiation, all of which have certain influences on the evaluation of air pollution concentrations (Shen et al., 2016). In an open and dissipative system, all the factors are correlated and interact with each other on different timescales, making it difficult to analyze the structure and temporal variation of air pollutants. Therefore, more attention is being paid to the scaling formalism to analyze the structural characteristics (Gokhale and Khare, 2007). Based on the concept of a fractal, in recent decades, the scaling behavior of the process of air pollution has attracted increased attentions. Various methods, including a computational fluid
http://dx.doi.org/10.1016/j.envpol.2016.11.090 0269-7491/© 2016 Elsevier Ltd. All rights reserved.
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
2
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
dynamics model (Chu et al., 2005) and statistical approaches (Gokhale and Khare, 2007), have been adopted to study the dynamical behavior of air pollutants. Specifically, linear unbiased estimates, correlation analyses, power spectrum analysis and so on are sophisticated approaches that have been widely used to analyze and describe the distribution, periodicity and trends of pollutant concentrations (Xue et al., 2015). By using rescaled range analysis, detrended fluctuation analysis (DFA) and spectral analysis, Kai et al. (2008). analyzed three pollution series and the daily air pollution index (API) of Shanghai. They found that there are two different power laws in these series, which indicated different selforiganized critical states. Continuous wavelet transform fractal analysis was introduced by Yuval and Broday (2010) to assess the predictability of pollutant time series. They found that the predictability of air pollutants are consistent with that of meteorological variables in the short scale. Based on mono-fractal analysis, Lee et al. (2006). analyzed the scaling behavior of a one-year series of hourly average O3 and found that scaling invariance exists in the studied series and the box dimension is a decreasing function. In all the studies mentioned above, the pollutant series are analyzed and modeled as monofractal series, which are more suited to model homogeneous time series because there is only one scaling component and the properties are constant (Stanley et al., 1996). To advance the investigation of the dynamics of more heterogeneous and complex series, it is necessary to analyze more scaling exponents, including through fractal and multifractal analyses, especially when the original series is composed of many interwoven fractal subsets (Pamuła and Grech, 2014). Multifractality is regarded as the inherent property of complex and composite systems, which have attracted much attention in recent years. Multifractality theory is widely used to quantitatively delineate the nonlinear evolution of a complicated system and the multiscale characteristics of physical quantities. It can also aid in understanding the intrinsic regularity and mechanism of physical changes (Windsor and Toumi, 2001). Shen et al. (2016). described and analyzed the multifractal characteristics of the air pollution index (API) in China, which provided a basis for further probing into the complexity of API series. Liu et al. (2015). used multifractal detrended fluctuation analysis (MF-DFA) to characterize the temporal fluctuations of the API record and three common pollution indexes of Shanghai in China. The results showed that the temporal scaling behaviors in studied pollutant series present different power-law relationships. Applying Chhabra and Jensen's multifractal formalism, Munoz Diosdado et al (Munoz Diosdado et al., 2013). analyzed an atmospheric pollutant concentration series from 1990 to 2005 with multifractal analysis. They confirmed the existence of multifractality in this series, which indicated a new level of complexity distinguished by the wide range of necessary fractal dimensions to characterize the dynamics of the air pollutant time series. Although the fractal and multifractal characteristics of pollutant series in some regions have been investigated, the mechanism of variation in fractality and multifractal sources and the relevance to the predictability have not be fully clarified. By studying the multifractal nature of the measured data, we intend to investigate their persistence properties and heterogeneity features, which can help us understand the structural complexity of pollutant records and provide theoretical support in pollutant series forecasting. The novelty of our work along with the acquired nontrivial findings can be concluded as follow: The power spectra of different pollutant series are computed by adopting the basic modulus-squared of the discrete Fourier transform, and different frequency regimes are found. The
distinct decreasing trends at low and high frequencies testify to the existence of power-law behaviors and long-range correlations at these frequencies. Compared with the traditional method of detrended fluctuation analysis that was widely used in previous studies, this research use spectral analysis together with the multifractal detrended fluctuation analysis, which are based on the stochastic processes, chaos theory and time series analysis, to investigate the statistical self-affinity and multifractality of a pollutant series. The time dynamics of pollutant series can be explored at lower timescales with higher frequency datasets. The datasets used in this research is the measured hourly observations of pollutant series that are collected from five important cities in China. It is found that different long-range temporal correlations for small and large fluctuations and a fat-tailed probability distribution of variations are two major sources for multifractality, and the fat-tailed probability distribution contributes more to the multifractality. A rough relationship between the forecasting accuracy of pollutant series and the parameter of multifractality is found in this paper, which means the contribution of the long-range correlations' multifractal strength to the whole multifractal strength can affect the predictability of pollutant series in a specific region to some extent. The conclusion about the relationship is in agreement with Yuval and Broday (2010). However, this relationship is quantified and unveiled by quantitative analysis and comparison, which has never been obtained so far to the best of our knowledge.
The findings of this study can help to better understand the mechanisms governing the dynamics of air pollutant series and aid in performing better meteorological assessment and management in early-warning system. The reminder of this paper is organized as follows. The data acquisition and analysis are described in section 2. Section 3 provides the spectral analysis briefly. In section 4, detailed discussions of the empirical results and extended analyses are presented. Lastly, section 5 presents the conclusions of this paper.
2. Data acquisition and analysis Original hourly records of particulate matter concentrations are measured and published on the air quality publishing platform of China (http://www.aqistudy.cn/historydata/index.php) and the Ministry of Environmental Protection of the People's Republic of China (http://english.mep.gov.cn/), which are our main data resources. In addition, parts of the hourly pollutant data were collected by our self-developed Java programs from May 13, 2014 to Nov 12, 2016. Because the quality of collected records is controlled, the fraction of missing data is small. Specifically, for the record of PM2.5 and PM10 in Shanghai, the number of entry/missing data is 20876/146 and 20876/549, respectively. In this paper, we adopt the cubic spline interpolation approach (Carrizosa et al., 2013) to handle the missing data. Contour plots of the originally hourly PM2.5 concentrations and PM10 concentrations are depicted in Fig. 1. To identify the features of air pollutant regimes and estimate the PM emission characteristics, the lognormal distribution can usually be utilized. In addition, it is an important part for PM forecasting and early warning systems. The probability density function (PDF) and the cumulative distribution function (CDF) of the Lognormal distribution are given by the following formula:
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
3
Fig. 1. Hourly records of the PM2.5 concentration and PM10 concentration in Shanghai, China, from May 13, 2014 to Nov 12, 2016.
" # 1 ðln x mÞ2 p ffiffiffiffiffiffi exp f ðx; m; sÞ ¼ ; x > 0 and 2s2 xs 2 p 1 Fðx; m; sÞ ¼ pffiffiffiffiffiffi s 2p
Zx 0
1 exp t
! ðln t mÞ2 dt; 2s2
(1)
(2)
where m > 0 and s > 0 denote the location parameter and the scale parameter, respectively. The histogram of the PM2.5 and PM10 series in Shanghai and the fitting curve of the lognormal distribution are shown in Fig. 2. The overall tendency of the PM10 series is slightly higher than that of the PM2.5 series, though the trends of these two series are nearly identical. The descriptive statistics are listed in Table 1. The PM2.5 series has lower volatility because the variance is lower than that of the PM10 series. Both of these series have positive skewness, indicating a clear right-side feature. In addition, the peak values of Kurtosis of the PM2.5 and PM10 series are 7.35 and 6.44, respectively, which are larger than the peak normal distribution of 3. This indicates that these two series exhibit an evident peak tail distribution, which is clearly in agreement with the results of the JarqueBera test. 3. Spectral and multifractal analysis In this part, the spectral and multifractal analysis of the hourly pollutant concentration series recorded from 2014 to 2016 are conducted successively to study the temporal fluctuations. Spectral analysis is performed to note the persistence properties of the pollutant series, based on which its heterogeneity features are studied further through multifractal detrended fluctuation analysis. The framework of the paper is shown in Fig. 3. 3.1. Spectral analysis The power spectrum of a time series describes the distribution
of power into the frequency components composing that signal. A complex series can be decomposed into simpler parts by the technical process of spectrum analysis. Herein, based on the periodogram approach, the power spectra of all the studied series are computed to investigate the cyclic oscillations and spectral exponents. The theory of power spectra is briefly reviewed. Specifically, the periodogram is a nonparametric estimate of the power spectral density (PSD) of a wide-sense stationary random process and is a commonly used technique to estimate a spectrum, which is given by the modulus squared of the discrete Fourier transform (DFT). The DFT of a function can generate a frequency spectrum containing the entire information of the original series. The periodogram is the DFT of the biased estimate of the autocorrelation sequence. Suppose a signal is sampled N different times and is uniformly spaced by Dt, with values of xn ; the periodogram is defined as follows:
Pðf Þ ¼
X Dt N1
2
i2pfnDt
xn e N n¼0
1 1
(3)
where the values at all frequencies except 0 and the Nyquist frequency 1=2Dt are multiplied by 2 for a one-sided periodogram to conserve the total power. Based on the periodogram method, the power spectra of the PM2.5 and PM10 series in Shanghai are computed and shown in Fig. 4. As is known, for a purely random process, such as a white noise series, the power spectral density is constant, which implies that the value of the series is totally independent of the others. To wit, the series does not have memory. In contrast, if the power spectrum has a power-law relationship, it indicates that the original signal has long-range correlation. We can preliminarily deduce the strength of the temporal fluctuations by estimating the information and characteristics of the spectral exponent (Telesca et al., 2002). From a visual inspection of Fig. 4, we can find different frequency regimes. The spectrum is nearly constant at the middle frequencies, which indicates that this process is uncorrelated.
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
4
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
Fig. 2. Frequency histogram and fitted distributions of the PM values: (a) PM2.5 and (b) PM10 (Shanghai).
Table 1 Descriptive statistics of the two pollutant series in Shanghai, China. Pollutant
Unit
Mean
Std.
Min
Max
Skewness
Kurtosis
Jarque-Bera
ADF
PM2.5
mg=m3 mg=m3
48.94
36.46
3.00
356.00
1.89
7.98
34044.43*
8.36*
69.4
46.72
1.00
540.00
1.85
8.57
38917.61*
10.31*
PM10
Note: * means the rejection of the Jarque-Bera statistic tests and Augmented Dickey-Fuller test is significant at the 1% significance level.
However, at lower and higher frequencies, the distinct decreasing trends testify to the existence of significant power-law behavior and long-range correlations at these frequencies. These variations
exhibited in the power laws indicate the variability in the persistence of the series (Yuval and Broday, 2010). Remark: According to (Calif and Schmitt, 2014), the value of the
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
5
Fig. 3. Flowchart and framework of the experiments performed in this paper.
exponent a can be calculated from the power law of the spectrum, which is denoted as P F a . Furthermore, relevant information and feathers about the temporal fluctuations can be furnished by the power spectrum, which is a good index for measuring a series’ persistence, and thus their autocorrelation and predictability. Generally, the original series is nonstationary if a > 1, persistent if 0 < a < 1; uncorrelated if a ¼ 0 and anti-persistent if 1 < a < 0. From the visual inspection of Fig. 4, the values of a at different frequencies can be computed approximately. Furthermore, we can deduce the specific strength and the type of temporal fluctuations. Thus, it is necessary to accurately estimate the spectral exponent a considering the importance of information furnished by the spectral exponent and the poor accuracy of the linear regression performed on the power spectrum. Multifractal detrended fluctuation analysis (MF-DFA) is adopted to reveal the more specific structure of the spectral exponent in the pollutions series.
yk ¼
k X ðxt 〈x〉Þ; k ¼ 1; 2; :::; N;
where 〈x〉 is the averaging value of the whole series. Second, the entire profile series fyk jk ¼ 1; 2; :::; Ng is divided into Ns ¼ PN=sR non-overlapping segments of equal window size s.
i h yðvÞ ¼ yðv1Þsþ1 ; yðv1Þsþ2 ; :::; yvs ; v ¼ 1; 2; :::; Ns :
In this section, the multifractal detrended fluctuation analysis (MFDFA) method will first be introduced. Then, it is applied not only to reveal the more complex structure in the air pollutant series but also to provide a good estimation of the spectral exponent. 3.2.1. Methodology Based on the DFA approach, multifractal detrended fluctuation analysis (MF-DFA) was proposed by Kantelhardt et al. (Kantelhardt et al., 2002) to investigate the multifractality in non-stationary time series. A brief description of MF-DFA follows: First, let us suppose fxt jt ¼ 1; 2; :::; Ng is the original time series, where N is the length of the series. The profile can be determined by the following:
(5)
Considering that the length of profile series N often cannot be a multiple of the given time scale s, the same partition method is repeated from the other end:
i h yðvÞ ¼ yNðvNs Þsþ1 ; yNðvNs Þsþ2 ; :::; yNðvNs 1Þs ; v ¼ Ns þ 1; Ns þ 2; :::; 2Ns :
3.2. Multifractal detrended fluctuation analysis
(4)
t
(6)
Thus, 2Ns segments are obtained together. Next, the local linear trend for each of the 2Ns segments yðvÞ is calculated using the least-square fitting model. Then, the detrended variance of each segment v for the first half is calculated as follows:
F 2 ðv; sÞ ¼
s h i2 1X yðv1ÞNs þi yðvÞi ; v ¼ 1; 2; :::; Ns : s i¼1
(7)
Each segment v for the second half is expressed as follows:
F 2 ðv; sÞ ¼
s h i2 1X yNðvNs Þsþi yðvÞi v s i¼1
¼ Ns þ 1; Ns þ 2; :::; 2Ns ;
(8)
where yðvÞi denotes the m order fitting polynomial in segment v.
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
6
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
Fig. 4. Power spectrum of different pollution series: (a) PM2.5 and (b) PM10.
Fourth, a q order detrended fluctuation function is calculated by averaging over all segments:
ln F0 ðsÞ ¼ ( Fq ðsÞ ¼ For q ¼ 0,
)1 2Ns h i q q 1 X 2 2 F ðv; sÞ : 2Ns v¼1
(9)
2Ns h i 1 X ln F 2 ðv; sÞ : 4Ns v¼1
(10)
Typically, if q ¼ 2, MF-DFA transforms into the classic DFA algorithm. We repeat the above steps for several time scales s. For longrange correlated time series, it is clear that Fq ðsÞ obeys the power law as s increases:
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
Fq ðsÞfshðqÞ ;
(11)
where the scaling exponent hðqÞ is called the generalized Hurst exponent, which can be obtained by calculating the slope of the log-log plots of Fq ðsÞ and s through linear fitting. Specifically, for q > 0, the generalized Hurst exponent hðqÞ depicts the scaling behavior of the segments with large fluctuations (Shen et al., 2016) and vice versa. In addition, hðqÞ is constant for all q when the time series are mono-fractal. On the contrary, if hðqÞ varies with different q, it indicates that the original series is multifractal. To be specific, the types of fluctuations related to q are anti-persistent when hðqÞ < 0:5, display random walk behavior when, hðqÞ ¼ 0:5 and persistent when 0:5 < hðqÞ < 1. For 1 < hðqÞ, correlations exist but cease to be of the power-law form. The relationship between the multifractal scaling exponent tðqÞ and hðqÞ is expressed as follow:
tðqÞ ¼ q: hðqÞ 1:
(12)
Via the Legendre transform, we can obtain the singularity strength function a and the singularity spectrum f ðaÞ, which are defined by the following:
f ðaÞ ¼ q:a tðqÞ:
(13)
Here, a is used to characterize the singularity of the original signals and the singularity spectrum f ðaÞ is used to describe the singularity content. The time series has a multifractal nature if the tðqÞ is a nonlinear function of q. 3.2.2. Multifractal analysis Firstly, analyses of the multifractal detrended fluctuation for two the pollutant time series are performed. It is worth noting that the lower bound of the scale size has a segment size of 10 samples and the upper bound is one tenth of the size of the series. The fluctuation function logðFq ðsÞÞ versus logðsÞ of the original pollutant time series is plotted for different statistical moments q ¼ ð5; 5Þ and shown in Fig. 5. The linear relationship between the dependent and independent variables is obvious because almost single lines can pair the curve of the fluctuation functions, which implies the existence of power-law relationship, as indicated in Eq. (9). This feature can also be noticed in the power-law relationship of the power spectrum, as shown in Fig. 4. In addition, a phenomenon appears in some of the relationships shown in Fig. 5: the regression lines with different orders of q tend to converge, which indicates the multifractal nature of the original series of the PM2.5 and PM10 concentration records. The regression lines for different orders of q are parallel if the pollutant time series is monofractal (Weerasinghe et al., 2016). Remark: In this paper, we refer to the generalized Hurst exponent as hðqÞ. Values of h > 0:5 signify persistency, which are obtained in the case of positive autocorrelation and exhibit long memory. Values of h ¼ 0:5 point to complete randomness. Cases of h < 0:5 are those with negative autocorrelation. Generally, better predictability follows a larger value of the Hurst parameter. The plot of the generalized Hurst exponent hðqÞ versus q, as shown in Fig. 6, can be considered to investigate whether these apparently converging lines are multifractal or monofractal. For a monofractal system, the generalized Hurst exponent hðqÞ versus q plot should be a straight line with zero gradient (Ihlen, 2012); the white noise series is a typical instance. By contrast, multifractal behavior for two pollutant series in Shanghai is clearly shown in Fig. 6 because both the curves have obvious negative slopes. To be specific, the hð2Þ values of the PM2.5
7
and PM10 series are 0.9112 and 0.9208, respectively. Because all the hð2Þ values are larger than 0.5, we can preliminarily conclude that the original PM2.5 and PM10 series are nonstationary signals with long range power-law correlations (Shi, 2015), which signifies that the fluctuations of the pollutant time series are positively correlated in a power-law fashion. 3.2.3. Origins of multifractality Furthermore, it is still necessary to analyze the nature of the multifractal behavior of the pollutant time series. Generally, there are two major types of sources for multifractality in time series: I, different long-range temporal correlations for small and large fluctuations and II, a fat-tailed probability distribution of variations (Rak and Zie˛ ba, 2015). The main methods to find the contributions of the two sources of multifractality are the shuffling procedure and the surrogating et al., 2005). To be specific, the procedure, respectively (Kwapien nonlinear temporal correlation can be destroyed by the shuffling procedure, whereas the distribution of the fluctuations remains unchanged. Thus, if no multifractal feature remains after we conduct the shuffling procedure on the original multifractal series, we can conclude that long-range correlation dominates the multifractality in the original series. Therefore, we apply the shuffling procedure to study the contribution of long-range correlations to the multifractality. In contrast, surrogate data can be used to study the contributions of the fat-tailed probability distribution because it can eliminate any sort of nonlinearities in original series and weaken the non-Gaussianity of the distributions. The shuffling procedure can be described as three steps: 1) pairs of random integer numbers ðm; nÞ are generated, where L denotes the length of the original series and m; n < L, 2) the entries m with n are interchanged, and 3) the previous steps are repeated 10L times to guarantee the raw series can be shuffled completely. The classic method for quantifying the fat-tailed contribution is Fourier phase randomization, which is conducted as follows: 1) a discrete Fourier transform of the original series is conducted, 2) the discrete Fourier transform of the data is multiplied by random phases, and 3) an inverse Fourier transform is performed to generate a phase randomized surrogates. For convenience, hðqÞ, hshuf ðqÞ, and hsurr ðqÞ denote the Hurst exponents for the original, shuffled and surrogated pollutant series, respectively. Remark: Specifically, if hshuf ðqÞz0:5 and it does not vary with the increasing of q, long-range correlation alone is the source of multifractality. On the other hand, fractality is due to fat-tailed probability distribution of variations, if hsurr ðqÞz0:5 and it is constant. In addition, the strength of multifractality of the shuffled and surrogated series are weaker than that of original series, when both types of multifractality are presented. The spectra of generalized Hurst exponent are presented for shuffled and surrogated data in Fig. 7. It is obvious that hðqÞ, hshuf ðqÞ and hsurr ðqÞ are larger than 0.5 and depend strongly on q, indicating that the multifractality of the pollutant series for the PM2.5 and PM10 concentrations originate from these two sources. To further compare the contributions of multifractality from the two sources, the corresponding multifractal spectrum f ðaÞ is computed and shown in Fig. 8 for the original series, shuffled series and surrogate series. Tor multifractal behavior, the shape of f ðaÞ is a humped curve, whereas for monofractal, the curve of f ðaÞ reduces to a point (Shen et al., 2016). Obviously, the multifractal spectra of the PM2.5 and PM10 series depicted in Fig. 8 show that the curves are unsymmetrical about their axis. Specifically, the curves of the multifractal spectrum for these three series, including the original PM2.5 series, shuffled PM2.5 series and surrogated PM2.5 series, approximately coincide near the peak of a ¼ 0:95, as presented in
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
8
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
Fig. 5. Fluctuation functions for pollutant time series of Shanghai using MF-DFA with different orders: (a) PM2.5 and (b) PM10.
Fig. 6. Variation of the generalized Hurst exponent hðqÞ with q for different pollutant series in Shanghai, China.
Fig. 8(a). This is mainly because the generalized Hurst exponent hðqÞ for the three series intersect each other when q is nearly zero, as shown in Fig. 7(a). In addition, the variation of the generalized Hurst exponent hðqÞ for the surrogated PM10 series tends to be minimized, and the three series of PM10 coincide slightly. Thus the multifractal spectra of these series tend to be right skewed and the right arms of the curves coincide with each other. In addition, the strength of multifractality can be quantified by the width of the singularity spectrum, which is denoted as Da ¼ amax amin . Generally, a large value of Da implies a stochastic dynamic character, long-term persistence and strong fluctuations of the original series (Xue et al., 2015). Specific characteristics of Da from different series are listed in Table 2. There are obvious differences between the spectra of the original, shuffled and surrogate series. The range of change of f ðaÞ is reduced significantly after the original series is shuffled and surrogated, indicating a reduction in the degree of multifractality. Taking the PM10 series of Shanghai as an example, the ranges of change of the multifractal spectrum
corresponding to the original series, shuffled series and surrogated series are 0.7113, 0.5582 and 0.2473, respectively. It is found that the values of Dashuf are larger than the values of Dasurr for the PM2.5 series and PM10 series, which means that the strength of the fattailed probability distribution multifractality is greater than that of the long-range correlation multifractality. The fact that Dasurr < Dashuf < Da demonstrates that the multifractal characteristics in the PM2.5 and PM10 series are mostly due to the significant long-term memory and fat-tailed distribution. Thus, we can preliminarily view fractality due to a fat-tailed probability distribution as the dominant multifractality in the studied series, including the PM2.5 and PM10 concentration series.
4. Extended analyses and discussions Based on the spectral and multifractal analysis of pollutant observations in Shanghai, the scope of subject is expanded to predictability of pollutants. More observations collected from different
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
9
Fig. 7. Generalized Hurst exponent hðqÞ as a function of q for the original, shuffled and surrogated series: (a) PM2.5 and (b) PM10 in Shanghai, China.
Fig. 8. Singularity spectrum f ðaÞ as a function of the singularity strength function a for (a) the PM2.5 series and (b) the PM10 series in Shanghai, China.
Table 2 Degrees of multifractality for the pollutant concentration of the original, shuffled and surrogated series. Pollutant series
Da
Dashuf
Dasurr
Dasurr =Da
PM2.5 PM10
0.8550 0.7113
0.5837 0.5582
0.3685 0.2473
0.431 0.3477
predictability and multifractality of pollutant series from the quantitative perspective. The predictability of PM series will be inscribed firstly, then several pollutants collected from different cities are discussed thoroughly, the special condition and explanation are given in the final discussions part. 4.1. The predictability of the PM series
sites are studied to investigate the relationship between the
In the above analyses, it is concluded that the multifractal
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
10
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
characteristics in the PM2.5 and PM10 series of Shanghai are due to significant long-range correlation and a fat-tailed distribution, and the contribution of multifractality from the fat-tailed distribution is greater than that from the long-range correlation, which inspires us to consider the difference in predictability of PM2.5 and PM10 series. Generally, a Hurst exponent for long-range correlation > 0:5 also indicates that predictability is possible (Munoz Diosdado et al., 2013), and predictability has a relationship with the value of the Hurst exponent. Therefore, our objective in this part is to investigate the relevance between the predictability of the pollutant time series and the multifractal characteristics, which can help understand the multifractality and its resources further. Attaining a good understanding of the predictive powers can benefit the decisionmaking process and environmental management. To the best of our knowledge, previous researches about multifractality mostly focus on the explanation while less notice on the exploration of predictability quantitatively. Thus, this paper are motivated to study the relationships between the multifractality of pollutant series and the quantitative predictability. With regards to the quantitative predictability of pollutant concentration series (such as PM, SO2, CO, NO etc.), there are several types of approaches that are widely applied, and they branch into two main streams fundamentally: deterministic methods and statistical approaches. (Feng et al., 2015). Specifically, deterministic approaches are meteorological methods which depend on real-time information of pollutant sources, emission quantity and temporal physical process. Eulerian Chemistry and Transport Models (CTMs) and its derivations are typical deterministic methods that have been applied to address important issues ranging from forecasting the routine of pollutants to climate change (Pirovano et al., 2015). On the other hand, statistical methods are usually used in literatures to make forecasting or early-warning of pollutant concentration, and multiple linear regression (MLR) (Genc et al., 2010), support vector machine (SVM) (Osowski and Garanty, 2007), extreme learning machines (ELMs) (Matyssek et al., 2012), adaptive neuro-fuzzy inference systems (ANFIS) (Zhao et al., 2016) and artificial neural networks (ANNs) (Wang et al., 2016) are commonly used statistical approaches. The aim of this section is to investigate the relationship between the predictability and multifractality of pollutant series. Recall that multifractality is regarded as the inherent property to describe the complex dynamics of series and we have concluded that the multifractal characteristics in the PM series of Shanghai are due to significant long-range correlation and a fat-tailed distribution, and the contribution of multifractality from the fat-tailed distribution is greater than that from the long-range correlation. Consequently, it is sensible to choose statistical approaches to quantify the predictability of pollutant series since the results are obtained based on a large quantity of historical observations and measurements. The degree of multifractality can determine the significance of periodic fluctuations and it also can unveil the long-term correlations in different size, both of which can benefit the forecasting of pollutant series. Hence, several typical statistical approaches are selected to quantify the predictability of pollutant series. Specifically, when applying these pollutant concentration series in forecasting models, a moving window is placed on the time series to identify the training set and testing set. In this paper, an hourly dataset from May 13, 2014 to May 13, 2016 is arranged as the training set, whereas the testing set covers six months nearly from May 14, 2016 to Nov 12, 2016. In other words, the data series collected from May 13, 2014 to May 13, 2016 is used to train the model, and then the trained model is used to make a one-stepahead forecasting of concentration values of the next six months. It is worth noting that same training set and testing set are adopted in all methods to ensure reasonable comparisons on the
predictability of the two series. The performance of the one-stepahead forecasting is evaluated by the mean absolute percent error (MAPE) and the index of agreement (IOA), which are defined as follow:
MAPE ¼
N 1 X jPi Oi j 100% and N i¼1 jOi j
(14)
PN
IOA ¼ 1
2 i¼1 ðPi Oi Þ 2 ; PN þ Oi O P O i i¼1
(15)
where N denotes the number of time points and Pi , Oi and O are the forecast values, observed values, and average value of the observed series, respectively. The MAPE can explain the actual size of the residual error between the observed and predicted concentrations. The index of IOA is dimensionless, has limits of 0.0 and 1.0, and can be viewed as a standardized measure of the mean square error. An IOA value of 0.0 indicates no agreement, and a value of 1.0 indicates perfect agreement. The specific results of forecasting performance are listed in Table 3. It is obvious that the prediction accuracy of the PM2.5 series is higher than that of the PM10 series in Shanghai, which is verified by all five prediction models. In other words, compared with the PM10 series of Shanghai, the PM2.5 series covering the studied period, from May 13, 2014 to Nov 12, 2016, contains more distinct correlations or regularities that are easier to catch and learn, making it easier to deduce and predict trends. The multifractality for the significant Hurst exponent, originating from the long-range correlation and fat-tailed distribution, is able to explain the superiority of the PM2.5 series in Shanghai. After all, the value of Hurst exponent of PM2.5 series is larger than that of PM10 series, which is presented in Fig. 6. In addition, the singularity spectrum f ðaÞ and multifractality degrees Da, represented in Fig. 8 and Table 2, demonstrate that the strength of multifractal characteristics in the PM2.5 series in terms of both the long-range correlation and fattailed distribution is greater than that of the PM10 series. 4.2. Extended analyses To further investigate the differences in the predictability of different pollutant series and their multifractal characteristics, more observations of pollutants concentration are collected in this paper, which will be discussed thoroughly as follow. 4.2.1. The same approach to analyze different pollutants In this section, the range of observations are expanded to several common pollutants, including air quality index (AQI), carbonic oxide (CO), nitrogen dioxide (NO2), ozone (O3) and sulfur dioxide (SO2).and all these observations involved are collected from Shanghai observation station. The same rule is implemented on
Table 3 Performance (MAPE, IOA) of hourly average pollutant time series forecasting, with different models. Forecasting model
MLR SVM ANNs ELMs ANFIS Average
PM2.5 series
PM10 series
MAPE
IOA
MAPE
IOA
0.072 0.0702 0.0722 0.0712 0.0713 0.0714
0.9953 0.9954 0.9952 0.9953 0.9953 0.9953
0.1053 0.1024 0.1051 0.1048 0.1037 0.1043
0.9871 0.9874 0.9873 0.9873 0.9872 0.9873
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
these objects so as to retain the condition and investigate the relationships objectively. The results are tabulated in Table 4, from which we can notice that the CO concentration series owns more significant trends and periodic fluctuations since it have higher prediction accuracy than others. The average value of MAPE error of CO series is 0.0601, which is the smallest between all the observations series. Besides, the average IOA index value of CO series is 0.9903, which also indicates the superiority of predictability of CO series. On the other hand, recall that multifractality is regarded as the inherent property to describe the complex dynamics of series and it is concluded that the contribution of multifractality from the fattailed distribution and from the long-range correlation can be read roughly by comparing the values of Dasurr =Da and Dashuf =Da. Consequently, the results obtained in Table 4 is sensible to quantify the predictability of pollutant series since the results are obtained based on a large quantity of historical observations and measurements. Herein, the value of Dasurr =Da, indicating the degree of contribution of the long-range correlations’ multifractal strength to the whole multifractal strength, is listed in Table 4, along with the values of Da, Dashuf and Dasurr , respectively. It can be noticed that CO series have the highest value of Dasurr =Da among all the five pollutants series and it is 0.4107. Recall the results of PM series obtained before, some relationships between the predictability and multifractality of pollutant series can be deduced reasonably, which is in agreement with Yuval and Broday (2010). The difference is that we get this point by quantitative analysis and comparison. 4.2.2. The analyses of pollutants series in different sites Hourly time series of air pollutant series obtained from five important cities in China from May 13, 2014 to Nov 12, 2016 are studied as additional cases. These cities include Tianjin (N 39130 , E 117 200 ), Wuhan (N 30 520 , E 114 310 ), Guangzhou (N 23 080 , E 113160 ), Lanzhou (N 36 030 , E 103 400 ) and Nanjing (N 32 040 , E 118 780 ), which are the representative cites of different orientations and directions in China. Analogously, the multifractal characteristics of different pollutant concentration series in these cities are analyzed and studied by the multifractal-characteristics parameters, including the spectrum width and the generalized Hurst exponent, while the predictability of these observations are inscribed by several typical statistical models that have been applied above. With regards to the predictability, observations of each pollutant series in one site are divided into training set and testing set with the same rules applied above. The forecasting procedure is conducted by the five statistical models and the predictabilities of
11
these series are evaluated by the same index, MAPE and IOA. The specific results are listed in Table 5 and Table 6. In each site, the best performance of predictability is marked in bold fonts, which is obtained by comparing the values of average error index. For instance, the AQI series and NO2 series have the highest predictability among all the seven pollutant concentration observations in TJ region, this rule is established under the specific circumstance in this study. Meanwhile, multifractality is described by several parameters, including the width of the singularity spectrum which implies a stochastic dynamic character, long-term persistence and strong fluctuations of the original series, shuffled series and surrogate series, respectively. It is worth noting that the ratio of Dasurr to Da can reflect the contribution of the long-range correlations’ multifractal strength to the whole multifractal strength (Shen et al., 2016), while the ratio of Dashuf to Da indicates the proportion of the fat-tailed probability distribution multifractality. Specific results of predictability and multifractality are listed in Tables 5 and 6. It is shown that in each studied city the pollutant observations that have higher forecasting accuracy mostly own the higher value of Dasurr =Da, which indicates the contribution of the long-range correlations’ multifractal strength to the whole multifractal strength can affect the forecasting performance. Take Tianjin as an example: The average value of MAPE of the AQI series and NO2 series is the same as 0.0758, which is smaller than that of the CO 0.0765, O3 0.0834, SO2 0.1104, PM2.5 0.091 and PM10 series 0.1296. In addition, the index of IOA also indicates that the predictability of the AQI series and NO2 series is superior to that of the other observations from Tianjin. Meanwhile, for the AQI series and NO2 series, the values of Dasurr =Da are 0.5436 and 0.7025, respectively, which are larger than that of CO, O3, SO2, PM2.5 and PM10 series. Analogical rule also exist in other studied cities. For Wuhan, Guangzhou, Lanzhou and Nanjing, the pollutant with the best predictability is different, that are NO2, PM2.5, PM10 and SO2, respectively. However, in spite of the differences, all of them have the highest value of Dasurr =Da in each dependent separated site. It should be noted that there is no comparability between pollutant series from different cities because the mechanisms of pollutants entail complex interactions between chemistry, emission and meteorology, which can vary with different geographical conditions. 4.3. Discussions Caused by the local correlations and interactions between various macro factors, however, the pollutant time series are of
Table 4 The multifractal characteristics and predictability of several different pollutants series in Shanghai, including the width of multifractal spectrum of original, shuffled and surrogated series. The best predict performances are marked in boldface. Observation
Da
Dashuf
Dasurr
Dasurr =Da
Predictability (MAPE) ANFIS
ANNs
ELMs
MLR
SVM
Average
AQI CO NO2 O3 SO2
0.885 0.7785 0.5947 0.5519 0.5557
0.6535 0.5285 0.3108 0.4944 0.6943
0.3621 0.3197 0.234 0.2101 0.2248
0.4092 0.4107 0.3935 0.3807 0.4045
0.0709 0.0681 0.1064 0.1002 0.0971
0.0718 0.0567 0.1032 0.0970 0.0903
0.0709 0.0636 0.1067 0.0995 0.0907
0.0726 0.0599 0.1075 0.1014 0.0896
0.0698 0.0523 0.1017 0.0952 0.0888
0.0712 0.0601 0.1051 0.0987 0.0913
Observation
Da
Dashuf
Dasurr
Dasurr =Da
Predictability (IOA) ANFIS
ANNs
ELMs
MLR
SVM
Average
AQI CO NO2 O3 SO2
0.885 0.7785 0.5947 0.5519 0.5557
0.6535 0.5285 0.3108 0.4944 0.6943
0.3621 0.3197 0.234 0.2101 0.2248
0.4092 0.4107 0.3935 0.3807 0.4045
0.9889 0.9901 0.984 0.9732 0.9493
0.9896 0.9906 0.9842 0.9802 0.9615
0.989 0.9902 0.9839 0.9752 0.9557
0.9887 0.9901 0.9839 0.9754 0.9545
0.99 0.9906 0.9843 0.979 0.9588
0.9892 0.9903 0.9841 0.9766 0.956
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
12
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
Table 5 Comparisons of multifractal characteristics and predictability (MAPE) of pollutants series from additional cases. The best predict performances are marked in boldface. Site
TJ
WH
GZ
LZ
NJ
Pollutant
AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10
Predictability
Multifractal characteristics
ANFIS
ANNs
ELMs
MLR
SVM
Average
Da
Dashuf
Dasurr
Dasurr =Da
0.0763 0.0818 0.0758 0.0831 0.1111 0.0909 0.1438 0.0665 0.1054 0.0558 0.1064 0.1274 0.0992 0.1555 0.2116 0.1537 0.4504 0.2339 0.1375 0.1336 0.1645 1.8558 0.1318 0.2224 0.2082 0.0937 0.0877 0.0717 0.1159 0.1166 0.1293 0.1152 0.0905 0.1366 0.1665
0.0746 0.0732 0.0762 0.0824 0.1135 0.0893 0.1363 0.0576 0.1042 0.0566 0.1062 0.1258 0.0978 0.1547 0.2125 0.1449 0.2904 0.2099 0.1352 0.1252 0.1775 1.4141 0.1189 0.2055 0.2086 0.0954 0.0883 0.071 0.1139 0.1188 0.1326 0.1183 0.0896 0.1473 0.1664
0.0767 0.0823 0.0755 0.0848 0.1098 0.0914 0.1366 0.0648 0.1045 0.0562 0.1067 0.126 0.0987 0.1551 0.2099 0.1527 0.3536 0.2285 0.1361 0.1307 0.1608 1.5394 0.1299 0.1949 0.2095 0.0935 0.0908 0.0708 0.1153 0.1191 0.1308 0.1158 0.0898 0.1403 0.1668
0.0768 0.0735 0.0768 0.0861 0.1102 0.0947 0.1453 0.0669 0.1059 0.0569 0.1067 0.1292 0.0991 0.1605 0.209 0.1602 0.3738 0.2712 0.1382 0.1339 0.1832 2.0222 0.1355 0.2698 0.2107 0.0958 0.0955 0.072 0.1195 0.1177 0.1299 0.1221 0.091 0.1404 0.1682
0.0747 0.0718 0.0747 0.0806 0.1074 0.0887 0.0858 0.0593 0.1006 0.0561 0.1029 0.1206 0.0961 0.1495 0.2048 0.1396 0.1641 0.1987 0.1306 0.1216 0.1716 0.3206 0.1175 0.1666 0.1974 0.092 0.0898 0.0706 0.1127 0.1132 0.1267 0.1148 0.0896 0.1301 0.1632
0.0758 0.0765 0.0758 0.0834 0.1104 0.091 0.1296 0.063 0.1041 0.0563 0.1058 0.1258 0.0982 0.1551 0.2096 0.1502 0.3265 0.2284 0.1355 0.129 0.1715 1.4304 0.1267 0.2118 0.2069 0.0941 0.0904 0.0712 0.1155 0.1171 0.1299 0.1172 0.0901 0.1389 0.1662
0.733 0.7616 0.3646 0.797 2.0952 0.7546 0.4982 0.4205 0.5355 0.3598 0.5761 0.4261 0.4205 0.874 2.0475 0.4287 2.0178 0.7669 0.3041 0.2532 0.563 0.5934 0.3503 0.55 0.4287 0.5653 0.4362 0.5585 0.7763 0.5304 0.7559 0.5377 0.315 0.7709 0.7932
0.7693 0.413 0.4759 2.1632 0.571 0.4295 0.477 0.4615 0.5232 0.4055 0.5557 0.5611 0.5636 0.8934 0.478 0.3834 0.6381 0.7609 0.3344 0.4583 0.5753 0.4262 0.3706 0.4177 0.5549 0.5505 0.3647 0.7039 0.7169 0.5097 0.6266 0.4054 0.4873 0.7122 0.7146
0.3985 0.3783 0.2561 0.0996 0.1886 0.279 0.1422 0.2524 0.0832 0.2943 0.3403 0.2085 0.3045 0.3392 0.2157 0.2209 0.1698 0.3429 0.1989 0.2237 0.2594 0.2536 0.2038 0.2652 0.1708 0.2962 0.2622 0.3376 0.2762 0.2594 0.3352 0.2084 0.1931 0.2585 0.1783
0.5436 0.4967 0.7025 0.125 0.09 0.3697 0.2855 0.6003 0.1554 0.8179 0.5907 0.4893 0.7241 0.3882 0.1054 0.5152 0.0841 0.4472 0.6539 0.8835 0.4607 0.4273 0.5819 0.4822 0.3984 0.5241 0.6011 0.6045 0.3558 0.489 0.4435 0.3875 0.613 0.3354 0.2248
Note: TJ denotes to Tianjin, WH denotes to Wuhan, GZ denotes to Guangzhou, LZ denotes to Lanzhou and NJ denoted to Nanjing.
high complexity, which makes it difficult to analyze the trends and make predictions. It is a complex system that contains a mass of components, such as atmospheric pollutant sources, topographical features, and atmospheric self-purification. All of these components interact with each other on different timescales, resulting in the certainty of characteristics of the pollutant time series to some extent. To be honest, no exact rules or relationships can be used to explain the predictability of different pollutant series, and not all cases exhibit the rough relation mentioned above. However, this approach does supply a novel perspective to analyze and evaluate the predictability of pollutant series in a specific area. Consequently, it is of some value to investigate the multifractal characteristics of air pollutant time series because the predictability can be evaluated by the multifractal sources and the variations in the multifractal strength. 5. Conclusions In this paper, we investigate the multifractal features and predictability of different pollutant time series and acquire the following nontrivial findings: Firstly, the power spectra of different pollutant series are computed by adopting the basic modulus-squared of the discrete Fourier transform, and different frequency regimes are found. The distinct decreasing trends at low and high frequencies testify to the existence of power-law behaviors and long-range correlations at these frequencies. Then, based on the stochastic processes, chaos theory and time
series analysis, the statistical self-affinity and multifractality of a pollutant series are investigated. Through a measure of the longterm memory of time series related to the autocorrelations of the time series and the rate at which these decrease as the lag between pairs of values increases, multifractal characteristics are confirmed. In addition, it is found that different long-range temporal correlations for small and large fluctuations and a fat-tailed probability distribution of variations are two major sources for multifractality, and the fat-tailed probability distribution contributes more to the multifractality, which shown by the multifractal spectra and multifractality degrees of the original, shuffled and surrogated pollutant concentration series. Finally, extended analyses and discussions are conducted to investigate the relevance between the predictability of the pollutant time series and the multifractal characteristics. A relationship between the forecasting accuracy of pollutant series and the value of Dasurr =Da is found in this paper, which means the contribution of the long-range correlations’ multifractal strength to the whole multifractal strength can affect the predictability of pollutant series in a specific region to some extent. The conclusion about the relationship is in agreement with Yuval and Broday (2010). However, this relationship is quantified and unveiled by quantitative analysis and comparison, which has never been obtained so far to the best of our knowledge. These results based on real observations may aid in attaining a full understanding of the multifractality and scaling behavior, which represent a basis for future investigations into improving the performances of forecasting models such as the self-organized
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
13
Table 6 Comparisons of multifractal characteristics and predictability (IOA) of pollutants series from additional cases. The best predict performances are marked in boldface. Site
TJ
WH
GZ
LZ
NJ
Pollutant
AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10 AQI CO NO2 O3 SO2 PM2.5 PM10
Predictability
Multifractal characteristics
ANFIS
ANNs
ELMs
MLR
SVM
Average
Da
Dashuf
Dasurr
Dasurr =Da
0.9869 0.9857 0.9876 0.9739 0.9631 0.9821 0.9202 0.8565 0.9576 0.9785 0.9868 0.9778 0.9814 0.9716 0.953 0.9844 0.9608 0.9895 0.9742 0.9931 0.9755 0.4791 0.874 0.956 0.8997 0.9923 0.9913 0.9911 0.9682 0.9792 0.9782 0.984 0.9837 0.9747 0.95
0.9877 0.9873 0.9875 0.9754 0.9629 0.982 0.9377 0.8879 0.9587 0.9795 0.9871 0.978 0.9821 0.9721 0.9529 0.9848 0.9861 0.9903 0.973 0.9936 0.9757 0.5985 0.8982 0.9612 0.8883 0.9926 0.9914 0.9912 0.9712 0.9792 0.9768 0.984 0.9838 0.9752 0.9504
0.9862 0.9864 0.9877 0.9749 0.9636 0.9811 0.9348 0.8738 0.9582 0.9786 0.9869 0.9782 0.9817 0.972 0.953 0.9845 0.9791 0.9897 0.9736 0.9933 0.9765 0.5442 0.8809 0.9587 0.8941 0.9924 0.9913 0.9912 0.9693 0.9791 0.9769 0.984 0.9838 0.9751 0.9505
0.9868 0.9875 0.9873 0.9762 0.9632 0.9821 0.9174 0.859 0.9575 0.978 0.9869 0.9775 0.9816 0.9713 0.9529 0.9842 0.9631 0.9891 0.9753 0.9929 0.9754 0.4309 0.8669 0.952 0.8979 0.9923 0.9909 0.9911 0.9677 0.9792 0.978 0.9838 0.9835 0.976 0.9499
0.9874 0.9877 0.9879 0.9772 0.9639 0.9823 0.9544 0.8631 0.9587 0.9781 0.9871 0.979 0.9822 0.9723 0.9532 0.984 0.9877 0.9905 0.9739 0.9934 0.9762 0.7881 0.884 0.9632 0.8966 0.9927 0.9913 0.9913 0.9697 0.9796 0.9776 0.9841 0.9839 0.9761 0.9508
0.987 0.9869 0.9876 0.9755 0.9633 0.9819 0.9329 0.8681 0.9581 0.9785 0.987 0.9781 0.9818 0.9719 0.953 0.9844 0.9754 0.9898 0.974 0.9933 0.9759 0.5682 0.8808 0.9582 0.8953 0.9925 0.9912 0.9912 0.9692 0.9793 0.9775 0.984 0.9837 0.9754 0.9503
0.733 0.7616 0.3646 0.797 2.0952 0.7546 0.4982 0.4205 0.5355 0.3598 0.5761 0.4261 0.4205 0.874 2.0475 0.4287 2.0178 0.7669 0.3041 0.2532 0.563 0.5934 0.3503 0.55 0.4287 0.5653 0.4362 0.5585 0.7763 0.5304 0.7559 0.5377 0.315 0.7709 0.7932
0.7693 0.413 0.4759 2.1632 0.571 0.4295 0.477 0.4615 0.5232 0.4055 0.5557 0.5611 0.5636 0.8934 0.478 0.3834 0.6381 0.7609 0.3344 0.4583 0.5753 0.4262 0.3706 0.4177 0.5549 0.5505 0.3647 0.7039 0.7169 0.5097 0.6266 0.4054 0.4873 0.7122 0.7146
0.3985 0.3783 0.2561 0.0996 0.1886 0.279 0.1422 0.2524 0.0832 0.2943 0.3403 0.2085 0.3045 0.3392 0.2157 0.2209 0.1698 0.3429 0.1989 0.2237 0.2594 0.2536 0.2038 0.2652 0.1708 0.2962 0.2622 0.3376 0.2762 0.2594 0.3352 0.2084 0.1931 0.2585 0.1783
0.5436 0.4967 0.7025 0.125 0.09 0.3697 0.2855 0.6003 0.1554 0.8179 0.5907 0.4893 0.7241 0.3882 0.1054 0.5152 0.0841 0.4472 0.6539 0.8835 0.4607 0.4273 0.5819 0.4822 0.3984 0.5241 0.6011 0.6045 0.3558 0.489 0.4435 0.3875 0.613 0.3354 0.2248
Note: TJ denotes to Tianjin, WH denotes to Wuhan, GZ denotes to Guangzhou, LZ denotes to Lanzhou and NJ denoted to Nanjing.
criticality model and the atmospheric chemistry-transport models. Conflict of interests The authors declare that there is no conflict of interests regarding the publication of this paper. Acknowledgements This research was supported by the National Natural Science Foundation of China (Grant No. 71573034), China Postdoctoral Science Foundation (2016M601318) and Liaoning Provincial Department of Education Research Platform Project (LN2016JD020). References Bandowe, B.A.M., Meusel, H., Huang, R., jin, Ho, K., Cao, J., Hoffmann, T., Wilcke, W., 2014. PM2.5-bound oxygenated PAHs, nitro-PAHs and parent-PAHs from the atmosphere of a Chinese megacity: seasonal variation, sources and cancer risk assessment. Sci. Total Environ. 473e474, 77e87. Calif, R., Schmitt, F.G., 2014. Multiscaling and joint multiscaling description of the atmospheric wind speed and the aggregate power output from a wind farm. Nonlinear process. geophys. 21, 379e392. Carrizosa, E., Olivares-Nadal, A.V., Ramírez-Cobo, P., 2013. Time series interpolation via global optimization of moments fitting. Eur. J. Oper. Res. 230, 97e112. Chu, a K.M., Kwok, R.C.W., Yu, K.N., 2005. Study of pollution dispersion in urban areas using computational fluid dynamics (cfd) and geographic information system (gis). Environ. Model. Softw. 20, 273e277. Chudnovsky, A.A., Koutrakis, P., Kloog, I., Melly, S., Nordio, F., Lyapustin, A., Wang, Y., Schwartz, J., 2014. Fine particulate matter predictions using high resolution Aerosol Optical Depth (AOD) retrievals. Atmos. Environ. 89, 189e198.
Feng, X., Li, Q., Zhu, Y., Hou, J., Jin, L., Wang, J., 2015. Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 107, 118e128. Genc, D.D., Yesilyurt, C., Tuncel, G., 2010. Air pollution forecasting in Ankara, Turkey using air pollution index and its relation to assimilative capacity of the atmosphere. Environ. Monit. Assess. 166, 11e27. Gokhale, S., Khare, M., 2007. Statistical behavior of carbon monoxide from vehicular exhausts in urban environments. Environ. Model. Softw. 22, 526e535. Ihlen, E.A.F., 2012. Introduction to multifractal detrended fluctuation analysis in Matlab. Front. Physiol. 3, 1e18. JUN. Kai, S., Chun-qiong, L., Nan-shan, A., Xiao-hong, Z., 2008. Using three methods to investigate time-scaling properties in air pollution indexes time series. Nonlinear Anal. Real World Appl. 9, 693e707. Kantelhardt, J.W., Zschiegner, S.a., Stanley, H.E., 2002. Multifractal detrended uctuation analysis of nonstationary time series. Phys. A 316, 87e114. , J., Oswie˛ cimka, P., Drozdz, S., 2005. Components of multifractality in highKwapien frequency stock returns. Phys. A Stat. Mech. its Appl. 350, 466e474. Lee, C.K., Juang, L.C., Wang, C.C., Liao, Y.Y., Yu, C.C., Liu, Y.C., Ho, D.S., 2006. Scaling characteristics in ozone concentration time series (OCTS). Chemosphere 62, 934e946. Liu, Z., Wang, L., Zhu, H., 2015. A timeescaling property of air pollution indices: a case study of Shanghai, China. Atmos. Pollut. Res. 6, 886e892. pez, R., Galinato, G.I., Islam, A., 2011. Fiscal spending and the environment: Lo Theory and empirics. J. Environ. Econ. Manage 62, 180e198. Matyssek, R., Wieser, G., Calfapietra, C., De Vries, W., Dizengremel, P., Ernst, D., Jolivet, Y., Mikkelsen, T.N., Mohren, G.M.J., Le Thiec, D., Tuovinen, J.P., Weatherall, A., Paoletti, E., 2012. Forests under climate change and air pollution: gaps in understanding and future directions for research. Environ. Pollut. Munoz Diosdado, A., Galvez Coyt, G., Balderas Lopez, J.A., del Rio Correa, J.L., 2013. Multifractal analysis of air pollutants time series. Rev. Mex. Fis. 59, 7e13. Osowski, S., Garanty, K., 2007. Forecasting of the daily meteorological pollution using wavelets and support vector machine. Eng. Appl. Artif. Intell. 20, 745e755. Pamuła, G., Grech, D., 2014. Influence of the maximal fluctuation moment order q on multifractal records normalized by finite-size effects. EPL Europhys. Lett. 105, 50004. Pirovano, G., Colombi, C., Balzarini, A., Riva, G.M., Gianelle, V., Lonati, G., 2015. PM2.5
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090
14
Q. Dong et al. / Environmental Pollution xxx (2016) 1e14
source apportionment in Lombardy (Italy): comparison of receptor and chemistry-transport modelling results. Atmos. Environ. 106, 56e70. Rak, R., Zie˛ ba, P., 2015. Multifractal flexibly detrended fluctuation analysis. Acta Phys. Pol. B 46, 1925. Shen, C., Huang, Y., Yan, Y., 2016. An analysis of multifractal characteristics of API time series in Nanjing, China. Phys. A Stat. Mech. its Appl. 451, 171e179. Shi, K., 2015. Multifractal processes and self-organized criticality of PM2.5 during a typical haze period in chengdu, China. Aerosol Air Qual. Res. 2015, 926e934. Stanley, H.E., Afanasyev, V., Amaral, L.A.N., Buldyrev, S.V., Goldberger, A.L., Havlin, S., Leschhorn, H., Maass, P., Mantegna, R.N., Peng, C.-K., Prince, P.A., Salinger, M.A., Stanley, M.H.R., Viswanathan, G.M., 1996. Anomalous fluctuations in the dynamics of complex systems: from DNA and physiology to econophysics. Phys. A Stat. Mech. its Appl. 224, 302e321. Telesca, L., Lapenna, V., Macchiato, M., 2002. Fluctuation analysis of the hourly time variability in observational geoelectrical signals. Fluct. Noise Lett. 2, L235eL242. Thatcher, M., Hurley, P., 2010. A customisable downscaling approach for local-scale meteorological and air pollution forecasting: performance evaluation for a year of urban meteorological forecasts. Environ. Model. Softw. 25, 82e92.
Wang, J., Liu, F., Song, Y., Zhao, J., 2016. A novel model: dynamic choice artificial neural network (DCANN) for an electricity price forecasting system. Appl. Soft Comput. 48, 281e297. Weerasinghe, R.M., Pannila, A.S., Jayananda, M.K., Sonnadara, D.U.J., 2016. Multifractal behavior of wind speed and wind direction. Fractals 24, 1650003. Windsor, H.L., Toumi, R., 2001. Scaling and persistence of UK pollution. Atmos. Environ. 35, 4545e4556. Xue, Y., Pan, W., Lu, W.-Z., He, H.-D., 2015. Multifractal nature of particulate matters (PMs) in Hong Kong urban air. Sci. Total Environ. 532, 744e751. Yuval, Broday, D.M., 2010. Studying the time scale dependence of environmental variables predictability using fractal analysis. Environ. Sci. Technol. 44, 4629e4634. Zhao, J., Guo, Z.H., Su, Z.Y., Zhao, Z.Y., Xiao, X., Liu, F., 2016. An improved multi-step forecasting model based on WRF ensembles and creative fuzzy systems for wind speed. Appl. Energy 162, 808e826. Zhao, L., Xie, Y., Wang, J., Xu, X., 2015. A performance assessment and adjustment program for air quality monitoring networks in Shanghai. Atmos. Environ. 122, 382e392.
Please cite this article in press as: Dong, Q., et al., Multifractal behavior of an air pollutant time series and the relevance to the predictability, Environmental Pollution (2016), http://dx.doi.org/10.1016/j.envpol.2016.11.090