An intelligent hybrid model for air pollutant concentrations forecasting: Case of Beijing in China

An intelligent hybrid model for air pollutant concentrations forecasting: Case of Beijing in China

Sustainable Cities and Society 47 (2019) 101471 Contents lists available at ScienceDirect Sustainable Cities and Society journal homepage: www.elsev...

3MB Sizes 0 Downloads 54 Views

Sustainable Cities and Society 47 (2019) 101471

Contents lists available at ScienceDirect

Sustainable Cities and Society journal homepage: www.elsevier.com/locate/scs

An intelligent hybrid model for air pollutant concentrations forecasting: Case of Beijing in China

T



Hui Liua, , Haiping Wua, Xinwei Lvb, Zhiren Renb, Min Liuc, Yanfei Lia, Huipeng Shia a

Institute of Artificial Intelligence & Robotics (IAIR), Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic and Transportation Engineering, Central South University, Changsha 410075, Hunan, China b Wasion Group Limited, Changsha 410205, Hunan, China c Institute of Super-microstructure and Ultrafast Process in Advanced Materials, School of Physics and Electronics, Central South University, Changsha 410075, Hunan, China

A R T I C LE I N FO

A B S T R A C T

Keywords: Air pollutant concentrations forecasting Empirical wavelet transform Multi-agent evolutionary genetic algorithm Nonlinear auto regressive models with exogenous inputs Time series

The forecasting of air pollutant concentrations is of great significance to protect the environment and guarantee the health of people. In the study, a novel hybrid model, namely EWT-MAEGA-NARX combining the EWT, MAEGA and NARX neural networks, is put forward for multi-step air pollutant concentrations forecasting. Four types of air pollutant containing PM2.5, SO2, NO2, and CO in Beijing, China are selected to verify the accuracy of the proposed model. To inspect the forecasting performance of the proposed model, some other models are chosen as the comparison models, which comprise of the VMD-MAEGA-NARX model, EWT-MAEGA-SVM model, MAEGA-NARX model, EWT-NARX model and EWT-ARIMA-NARX model. The experimental results show that: (1) The EWT-MAEGA-NARX model can achieve satisfactory predictions in air pollutant concentrations forecasting, whose MAE in 1-step forecasting of PM2.5, SO2, NO2, CO series are 0.1314 μ g/ m3 , 0.0213 μ g/ m3 , 0.0722 μ g/ m3 , 0.0033 mg/ m3 , respectively. (2) In the EWT-MAEGA-NARX model, the EWT is a good feature extractor and the parameter optimization process of MAEGA for the NARX neural network can obviously enhance the prediction performance of the model.

1. Introduction With the continuous development of social economy, the acceleration of urbanization and the expansion of industrial scale, air pollution phenomenon has deteriorated rapidly, which leads to serious air pollution. Some harmful substances in air, such as PM2.5, SO2, NO2, CO and so on, are seriously endangering the environment and human's health. PM2.5 particles in the atmosphere will enter the respiratory tract or even the lungs by breathing. Chronically exposure to such air environment will significantly increase respiratory related diseases morbidity, causing serious harm to human health (Coker & Kizito, 2018; Yu & Stuart, 2017). The SO2 and NO2 in the atmosphere can cause irreparable damage to the respiratory system of the human body (Zhong

et al., 2018). Besides, the estimation results in Hao et al. study indicated that the atmosphere pollution indeed has a seriously negative effect on economic progress (Hao et al., 2018). As an important form of environmental pollution, air pollution seriously restricts the sustainable development of the environment (Silva & Mendes, 2012). It is precisely because of these problems and phenomena, some prediction methods for the concentrations of air pollutant have been studied by relevant scholars. Even some field technologies, such as GIS, have been applied to calculate air pollutant concentrations (Pilla & Broderick, 2015). However, many research results have the defects of low prediction accuracy and poor timeliness, which can’t meet the environmental needs of the early warning of atmospheric pollutant. Therefore, reliable and accurate prediction of air pollutant concentrations is very

Abbreviations: EWT, ensemble wavelet transform; VMD, variational mode decomposition; NARX network, nonlinear autoregressive network with exogenous inputs; SVM, support vector machine; MAEGA, multi-agent evolutionary genetic algorithm; ARIMA, auto regressive integrated moving average model; ARMA, autoregressive moving average model; EMD, empirical mode decomposition; IMF, intrinsic mode function; ADMM, alternate direction method of multipliers; TDL, tapped delay line; ACF, autocorrelation function; PACF, partial autocorrelation function; MLP, multi-layer perceptron; ANN, artificial neural networks; NBFS, neighbor-based feature scaling; ANFIS, adaptive network-based fuzzy inference system; VARMA, vector autoregressive moving average model; GA, genetic algorithm; SVR, support vector regression; BP, back propagation; AERMOD, American Meteorological Society/environmental protection regulatory model; WRF, weather research and forecasting; MAE, mean absolute error; MAPE, mean absolute percentage error; RMSE, root mean square error ⁎ Corresponding author. E-mail address: [email protected] (H. Liu). https://doi.org/10.1016/j.scs.2019.101471 Received 2 December 2018; Received in revised form 19 January 2019; Accepted 12 February 2019 Available online 13 February 2019 2210-6707/ © 2019 Elsevier Ltd. All rights reserved.

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

problems is exploited and further advanced in air pollutant concentrations forecasting; (b) The combination of MAEGA model with NARX neural networks are first proposed, which has outstanding performance with extremely low estimated errors; (c) The widely-recognized decomposing method (EWT) is used to process the original air pollutant series, and the performance of its integration with NARX neural networks are contrasted to analyze the effect of decomposition algorithm on the MAEGA-NARX model performance in different steps ahead forecasting; (d) The proposed model supplies a new idea and method for the prediction and early warning of air pollutant concentrations, and improves the prediction accuracy of the contents of several common air pollutant concentrations to a certain extent. The structure of this essay is organized as shown below: Section 2 shows the framework of the proposed hybrid models and the elaboration of two decomposing methods. Section 3 explains NARX networks, MAEGA model, ARIMA model, SVM and their combination procedure. Section 4 presents the prediction results of air pollutant concentrations. The comparison analyses of relevant models are also included in this section. Section 5 concludes the study.

significant and helpful to enhance healthy level. Substantial works have been made for air pollutant concentrations predictions recently. The quantitative prediction methods of the atmosphere pollutant concentrations can be classified into statistical prediction, intelligent prediction and numerical prediction from the angle of methodology. Statistical prediction model is mainly based on the statistical relationship between various influencing factors (such as conventional meteorology, emission sources, etc.) and air pollutant. At present, the commonly used statistical models for air pollutant concentrations prediction include the grey model (Li, Gong, & Liu, 2012; Zhou, Zhao, & Li, 2010) and regression model. As for the grey model, the prediction accuracy largely depends on the data characteristics and the grey parameters. In regression models, stepwise regression (Zvereva & Kozlov, 2010), principal component regression (Kumar & Goyal, 2011) and multiple linear regression are applied in a wide range in the prediction of the air pollutant concentrations. In addition, some researchers use projection pursuit regression model to avoid the contradiction problem that linear regression can’t reflect the actual nonlinear situation (Hou, 2012; Wang, Zhang, & Liu, 2011). Multiple regression is mainly used to consider the potential impact between various air pollutants or the relationship between target pollutant and the factors related to its concentrations. Some experts and scholars combined with Weather Research and Forecasting and mathematical methods to predict the concentrations of air pollutant (Afzali, Rashid, Afzali, & Younesi, 2017; Kumar & Goyal, 2011). In recent years, the artificial intelligence methods have received considerable attention, such as neural network or deep learning, have been widely adopted in the field of air pollutant concentrations prediction, and obtained good results. Practice had proved that a single intelligent model can’t achieve better prediction performance in forecasting air pollutant concentrations. Therefore, most intelligent prediction models are hybrid models integrated with pre-processing methods (such as decomposition algorithm), optimization algorithm, prediction algorithm and other algorithms. For instance, PAC and ANN integration (Voukantsis et al., 2011), wavelet analysis and neural network integration (Osowski & Garanty, 2007), feature extraction and fuzzy neural network integration (Polat, 2012) have proved that the prediction performance of hybrid model was better than the single neural network model. Based on the study of physical and chemical processes of atmosphere, a mathematical model of dilution and diffusion of air pollutant was established to predict the dynamic changes of air pollutant concentrations. Numerical prediction methods of atmosphere pollutant concentrations have undergone the first-generation models dominated by Gaussian diffusion model and Lagrange model, and the secondgeneration models dominated by Euler grid model, and the third generation air quality mode mainly including CMAQ(Wang et al., 2010), CAMx (Huang, Cheng, Perozzi, & Perozzi, 2012) modes and etc. In this study, a novel hybrid air pollutant concentrations forecasting model is developed based on the EWT method, NARX neural networks and the MAEGA model. The original air pollutant series were decomposed respectively by EWT algorithm before forecasted by the optimized NARX neural networks. To test the forecasting performance of the EWT-MAEGA-NARX model, some other models are selected as the comparison models, which comprise of the VMD-MAEGA-NARX model, EWT-MAEGA-SVM model, MAEGA-NARX model, EWT-NARX model and EWT-ARIMA-NARX model. By comparing the performance of involved models on diverse multi-step ahead occasions, the optimal NARX network models in 1-step, 3-step and 5-step ahead forecasting are determined according to the error evaluation index applied in the study. Besides, Beijing City of China is the study area, where four types of original air pollutant including PM2.5, SO2, NO2 and CO are selected to test the accuracy of the proposed model. The main contributions of the study are provided as follows: (a) The superiority of NARX neural networks in dealing with time series

2. Methodology The framework of the proposed air pollutant concentrations forecasting model in the study is given in Fig. 1. As shown in Fig.1, the proposed model consists of the following main contents as: (1) The group of the air pollutant series is decomposed into different sub-series by EWT and VMD decomposing algorithms. (2) The MAEGA model is adopted to optimize the value of two input delays of the NARX neural networks and the value of Regularization and kernel parameter in SVM. (3) The ARIMA is adopted to optimize and determine the value of two input delays of the NARX neural networks. (4) The NARX neural networks and SVM models are trained separately according to the data from each length of non-stationary air pollutant series. (5) Each decomposed sub-series is taken as input of NARX networks and SVM to realize multi-step forecasting of air pollutant concentrations. The decomposition method in this model adopts EWT algorithm and VMD algorithm as comparison algorithm. These two algorithms are commonly used in data processing decomposition methods, and the specific mathematical processes of these two algorithms will be explained below.

2.1. Empirical wavelet transform The EWT is a new self-adaptive signal decomposing method, which inherits the merits of both EMD and WT. It builds a series of band-pass filters in the frequency domain to extract different useful AM-FM components. Empirical wavelet is composed of empirical wavelet function and empirical scaling function. The eq. (1) and Eq. (2) determine the scaling function and the empirical wavelets, respectively (Gilles, 2013):

⎧1 if |ω| ≤ (1 − γ ) ωn ⎪ cos [πβ ((|ω| − (1 − γ ) ω )/2γω )/2]if(1 − γ ) ω ≤ |ω| n n n ϕˆn (ω) = ⎨ ≤ (1 + γ ) ωn ⎪ ⎩ 0 otherwise

2

(1)

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Fig. 1. The framework of the air pollutant concentrations forecasting model.

ψˆn (ω) =

Step2: Update the uk and ωk according to Eqs. (4) and (5);

⎧1 if(1 + γ ) ωn ≤ |ω| ≤ (1 − γ ) ωn + 1 ⎪ cos [πβ ((|ω| − (1 − γ ) ω )/2γω )/2]if(1 − γ ) ω n+1 n+1 n + 1 ≤ |ω| ⎪ ⎪ ≤ (1 + γ ) ωn + 1



⎨ sin [πβ ((|ω| − (1 − γ ) ω )/2γω )/2]if(1 − γ ) ω ≤ |ω| n n n ⎪ ⎪ ≤ (1 + γ ) ωn ⎪ ⎩ 0 otherwise

ωkn + 1 =

β (x ) = x 4 (35 − 84x + 70x 2 − 20x 3)

(

ωn + 1 − ωn ωn + 1 + ωn

(3)

n+1

ˆλ

), then empirical wavelet {ϕ (t ), {ψ (t )} 1

n

(∫



0

i≠k

) (∫

ω |uˆk (ω)|2 dω /



0

)

|uˆk (ω)|2 dω

(4) (5)

Step3: Update the λ according to Eq. (6).

(2)

Where, if γ < minn

λˆ (ω) ⎞ /(1 + 2α (ω − ωk )2) 2 ⎟ ⎠



ˆukn+1 (ω) = ⎜fˆ (ω) − ∑ ˆui (ω) +

N n = 1}

is a set of orthogonal basis of space L2 (R) . The EWT decomposed sub-layers are illustrated in Fig. 2.

n ⎛ (ω) = λ (ω) + τ ⎜f (ω) − ⎝

ˆ

ˆ

K

∑ uˆkn+1 (ω)⎞⎟ k=1



(6)

Step4: For the given discriminant accuracy e > 0 , if n+1 n 2 n 2 (∑ || uˆk − uˆk ||2 )/|| uˆk ||2 < e , the iteration will be stopped, otherwise, return to step2. Finally, the whole circle is finished with several narrow-band IMF components, as illustrated in Fig. 3.

2.2. Variational mode decomposition The VMD is a new self-adaptive signal processing algorithm. It converts the signal decomposition to a variational problem. Compared to other decomposing methods, the VMD has good noise robustness and the algorithm is simple, since each mode is updated in the frequency domain. The procedure of VMD algorithm can be explained as follows (Dragomiretskiy & Zosso, 2014): Step1:Initialize the modes {uˆk1} , center frequency {ωk1} , lagrange 1 multiplication operator {λˆ } , secondary penalty factor α and n .

3. Optimization of NARX neural networks and SVM model 3.1. Standard NARX neural networks The NARX network is a dynamic neural architecture with memory function, whose feature is that in the hidden layer there are two input delays. The expression of the NARX model can be defined as follows (Jamil & Zeeshan, 2018): 3

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Fig. 2. The EWT decomposed results of the original PM2.5 series.

y (n + 1) = f [y (n), ..., y (n − d y+1); u (n), u (n + 1), ..., u (n − du + 1)]

3.3. Multi-agent evolutionary genetic algorithm (7)

The Multi-Agent Evolutionary Genetic Algorithm (MAEGA) is designed through the purpose of each agent, survival environment, local environment definition and behavior to design the agent for optimizing. Define the agent as a candidate for the goal of optimization solution, marked as p = (x1, x2 , ..., x Q ) . Its energy equals the inverse of objective function which called E (p) = −f (p) , where Q represents the number of variables. In order to realize the local perception ability of the agent, the living environment is organized as a grid structure, the grid structure is called the main grid, which is marked as G . Agents fixed on a grid point (i, j ) that cannot move are marked as pi, j , so the agent acts with the local agent in its neighborhood, specifically for competition and collaboration. Sharing information through the proliferation, update self-information through self-learning and variation. The four mechanisms of self-learning, competition, cooperation and variation accomplish the evolution of the agent, and then search for the optimization. The key of multi-agent evolutionary algorithm is the design of operators (Zhao & Peng, 2010).

where u(n) , y (n) are the input and output of the NARX network at discrete time step ‘n ’, and d (u) , d (y ) are the orders of the input-memory and output-memory, respectively. It can be denoted in the vector form as follows:

y (n + 1) = f [y (n); u (n)]

(8)

where u(n) and y (n) present the input and output regressors, respectively. Fig. 4 illustrates the structure of a NARX network with three-hiddenlayer. There are two types of NARX networks. The series-parallel mode takes the actual values of the output as output regressors. The parallel mode uses estimated output as a feedback in the output regressor. In this study, the SP model is chosen for the forecasting NARX network, and the training algorithm is selected as ‘trainlm’. 3.2. Support vector machine

(1) Self-learning operator Tomassini, 2006)

The regression function is estimated in the set of functions f (x , a) = 〈w, ϕ (x )〉 + b , and the problem of regression estimation is defined as the problem of minimizing the risk by using the principle of structural risk minimization for the linear loss function ε (ε ≤ 0) which is insensitive. The elements of structural Sn are defined by the following inequalities: w·w ≤ cn . Then the support vector estimation for regression is generated (Zhao & Peng, 2010). The regularization parameter and kernel parameter are the most important parameters that affect the performance of Support Vector Machine. In order to realize good application ability of SVM, it’s necessary to use optimization algorithm to adjust the above parameters.

Λ (p)

(Gagné, Schoenauer, Sebag, &

Agents possess knowledge related to the problem to be solved, which can be used for self-learning to improve performance. Local search is used to achieve:

Λ (p) = p Λ , where p Λ meet: ∀ ptemp ∈ p ± Δp , E (p Λ ) ≥ ptemp (2) Cooperative operator X(pi , pj ) (Phienthrakul & Kijsirikul, 2005)

4

(9)

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Fig. 3. The VMD decomposed results of the original PM2.5 series.

Normal standard random number method is used for variation:

p , U (0, 1) < Pv v (p) = ⎧ v ⎨ ⎩ p + p , otherwise

(11)

Where U (0, 1) is a random number within (0, 1) interval uniform distribution, Pv is the probability of variation Pv = (e1, e2, ..., eQ ) ,and ei ∈ Φ (0, 1/ t ), i = 0, 1, ..., Q ,where t is an evolutionary algebra. (4) Competition operator r (p) (Smeaton, Over, & Kraaij, 2006) Competition with the neighborhood agent only needs to compete with the neighborhood energy maximizer, which is called P max , and then the competition behavior criterion is:

p , E (p) ≥ E (P max ) r (p) = ⎧ new max ⎨ ⎩ p , E (p) < E (P )

(12)

pnew

is the agent produced by self-learning, orthogonal crossover Where and mutation.

Fig. 4. The structure of a NARX network with three-hidden-layer.

3.4. Autoregressive integrated moving average model

Collaboration operators can be understood as information sharing between agents, where the set of agents P , |P| = M is generated by neighborhood orthogonal crossover operators. Then use the highest energy agent in P as the result of cooperation operator:

The ARIMA is a distinguished time series forecasting method proposed by Box and Jenkins. The standard form of which is defined as: ARIMA (p, d, q), where ‘p ’ represents the auto regression order, ‘d ’ is the differential order, and ‘q ’ means the moving average order. In the processing of identification, the kind of model and corresponding parameters (like p, d, q) are determined according to some indicators like differential results, correlation coefficient and certain information criterion. After that a suitable mathematical calculation

X(pi , pj ) = p X , where p X meet: ∀ ptemp ∈ P, E (p X ) ≥ ptemp

(10)

(3) Mutation operator v (p) (Cheng-Lung & Chieh-Jen, 2006) 5

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

4. Forecasting computation

method is applied to estimate the unknown parameter in the model equation.

4.1. Original series 3.5. NARX and SVM optimized by the MAEGA model

Four sets of original PM2.5, SO2, NO2 and CO series are used to test the performance of each present model, and each series includes 1400 samples. The former 800 samples are utilized to train the NARX networks and the SVM model, while the latter 600 samples are used as testing purpose. Four groups of raw air pollutant series are demonstrated in Fig. 6. The Statistical characteristics value of the air pollutant series are given in Table 1.

In order to make optimization for the parameters of SVM using the Multi-Agent Evolutionary Genetic Algorithm, the parameters must be encoded firstly. The value of two input delays in the NARX model or the regularization parameter C and kernel parameter σ in SVM are represented by agent P = (du, d y ) or P = (C , σ ) . Best t and CBest t are defined as the optimal agents of the previous generation T and the optimal agents of the T generation respectively. ts is the maximum energy invariant algebra, i.e. ts = max {τ|Best t = Best t + τ } . In order to enhance the convergence performance of the method, the maximum evolution algebra t max and the maximum energy invariant algebra ts max are set as the double evolution termination criteria. The flow of algorithm is as follows (Zhao & Peng, 2010): Step1: Set t = 0 , initialize all the agents pi, j in the main grid G. Because there is no prior information, the random generation scheme is adopted, namely pi, j = rand (P ) , P is the encoding space of the solution, updating Best t , Best t + τ and ts . Step2: For each agent pi, j of generation t , using self-learning, collaboration, mutation and competition with Eqs. (9)–(12) respectively, generate a new generation of agents, where E (pi, j ) is obtained by standard SVM algorithm. Step3: Update Best t , Best t + τ and ts . Step4: Judge whether the evolutionary termination criterion is satisfied or not. If it is satisfied, the output is Best t , Best t + τ , otherwise let t = t + 1, turn Step2.

4.2. Forecasting experiment In the study, to assess the forecasting performance of the EWTMAEGA-NARX model, five other prediction models are provided as the contrast models, which comprise of the VMD-MAEGA-NARX model, the EWT-MAEGA-SVM model, the MAEGA-NARX model, the EWT-NARX model and the EWT-ARIMA-NARX model. In order to evaluate the prediction performance of each model, three error criteria are adopted in the study including MAE (Afzali et al., 2017), MAPE (Liu, Binaykia, Chang, Tiwari, & Tsao, 2017) and RMSE (Kasiviswanathan, He, Sudheer, & Tay, 2016). Additionally, to investigate the multi-step forecasting performance of the developed model, 1-step, 3-step and 5-step predictions are conducted for all the mentioned models. 4.2.1. Forecasting result In order to investigate the prediction performance of each result, their prediction result and prediction errors in different steps (1-step, 3step, 5-step) are recorded respectively. Taking the forecasting results of the three steps as example, Figs. 7–10 show the air pollutant concentrations forecasting trends of 4 types of air pollutant series by using involved six models.

3.6. NARX neural networks optimized by the ARIMA model The ARIMA model can reveal the inner correlation between data in time series. This feature can be applied in the NARX neural network to select its parameter. The value of two input delays in NARX model is determined according to the auto regression order ‘p’ in the ARIMA model. Two judgment criteria selected in this study are the ACF and the PACF. Take PM2.5 series as example to explain the combination process. The ACF and PACF results of PM2.5 series are given in Fig. 5. From Fig. 6, it can be observed that the autoregression order ‘p’ in the ARIMA model is chosen as 20. The value of two input delays in the NARX model could be determined correspondingly.

4.2.2. Error estimated analysis The forecasting errors of models above are recorded in Table 2. (a) The EWT-MAEGA-NARX model can achieve accurate predictions in air pollutant concentrations forecasting. Compared with five other forecasting models, the EWT-MAEGA-NARX model has better prediction accuracy in 1-step, 3-step and 5-step forecasting. Taking 1step prediction results as an instance, the MAE, MAPE, RMSE of the

Fig. 5. ACF and PACF value of PM2.5 series. 6

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Fig. 6. Original air pollutant series.

that the two decomposition algorithms, EWT and VMD, have similar impact on the forecasting accuracy of the model. (c) The EWT-MAEGA-NARX model has better forecasting accuracy than the EWT-ARIMA-NARX model. Taking 1-step prediction results of series SO2 as an instance, the error indexes of the EWTMAEGA-NARX model are 0.0213 μ g/ m3 , 0.0031%, 0.0347 μ g/ m3 , respectively; the error indexes of the EWT-ARIMA-NARX model are 0.0689 μ g/ m3 , 0.0093%, 0.1217 μ g/ m3 , respectively. It shows that the MAEGA model is better than the ARIMA model in terms of NARX neural network parameter optimization performance. (d) The EWT-MAEGA-NARX model has better forecasting accuracy than the EWT- MAEGA -SVM model. Taking 1-step forecasting results of series NO2 as an instance, the error indexes of the EWTMAEGA-NARX model are 0.0722 μ g/ m3 , 0.0013%, 0.0969 μ g/ m3 , respectively; the error indexes of the EWT-MAEGA-SVM model are 0.3737 μ g/ m3 , 0.0072%, 0.9451 μ g/ m3 , respectively. It shows that NARX has better prediction performance for air pollutant series than SVM model. (e) The error indexes of all of the models experience an overall decrease with the increase of prediction steps in general.

Table 1 Descriptive statistics of the air pollutant series. Series

Min-Max Values

Mean

Standard Derivation

Skewness

Kurtosis

PM2.5 SO2 NO2 CO

5.70-160.80 2.0-33.30 25.90-134.5 0.40-2.80

61.9111 10.1876 55.9676 1.0813

34.9648 6.7854 18.1357 0.3973

0.6105 1.1250 1.7829 0.9424

2.6318 3.5675 7.9568 4.3871

EWT-MAEGA-NARX model in series PM2.5 are 0.1314 μ g/ m3 ,0.0032%, 0.1793 μ g/ m3 , respectively; the error indexes of the model in series SO2 are 0.0213 μ g/ m3 , 0.0031%, 0.0347 μ g/ m3 , respectively; the error indexes of the model in series NO2 are 0.0722 μ g/ m3 ,0.0013%, 0.0969 μ g/ m3 , respectively; the error indexes of the model in series CO are 0.0033 μ g/ m3 , 0.0033%, 0.0041 μ g/ m3 , respectively. (b) The EWT-MAEGA-NARX model and the VMD-MAEGA-NARX model enjoy similar forecasting accuracy. Taking 1-step prediction results of series PM2.5 as an instance, the error indexes of the EWTMAEGA-NARX model are 0.1314 μ g/ m3 ,0.0032%, 0.1793 μ g/ m3 , respectively; the error indexes of the VMD-MAEGA-NARX model are 0.2112 μ g/ m3 ,0.0052%, 0.3361 μ g/ m3 , respectively. It shows

In order to compare the forecasting accuracy of the mentioned

Fig. 7. The results of 3-step ahead forecasting of the PM2.5 series. 7

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Fig. 8. The results of 3-step ahead forecasting of the SO2 series.

Fig. 9. The results of 3-step ahead forecasting of the NO2 series.

Fig. 10. The results of 3-step ahead forecasting of the CO series. 8

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Table 2 The forecasting results for the air pollutant series. Series

Step

MAE (μ g/m3 )

PM2.5 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step SO2 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step NO2 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step CO 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step

EWT-MAEGA-NARX 0.1314 0.5737 0.6070 EWT-MAEGA-SVM 0.6263 1.7890 4.0171 EWT- NARX 1.4200 1.4636 1.9689 EWT-MAEGA-NARX 0.0213 0.0235 0.0919 EWT-MAEGA-SVM 0.0203 0.1247 0.3536 EWT- NARX 0.2804 0.2902 0.3837 EWT-MAEGA-NARX 0.0722 0.0744 0.2189 EWT-MAEGA-SVM 0.3737 0.6508 0.9716 EWT- NARX 0.9511 0.9597 1.0186 EWT-MAEGA-NARX 0.0033 0.0041 0.0082 EWT-MAEGA-SVM 0.0118 0.0246 0.0553 EWT- NARX 0.0280 0.0371 0.0384

MAPE (%)

RMSE

MAE

(μ g/m3 )

(μ g/m3 )

0.0032 0.0151 0.0145

0.1793 0.7748 0.8101

0.0142 0.0428 0.0880

0.9261 2.3564 7.4307

0.0318 0.0327 0.0444

2.1827 2.2088 2.7697

0.0031 0.0030 0.0115

0.0347 0.0367 0.1557

0.0029 0.0155 0.0463

0.0402 0.1973 0.5005

0.0373 0.0371 0.0483

0.3905 0.4186 0.5735

0.0013 0.0013 0.0039

0.0969 0.0943 0.2769

0.0072 0.0118 0.0168

0.9451 1.0753 1.2424

0.0170 0.0170 0.0177

1.2569 1.2988 1.3269

0.0033 0.0036 0.0064

0.0041 0.0072 0.0160

0.0093 0.0184 0.0481

0.0199 0.0383 0.0832

0.0272 0.0306 0.0354

0.0350 0.0741 0.0536

VMD- MAEGA -NARX 0.2112 0.4799 1.0977 MAEGA-NARX 1.3941 1.5468 2.3973 EWT-ARIMA-NARX 0.4783 0.7773 1.1124 VMD- MAEGA -NARX 0.0441 0.0988 0.1820 MAEGA-NARX 0.1825 0.2532 0.4927 EWT-ARIMA-NARX 0.0689 0.1534 0.2881 VMD- MAEGA -NARX 0.1066 0.2057 0.4346 MAEGA-NARX 0.4514 0.5847 1.0182 EWT-ARIMA-NARX 0.1796 0.3509 0.6101 VMD- MAEGA -NARX 0.0088 0.0180 0.0243 MAEGA-NARX 0.0293 0.0269 0.0592 EWT-ARIMA-NARX 0.0123 0.0246 0.0293

MAPE (%)

RMSE

0.0052 0.0098 0.0264

0.3361 0.8175 1.6799

0.0329 0.0383 0.0619

2.0442 2.1785 3.4045

0.0127 0.0173 0.0280

0.7738 1.1668 1.5923

0.0057 0.0126 0.0214

0.0875 0.1768 0.3311

0.0291 0.0381 0.0669

0.2415 0.3539 0.6958

0.0093 0.0193 0.0364

0.1217 0.2502 0.4353

0.0019 0.0037 0.0077

0.2807 0.4163 0.6078

0.0081 0.0102 0.0184

0.6274 0.8028 1.3377

0.0032 0.0062 0.0107

0.3548 0.5263 0.8398

0.0082 0.0164 0.0218

0.0135 0.0242 0.0339

0.0262 0.0247 0.0458

0.0375 0.0357 0.0988

0.0119 0.0225 0.0263

0.0158 0.0351 0.0431

(μ g/m3 )

In accordance with Table 2 and Figs. 7–10, it can be found that:

improved. (b) The EWT-MAEGA-NARX model significantly outperforms the prediction accuracy compared to the EWT-NARX model. Taking 1-step prediction results of series PM2.5 as an instance, the PMAE , PMAPE , PRMSE are 90.7465%,89.9371%,91.7854%, respectively. It shows that the parameter optimization process of MAEGA for the NARX network can apparently enhance the prediction performance of the model.

models, the promoting percentage of estimated errors are adopted as follows:

PMAE = (MAE1 − MAE2)/ MAE1

(13)

PMAPE = (MAPE1 − MAPE2)/ MAPE1

(14)

PRMSE = (RMSE1 − RMSE2)/RMSE1

(15)

In the study, the proposed models are compared with the MAEGANARX model and EWT-NARX model. The MAE, MAPE, RMSE promoted percentages of the EWT-MAEGANARX model are presented in Table 3.

5. Conclusion In this study, a novel hybrid air pollutant concentrations forecasting models is developed with the NARX neural network, the MAEGA model and the EWT decomposing method. The EWT method is adopted to decompose the raw air pollutant series into a number of subseries. The NARX network is utilized to realize the multi-step ahead the forecasting of air pollutant concentrations, whose parameters are selected by the MAEGA model. In order to test the forecasting accuracy of the EWT-

(a) The EWT-MAEGA-NARX model significantly outperforms the prediction accuracy compared to the MAEGA-NARX model. Taking 1step prediction results of series PM2.5 as an instance, the PMAE , PMAPE , PRMSE are 90.5746%,90.2736%,91.2288%, respectively. This phenomenon indicates that the EWT is a good feature extractor, and the prediction performance of the model can be apparently

9

Sustainable Cities and Society 47 (2019) 101471

H. Liu, et al.

Table 3 The promoting percentages of estimated errors. Index

Forecasting models

Step

PM2.5

SO2

NO2

CO

PMAE (%)

EWT-MAEGA-NARX VS MAEGA-NARX EWT-MAEGA-NARX VS EWT-NARX EWT-MAEGA-NARX VS MAEGA-NARX EWT-MAEGA-NARX VS EWT-NARX EWT-MAEGA-NARX VS MAEGA-NARX EWT-MAEGA-NARX VS EWT-NARX

1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step 1-step 3-step 5-step

90.5746 62.9105 74.6798 90.7465 60.8021 69.1706 90.2736 60.5744 76.5751 89.9371 53.8226 67.3423 91.2288 64.4342 76.2050 91.7854 64.9221 70.7513

88.3288 90.7188 81.3477 92.4037 91.9021 76.0490 89.3471 92.1260 82.8102 91.6890 91.9137 76.1905 85.6315 89.6298 77.6229 91.1140 91.2327 72.8509

84.0053 87.2755 78.5013 92.4088 92.2476 78.5097 83.9506 87.2549 78.8043 92.3529 92.3529 77.9661 84.5553 88.2536 79.3003 92.2906 92.7395 79.1318

88.7372 84.7584 86.1486 88.2143 88.9488 78.6458 87.4046 85.4251 86.0262 87.8676 88.2353 81.9209 89.0667 79.8319 83.8057 88.2857 90.2834 70.1493

PMAPE (%)

PRMSE (%)

concentrations from multiple sources using AERMOD coupled with WRF prognostic model. Journal of Cleaner Production, 166, 1216–1225. Cheng-Lung, H., & Chieh-Jen, W. (2006). A GA-based feature selection and parameters optimizationfor support vector machines. Expert Systems With Applications, 31, 231–240. Coker, E., & Kizito, S. (2018). A narrative review on the human health effects of ambient air pollution in Sub-Saharan Africa: An urgent need for health effects studies. International Journal of Environmental Research and Public Health, 15, 427. Dragomiretskiy, K., & Zosso, D. (2014). Variational mode decomposition. IEEE Transactions on Signal Processing, 62, 531–544. Gagné, C., Schoenauer, M., Sebag, M., & Tomassini, M. (2006). Genetic Programming for Kernel-Based Learning with Co-evolving Subsets Selection, 4193, 1008–1017. Gilles, J. (2013). Empirical wavelet transform. IEEE Transactions on Signal Processing, 61, 3999–4010. Hao, Y., Peng, H., Temulun, T., Liu, L.-Q., Mao, J., Lu, Z.-N., et al. (2018). How harmful is air pollution to economic development? New evidence from PM2. 5 concentrations of Chinese cities. Journal of Cleaner Production, 172, 743–757. Hou, X. (2012). Application of projection pursuit model to analyze soil pollutants of sewage irrigation region. Xinjiang Agricultural Sciences, 49, 730–734. Huang, Q., Cheng, S., Perozzi, R. E., & Perozzi, E. F. (2012). Use of a MM5–CAMx–PSAT modeling system to study SO2 source apportionment in the Beijing Metropolitan Region. Environmental Modeling & Assessment, 17, 527–538. Jamil, M., & Zeeshan, M. (2018). A comparative analysis of ANN and chaotic approachbased wind speed prediction in India. Neural Computing & Applications, 1–13. Kasiviswanathan, K. S., He, J., Sudheer, K. P., & Tay, J. H. (2016). Potential application of wavelet neural network ensemble to forecast streamflow for flood management. Journal of Hydrology, 536, 161–173. Kumar, A., & Goyal, P. (2011). Forecasting of air quality in Delhi using principal component regression technique. Atmospheric Pollution Research, 2, 436–444. Li, J., Gong, D., & Liu, X. (2012). Prediction and analysis of air pollutants concentrations in Wuwei City of Gansu Province based on GM(1,1) model. Environmental Science & Management.. Liu, B. C., Binaykia, A., Chang, P. C., Tiwari, M. K., & Tsao, C. C. (2017). Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): A case study of Beijing-Tianjin-Shijiazhuang. PloS One, 12, e0179763. Osowski, S., & Garanty, K. (2007). Forecasting of the daily meteorological pollution using wavelets and support vector machine. Engineering Applications of Artificial Intelligence, 20, 745–755. Phienthrakul, T., & Kijsirikul, B. (2005). Evolutionary strategies for multi-scale radial basis function kernels in support vector machines. 905–911. Pilla, F., & Broderick, B. (2015). A GIS model for personal exposure to PM10 for Dublin commuters. Sustainable Cities and Society, 15, 1–10. Polat, K. (2012). A novel data preprocessing method to estimate the air pollution (SO 2): Neighbor-based feature scaling (NBFS). Neural Computing & Applications, 21, 1987–1994. Silva, L. T., & Mendes, J. F. G. (2012). City Noise-Air: An environmental quality index for cities. Sustainable Cities and Society, 4, 1–11. Smeaton, A. F., Over, P., & Kraaij, W. (2006). Evaluation campaigns and TRECVid. 321–330. Voukantsis, D., Karatzas, K., Kukkonen, J., Räsänen, T., Karppinen, A., & Kolehmainen, M. (2011). Intercomparison of air quality data using principal component analysis, and forecasting of PM10 and PM2. 5 concentrations using artificial neural networks, in Thessaloniki and Helsinki. The Science of the Total Environment, 409, 1266–1276. Wang, J., Zhang, X., & Liu, T. (2011). Projection pursuit regression model for prediction of air permeability of woven fabrics. Journal of Textile Design Research and Practice, 32, 46–48. Wang, L., Jang, C., Zhang, Y., Wang, K., Zhang, Q., Streets, D., et al. (2010). Assessment of air quality benefits from national air pollution control policies in China. Part II: Evaluation of air quality predictions and air quality benefits assessment. Atmospheric Environment, 44, 3449–3457. Yu, H., & Stuart, A. L. (2017). Impacts of compact growth and electric vehicles on future air quality and urban exposures may be mixed. The Science of the Total Environment, 576, 148–158. Zhao, L. H., & Peng, T. (2010). An effective SVM parameter selection optimazation method. Manufacturing Automation, 388, 121–136. Zhong, P., Huang, S., Zhang, X., Wu, S., Zhu, Y., Li, Y., et al. (2018). Individual-level modifiers of the acute effects of air pollution on mortality in Wuhan, China. Global Health Research and Policy, 3, 27. Zhou, J.-h., Zhao, J.-g., & Li, P. (2010). IEEEStudy on Gray Numerical Model of Air Pollution in Wuan City, 2010 International Conference on Challenges in Environmental Science and Computer Engineering2010. Study on Gray Numerical Model of Air Pollution in Wuan City, 2010 International Conference on Challenges in Environmental Science and Computer Engineering, 321–323. Zvereva, E. L., & Kozlov, M. V. (2010). Responses of terrestrial arthropods to air pollution: A meta-analysis. Environmental Science and Pollution Research - International, 17, 297–311.

According to Table 3, it is obviously found that:

MAEGA-NARX model, five other models are selected as the contrast models, which comprise of the VMD-MAEGA-NARX model, EWTMAEGA-SVM model, MAEGA-NARX model, EWT-NARX model and EWT-ARIMA-NARX model. The simulated results of several experiments demonstrate that: (a) the EWT-MAEGA-NARX model presents high accurate performance, which demonstrates that the combination of the decomposing algorithms with the NARX network is successful; (b) the EWT-MAEGA-NARX and VMD-MAEGA-NARX model present the highest accuracy for one-step and other-step ahead predictions, respectively. And in each step forecasting experiments, these two models all have outstanding performance which outperform other models; (c) in the comparison of remaining models, the EWT-ARIMA-NARX model is better than the EWT-MAEGA-SVM model in general. With the increase of prediction steps, the gap between these hybrid models is narrowed; (d) the MAEGA-NARX model and EWT-NARX model have the worst forecasting performance for all steps. The reason why these two are less accurate than other models is that MAEGA optimized NARX networks or the based models combine the decomposing method to obtain the decomposed components from non-stationary air pollutant series. In conclusion, the proposed model offers a new idea for the forecasting and early warning for atmosphere pollutant concentrations, and improves the forecasting accuracy of the contents of several common air pollutant to a certain extent. Acknowledgements This study is fully supported by the National Natural Science Foundation of China (Grant No. 61873283), the Changsha Science & Technology Project and Training Program for Excellent Young Innovators of Changsha (Grant No. KQ1707017), the Shenghua Yu-ying Talents Program of the Central South University and the innovation driven project of the Central South University (Project No. 2019CX005). References Afzali, A., Rashid, M., Afzali, M., & Younesi, V. (2017). Prediction of air pollutants

10