Daily urban air quality index forecasting based on variational mode decomposition, sample entropy and LSTM neural network

Daily urban air quality index forecasting based on variational mode decomposition, sample entropy and LSTM neural network

Sustainable Cities and Society 50 (2019) 101657 Contents lists available at ScienceDirect Sustainable Cities and Society journal homepage: www.elsev...

2MB Sizes 0 Downloads 20 Views

Sustainable Cities and Society 50 (2019) 101657

Contents lists available at ScienceDirect

Sustainable Cities and Society journal homepage: www.elsevier.com/locate/scs

Daily urban air quality index forecasting based on variational mode decomposition, sample entropy and LSTM neural network Qunli Wua,b, Huaxing Lina, a b

T



Department of Economics and Management, North China Electric Power University, 689 Huadian Road, Baoding 071003, China Beijing Key Laboratory of New Energy and Low-Carbon Development, North China Electric Power University, Beijing 102206, China

A R T I C LE I N FO

A B S T R A C T

Keywords: Air quality index (AQI) forecasting Variational mode decomposition (VMD) Sample entropy (SE) Long short-term memory (LSTM) Neural network

An accurate and effective air quality index (AQI) forecasting is one of the necessary conditions for the promotion of urban public health, and to help society to be sustainable notwithstanding the effects of air pollution. This study proposes a hybrid AQI forecasting model to enhance forecasting accuracy. Variational mode decomposition (VMD) was applied to decompose the original AQI series into different sub-series with various frequencies. Then, sample entropy (SE) was applied to recombine the sub-series to solve the issues of over-decomposition and computational burden. Next, a long short-term memory (LSTM) neural network was established, to forecast those new sub-series, following which the ultimate AQI forecast could be obtained, by accumulating prediction values from each sub-series. The results illustrated that: (1) the proposed VMD-SE-LSTM model displayed superior capacity for daily urban AQI forecasting, as shown using test case data from Beijing and Baoding; (2) when the proposed model was compared with other models, the results indicated that VMD-SE-LSTM model comprehensively captured the characteristics of the original AQI series. Besides, the proposed model had a high rate of correct AQI class forecasting, which existing single models cannot achieve, while other hybrid models can only reflect AQI series trends with limited prediction accuracy.

1. Introduction In recent years, air pollution has become an increasingly important issue in urban sustainable development. The general public is very sensitive to forward trends in air quality, as it can cause diseases, allergies and even death, to humans (Brunekreef, 2007; He, Ding, & Prasad, 2019; Liu, Xu, & Yang, 2018). The AQI is a dimensionless index that quantitatively describes air quality status; it tracks six pollutants—PM2.5, PM10, CO, O3, SO2, and NO2—and is estimated with reference to the new ambient air quality standard (GB3095-2012). Currently, AQI is a key air quality indicator, and is closely related to outdoor activity decisions and human health (Cao et al., 2011; Ribeiro, Pinho, Branquinho, Llop, & Pereira, 2016; Zhou, Chang, Chang, Kao, & Wang, 2019). It is also now known that even low levels of air pollution can trigger discomfort in sensitive members of a given population (He, Yang, & Ye, 2014; Li et al., 2018). Table 1 lists the AQI standards and related notes in China, with AQI numbers assigned to air quality classes ('excellent ', 'good', 'light pollution', 'moderate pollution', 'serious pollution' or 'heavy pollution'), and notes produced on measures that should be taken in relation to the pursuit or otherwise of outdoor activities.



Real-time air quality information is very important for air pollution control and human health protection (Lin & Zhu, 2018; Ni, Huang, & Du, 2017), which highlights the significance of sound air quality forecasting, a primary approach to informing ever-changing air pollution trend and providing health warnings in advance. Moreover, this provides important support for urban environmental management decision-making in avoiding major accidents caused by air pollution (He, 2018; Pisoni et al., 2018). In the literature, air pollutants and AQI forecasting methods can be classified into two types: numerical models and data-driven models. Numerical models are widely used in the atmospheric field, and are based on classical physical and chemical theories (Baklanov et al., 2008; Vijayaraghavan et al., 2016). Data-driven models include traditional statistical models (Chen & Pai, 2015), AI models (Taylan, 2017; Wang, Liu, Qin, & Zhang, 2015; Wei & Liu, 2016), and hybrid models (Wang et al., 2015), which are based on the characteristics of historical data (Zhu, Wu, Chen, Zhou, & Tao, 2018). Numerical models tend to be complicated, due to the uncertainties associated with source inventories, and to complex atmospheric conditions (Niu, Wang, Sun, & Li, 2016). In our study, we mainly considered data-driven models, as they achieve prediction results easily,

Corresponding author. E-mail address: [email protected] (H. Lin).

https://doi.org/10.1016/j.scs.2019.101657 Received 11 March 2019; Received in revised form 11 June 2019; Accepted 11 June 2019 Available online 12 June 2019 2210-6707/ © 2019 Elsevier Ltd. All rights reserved.

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

Nomenclature

EWT GRNN LSTM LSSVM MAEGA MM MAE MAPE NARX PSOGSA

AERMOD American Meteorological Society/ Environmental Protection Agency regulatory model ARIMA Autoregressive integrated moving average AQI Air Quality Index AI Artificial intelligence ADMM Alternating direction method of multipliers BBODE Biogeography-based optimization differential Evolution BIMF Band-limited intrinsic mode functions BPNN General regression neural network CEEMD Complementary ensemble empirical model decomposition CFM Combined forecasting model DE Differential evolution ELM Extreme learning machine EMD Empirical model decomposition EEMD Ensemble empirical model decomposition

RMSE SE SVR VMD WD

Ensemble wavelet transforms Generalized regression neural network Long short-term memory Least-squares support-vector machines Multi-agent evolutionary genetic algorithm Mirror method Mean absolute error Mean absolute percentage error Nonlinear autoregressive network with exogenous inputs Article swarm optimization and gravitational search algorithm Root mean square error Sample entropy Support vector regression Variational mode decomposition Wavelet decomposition

Table 1 The standard of Air Quality Index and related information, for China. AQI

Grade

AQI Classes

Notes

0–50 51–100 101–150 151–200 201–300 > 300

I II III IV V VI

Excellent Good Light pollution Moderate pollution Serious pollution Heavy pollution

Take part in outdoor activities to breathe fresh air Outdoor activities can be carried out normally Sensitive people* should reduce outdoor activities that cause physical exertion Greater impact on sensitive people Everyone should reduce outdoor activities appropriately Try not to stay outdoors

* The term sensitive people refer to the elderly, children, those with respiratory diseases, etc.

Finally, the ultimate AQI forecast results could be obtained by accumulating the prediction values of each component. Using this methodology, a new hybrid model, named VMD-SE-LSTM, was established for AQI forecasting. The rest of this paper is organized as follows: Section 2 describes the modelling approaches, while in Section 3, construction of the hybrid model designed to forecast AQI is described. Then, in Section 4, the proposed model is examined using experimental and comparative analysis, and Section 5 sets out what has been achieved by this research.

given the availability of applicable software. Table 2 summarizes air pollutant studies and AQI forecasting models published in the past four years, it can be seen that hybrid forecasting models have received considerable attention. A hybrid forecasting model is formed by adding signal processing techniques to an AI model, with the aim of further decomposing a nonlinear AQI time series into more stationary and regular sub-series, so that the ultimate forecasting result can be obtained by aggregating the forecast values from the sub-series. Evidently, the application of signal decomposition technology to air pollutant concentrations, or AQI forecasting, enhances prediction performance, but there are still many shortcomings with this approach. Firstly, there has been little comparison among signal processing technologies, and the number of artificially set decomposition modes and the existence of over-decomposition are unreasonable, resulting in inaccurate extraction of the original sequence information, which affects the final prediction accuracy. For example, (Zhu, Wu et al., 2018) employed EEMD to decompose the AQI series into a total of nine modes. However, the trend of several modes was similar and can be considered as over-decomposition. Besides, the nine-time operation of the prediction model also imposed a computational burden, the final prediction accuracy obtained was only 10.35%. These issues indicate that further study is needed into reasonable hybrid models, if higher levels of precision are to be successfully achieved. The principal purpose of this study, therefore, was to develop a new, high-accuracy, AQI forecasting method. To this end, a new hybrid model, based on VMD, SE, and a LSTM neural network, has been developed for AQI forecasting. Firstly, VMD was adopted to decompose the original AQI series data into a discrete number of components with different frequencies. Next, SE was applied to recombine the sub-series obtained by VMD to solve the issues of over-decomposition and computational burden. Then, an LSTM neural network with good learning ability and time series memory, was established to forecast each recombinant sub-series.

2. Methods The research methodology used in this paper involves variational Table 2 Summary of studies on forecasting air pollutants and air quality indexes using different models. Study

Subject

Type

Method

(Afzali, Rashid, Afzali, & Younesi, 2017)

SO2, NO2, PM10

N

(Zhou et al., 2019) (Wang, Niu, & Wang, 2017) (Wang, Wei et al., 2017) (Yun, Yong, Wang, Xie, & Li, 2016) (Zhu et al., 2017) (Zhu, Wu et al., 2018) (Zhu, Yang et al., 2018)

PM2.5, PM10, NOx PM2.5, PM10, SO2, NO2, CO and O3 AQI PM10, SO2, NO2

AI H

AERMOD/WRF 92 dispersion model coupled with WRF LSTM CEEMD/BBODE/ LSSVM CEEMD-VMD/ ELM/DE WD/BPNN

AQI AQI AQI

H H H

(Liu et al., 2019)

PM2.5, PM10, SO2, CO

H

H H

EMD/SVR/ARIMA EEMD/MM/CFM CEEMD/PSOGSA/SVR/ GRNN EWT/MAEGA/NARX

Note: N = Numerical models; AI = artificial intelligent model; H = hybrid model. 2

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

mode decomposition, sample entropy, and development of an LSTM neural network—and brief descriptions of these methods follow. 2.1. Variational mode decomposition (VMD) VMD was proposed by Dragomiretskiy and Zosso in 2014 (Zosso & Dragomiretskiy, 2014), as a new, multi-resolution approach to nonrecursive signal processing. VMD can adaptively decompose a real-valued signal, f (t ) , into a discrete number of BIMF, uk , with specific sparsity properties. Each BIMF, uk , is compact around a centre pulsation, ωk , which is determined along with the process of decomposition, and its bandwidth is estimated by using the Gaussian smoothness , H1, of the shifted signal. Thus, the process of decomposition is implemented by settling a constrained variational problem as shown in Eq. (1). K

⎧ j ⎧ ⎞ × uk (t ) ⎤ e−jwkt ⎪ min ∑ ∂t ⎡ ⎛δ (t ) + ⎢ ⎥ ⎨ k=1 πt ⎠ ⎪ ⎝ ⎣ ⎦ ⎩ K ⎨ ⎪ s. t . ∑ uk = f (t ) ⎪ k=1 ⎩

Fig. 1. LSTM architecture.

2

⎫ ⎬ 2⎭

2.2. Sample entropy Sample entropy was proposed by Richman and Moorman (Richman & Moorman, 2000), to measure the complexity of two distinct time series. The higher the autocorrelation between the series’, the smaller the SE values. SE values could be obtained by applying the following five steps. Step 1: For a given time series {x i} = {x1, x2, …, xN } , in an m-dimensional vector,

(1)

Making use of both a quadratic penalty term and Lagrangian multipliers, λ , the above constrained problem can be converted to an unconstrained problem, making it easier to address. The augmented Lagrangian multiplier is described as Eq. (2), where α denotes the balancing parameter of the data-fidelity constraint. K

L ({uk }, {ωk }, λ ) = α ∑ k=1

j ⎞ × uk (t ) ⎤ e−jwkt ∂t ⎡ ⎛δ (t ) + ⎢ ⎥ πt ⎝ ⎠ ⎣ ⎦ K

+

f (t ) −

k=1

Dm (Xi, Xj ) = max |x i + k − x j + k |

K

+

λ (t ), f (t ) −

∑ uk (t )

Step 3: Count the sum number of Dm (Xi, Xj ) < r for every i value, and then calculate the ratio with N-m + 1, to obtain Bim (r ) . Then, calculate the mean for Bim (r ) to obtain Bm (r ) .

(2) The ADMM can be used to solve Eq. (1). Therefore, it is implied that updating uk , ωk , and λk , in two directions, is conducive for achieving the VMD analysis process, and solutions for uk , ωk and λk can be calculated as shown in Eqs. (3)–(5).

uˆkn + 1 (ω)

=

Bim (r ) = Bm (r ) =

fˆ (ω) − ∑i ≠ k uˆkn (ω) + (λˆ (ω)/2) 1 + 2α (ω − ωk )2

∫0 ω |uˆkn + 1 (ω)|2 dω ∞ ∫0 |uˆkn + 1 (ω)|2 dω

n+1 n ⎛ λˆk (ω) = λˆk (ω) + τ ⎜fˆ (ω) − ⎝

i≠k



1 N−m+1



Bim (r )

i=1

Bm + 1 (r ) =

1 N−m

N −m



Bim + 1 (r )

Step 5: Estimate SE, using Eq. (12): (5)

Bm + 1 (r ) ⎞ ⎫ SE(m , r ) = lim ⎧−In ⎛ m N →∞ ⎨ ⎝ B (r ) ⎠ ⎬ ⎩ ⎭ ⎟

(12)

Given a finite time series, N , the SE value can be obtained as shown in (13):

Bm + 1 (r ) ⎞ SE(m , r , N ) = −In ⎛ m ⎝ B (r ) ⎠ ⎜



(11)

i=1



‖uˆkn ‖22

(10)

(4)

In Eqs. (3)–(5), fˆ (ω), uˆkn (ω), λˆ (ω) and uˆkn + 1 (ω) represent the Fourier transforms of f (t ) , and n denotes the number of iterations. The VMD algorithm termination condition is presented as Eq. (6), where ε is tolerance of the convergence criterion.

∑k ‖uˆkn + 1 − uˆkn ‖22

(9)

N −m+1

Step 4: Update m to m + 1, and repeat steps 1–3, then obtain the mean for Bim + 1 (r ) .



∑ uˆkn+1 (ω) ⎟

1 num {Dm (Xi, Xj ) < r } N−m

(3)



ωkn + 1 =

(8)

0˜m − 1

k=1

2

(7)

Step 2: Define and calculate the Dm (Xi, Xj ) as the distance between Xi, and Xj :

2

2

∑ uk (t )

Xi = [x i , x i + 1, …, x i + m − 1], (i = 1,2, …, N − m + 1)

2

(6)



(13)

In this study, the embedding dimension, m was set to as 2, and the similarity tolerance r was set to r = 0.2 standard deviations.

BIMF uk can be obtained from the entire decomposition process for VMD, according to the following steps: Step 1: Initialize parameters for the VMD method, including {uk1}, {ωk1}, and λ1, and set the iteration number, n, to 1. Step 2: Calculate uˆkn + 1 (ω) and ωkn + 1, using the Eqs. (3) and (4). Step 3: Update the Lagrangian multiplier, λk , in terms of Eq. (5). Step 4: Given the tolerance of the convergence criterion, ε > 0 , and if the convergence condition of Eq. (6) is satisfied, the iteration is stopped, otherwise n increases to n + 1, and returns to step 2. Then, the final BIMF can be obtained.

2.3. Long short-term memory neural network LSTM was proposed by (Hochreiter & Schmidhuber, 1997); it introduced an improved strategy for to traditional RNN, whereby training was provided through back propagation time, to overcome the vanishing gradient problem. It is suitable for dealing with, and predicting, important events that occur in time series after relatively long intervals or delays. LSTM networks are not neurons, but are memory blocks 3

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

Fig. 2. Flowchart of the proposed model.

input time series data x= (x1, x2 , …, x t − 1, x t ) . The specific mathematical formulae for LSTM for the time step T are as shown in Eq. (14)–(19), where w , R , and E are the existing input weight matrix; x t is the input vector; b is the bias vector; ct − 1 and ht − 1 respectively are the previous cell and its output vector; σ and tanh are activation functions in the range [0,1] and [1,1] respectively.

Table 3 Descriptive statistics of case study AQI data.  

Beijing Baoding

Data set

training data test data training data test data

Statistics Minimum

Maximum

23 33 31 49

500 233 500 283

Mean

Median

100.98 75.48 129 131.6

86.5 66 111 130

Standard Deviation 64.62 44.06 72.36 60.67

σ(x ) =

1 1 + e−x

(14)

Input gate it : (15)

it = σ (wi × x t + Ri × ht − 1 + Ei × ct − 1 + bi ) connected through layers. An LSTM layer consists of a set of recurrently connected blocks, known as memory blocks, which can be thought of as a differentiable version of the memory chips in a digital computer. Each one contains one or more recurrently connected memory cells, and three multiplicative gates. There are three types of gates in a unit:

Forget gate ft :

ft = σ (wf × x t + Rf × ht − 1 + Ef × ct − 1 + bf )

(16)

Output gate yt :

yt = σ (wy × x t + Ry × ht − 1 + Ey × ct − 1 + by )

(17)

Cell ct :

• Forget gate: Conditionally decides what information to throw away from the block. • Input gate: Conditionally determines the value of updating the memory state from the input. • Output gate: Determines what to output, conditionally based on the

ct = ft × ct − 1 + it × c¯t c¯t = σ(wc × x t + R c × ht − 1 + bc )

(18)

Output vector ht :

memory of the input and block.

ht = yt × σ (ct )

Between them, the gates of the unit have the weight of learning in the training process, as is shown in the LSTM architecture diagram presented as Fig. 1. Note that there exists the sigmoid layer, σ , and the tanh layer, and pointwise operations of summation ⨁, and multi⊗; plication LSTM architecture estimates an output y= (y1 , y2 , …, yt − 1 , yt ) by updating the input gate it , output gate yt , and forget gate ft , on memory cell ct , from time t = 1 to T , based on the

(19)

3. Air quality index forecasting model In this section, the proposed VMD-SE-LSTM model is described in detail. The flowchart in Fig. 2 illustrates its operation, and the model was created using the following four techniques: Part 1: Information extraction. The VMD technique is employed to decompose an original AQI series into a discrete number of

Table 4 Recombined by the SE value of VMD subsequence.  

Sub

Beijing

BIMF1 BIMF2 BIMF3 BIMF4 BIMF5 BIMF6 BIMF7 BIMF8

SE value 0.2904 0.6406 0.6654 0.9939 0.8397 0.8090 0.7827 0.8037

New sub

 

Sub

BIMF1 BIMF2&BIMF3

SE-BIMF1 SE-BIMF2

Baoding

BIMF4 BIMF3 BIMF6&BIMF7&BIMF8

SE-BIMF3 SE-BIMF4 SE-BIMF5

BIMF1 BIMF2 BIMF3 BIMF5 BIMF4 BIMF6 BIMF7 BIMF8

Recombination

4

SE value

Recombination

New sub

0.2240 0.6562 0.6548 0.7729 0.7345 0.7482 0.8245 0.8074

BIMF1 BIMF2&BIMF3

SE-BIMF1 SE-BIMF2

BIMF5 BIMF3&BIMF6

SE-BIMF3 SE-BIMF4

BIMF7&BIMF8

SE-BIMF5

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

Fig. 3. Recombination results for the Beijing AQI series decomposed using VMD-SE.

Fig. 4. Recombination results for the Baoding AQI series decomposed using SD-SE.

were used to evaluate the performance of the proposed model.

components, each with different frequencies, respectively denoted by BIMF1, BIMF2, …, BIMFN. A = 2000 and τ = 0.3, to ensure data decomposition fidelity. The function of this technique is to diminish the non-stationary character of a series, in order to promote a high precision, short-term forecast. Part 2: Data recombining. SE was employed to calculate the SE values of each BIMF, and then to recombine the BIMFs into a new subseries (SE-BIMF), based on their approximate SE values. At this point, data pre-processing is complete. The purpose of this technique was to solve the problem of over-decomposition and computational burden. Part 3: Training and validation of the model. In this study, forecasting for each component was conducted with the LSTM model, which has the following basic steps: (I) Network parameter setting, the main LSTM parameters are Hidden Units, max Epochs, mini Batch Size and Learn Rate. In this application, loss function was defined by MSE, Learning Rate ranged from 0.005 to 0.2 to obtain satisfactory effort, the mini Batch Size was taken as 120, and the max Epochs taken was 400; (II) Prepare input data and standardize; (III) Establish and then train the LSTM network structure; (IV) Assess LSTM and renormalize the data; (V) Show results. The LSTM network was completed using the MATLAB 2018b deep learning framework. Part 4: AQI forecasting. Obtain the final forecasting results for AQI as the sum of each SE-BIMF. Part 5: Evaluation of prediction results. RMSE, MAE, MAPE, and R

4. Case study 4.1. Study area and data set Two AQI data series, for Beijing and Baoding, China, were used as case studies, to validate the proposed hybrid model. Beijing and Baoding have similar climates, being very hot in summer, with little rain, while the winters are frigid and dry. Their respective AQI data series do show significant differences, however, due to their different industrial structures. The sites’ AQI data were obtained from the website https://www.aqistudy.cn/historydata/. Daily AQI data were obtained for the period 1 December 2016 to 31 December 2018, with a total of 761 observations. In each city, observations 1–730, and 731–761 (with the latter representing 1 December 2018 to 31 December 2018) were adopted as the training and test data sets, respectively. Descriptive statistics for the AQI series are presented in Table 3, and show that the AQI data series for Baoding exhibited greater volatility than that for Beijing, which confirmed that using these two cases for our study would be a suitable evaluation of our proposed method’s effectiveness and stability.

5

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

4.3. Original AQI series decomposition results

Table 5 Comparison between actual values and forecasts obtained using VMD-SE-LSTM. Date

Actual

2018/12/1 2018/12/2 2018/12/3 2018/12/4 2018/12/5 2018/12/6 2018/12/7 2018/12/8 2018/12/9 2018/12/10 2018/12/11 2018/12/12 2018/12/13 2018/12/14 2018/12/15 2018/12/16 2018/12/17 2018/12/18 2018/12/19 2018/12/20 2018/12/21 2018/12/22 2018/12/23 2018/12/24 2018/12/25 2018/12/26 2018/12/27 2018/12/28 2018/12/29 2018/12/30 2018/12/31

In order to handle the non-stationarity of random and irregular AQI series, the VMD technique, with its strong decomposition ability, was applied to decompose the original AQI series. To decrease the computational time and avoid over-decomposition, the BIMFs were recombined into new sub-series, according to the approximate SE values shown in Table 4. These new, post-recombination sequences are illustrated in Figs. 3 and 4, respectively.

forecast

AQI

AQI Classes

AQI

AQI Classes

180 233 152 61 76 37 38 36 73 83 52 73 45 82 119 87 55 65 78 94 92 66 52 78 58 33 52 40 33 47 70

Moderate pollution Serious pollution Moderate pollution Good Good Excellent Excellent Excellent Good Good Good Good Excellent Good Light pollution Good Good Good Good Good Good Good Good Good Good Excellent Good Excellent Excellent Excellent Good

162 195 138 62 75 54 38 39 68 90 51 68 53 83 120 94 61 65 80 93 87 65 58 83 61 38 52 50 34 44 71

Moderate pollution Moderate pollution Light pollution Good Good Excellent Excellent Excellent Good Good Good Good Good Good Light pollution Good Good Good Good Good Good Good Good Good Good Excellent Good Excellent Excellent Good Good

4.4. Forecasting results After decomposition and recombination using VMD-SE, the LSTM model was developed, to predict each SE-BIMF. Then, the ultimate AQI forecast result could be obtained, by accumulating the prediction values from each SE-BIMF. The daily AQIs for Beijing and Baoding were used to verify VMD-SE-LSTM model performance, and in order to demonstrate the advantages of the VMD-SE-LSTM model, VMD-SE-BP, EEMDLSTM, EEMD-BP, LSTM, and BP models were used for comparison. The forecasting results and comparative analysis for the different cities are considered in the following sub-sections. 4.4.1. Beijing Table 5 lists the results from comparison between actual AQI values and those forecasts using the new model, for Beijing. The class prediction was incorrect for only four days, making the correct class forecasting rate 87.09%. This led to excellent results for the MAPE, MAE, and RMSE, which were 7.73%, 5.66%, and 9.38%, respectively. The MAPE, MAE, and RMSE values obtained using the six forecasting models are shown in Table 6, where the lowest values among all comparison models are marked in bold. The proposed VMD-SE-LSTM model exhibited the best prediction accuracy in this comparison. The fitting curves and forecasting errors for the six models are shown in Fig. 5, where it can be seen that the forecasting errors of the VMD-SE-LSTM model are evenly distributed around zero, and have a relatively small variation range compared to the other five models. In addition, the AQI forecasting curves for the proposed model showed the best match with the actual curves, with the other hybrid models exhibiting more limited prediction accuracy. Specifically, the prediction accuracies of the VMD-SE-LSTM, VMDSE-BP, EEMD-BP and EEMD-LSTM models, based on signal decomposition, was better than that of traditional AI models without signal decomposition. This showed that signal decomposition technology can effectively reduce the non-stationary characteristics of AQI series’, thus facilitating improved performance. Among the two signal decomposition models, which use different signal decomposition techniques, VMD-SE-LSTM performed better than EEMD-LSTM, and overall, the prediction accuracy evaluation index for the proposed model was the best in each comparison. AQI forecast scatterplots for the various models versus the actual AQI data are presented in Fig. 6. The R of the proposed model was 0.98958, which is a little lower than that of EEMD-LSTM model, however the VMD-SE-LSTM model still displayed fitting superiority over the remaining models. In general, the VMD-SE-LSTM model outperformed the other hybrid models, with the highest R value, and overall, the proposed model displayed better AQI prediction, achieving good forecasting performance.

Table 6 Statistical evaluation of different model performances, for Beijing.  

MAE

RMSE

MAPE (%)

BP LSTM EEMD-BP EEMD-LSTM VMD-SE-BP VMD-SE-LSTM

25.44 22.64 16.46 10.87 8.92 5.66

35.07 29.36 24.62 12.78 11.47 9.38

40.89 38.12 19.23 16.93 13.62 7.73

4.2. Performance criteria for determining prediction accuracy In our study, RMSE, MAE, MAPE, and R were used as criteria to quantitatively evaluate the performance of the proposed hybrid model. In general, the smaller the value achieved for the criterion, the better the performance. The computational equations for these criteria are shown in Eqs. (20)–(23), where x i is the actual data at time i, and xˆi is the corresponding predictive data. i = 1, 2, …, n. n

1 ∑ (xi − xˆi )2 n i=1

RMSE =

(20)

n

MAE =

1 ∑ |xi − xˆi | n i=1

(21)

4.4.2. Baoding In order to further review the stability of the proposed method, AQI forecasting for Baoding is discussed in this section, with Table 7 presenting the forecasting results. In the table, it can be seen there are five days of incorrect class prediction, and that the correct rate for class forecasting is 83.87%. Fitting curves and prediction errors are shown in Fig. 7, and the evaluation results using the six models are listed in Table 8. The MAE, MAPE, and RMSE of the proposed model were 11.97,

n

MAPE =

1 x − xˆ ∑ ix i n i=1 i

× 100% (22)

n

R=

n

n

∑i = 1 xˆi x i − ∑i = 1 xˆi ∑i = 1 x i / n n

n

2

n

n

2

2 2 (∑i = 1 xˆi − (∑i = 1 xˆi ) / n)(∑i = 1 xˆi − (∑i = 1 x i ) / n)

(23) 6

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

Fig. 5. AQI forecast results for Beijing, achieved using various models.

Fig. 6. Scatterplots of actual and forecast AQI values achieved using various models (Beijing data).

satisfactory AQI forecasting performance for Baoding, thus verifying its suitability for AQI forecasting and its usefulness in practical applications.

9.09%, and 15.10, respectively, which were lower than the results achieved by the comparison models. Scatterplots of forecast AQI from the various models versus the actual AQI are shown in Fig. 8; the R for the proposed model was 0.98646, which illustrated its superiority. Based on the forecast results, conclusions similar to those drawn for the Beijing case study were also seen to apply to this case. It was also apparent that signal processing technology was able to significantly improve AQI prediction accuracy. The proposed VMD-SELSTM model outperformed the other models that were considered, and overall, the proposed hybrid VMD-SE-LSTM model showed the most

4.5. Discussion The prediction results from case studies involving two cities have indicated that the proposed hybrid model, VMD-SE-LSTM, is effective for AQI forecasting, and has better performance compared to other existing models, in terms of accuracy and stability. It achieved a mean MAPE of 8.41%, which compared well with those published in the other 7

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

Table 7 Comparison between actual and forecast values, obtained using VMD-SE-LSTM. Date

2018/12/1 2018/12/2 2018/12/3 2018/12/4 2018/12/5 2018/12/6 2018/12/7 2018/12/8 2018/12/9 2018/12/10 2018/12/11 2018/12/12 2018/12/13 2018/12/14 2018/12/15 2018/12/16 2018/12/17 2018/12/18 2018/12/19 2018/12/20 2018/12/21 2018/12/22 2018/12/23 2018/12/24 2018/12/25 2018/12/26 2018/12/27 2018/12/28 2018/12/29 2018/12/30 2018/12/31

Actual

Table 8 Statistical evaluation of different models using Baoding data.

forecast

AQI

AQI Classes

AQI

AQI Classes

233 234 158 87 114 75 49 51 108 138 95 162 168 130 208 283 68 135 149 145 192 213 73 166 115 55 61 59 142 104 109

Serious pollution Serious pollution Moderate pollution Good Light pollution Good Excellent Good Light pollution Light pollution Good Moderate pollution Moderate pollution Light pollution Serious pollution Serious pollution Good Light pollution Light pollution Light pollution Moderate pollution Serious pollution Good Moderate pollution Light pollution Good Good Good Light pollution Light pollution Light pollution

208 217 155 108 106 73 47 61 123 130 106 144 150 121 206 246 70 105 159 131 200 181 80 157 134 59 67 58 148 116 115

Serious pollution Serious pollution Moderate pollution Good Light pollution Good Excellent Excellent Light pollution Light pollution Light pollution Light pollution Moderate pollution Light pollution Serious pollution Serious pollution Good Light pollution Moderate pollution Light pollution Moderate pollution Moderate pollution Good Moderate pollution Light pollution Good Good Good Light pollution Light pollution Light pollution

 

MAE

RMSE

MAPE (%)

BP LSTM EEMD-BP EEMD-LSTM VMD-SE-BP VMD-SE-LSTM

46.18 42.86 24.47 21.93 12.34 11.97

57.17 54.06 29.05 28.94 16.15 15.10

45.19 32.11 21.80 17.40 14.78 9.09

unreasonable, in the CEEMD-GCA-GRNN-PSOGSA-SVR and the CEEMD-VMD-ELM-DE models. In our study, SE was adopted to recombine the sub-series obtained by VMD, to solve the problem of overdecomposition and computational burden. This supported the decomposition effect and reduced operation time. It can also be confirmed that the proposed model predicted AQI quickly and conveniently, demonstrating its superiority over other models, to some extent. 5. Conclusions An accurate, effective and stable AQI prediction model is necessary, to promote urban public health and the sustainable development of society, notwithstanding the negative impacts of air pollution. In order to enhance efficient and accurate AQI prediction, a hybrid model named VMD-SE-LSTM has been developed. In this model, first, the VMD technique was employed to decompose the original AQI series, then, SE was applied to recombine the sub-series obtained by VMD—to solve the problem of over-decomposition and computational burden. Next, an LSTM neural network, with good learning ability and time series memory, was established, to forecast each recombinant subsequence. Based on the evaluation criteria, the hybrid VMD-SE-LSTM model with almost optimal values was more effective at AQI forecasting than five models used for comparison: BP, LSTM, EEMD-BP, EEMD-LSTM, and VMD-SE-BP. The proposed hybrid model showed it was able to comprehensively capture the characteristics of the original AQI series, and had a high correct rate for AQI class forecasting. Single models used for comparison could not produce comparable performances, and other hybrid models only reflected the AQI series trend, with limited prediction accuracy. Based on these findings, we were able to conclude

studies, being less than the WD-BPNN model (15.9%) (Bai, Li, Wang, Xie, & Li, 2016), the EEMD-MM-CFM model (10.35%) (Zhu, Wu et al., 2018), and the EMD-IMFs-hybrid and EMD-SVR-hybrid models (15.6%) (Zhu et al., 2017), and more than CEEMD-GCA-GRNN-PSOGSA-SVR model (6.152%) (Zhu, Yang et al., 2018), and the CEEMD-VMD-ELMDE model (4.33%) (Wang, Wei, Luo, Yue, & Grunder, 2017). The number of decomposition modes artificially set could be seen as

Fig. 7. AQI forecast results for Baoding, achieved using various models. 8

Sustainable Cities and Society 50 (2019) 101657

Q. Wu and H. Lin

Fig. 8. Scatterplots of actual and forecast AQI values achieved using various models (Baoding data).

that the proposed hybrid model is apparently superior, in comparison to other single or hybrid models, and is suitable for daily urban AQI forecasting.

Liu, W., Xu, Z., & Yang, T. (2018). Health effects of air pollution in China. International Journal of Environmental Research and Public Health, 15(7), 1471. Ni, X. Y., Huang, H., & Du, W. P. (2017). Relevance analysis and short-term prediction of PM2.5 concentrations in Beijing based on multi-source data. Atmospheric Environment, 150, 146–161. Niu, M., Wang, Y., Sun, S., & Li, Y. (2016). A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting. Atmospheric Environment, 134, 168–180. Pisoni, E., Albrecht, D., Mara, T. A., Rosati, R., Tarantola, S., & Thunis, P. (2018). Application of uncertainty and sensitivity analysis to the air quality SHERPA modelling tool. Atmospheric Environment, 183, 84–93. Ribeiro, M. C., Pinho, P., Branquinho, C., Llop, E., & Pereira, M. J. (2016). Geostatistical uncertainty of assessing air quality using high-spatial-resolution lichen data: A health study in the urban area of Sines, Portugal. The Science of the Total Environment, 562, 740–750. Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology Heart and Circulatory Physiology, 278(6), H2039–2049. Taylan, O. (2017). Modelling and analysis of ozone concentration by artificial intelligent techniques for estimating air quality. Atmospheric Environment, 150, S1352231016309037. Vijayaraghavan, K., Cho, S., Morris, R., Spink, D., Jung, J., Pauls, R., et al. (2016). Photochemical model evaluation of the ground-level ozone impacts on ambient air quality and vegetation health in the Alberta oil sands region: Using present and future emission scenarios. Atmospheric Environment, 141, 209–218. Wang, D., Wei, S., Luo, H., Yue, C., & Grunder, O. (2017). A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine. The Science of the Total Environment, 580, 719–733. Wang, J., Niu, T., & Wang, R. (2017). Research and application of an air quality early warning system based on a modified least squares support vector machine and a cloud model. International Journal of Environmental Research and Public Health, 14(3), 249. Wang, P., Liu, Y., Qin, Z., & Zhang, G. (2015). A novel hybrid forecasting model for PM₁₀ and SO₂ daily concentrations. The Science of the Total Environment, 505(505C), 1202. Wei, S., & Liu, M. (2016). Wind speed forecasting using FEEMD echo state networks with RELM in Hebei, China. Energy Conversion and Management, 114, 197–208. Yun, B., Yong, L., Wang, X., Xie, J., & Li, C. (2016). Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions. Atmospheric Pollution Research, 7(3), 557–566. Zhou, Y. L., Chang, F. J., Chang, L. C., Kao, I. F., & Wang, Y. S. (2019). Explore a deep learning multi-output neural network for regional multi-step-ahead air quality forecasts. Journal of Cleaner Production, 209, 134–145. Zhu, J., Wu, P., Chen, H., Zhou, L., & Tao, Z. (2018). A hybrid forecasting approach to air quality time series based on endpoint condition and combined forecasting model. International Journal of Environmental Research and Public Health, 15(9), 1941. Zhu, S., Lian, X., Liu, H., Hu, J., Wang, Y., & Che, J. (2017). Daily air quality index forecasting with hybrid models: A case in China. Environmental Pollution, 231(Pt 2), S0269749117316330. Zhu, S. L., Yang, L., Wang, W. N., Liu, X. R., Lu, M. M., & Shen, X. P. (2018). Optimal-combined model for air quality index forecasting: 5 cities in North China. Environmental Pollution, 243, 842–850. Zosso, D., & Dragomiretskiy, K. (2014). Variational mode decomposition. IEEE Transactions on Signal Processing, 62(3), 531–544.

Acknowledgements The authors gratefully acknowledge financial support from the National Social Science Fund of China (Grant No. 17BGL252). References Afzali, A., Rashid, M., Afzali, M., & Younesi, V. (2017). Prediction of air pollutants concentrations from multiple sources using AERMOD coupled with WRF prognostic model. Journal of Cleaner Production, 166. Bai, Y., Li, Y., Wang, X. X., Xie, J. J., & Li, C. (2016). Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions. Atmospheric Pollution Research, 7(3), 557–566. Baklanov, A., Mestayer, P. G., Clappier, A., Zilitinkevich, S., Joffre, S., Mahura, A., et al. (2008). Towards improving the simulation of meteorological fields in urban areas through updated/advanced surface fluxes description. Atmospheric Chemistry and Physics, 8(3), 523–543. Brunekreef, B. (2007). Health effects of air pollution observed in cohort studies in Europe. Journal of Exposure Science & Environmental Epidemiology, 17, S61–S65. Cao, J., Yang, C. X., Li, J. X., Chen, R. J., Chen, B. H., Gu, D. F., et al. (2011). Association between long-term exposure to outdoor air pollution and mortality in China: A cohort study. Journal of Hazardous Materials, 186(2-3), 1594–1600. Chen, L., & Pai, T. Y. (2015). Comparisons of GM (1, 1), and BPNN for predicting hourly particulate matter in Dali area of Taichung City, Taiwan. Atmospheric Pollution Research, 6(4), 572–580. He, B.-J., Ding, L., & Prasad, D. (2019). Enhancing urban ventilation performance through the development of precinct ventilation zones: A case study based on the Greater Sydney, Australia. Sustainable Cities and Society, 47, 101472. He, B. J. (2018). Potentials of meteorological characteristics and synoptic conditions to mitigate urban heat island effects. Urban Climate, 24, 26–33. He, B. J., Yang, L., & Ye, M. (2014). Strategies for creating good wind environment around Chinese residences. Sustainable Cities and Society, 10, 174–183. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. Li, H., You, S., Zhang, H., Zheng, W., Lee, W. L., Ye, T., et al. (2018). Analyzing the impact of heating emissions on air quality index based on principal component regression. Journal of Cleaner Production, 171, S0959652617323995. Lin, B., & Zhu, J. (2018). Changes in urban air quality during urbanization in China. Journal of Cleaner ProductionS0959652618309740. Liu, H., Wu, H., Lv, X., Ren, Z., Liu, M., Li, Y., et al. (2019). An intelligent hybrid model for air pollutant concentrations forecasting: Case of Beijing in China. Sustainable Cities and Society, 47, 101471.

9