Journal of Cleaner Production 231 (2019) 1005e1015
Contents lists available at ScienceDirect
Journal of Cleaner Production journal homepage: www.elsevier.com/locate/jclepro
Recurrent Neural Network and random forest for analysis and accurate forecast of atmospheric pollutants: A case study in Hangzhou, China Rui Feng a, *, Hui-jun Zheng b, **, Han Gao c, An-ran Zhang d, Chong Huang e, Jun-xi Zhang a, Kun Luo a, ***, Jian-ren Fan a a
State Key Laboratory of Clean Energy Utilization, Zhejiang University, Hangzhou, 310027, PR China Department of Intensive Care Unit, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, 310020, PR China Zhejiang Construction Investment Environment Engineering Co,Ltd., Hangzhou, 310013, PR China d Zhejiang Tongji Vocational College of Science and Technology, Hangzhou, 311215, PR China e Hangzhou Netease Zaigu Technology Co., Ltd., Hangzhou, 310052, PR China b c
a r t i c l e i n f o
a b s t r a c t
Article history: Received 12 March 2019 Received in revised form 6 May 2019 Accepted 27 May 2019 Available online 28 May 2019
Hangzhou, one the most prosperous cities in China, suffers from severe atmospheric quality degradation in recent years. For the good of nine million local citizens and incoming Asian Games, Recurrent Neural Network (RNN) and Random Forest are used to analyze the air pollution in Hangzhou. Compared with the traditional atmospheric models, machine learning models are faster, more accurate and less costly in simulating all the pollutants without using the pollution inventory. The Feature Importance (FI) generated by Random Forest reveals the complicated relationships among air pollutants and meteorology. Carbon monoxide (CO) plays an important role in shaping ground-level ozone, nitrogen dioxide (NO2) and particulate matters (PM) in the atmospheric environment. Dew-point deficit plays a more important role than relative humidity in shaping air pollutants. Urban heat island effect is not obvious for the air pollutants from non-point source. Furthermore, a WRF/RNN-based method to forecast air pollutants, including SO2, NO2, CO, PM2.5, PM10 and O3, in the future 24 h is proposed and a RNN-based method to estimate regional transport rate of air pollutants and reversely identify air pollution emission sources is introduced. At last, policy assessment is made to better regulate the air pollution in Hangzhou, prior to the 2022 Asian Games. © 2019 Elsevier Ltd. All rights reserved.
Keywords: Recurrent neural network Random forest WRF-CMAQ Feature importance Atmospheric pollution forecast
1. Introduction Hangzhou, one of the most prosperous cities in China and the host of the 2016 Group of Twenty summit, 2018 FINA Short Course World Championships and 2022 Asian Games, suffers from severe atmospheric degradation in recent years (Liu et al., 2015; Feng et al., 2018). The major atmospheric pollutants are sulfur dioxide (SO2), nitrogen dioxide (NO2), lower tropospheric ozone (O3), carbon monoxide (CO), PM2.5, PM10 and volatile organic compounds (VOCs) Wan et al. (2018); Li et al. (2018); (Zhou et al., 2006; Wagoner et al., 2012). SO2 and PM10 are strongly
* Corresponding author. ** Corresponding author. *** Corresponding author. E-mail addresses:
[email protected] (R. Feng),
[email protected] (H.-j. Zheng),
[email protected] (K. Luo). https://doi.org/10.1016/j.jclepro.2019.05.319 0959-6526/© 2019 Elsevier Ltd. All rights reserved.
connected with nonmalignant respiratory disease mortality (Chen et al., 2017). Lower tropospheric ozone causes nearly eighty thousand premature deaths in China annually (Liu et al., 2018a). Exposure to PM2.5 and NO2 is associated with increased mortality (Hvidtfeldt et al., 2019; Liu et al., 2017). VOCs are toxic, carcinogenic and mutagenic on human respiratory and nervous systems, causing asthma or allergy (Gałe˛ zowska et al., 2016; Guerra et al., 2017; Lei et al., 2018). Also, VOCs are able to generate secondary organic aerosols, mostly PM2.5 (Derwent et al., 2010). Significant increment in mortality and morbidity from cerebrovascular and cardiovascular diseases are found connected with elevated ambient carbon monoxide (Liu et al., 2018b). However, there is still no method to forecast and early warn the major atmospheric pollutants in the future. Therefore, for the good of local citizens and incoming Asian Games, a fast and accurate atmospheric pollution model for forecast of atmospheric pollution which is also able to identify the locations of pollution emission sources is urgently needed.
1006
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
Traditional atmospheric models, such as the Weather Research and Forecast coupled with Community Multi-scale Air Quality modeling system (WRF-CAMQ) and WRF with Chemistry (WRFChem), have been widely used (Feng et al., 2018; Xu et al., 2018). In recent years, Random Forest and Artificial Neural Networks rise as a brand new method in prediction of atmospheric pollutants or ska, 2019; Palani et al., 2011; contaminants in the water (Kamin Bhowmik et al., 2016, 2017, 2018; Debnath et al., 2015). Recurrent Neural Network (RNN), a time-related intelligent deep learning machine, is a novel method that is firstly used to predict atmospheric pollutants. Compared with the traditional atmospheric models (Feng et al., 2018a; Xu et al., 2018; Zhang et al., 2016), whose accuracy needs to be ensured by constant updated pollution inventories, Random Forest and RNN are more accurate, much faster in running speed and less costly due to the un-necessity of using pollution inventory. Here, three models, WRF-CAMQ, Random Forest and Recurrent Neural Networks (RNN), are established to compare their prediction ability for major air pollutants in the future 24 h. To probe the determining factors for each atmospheric pollutant, we calculate the feature importance (FI) via Random Forest for each input derived from meteorology and pollutants. Then, the input with higher FI is used to predict the future pollutants using the meteorology generated from WRF. Furthermore, the simulated results by RNN for multiple environmental monitoring stations are used to reversely identify the locations of air pollutants emission sources. At last, why the combination of random forest and RNN is able to replace the traditional atmospheric models is discussed. 2. Materials and methods 2.1. Observation Seven air pollutants, including SO2, NO2, O3, PM2.5, PM10, CO and VOCs, were sampled from 1st December 2017 to 31st December 2018 at the Olympic Sports Center (OSC) (30.229 N, 120.229 E) in the downtown Hangzhou. The Olympic Sports Center will hold the opening ceremony and the competitions of track and field of the 2022 Asian Games. 107 different kinds of VOCs species from the outdoor atmosphere at OSC have been detected, measured and recorded. The hourly meteorological data were recorded at OSC as well representing downtown weather conditions and at Mount West-Path (30.262 N, 119.726 E) representing rural weather conditions. Measuring instruments included Model 42i (NOeNO2NOx) analyzer, Model 43i (SO2) analyzer, Model 48i (CO) analyzer, Model 49i (O3) analyzer, TEMO1405 (PM2.5/PM10) analyzer and Gas Chromatography coupled with Mass Spectrometer (GC/MS). The GC was Agilent 7780N, mass spectrometer was 5977C. 2.2. Simulation 2.2.1. WRF-CMAQ configuration WRF version 3.7.1 is used to simulate the meteorology. Initial and boundary conditions are derived from the National Centers for Environmental Prediction (NCEP) (http://dss.ucar.edu/datasets/ ds083.2/) in every 6 h. Domain is gridded into 1.0 1.0 . The main options for WRF are listed in Table 1. The anthropogenic emission inventory is made by Tsinghua University (http://www.meicmodel.org/) and Shanghai Academy of Environmental Sciences (Wang et al., 2018; Liu et al., 2018c; Li et al., 2019a). And the local inventory is upgraded by our research group. The biogenic emission inventory is created by the Model of Emissions of Gases and Aerosols from Nature (MEGAN) version 2.1 (Yu et al., 2017; Zhang et al., 2017). CMAQ version 5.1 (https://www. cmascenter.org/cmaq/) is applied for the simulations upon
Table 1 Main WRF options. Parameter
Selected scheme
Microphysics Long wave radiation Short wave radiation Surface layer Land surface Planetary boundary layer Cumulus
WSM6 (Xiao et al., 2013) RRTM (Fountoukis et al., 2018) Goddard (Fauchez et al., 2018) Monin-Obukhov (Breedt et al., 2018) et al., 2017) Noah (Kishne YSU (Parra, 2018) ve nyi (Grell and Freitas, 2014) Grell- De
Hangzhou through three nested domains using Lambert projection with two true latitudes (25 N and 40 N). Domain 1 covers the entire China with 180 140 grids in 36 km resolution. Domain 2 covers East China with 140 230 grids in 12 km resolution. Domain 3 covers Yangtze River Delta with 150 170 grids in 4 km resolution. Vertically, 28 layers extend unequally from ground to 100 hpa. Sigma coordinates for the first layer is set between 1 and 0.998. Aerosol mechanism ARE06 is chosen for the aerosol module and carbon bond mechanism CB05 is picked for gas-phase chemistry. 2.2.2. Recurrent Neural Network and random forest The inputs of RNN and Random Forest consist of meteorology and atmospheric pollutants. Meteorology includes eight variables, which are temperature, dew-point deficit, relative humidity, precipitation, wind speed, wind direction, atmospheric pressure and urban heat island (UHI). UHI is defined as the temperature difference between downtown and rural Hangzhou. Pollutants include nine variables, which are SO2, NO2, PM2.5, PM10, O3, CO, VOCs, ozone formation potentials (OFP) and propene equivalent concentration (PEC). OFP and PEC are two statistic indicators to estimate the ability of VOCs to generate ozone. OFP stands for the total amount of ozone produced by a VOC species and PEC indicates the ozone formation rate by a VOC species. Output is set as the concentration of one air pollutant. The main parameters for RNN and Random Forest are shown in Table 2. The reason that RNN and Fandom Forest are selected in this paper is they outperform other machine learning models, such as Extreme Learning Machine (ELM) and Multiple-layer Perceptron (MLP). Activating Function for RNN is selected as Rectified Linear Unit (ReLU) (Rynkiewicz, 2019). 3. Results and discussions 3.1. OFP, PEC and SOAFP The equations for calculating ozone formation potential (OFP) (Duan et al., 2008; Vo et al., 2018), propene-equivalent concentration (PEC) (Jia et al., 2016; Nie et al., 2019; Wang et al., 2016) and secondary organic aerosol formation potential (SOAFP) (Vo et al., 2018) are listed below,
OPF ¼
n X
ri MIRi
(1)
i¼1
PEC ¼
n X
Mi
i¼1
SOAFP ¼
n X
KOH VOCi KOH Propene
ri SOAP i
(2)
(3)
i¼1
Whereri is the concentration of VOC species i in mg/m3, MIRi is the maximum incremental reactivity, documented by Carter (2010), Mi is the molarity of VOC species i in mol/cm3, kOH is the reaction
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
1007
Table 2 Parameters for RNN and random forest. Random forest
RNN
N estimators Min sample split Max depth Max features Max leaf nodes
500 2 None Auto None
Hidden units Time steps Learning rate Layer number Activate function
rate constant between VOC species i and hydroxyl radical recorded by Atkinson and Arey (2003) and SOAPi is the secondary organic formation potential of VOC species i, reported by Derwent et al. (2010). The yearly individual VOC species concentrations and their proportions of total OFP and PEC in the downtown Hangzhou are given in Table 3 and Table S1. According to the empirical kinetic modeling approach (Tan et al., 2018), OSC belongs to VOC-limited regime. The ratio of toluene to benzene is larger than 2 (Hui et al., 2018) indicating the main source of VOCs is the industry. The ratio of xylene to ethylbenzene is larger than 3 (Feng et al., 2018; Yurdakul et al., 2018) implicating low regional transport rate of VOCs and the source of the VOCs is nearby. High ratios of propane, n-butane and isobutane to acetylene (Hui et al., 2018) imply liquefied petroleum gas leakage at OSC. 3.2. Feature importance Feature importance (FI) (Razmjoo et al., 2017), created by Random Forest (https://www.stat.berkeley.edu/~breiman/ RandomForests/cc_home.htm), is used to quantify the significance of each input for impacting on the output. Due to the fact that there is no massive atmospheric pollution control strategy carried out in Hangzhou and its surrounding areas since the G20 summit and the weekend/holiday effects are unobvious in Hangzhou (Feng
200 5 0.001 2 ReLU
et al., 2018), a reasonable assumption is the pollution emission sources stay almost the same in 2018. We use the daily average data to generate FI (see Fig. 1). The higher FI that an input scores, the more important that input is to the output. Fig. 1 shows that NO2 dominates the formation of PM2.5 and PM10 and SO2 is essential but less significant than NO2 in shaping PM, since SO2 and NO2 are the major precursors of PM. CO is strongly positively correlated to PM2.5, PM10 and NO2 with Pearson correlation coefficient (R) of 0.73, 0.71 and 0.78, respectively. The n et al., 2019) indicates the equation CO þ NO3$CO2þNO2 (Abia reason that CO and NO2 are positively correlated (R ¼ 0.78). According to the chemical equilibrium, adding CO results in increase of NO2 and vice versa. CO and O3 are negatively correlated, because of chemical reaction equilibrium of the equation CO þ O3$CO2þO2. Temperature is negatively correlated to SO2 and NO2 with R of 0.47 and 0.48, respectively. O3 and PM2.5 are negatively correlated because increase of PM2.5 speeds up the aerosol sink of hydro-peroxy (HO2) radicals (Li et al., 2019a), the precursors of O3. NO2 and O3 are negatively correlated due to the reaction due to the NO2eO3 titration effect (Li et al., 2012). OFP has greater impact than PEC on O3. The poor visibility owes to increase of dew-point deficit and PM2.5. More importantly, dew-point deficit plays a more significant role than relative humidity in shaping all the air pollutants. The contribution of urban heat island (UHI) to
Table 3 Top 30 VOCs for concentration, OFP and PEC at OSC, 2018
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Sum
VOCs
Concentration (mg/m3)
VOCs
OFP (%)
VOCs
PEC (%)
Trichlorofluoromethane 1,2-Dichloroethane Ethanol Acetone Toluene 2,3-Dimethylbutane Ethyl acetate 2-Butanone Propane Dichloromethane M/P-xylene Styrene 2-Methylpentane O-xylene Carbon disulfide Isopentane Cumene N-butane Benzene Dichlorodifluoromethane Propylene Isobutane Trichloromethane Isopropanol 1,2-Dichloropropane Ethane Ethylbenzene N-pentane Chloromethane N-hexane
24.72 11.12 10.86 9.91 6.30 5.70 5.56 5.45 5.23 5.19 5.15 4.93 3.99 3.37 2.86 2.82 2.46 2.38 2.12 1.99 1.98 1.98 1.85 1.71 1.53 1.53 1.43 1.35 1.28 1.21 137.96 mg/m3 (79.37%)
M/P-xylene Toluene Propylene 2-Methylpentane O-xylene 1-Butene Ethylbenzene Styrene 2-Butanone Ethylene Ethanol N-hexane Acrolein Cumene 1-Pentene 2,3-Dimethylbutane 4-Methyl-2-pentanone Isoprene 1-Hexene Isopentane Acetone Methyl methacrylate Cis-2-butene 2-Hexanone N-butane M-diethylbenzene Propane 1,3-Butadiene Trans-2-butene Isobutane
10.69 8.96 8.21 7.95 7 3.38 3.34 3.03 2.87 2.75 2.59 2.42 2.38 2.21 2.07 1.97 1.65 1.47 1.47 1.46 1.27 1.05 1.03 1.01 0.97 0.92 0.91 0.91 0.88 0.86 87.68%
Styrene Propylene Ethanol M/P-xylene Isoprene 1-Butene O-xylene Toluene 2,3-Dimethylbutane 1-Pentene 1-Hexene Ethylene 1,3-Butadiene 2-Methylpentane Cis-2-butene Cis-2-pentene 4-Methyl-2-pentanone Isopentane Propane Cumene Naphthalene 2-Hexanone N-butane Ethylbenzene 2-Butanone N-hexane Isobutane N-pentane 2,3,4-Trimethylpentane N-dodecane
23.69 10.68 6.5 5.98 4.99 4.72 3.92 3.35 3.3 3.12 2.85 2.25 2.16 2.08 1.76 1.73 1.34 1.22 1.11 1.11 1.04 0.9 0.83 0.81 0.79 0.63 0.62 0.61 0.57 0.44 95.10%
1008
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
Fig. 1. Feature importance: (a) SO2; (b) NO2; (c) CO; (d) PM2.5; (e) PM10; (f) O3-24 h; (g) VOCs; (h) Visibility.
pollutants is rather small, except for VOCs, because most VOCs are emitted by the aforementioned industry source near OSC which is a point source, as a point source is greatly influenced by wind speed and wind direction modified by UHI. Precipitation has more impact on PM2.5 and PM10 than the gaseous air pollutants. SOAFP has little influence on PM, indicating the components of PM are mainly nonorganic. CO and PM2.5 are positively correlated (R ¼ 0.73), as PM2.5 is the primary FI for CO while CO is the secondary FI for PM2.5. Since the primary source of CO and PM2.5 are traffic and industry, respectively, CO and PM2.5 aren't emitted by the same type of sources simultaneously. Therefore, CO has impact on PM2.5. Two potential reasons account for this. Firstly, CO elevates PM2.5 by shaping NO2. Secondly, since CO is not a precursor of PM2.5, CO may act as a trigger to stimulate and accelerate the formation of the secondary PM2.5. There is a need to better understand the function of CO in the formation processes of PM2.5.
3.3. Model comparison Our goal is to build a fast and accurate model to predict the pollutants level in the future 24 h. As for SO2, NO2, CO, PM2.5 and PM10, testing time is set as the daily average data of one day while training time is set as the daily average data extracted from previous three months. As for ground-level O3, testing time is set as the O3-8 h (maximum daily average of O3 in eight consecutive hours) while training time is set as the daily data in previous three months averaging in the same time period as O3-8 h. Training data are derived from actual observation while testing data are simulated by WRF. Because SO2, NO2 and CO are mainly influenced by meteorology, only meteorological data are used to simulate them. Input includes six covariates, which are temperature, dew-point deficit, precipitation, atmospheric pressure, wind speed and wind directions. PM2.5 and PM10 are predicted using RNN-simulated SO2,
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
1009
Fig. 1. (continued).
NO2 and CO coupled with WRF-simulated meteorology. O3-8 h is predicted using RNN-simulated NO2, PM2.5 and CO coupled with WRF-simulated meteorology. Hangzhou has a subtropical climate with four seasons, so we pick January, April, July and October to represent winter, spring, summer and autumn, respectively. The model comparisons of Random Forest, RNN and WRF-CMAQ are made in Fig. 2. Statistic indicators (see Table S2) indicate RNN is the best model for forecast of air pollutants, followed by Random Forest and WRFCMAQ. More importantly, the main difference between traditional atmospheric model, such as WRF-CMAQ, and machine learning is the air pollution emission inventory. The traditional atmospheric model needs the accurate inventory for the simulation. Building of a national-scaled inventory is costly and needs a lot of efforts on
collection of local data. Any neglect of pollution sources leads to inaccuracy. In contrast, machine learning doesn't need any inventory, but only a few samples to train when the pollution sources is changed. The Multi-resolution Emission Inventory for China (MEIC) (Li et al., 2019a) is updated every two years unfitted for a rapid growth economy like China but the training samples for machine learning is only at most 100 previous days. Furthermore, the calculation time of machine learning is less than one percent of the traditional atmospheric model. In this case, simulating the hourly seven kinds of air pollutants in four months of 2018 by WRFCMAQ needs more than 6 day using 256 cores computer-clusters while less than 1 h is needed by machine learning for simulating same data using a personal laptop with 4 cores. Plus, machine learning is able to reveal the implicit relationships among
1010
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
Fig. 2. Air pollution predicted by WRF-CMAQ, Random forest and RNN: (a) SO2; (b) NO2; (c) CO; (d) PM2.5; (e) PM10; (f) O3.
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
1011
Fig. 2. (continued).
pollutants where CMAQ fails. For instance, FI implicates CO and PM2.5 is strongly negatively connected, but CAMQ doesn't have such functions to estimate PM2.5. Thus, machine learning can be used to improve the chemical mechanisms applied in CMAQ. The only shortcoming of machine learning is it's unable to calculate the sector contributions (Feng et al., 2018) of pollutants.
3.4. Estimation of regional transport rate of air pollutants and reverse identification of air pollution source by RNN Since Random Forest and RNN are able to catch the secondary PM2.5 and PM10 with the input of their precursors, the main reason
for underestimation of PM2.5 and PM10 is the long range transport by regional movement of air mass. Intrusion of the cold air from Siberian plateau in Mongolia (Li et al., 2019b) accounts for the episodes of PM10 in the mid-January and mid-April. RNN outperforms Random Forest because it is linked to temporal contextual information (Liu and Sullivan, 2019). The training period and testing period are set the same, as RNN is able to filter the extreme fluctuations. Inputs include meteorology and observed pollution (SO2, NO2, CO, O3 and VOCs). When the observations of PM2.5 and PM10 are over 5% higher than simulation, the surplus part is considered long range transported. We take episodes of PM10 in 16e20 January and 16e18 April for examples (see Table S3). Compared with the
1012
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
Fig. 3. National distribution of PM2.5 in January 2018.
zero-out strategy (Feng et al., 2018), this method is much faster. The results show that in the episodes about 35% of PM10 are from long range transport. There are ten environmental monitoring stations in Hangzhou, over one hundred in the Yangtze River Delta (YRD) and over two thousand in China (see Fig. 3). All the stations measure the hourly data of meteorology and atmospheric pollution. RNN can be trained to identify the emission locations of pollutants, and Fig. 3 is an example of national PM2.5. It's obvious that the main sources of PM2.5 are from the central China. Furthermore, Fig. 4 shows the time and space distribution of PM10 in YRD using MEIC as the inventory. The center of Hangzhou (30.2 N, 120.2 E) is located near the intersection of dotted lines. From Fig. 4, the hot spots indicate sources of PM10 are in the due north of Hangzhou. The daily average data from the local ten environmental monitoring stations are used to train RNN. Wind speed is set to be zero to predict the pollutants. The results infer that the two stations in the northeast of Hangzhou have relative high values of PM10. There is no big difference between the other stations in the downtown Hangzhou. Therefore, the hot spots of PM10 should mainly come from the northeast Hangzhou, but not due north, suggesting the inaccuracy of the inventory. Thus, we improve the inventory through reversely identifying the locations of pollution emission sources. Actually, the newly built heavy industry zone is located in the northeast of downtown Hangzhou.
Furthermore, RNN can be used to forecast the air pollution during the Asian Games or other big events. The accuracy of RNN depends on the accurate predictions of meteorology by WRF. Normally, the air pollutants emission reduction plan is enforced by the government six months before the big events taken place in China. For instance, the emission reduction plan was implemented in January 2008 to ensure the good air quality for the 2008 Beijing Olympic Games. Therefore, we can collect enough samples to train RNN to make accurate forecast of air pollutants level and evaluate the effectiveness of the emission reduction plan. 4. Conclusion In the work, one traditional atmospheric model, WRF-CMAQ and two machine learning models, Random Forest and Recurrent Neural Network are used to analyze and forecast the air pollutants in Hangzhou, China. According to the Feature Importance (FI), generated by Random Forest, the influence of meteorological conditions and precursors on a certain pollutant is quantitatively proved for the first time showing the following results. NO2 dominates the formation of PM2.5 and PM10. Temperature is negatively correlated to SO2 and NO2. CO is positively correlated to PM2.5, PM10 and NO2 and negatively correlated to O3. The strange connection between CO and PM2.5 needs further investigation. The poor visibility owes to increase of dew-point deficit and PM2.5. More
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
Fig. 4. WRF-CMAQ simulated PM10 (mg/m3) upon YRD.
1013
1014
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015
importantly, dew-point deficit plays a more significant role than relative humidity in shaping all the air pollutants, which has never been reported by previous works. The contribution of urban heat island (UHI) to pollutants is rather small, except for VOCs, because most VOCs are emitted by a point source. Precipitation has more impact on PM2.5 and PM10 than the gaseous air pollutants. SOAFP has little influence on PM, indicating the components of PM are mainly ionic and non-organic compounds. Furthermore, a method to forecast the air pollutants, including SO2, NO2, CO, PM2.5, PM10 and O3, in the future 24 h is proposed. The procedures are set as follows. Firstly, generate the simulated meteorology in the future 24 h by WRF. Secondly, forecast SO2, NO2 and CO in future 24 h by RNN using only WRF-simulated meteorology. Training period is set as the hourly meteorology in previous three months and training data is derived from the actual observed meteorology. Thirdly, forecast PM2.5 and PM10 in the future 24 h by RNN using WRF-simulated meteorology and RNN-simulated SO2, NO2 and CO. Fourthly, forecast ground-level O3 in the future 24 h by RNN using WRF-simulated meteorology and RNN-simulated NO2, CO and PM2.5. Then, a method to estimate regional transport rate of air pollutants and reversely identify air pollution emission sources by RNN is introduced. At last, the policy assessment is made to better regulate the air pollution in Hangzhou, prior to the 2022 Asian Games. Author contribution Rui Feng designed the experiments and carried them out. Huijun Zheng developed the machine learning code and performed the simulations. Rui Feng prepared the manuscript with contributions from all co-authors. Acknowledgment This work is financially supported by the National Natural Science Foundation of China (No. 51476144). Appendix A. Supplementary data Supplementary data to this article can be found online at https://doi.org/10.1016/j.jclepro.2019.05.319. References Bilbao, R., Alzueta, M., 2019. Effect of CO2 atmosphere and Abi an, M., Millera, A., presence of NOx (NO and NO2) on the moist oxidation of CO. Fuel 236, 615e621. https://doi.org/10.1016/j.fuel.2018.09.054. Atkinson, R., Arey, J., 2003. Atmospheric degradation of volatile organic compounds. Chem. Rev. 103, 4605e4638. https://pubs.acs.org/doi/pdf/10.1021/cr0206420. Bhowmik, M., Debnath, A., Saha, B., 2018. Mixed phase Fe2O3/Mn3O4 magnetic nanocomposite for enhanced adsorption of methyl orange dye: neural network modeling and response surface methodology optimization. Appl. Organomet. Chem. 32 (3), e4186. https://doi.org/10.1002/aoc.4186. Bhowmik, K., Debnath, A., Nath, R., Saha, B., 2017. Synthesis of MnFe2O4 and Mn3O4 magnetic nano-composites with enhanced properties for adsorption of Cr(VI): Artificial neural network modeling. Water Sci. Technol. 76 (12), 3368e3378. https://doi.org/10.2166/wst.2017.501. Bhowmik, K., Debnath, A., Nath, R., Das, S., Chattopadhyay, K., Saha, B., 2016. Synthesis and characterization of mixed phase manganese ferrite and hausmannite magnetic nanoparticle as potential adsorbent for methyl orange from aqueous media: Artificial neural network modeling. J. Mol. Liq. 219, 1010e1022. https:// doi.org/10.1016/j.molliq.2016.04.009. Breedt, H., Craig, K., Jothiprakasam, V., 2018. Monin-Obukhov similarity theory and its application to wind flow modelling over complex terrain. J. Wind Eng. Ind. Aerodyn. 182, 308e321. https://doi.org/10.1016/j.jweia.2018.09.026. Carter, W., 2010. Development of the SAPRC-07 chemical mechanism. Atmos. Environ. 44 (40), 5324e5335, 2010. https://doi.org/10.1016/j.atmosenv.2010.01. 026. Chen, X., Wang, X., Huang, J., Zhang, L., Song, F., Mao, H., Chen, K., Chen, J., Liu, Y., Jiang, G., Dong, G., Bai, Z., Tang, N., 2017. Nonmalignant respiratory mortality and long-term exposure to PM10 and SO2: a 12-year cohort study in northern
China. Environ. Pollut. 231 (1), 761e767. https://doi.org/10.1016/j.envpol.2017. 08.085. Debnath, A., Majumder, M., Pal, M., 2015. A cognitive approach in selection of source for water treatment plant based on climatic impact. Water Resour. Manag. 29 (6), 1907e1919. https://doi.org/10.1007/s11269-015-0918-x. Derwent, R., Jenkin, M., Utembe, S., Shalcross, D., Murrels, T., Passant, N., 2010. Secondary organic aerosol formation from a large number of reactive manmade organic compounds. Sci. Total Environ. 408, 3374e3381. https://doi.org/ 10.1016/j.scitotenv.2010.04.013. Duan, J., Tan, J., Yang, L., Wu, S., Hao, J., 2008. Concentration, sources and ozone formation potential of volatile organic compounds (VOCs) during ozone episode in Beijing. Atmos. Res. 88 (1), 25e35. https://doi.org/10.1016/j.atmosres. 2007.09.004. Fauchez, T., Platnick, S., V arnai, T., Meyer, K., Cornet, C., Szczap, F., 2018. Scale dependence of cirrus heterogeneity effects, Part II: MODIS NIR and SWIR channels. Atmos. Chem. Phys. 18, 12105e12121. https://doi.org/10.5194/acp-1812105-2018. Feng, R., Wang, Q., Huang, C., Liang, J., Luo, K., Fan, J., Zheng, H., 2018. Ethylene, xylene, toluene and hexane are major contributors of atmospheric ozone in Hangzhou, China, prior to the 2022 Asian Games. Environ. Chem. Lett. (in press) https://doi.org/10.1007/s10311-018-00846-w. Fountoukis, C., Martín-Pomares, L., Perez-Astudillo, D., Bachour, D., Gladich, I., 2018. Simulating global horizontal irradiance in the Arabian Peninsula: sensitivity to explicit treatment of aerosols. Sol. Energy 163, 347e355. https://doi.org/10. 1016/j.solener.2018.02.001. Gałe˛ zowska, G., Chraniuk, M., Wolska, L., 2016. In vitro assays as a tool for determination of VOCs toxic effect on respiratory system: a critical review. Trac. Trends Anal. Chem. 77, 14e22. https://doi.org/10.1016/j.trac.2015.10.012. Grell, G.A., Freitas, S.R., 2014. A scale and aerosol aware stochastic convective parameterization for weather and air quality modeling. Atmos. Chem. Phys. 14, 5233e5250. https://doi.org/10.5194/acp-14-5233-2014. ^rtes, J., Lione, V., Castro, H., Alves, G., 2017. Guerra, L., Teles de Souza, A., Co Assessment of predictivity of volatile organic compounds carcinogenicity and mutagenicity by freeware in silico models. Regul. Toxicol. Pharmacol. 91, 1e8. https://doi.org/10.1016/j.yrtph.2017.09.030. Hui, L., Liu, X., Tan, Q., Feng, M., An, J., Qu, Y., Zhang, Y., Jiang, M., 2018. Characteristics, source apportionment and contribution of VOCSs to ozone formation in Wuhan, Central China. Atmos. Environ. 192, 55e71. https://doi.org/10.1016/j. atmosenv.2018.08.042. Hvidtfeldt, U., Sørensen, M., Geels, C., Ketzel, M., Khan, J., Tjønneland, A., Overvad, K., Brandt, J., Raaschou-Nielsen, O., 2019. Long-term residential exposure to PM2.5, PM10, black carbon, NO2, and ozone and mortality in a Danish cohort. Environ. Int. 123, 265e272. https://doi.org/10.1016/j.envint.2018. 12.010. Jia, C., Mao, X., Huang, T., Liang, X., Wang, Y., Shen, Y., Jiang, W., Wang, H., Bai, Z., Ma, M., Yu, Z., Ma, J., Gao, H., 2016. Non-methane hydrocarbons (NMHCs) and their contribution to ozone formation potential in a petrochemical industrialized city, Northwest China. Atmos. Res. 169 (A), 225e236. https://doi.org/10. 1016/j.atmosres.2015.10.006. ska, J., 2019. A random forest partition model for predicting NO2 concenKamin trations from traffic flow and meteorological conditions. Sci. Total Environ. 651 (1), 475e483. https://doi.org/10.1016/j.scitotenv.2018.09.196. , A., Yimam, Y., Morgan, C., Dornblaser, B., 2017. Evaluation and improvement Kishne of the default soil hydraulic parameters for the Noah Land Surface Model. Geoderma 285, 247e259. https://doi.org/10.1016/j.geoderma.2016.09.022. Lei, M., Wu, S., Liu, G., Amirkhanian, S., 2018. VOCs characteristics and their relation with rheological properties of base and modified bitumens at different temperatures. Constr. Build. Mater. 160, 794e801. https://doi.org/10.1016/j. conbuildmat.2017.12.158. Li, L., Chen, C.H., Huang, C., Huang, H.Y., Zhang, G.F., Wang, Y.J., Wang, H.L., Lou, S.R., Qiao, L.P., Zhou, M., Chen, M.H., Chen, Y.R., Streets, D.G., Fu, J.S., Jang, C.J., 2012. Process analysis of regional ozone formation over the Yangtze River Delta, China using the Community Multi-scale Air Quality modeling system. Atmos. Chem. Phys. 12, 10971e10987. https://doi.org/10.5194/acp-12-10971-2012. Li, Y., Guan, B., Tao, S., Wang, X., He, K., 2018. A review of air pollution impact on subjective well-being: survey versus visual psychophysics. J. Clean. Prod. 184, 959e968. https://doi.org/10.1016/j.jclepro.2018.02.296. Li, K., Jacob, D., Liao, H., Shen, L., Zhang, Q., Bates, K., 2019a. Anthropogenic drivers of 2013e2017 trends in summer surface ozone in China. P NATL ACAD SCI USA 116 (2), 422e427. https://doi.org/10.1073/pnas.1812168116. Li, J., Liao, H., Hu, J., Li, N., 2019b. Severe particulate pollution days in China during 2013e2018 and the associated typical weather patterns in Beijing-TianjinHebei and the Yangtze River Delta regions. Environ. Pollut. 248, 74e81. https://doi.org/10.1016/j.envpol.2019.01.124. Liu, Z., Sullivan, C., 2019. Prediction of weather induced background radiation fluctuation with recurrent neural networks. Radiat. Phys. Chem. 155, 275e280. https://doi.org/10.1016/j.radphyschem.2018.03.005. Liu, G., Li, J., Wu, D., Xu, H., 2015. Chemical composition and source apportionment of the ambient PM2.5 in Hangzhou, China. Particuology 18, 135e143. https:// doi.org/10.1016/j.partic.2014.03.011. Liu, M., Huang, Y., Jin, Z., Ma, Z., Liu, X., Zhang, B., Liu, Y., Yu, Y., Wang, J., Bi, J., Kinney, P., 2017. The nexus between urbanization and PM2.5 related mortality in China. Environ. Pollut. 227, 15e23. https://doi.org/10.1016/j.envpol.2017.04.049. Liu, H., Liu, S., Xue, B., Lv, Z., Meng, Z., Yang, X., Xue, T., Yu, Q., He, K., 2018a. Groundlevel ozone pollution and its health impacts in China. Atmos. Environ. 173,
R. Feng et al. / Journal of Cleaner Production 231 (2019) 1005e1015 223e230. https://doi.org/10.1016/j.atmosenv.2017.11.014. Liu, C., Yin, P., Chen, R., Meng, X., Wang, L., Niu, Y., Lin, Z., Liu, Y., Liu, J., Qi, J., You, J., Kan, H., Zhou, M., 2018b. Ambient carbon monoxide and cardiovascular mortality: a nationwide time-series analysis in 272 cities in China. The Lancet Planetary Health 2 (1), e12ee18. https://doi.org/10.1016/S2542-5196(17)30181X. Liu, Y., Li, L., An, J., Huang, L., Yan, R., Huang, C., Wang, H., Wang, Q., Wang, M., Zhang, W., 2018c. Estimation of biogenic VOC emissions and its impact on ozone formation over the Yangtze River Delta region, China, 186. Atmospheric Environment, pp. 113e128. https://doi.org/10.1016/j.atmosenv.2018.05.027. Nie, E., Zheng, G., Gao, D., Chen, T., Yang, J., Wang, Y., Wang, X., 2019. Emission characteristics of VOCs and potential ozone formation from a full-scale sewage sludge composting plant. Science of The Total Environment, pp. 664e672. https://doi.org/10.1016/j.scitotenv.2018.12.404. Palani, S., Tkalich, P., Balasubramanian, R., Palanichamy, J., 2011. ANN application for prediction of atmospheric nitrogen deposition to aquatic ecosystems. Mar. Pollut. Bull. 62 (6), 1198e1206. https://doi.org/10.1016/j.marpolbul.2011.03.033. Parra, R., 2018. Performance studies of planetary boundary layer schemes in WRFChem for the Andean region of Southern Ecuador. Atmos. Pollut. Res. 9 (3), 411e428. https://doi.org/10.1016/j.apr.2017.11.011. Razmjoo, A., Xanthopoulos, P., Zheng, Q., 2017. Online feature importance ranking based on sensitivity analysis. Expert Syst. Appl. 85, 397e406. https://doi.org/10. 1016/j.eswa.2017.05.016. Rynkiewicz, J., 2019. Asymptotic statistics for multilayer perceptron with ReLU hidden units. Neurocomputing 342, 16e23. https://doi.org/10.1016/j.neucom. 2018.11.097. Tan, Z., Lu, K., Jiang, M., Su, R., Dong, H., Zeng, L., Xie, S., Tan, Q., Zhang, Y., 2018. Exploring ozone pollution in Chengdu, southwestern China: a case study from radical chemistry to O3-VOC-NOx sensitivity. Sci. Total Environ. 636, 775e786. https://doi.org/10.1016/j.scitotenv.2018.04.286. Vo, T., Lin, C., Weng, C., Yuan, C., Lee, C., Hung, C., Bui, X., Lo, K., Lin, J., 2018. Vertical stratification of volatile organic compounds and their photochemical product formation potential in an industrial urban area. J. Environ. Manag. 217, 327e336. https://doi.org/10.1016/j.jenvman.2018.03.101. Wagoner, E., Hayes, J., Karty, J., Peters, D., 2012. Direct and nickel(I) salen-catalyzed reduction of 1,1,2-trichloro-1,2,2-trifluoroethane (CFC-113) in dimethylformamide. J. Electroanal. Chem. 676, 6e12. https://doi.org/10.1016/j.jelechem.2012. 04.023.
1015
Wan, D., Song, L., Mao, X., Yang, J., Jin, Z., Yang, H., 2018. One-century sediment records of heavy metal pollution on the southeast Mongolian Plateau: implications for air pollution trends in China. Chemosphere 220, 539e545. https:// doi.org/10.1016/j.chemosphere.2018.12.151. Wang, P., Ying, Q., Zhang, H., Hu, J., Lin, Y., Mao, H., 2018. Source apportionment of secondary organic aerosol in China using a regional source-oriented chemical transport model and two emission inventories. Environ. Pollut. 237, 756e766. https://doi.org/10.1016/j.envpol.2017.10.122. Wang, N., Li, N., Liu, Z., Evans, E., 2016. Investigation of chemical reactivity and active components of ambient VOCs in Jinan, China. Air Quality, Atmosphere & Health 9 (7), 785e793. https://doi.org/10.1007/s11869-015-0380-1. Xiao, H., Sun, J., Bian, X., Dai, Z., 2013. GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model. Comput. Geosci. 59, 156e162. https:// doi.org/10.1016/j.cageo.2013.06.016. Xu, R., Tie, X., Li, G., Zhao, S., Cao, J., Feng, T., Long, X., 2018. Effect of biomass burning on black carbon (BC) in South Asia and Tibetan Plateau: the analysis of WRFChem modeling. Sci. Total Environ. 645, 901e912. https://doi.org/10.1016/j. scitotenv.2018.07.165. Yu, H., Guenther, A., Gu, D., Warneke, C., Geron, C., Goldstein, A., Graus, M., Karl, T., Kaser, L., Misztal, P., Yuan, B., 2017. Airborne measurements of isoprene and monoterpene emissions from southeastern U.S. forests. Sci. Total Environ. 595, 149e158. https://doi.org/10.1016/j.scitotenv.2017.03.262. an, G., Pekey, H., Tuncel, G., 2018. Temporal Yurdakul, S., Civan, M., Kuntasal, O., Dog variations of VOCS concentrations in Bursa atmosphere. Atmos. Pollut. Res. 9 (2), 189e206. https://doi.org/10.1016/j.apr.2017.09.004. Zhang, Y., Zhang, X., Wang, L., Zhang, Q., Duan, F., He, K., 2016. Application of WRF/ chem over East asia: Part I. Model evaluation and intercomparison with MM5/ CMAQ, 124. Atmospheric Environment, pp. 285e300. https://doi.org/10.1016/j. atmosenv.2015.07.022. B. Zhang, R., Cohan, A., Biazar, A., Cohan, D., 2017. Source apportionment of biogenic contributions to ozone formation over the United States. Atmos. Environ. 164, 8e19. https://doi.org/10.1016/j.atmosenv.2017.05.044. Zhou, L., Shan, X., Chen, X., Yin, X., Zhang, X., Xu, C., Wei, Z., Xu, K., 2006. Investigation of electron momentum distributions for outer valence orbitals of trichlorofluoromethane by (e, 2e) electron momentum spectroscopy. J. Electron. Spectrosc. Relat. Phenom. 153 (1e2), 58e64. https://doi.org/10.1016/j.elspec. 2006.05.006.