Cross-comparison and evaluation of air pollution field estimation methods

Cross-comparison and evaluation of air pollution field estimation methods

Accepted Manuscript Cross-comparison and evaluation of air pollution field estimation methods Haofei Yu, Armistead Russell, James Mullholland, Talat O...

2MB Sizes 0 Downloads 68 Views

Accepted Manuscript Cross-comparison and evaluation of air pollution field estimation methods Haofei Yu, Armistead Russell, James Mullholland, Talat Odman, Yongtao Hu, Howard H. Chang, Naresh Kumar PII:

S1352-2310(18)30059-1

DOI:

10.1016/j.atmosenv.2018.01.045

Reference:

AEA 15804

To appear in:

Atmospheric Environment

Received Date: 29 August 2017 Revised Date:

7 January 2018

Accepted Date: 27 January 2018

Please cite this article as: Yu, H., Russell, A., Mullholland, J., Odman, T., Hu, Y., Chang, H.H., Kumar, N., Cross-comparison and evaluation of air pollution field estimation methods, Atmospheric Environment (2018), doi: 10.1016/j.atmosenv.2018.01.045. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Cross-Comparison and Evaluation of Air Pollution Field Estimation Methods

1 2

1

TE D

M AN U

*Corresponding author Address: 311 Ferst Dr NW. EST 3210, Atlanta, GA 30332. E-mail: [email protected] Phone: +1 (404) 894-3079 Fax: +1 (404) 894-8266

SC

RI PT

School of Civil and Environmental Engineering. Georgia Institute of Technology. Atlanta, GA. USA. Department of Civil, Environmental and Construction Engineering. University of Central Florida. Orlando, FL. USA. 3 Department of Biostatistics and Bioinformatics. Rollins School of Public Health. Emory University. Atlanta, GA. USA. 4 Electric Power Research Institute. Palo Alto, CA. USA 2

EP

5 6 7 8 9 10 11 12 13 14 15 16 17

Haofei Yu1,2, Armistead Russell1,*, James Mullholland1, Talat Odman1, Yongtao Hu1, Howard H Chang3, Naresh Kumar4

AC C

3 4

1

ACCEPTED MANUSCRIPT

Abstract Accurate estimates of human exposure is critical for air pollution health studies and a variety of methods are currently being used to assign pollutant concentrations to populations. Results from these methods may differ substantially, which can affect the outcomes of health impact assessments. Here, we applied 14 methods for developing spatiotemporal air pollutant concentration fields of eight pollutants to the Atlanta, Georgia region. These methods include eight methods relying mostly on air quality observations (CM: central monitor; SA: spatial average; IDW: inverse distance weighting; KRIG: kriging; TESS-D: discontinuous tessellation; TESS-NN: natural neighbor tessellation with interpolation; LUR: land use regression; AOD: downscaled satellite-derived aerosol optical depth), one using the RLINE dispersion model, and five methods using a chemical transport model (CMAQ), with and without using observational data to constrain results. The derived fields were evaluated and compared. Overall, all methods generally perform better at urban than rural area, and for secondary than primary pollutants. We found the CM and SA methods may be appropriate only for small domains, and for secondary pollutants, though the SA method lead to large negative spatial correlations when using data withholding for PM2.5 (spatial correlation coefficient R = -0.81). The TESS-D method was found to have major limitations. Results of the IDW, KRIG and TESS-NN methods are similar. They are found to be better suited for secondary pollutants because of their satisfactory temporal performance (e.g. average temporal R2 > 0.85 for PM2.5 but less than 0.35 for primary pollutant NO2). In addition, they are suitable for areas with relatively dense monitoring networks due to their inability to capture spatial concentration variabilities, as indicated by the negative spatial R (lower than -0.2 for PM2.5 when assessed using data withholding). The performance of LUR and AOD methods were similar to kriging. Using RLINE and CMAQ fields without fusing observational data led to substantial errors and biases, though the CMAQ model captured spatial gradients reasonably well (spatial R = 0.45 for PM2.5). Two unique tests conducted here included quantifying autocorrelation of method biases (which can be important in time series analyses) and how well the methods capture the observed interspecies correlations (which would be of particular importance in multipollutant health assessments). Autocorrelation of method biases lasted longest and interspecies correlations of primary pollutants was higher than observations when air quality models were used without data fusing. Use of hybrid methods that combine air quality model outputs with observational data overcome some of these limitations and is better suited for health studies. Results from this study contribute to better understanding the strengths and weaknesses of different methods for estimating human exposures.

49

Keywords

50

Air pollution; exposure estimation; health impacts; data fusion; hybrid model

51 52 53 54 55 56 57

1. Introduction

AC C

EP

TE D

M AN U

SC

RI PT

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Exposure to ambient air pollution is associated with a range of adverse health outcomes, from minor respiratory tract irritation to increased premature mortality (Bernstein et al. 2004; Kampa and Castanas 2008; Kaufman et al. 2016; Kim 2004; Pope and Dockery 2006; Rohr and Wyzga 2012). Recent findings suggest that globally, air pollution is one of the largest environmental health risks and a leading factor for disease burden (Gakidou et al. 2017). Such studies use estimates of air pollutant concentration fields in reaching their conclusions.

2

ACCEPTED MANUSCRIPT

In the past, a variety of approaches have been used to assign pollution concentration levels to both individuals and populations, including simple measurement-based methods (Jerrett et al. 2005), empirical modeling (Zou et al. 2009) and air quality modeling techniques (Özkaynak et al. 2013a). More recently, hybrid methods, combining multiple approaches have been developed (Beckerman et al. 2013; Hu et al. 2014; Ivey et al. 2015; Ivey et al. 2016; Ozkaynak et al. 2013). Simpler methods are typically easier to implement as they require minimal inputs and little complex mathematical modeling. However, the results may also be limited in the type of information provided and/or their accuracy. Emissionbased air quality models can require extensive amounts of input data, computational effort, and expertise in their application, but can account for more factors affecting ambient pollutant concentrations and are designed to better characterize physical and chemical behaviors of different types of pollutants. Air quality models can also suffer from biases and errors resulting from inaccuracies in emissions, meteorology, or lack of complete physics and chemistry in the models. Satellite retrieval based methods can capture spatial pollutant variations, but the information is typically derived from columnar observations, is limited to a few pollutants and can suffer from missing data (e.g., on cloudy days) (Kloog et al. 2011; Li et al. 2015). Overall, each method has its own strengths and weaknesses. Results and outcomes of exposure estimation and epidemiologic studies will also likely vary depending on the method used (Baxter et al. 2013; Goldman et al. 2010; Goldman et al. 2012).

75 76 77 78 79 80 81 82 83 84 85 86 87

In this study, we applied fourteen exposure estimation methods to the Atlanta, Georgia area in the USA to provide daily pollutant concentration fields for a number of health-relevant pollutants. Comprehensive comparisons and evaluations across these methods was performed to better understand the unique characteristics of different methods, and to inform users of potential issues that may impact their applications in health studies. Comparisons were focused on four aspects: 1) spatial concentration fields; 2) Temporal correlations and statistical performance; 3) Auto-correlations of estimated concentrations and method errors; and 4) Interspecies correlations. Each of these attributes can be important to both spatially- and temporally-focused (e.g., time series) health studies. While many of the methods are primarily oriented towards improving the spatial features of estimated pollutant fields, such information is important in reducing exposure misclassification error in time-series studies. The strengths and weaknesses of each method are then discussed regarding their applications in epidemiology studies, and recommendations are made for the selection and use of different methods based on their performance.

88 89 90 91 92 93 94 95 96 97 98

2. Methods

EP

TE D

M AN U

SC

RI PT

58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74

AC C

Fourteen methods were applied in this study, including eight observation-based methods: Central Monitor (CM), Site Average (SA), Inverse Distance Weighting (IDW), Kriging (KRIG), discontinuous tessellation (TESS-D), continuous natural neighbor interpolation based on tessellation (TESS-NN), Land Use Regression (LUR) and satellite Aerosol Optical Depth (AOD); two emissions-based air quality modeling methods: RLINE dispersion and CMAQ chemical transport modeling (CTM); and four hybrid modeling methods including CMAQ-kriging adjustment (CMAQ-Krig), CMAQ combined with data fusion (CMAQ-DF), CMAQ-RLINE-DF fusion, and a CTM-receptor method that is capable of source appointment (CMAQ-hybrid). Each method is briefly described below. Interested readers are referred to corresponding references for further technical details of each method. These methods were chosen based on their use in past health studies.

3

ACCEPTED MANUSCRIPT

2.1.

Application Domain and Data

The methods studied here are applied to 20 counties centered around Atlanta, Georgia, USA (Fig 1) having a total population of 5.5 million. Eight pollutants were selected: carbon monoxide (CO), nitrogen dioxide (NO2), sulfur dioxide (SO2), ozone (O3), fine particulate matter (PM2.5) and three speciated PM2.5 components: elemental carbon (EC), organic carbon (OC) and sulfate (SO4). Among them CO, NO2, SO2, O3 and PM2.5 are criteria air pollutants regulated under the National Ambient Air Quality Standards in the United States (US), EC, OC and SO4 are major components of PM2.5. Studies have linked exposures to these pollutants with adverse health effects. The study area has a diverse set of emission sources (USEPA, 2016a). CO, NO2 and EC emissions are dominated by transportation-related sources; in 2011 SO2 was emitted mainly from power plants, which also contribute to the formation of SO4; O3 is a secondary pollutant, while OC is both secondary and primary. Both O3 and OC are formed from complex interactions between anthropogenic and biogenic emission sources. Specific metrics of pollutant concentrations considered in this study include daily maximum 1-hour average CO, NO2 and SO2, daily maximum 8-hour average O3 and daily average PM2.5, EC, OC and SO4 concentrations, all of which are commonly used for both regulatory purposes and health studies. In 2011, a total of 16 monitoring sites were located inside the study area, and additional 36 monitors were within 100 km of the study area (a total of 52 stations, Fig. 1 and Table 1). It is worth noting that not all pollutants were measured at all monitors, and additional monitors were included in some of the methods investigated here. Detailed information on monitor name and species measured at each monitor is provided in Table 1. Overall, PM2.5 were measured at 31 monitors, followed by ozone (27), PM2.5 components (EC, OC and SO4, 12 monitors), SO2 (10), NO2 (6) and CO were measured at 4 monitors. Depending on the nature of different methods, measurement data from varying number of monitors were used.

121 122 123 124 125 126 127 128 129 130 131 132 133 134

Two monitors (JST and YRK) are operated as part of the Southeastern Aerosol Research and Characterization (SEARCH) study (Edgerton et al. 2005, 2006; Hansen et al. 2003), which provides extensive information on concentrations of all pollutants at these two locations. These two monitors were selected as method evaluation monitors and detailed statistical performance metrics were calculated at their locations. Detailed observational data are also available at the South DeKalb (SDK) site, a regulatory monitoring site, that is also Georgia’s NCORE advanced monitoring location (USEPA, 2016b). The SDK site is in a field, surrounded by forest, near a major highway (Fig 1). The SDK site is operated by the Environmental Protection Division of Georgia Department of Natural Resources. Concentrations of gaseous pollutants (CO, SO2, NO2 and O3) and PM2.5 were measured with an hourly resolution in 2011, while concentrations of EC, OC and SO4 were measured every three days with a 24hour resolution. Measurement data for JST and YRK were retrieved from the SEARCH network (http://www.atmospheric-research.com/studies/SEARCH/), while data from other monitors were retrieved from the US Environmental Protection Agency (EPA) Air Quality System (AQS) Data Mart (https://aqs.epa.gov/api).

AC C

EP

TE D

M AN U

SC

RI PT

99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

4

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

135

Figure 1. The Atlanta, Georgia, study area and locations of pollutant concentration monitors used in this study. Name and pollutant species measured at each monitor are provided in Table 1.

138

Table 1. List of monitors shown in Figure 1 and corresponding pollutants measured at each monitor. Name JST YRK SDK ForestPark

2

Kennesaw

NO2

SO2

O3

PM2.5

PM2.5 Speciation*

ID

Name

24

Summerville

25

ClarkeCty

26

Athens

27

Dawsonville

28

Rome-CoosaES

CobbCo

29

Gainesville

Newnan

30

WarnerRobins

DHC

31

FtMtn

6

Douglas-WSS

32

Col-MCHD

7

ERS

33

Col-Airpt

FS8

34

Col-CusRd

3 4 5

8

AC C

1

CO

EP

ID

TE D

136 137

9

ConfAve

35

PikeCty

10

RoswellRd

36

Rossville

11

GwinTech

37

Sandersville

12

HenryCo

38

Gordon

5

CO

NO2

SO2

O3

PM2.5

PM2.5 speciation

Conyers

39

NC-1

14

AL-1

40

NC-2

15

AL-2

41

SC-1

16

AL-3

42

SC-2

17

AL-4

43

TN-1

18

AL-5

44

TN-2

19

AL-6

45

TN-3

20

AL-7

46

TN-4

21

AL-8

47

TN-5

22

Macon

48

TN-6

23

Macon-SE

49

TN-7

*PM speciation monitoring site that measure EC, OC and SO4

140

2.2.

Observation-Based Methods

SC

139

13

RI PT

ACCEPTED MANUSCRIPT

The Central Monitor (CM), or central site method, uses measurement data from a single monitor to represent pollutant concentrations over the entire study area (Pope et al. 2009; Zanobetti and Schwartz 2009). In this study, we assigned the South DeKalb site (SDK, Fig 1) as the central monitor because it is the National Core (NCORE) site closest to the city center. NCORE sites have more comprehensive measurements than most other routine monitoring sites, and have been used in health studies in Atlanta and elsewhere (Marmur et al. 2006b).

148 149 150 151 152

In the Site Average (SA) method, observational data from more than one site are averaged to represent pollutant concentration levels in the study area (Özkaynak et al. 2013b). Here, we applied a normalization approach in consideration of site-specific concentration variability and occasional missing observations, further details on the normalization methods used are available elsewhere (Wong et al. 2004; Vaidyanathan et al. 2013).

153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168

Various spatial interpolation methods can be used to produce spatially varying pollutant concentration fields. Pollutant concentrations are calculated between monitoring locations as the weighted average of measurements from nearby locations. In this study, we applied four spatial interpolation methods: Inverse Distance Weighting (IDW), Kriging (KRIG), discontinuous tessellation (TESS-D) and tessellationbased continuous natural neighbor interpolation (TESS-NN). The IDW approach here weights concentrations as the inverse of the distance squared between known concentration locations and locations where concentrations are desired (Hoek et al. 2002). In the KRIG method, the weight was derived from a fitted parametric function to binned semivariance values calculated between observations measured at any pair of locations (Bergen et al. 2013; Kim et al. 2014; Sampson et al. 2013). The mGstat geostatistical package was used (http://mgstat.sourceforge.net/) for KRIG. The TESS-D and TESS-NN methods splits the entire study area into different polygons (Anselin and Le Gallo 2006). In the TESS-D method, no spatial variations in pollutant concentrations are assumed within each polygon. Here we applied the Thiessen polygons approach (Fig S1), with each polygon containing one monitor which determines concentration levels within the corresponding polygon (a 100% weight). In the case of missing observations, measurement data from the closest monitor were assigned. In the TESS-NN method, continuous concentration fields are achieved using the natural neighbor (or Voronoi Neighbor

AC C

EP

TE D

M AN U

141 142 143 144 145 146 147

6

ACCEPTED MANUSCRIPT

Averaging) method (Abt 2008; USEPA 2010). For IDW and TESS-D methods, concentration measurement data from all 52 monitors shown in Fig 1 were used. For the KRIG method, the 100-km buffer area was further expended to 200 km to include more monitors in order to develop the variogram used for the parametric function. The selected resolution for spatial interpolations here is 4 km (for IDW, KRIG and TESS-NN), though any resolution is possible for spatial interpolation methods.

174 175 176 177 178 179 180 181 182

Land use regression (LUR) is an empirical approach to estimate pollutant concentrations that assumes pollutant concentrations can be statistically related to land use characteristics at the location or nearby (often also including other time-varying variables. e.g., meteorological parameters) (Hoek et al. 2008; Ryan and LeMasters 2007; Yanosky et al. 2014). In this study, the independent variables selected, in addition to land use characteristics (e.g., length of roadway), include 10m wind speed, 2m temperature, 2m relative humidity, and the height of the planetary boundary layer, all of which are extracted from the 12-km resolution meteorological fields that were developed for the CMAQ modeling discussed below. The LUR method was applied for PM2.5 only. Results for other pollutants were not calculated due to lack of monitoring sites.

183 184 185 186 187 188 189 190

Satellite observations, especially aerosol optical depth (AOD), have recently gained popularity in exposure modeling due to their broad spatial coverage (Engel-Cox et al. 2004; Hu et al. 2014a, 2014b; Kloog et al. 2011; Reid et al. 2015; Lv et al. 2016). Similar to LUR models, the AOD approach also develops regression equations using available ground based observational data, with satellite AOD data added as an additional independent variable. In this study, we applied the calibration method developed by Chang et al. (2014), with AOD data (1-km resolution) obtained from the Multiangle Implementation of Atmospheric Correction (MAIAC) approach (Lyapustin et al. 2011) downscaled from MODIS data. The chosen spatial resolution for both LUR and AOD methods is 1 km.

2.3.

TE D

M AN U

SC

RI PT

169 170 171 172 173

Air Quality Modeling Methods

Emission-based air quality models have been used for decades to provide information about air pollutants that could not be directly gleaned from the available observations, and are seeing an increasing role in providing human exposure information in health studies. In this study, we applied two air quality models: the Research Line Source Model (R-LINE) (Snyder et al. 2013), and the Community Multiscale Air Quality (CMAQ) model (Byun and Schere 2006). R-LINE is a steady-state dispersion model used in support of evaluating the exposures to near-traffic environment (Batterman et al. 2014; Isakov et al. 2014). CMAQ is an extensively used CTM for regional air quality modeling (Koo et al. 2015; Mannshardt et al. 2013; Simon et al. 2012).

200 201 202 203 204 205 206 207

Inputs to R-LINE included a detailed on-road mobile source emission inventory containing hourlyresolved primary PM2.5 emissions from 43,712 roadway links in the study area (D'Onofrio 2016). Concentrations of PM2.5 were estimated at 235,297 receptor locations with a 250 m spatial resolution. Further details on configurations of the R-LINE model are available in Zhai et al. (2016). CMAQ air quality fields for the year 2011 were provided by the U.S. EPA and the Center for Disease Control and Prevention as part of the Public Health Air Surveillance Evaluation study. Spatial resolution of the CMAQ data is 12 km, covering the continental US. Further details of the configuration of the CMAQ model can be found elsewhere (Appel et al. 2012).

208 209

We also applied CMAQ using a 36-km resolution in support of the CMAQ-hybrid method described below (36 km is used due to the computational burden of the method). Meteorology fields for the 36-

AC C

EP

191 192 193 194 195 196 197 198 199

7

ACCEPTED MANUSCRIPT

km resolution CMAQ modeling were generated by the Weather Research and Forecasting (WRF) model (http://wrf-model.org/). Emission data for CMAQ were prepared by the Sparse Matrix Operator Kernel Emissions (SMOKE) model (version 3.5.1) with the 2011 National Emission Inventory (2011ec_v6_11f modeling platform) as input. CMAQ version 5.0.2 was applied with CB05 reaction mechanism. The Direct Decoupled Method in Three-Dimensions (DDM-3D) was enabled to estimate source contributions to PM2.5 from 16 emission categories, including agriculture, aircraft, biogenic, coal combustion, dust, fire, fuel oil combustion, metal processing, natural gas combustion, non-road diesel engine emissions, nonroad gasoline engine emissions, on-road diesel engine emissions, on-road gasoline engine emissions, wood combustion and all other emissions.

2.4.

Hybrid Modeling Methods

RI PT

210 211 212 213 214 215 216 217 218

Unadjusted, or “raw” air quality model outputs can have substantial biases and errors (Appel et al. 2012; Appel et al. 2013; Civerolo et al. 2010; Foley et al. 2010; Hogrefe et al. 2014; Hogrefe et al. 2015; Koo et al. 2015; Marmur et al. 2006a, 2006b; Mebust et al. 2003; Napelenok et al. 2011; Porter et al. 2015; Rao et al. 2014; Simon et al. 2012; Swall and Foley 2009; Zhang et al. 2010). The propensity for such biases led to the development of relative reduction factors when such models are used in attainment demonstrations (Jones et al. 2005). In this study, we applied four hybrid methods: CMAQ-Krig, CMAQ-DF, CMAQ-RLINE-DF and the CMAQ-hybrid method.

227 228 229 230 231 232

In the CMAQ-Krig method, CMAQ-modeled/observation ratios of daily pollutant concentrations were first calculated at all monitor locations. The calculated ratios were then spatially interpolated to the center points of all CMAQ grid cells covering the study region using ordinary kriging. CMAQ model outputs were then adjusted proportionally using the interpolated ratios. This procedure was applied using daily maximum 1-hour CO, NO2 and SO2, daily maximum 8-hour O3, and daily average PM2.5, EC, OC and SO4 concentrations collected from monitors located inside the study domain.

233 234 235 236 237 238 239 240 241 242

A more sophisticated data fusion approach (CMAQ-DF) was also applied here (Friberg et al. 2016; Hao et al. 2016; Huang et al. 2016). In this approach, two concentration fields were first developed, including one field that assimilated temporal variations of observation data and spatial structure from annual mean CMAQ results, and another field that incorporated spatial and temporal variations from CMAQ results scaled using observation data to correct for annual and seasonal biases. The two fields were then fused together based on weighting factors that vary over space based on their ability to predict temporal variations. This novel hybrid approach accounts for the potential for specific locations being consistently impacted by very local sources leading to observations higher than the surroundings, and also can accommodate sparse monitors. Additional details of this approach can be found at Friberg et al. (2016). The spatial resolution of CMAQ-Krig and CMAQ-DF methods, as applied here, is 12 km

243 244 245 246 247 248 249 250

For the purpose of exposure estimation, concentration fields with a high spatial resolution are often desired. We applied a CTM-RLINE-Observation (CMAQ-RLINE-DF) combination method to fuse coarsergrid CMAQ concentration fields (12 km) developed using the CMAQ-DF method with high resolution RLINE model (250 m) output (Bates et al., 2015). This way, sub-grid pollutant concentration information was added to CMAQ results. First, grid averaged concentrations of all RLINE receptors within the corresponding CMAQ cells were subtracted from CMAQ-DF concentrations. The calculated differences are concentration contributions from regional background and emission sources other than those included in RLINE modeling. The differences were then assumed to be at the center of each CMAQ grid

AC C

EP

TE D

M AN U

SC

219 220 221 222 223 224 225 226

8

ACCEPTED MANUSCRIPT

cell, and spatially interpolated to all RLINE receptors, with RLINE modeled concentrations subsequently added back. The method generates highly-resolved (250 m resolution) concentration fields that are suitable for detailed health impact analysis.

254 255 256 257 258 259 260 261 262 263 264 265 266 267

The above methods all produce pollutant species concentration fields. Recently, increased focus has been on associating health outcomes directly to sources (Laden et al. 2000; Sarnat et al. 2006). The last method applied in this study, CMAQ-hybrid, combines CMAQ source impact fields with a receptor modeling source apportionment technique to simultaneously optimize modeled source impacts, adjusting for modeling biases and errors in multiple species from multiple sources (Hu et al. 2014; Ivey et al. 2015). CMAQ provides spatiotemporal source impact fields, along with species concentrations fields. A receptor modeling technique, similar to the Chemical Mass Balance approach (Coulter 2004; Hopke 2016; Watson et al. 1984) is then used to adjust the source impacts at each monitoring site with species observations, and those adjustments are kriged spatially. The CMAQ generated data used included 36 species and 16 emission source categories. Speciated PM2.5 measurements are needed for the CMAQ-hybrid method, and were retrieved from EPA AQS Data Mart for 2011 from 181 monitor locations across continental US. Additional details are provided in the supplemental materials and Ivey et al., 2015. In addition to providing source impacts, the method also provides PM species concentration fields that are adjusted by the observations.

SC

Method Evaluation

M AN U

2.5.

EP

TE D

All of the above-mentioned 14 methods were evaluated against observational data. Results from these methods were cross-compared focusing on three aspects: 1) Spatial concentration fields. Concentrations of air pollutants are known to vary substantially spatially. Appropriately characterizing such variability is critical for health studies. Here the estimated spatial fields from all methods were cross-compared focusing on two aspects: a) spatial distributions of pollutant concentrations; and b) the capability of different methods at capturing the measured spatial concentration variabilities, which were quantified using the spatial correlation coefficients (R). Specifically, the correlations were calculated using simulated and observed annual average PM2.5 concentrations at all 10 monitors located within the study domain. Leave-one-out cross validations were performed for the simulated values whenever possible. Possible R values range from -1 to 1. A R value of 1 indicate that the corresponding method perfectly captured the observed spatial concentration variabilities, while a R value of -1 indicate that the corresponding method simulated inverse spatial concentration variabilities; 2) Temporal correlations and statistical performance. A total of 14 statistical metrics were calculated to evaluate the temporal performance of all methods. Due to length limitation, only results temporal coefficient of determination (R2) were provided; 3) Auto-correlations. Auto-correlation is the correlation of discrete concentration measurements with a “temporally shifted” copy of itself as a function of the “shift”. It’s useful for finding periodical biases or errors hidden in model outputs. Here auto-correlations of estimated concentrations biases were calculated and compared. For all three aspects, leave-one-out (L1O) cross-validations were performed whenever possible. In the L1O approach, observational data from a single monitor were excluded, pollutant concentrations were then estimated at corresponding monitor location using the corresponding approach and were compared with real observational data. The L1O approach is particularly useful for evaluating the performance of observational-data-based methods.

AC C

268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290

RI PT

251 252 253

9

ACCEPTED MANUSCRIPT

291

3. Results

292 293 294 295 296 297 298 299 300 301

3.1.

Spatial fields of the average daily PM2.5 concentrations were estimated by all 14 methods, including both the 12-km and 36-km CMAQ results (Fig 2). Purely data-driven methods including CM, SA, spatial interpolation methods (IDW, TESS-D, TESS-NN and KRIG) and empirical regression methods (LUR and AOD) lead to PM2.5 concentration fields less spatial variation, while results from air quality model-based approaches (RLINE and CMAQ) and hybrid methods (CMAQ-Krig, CMAQ-DF, CMAQ-RLINE-DF and CMAQ-hybrid) show more spatial variation. Similar results can also be observed for other pollutants (Figs S3 through S9), with much more prominent differences observed for primary pollutants including CO, NO2, SO2 and EC. Not all methods had sufficient information for their appropriate application to pollutants besides PM2.5 (e.g. AOD is only applied for PM2.5).

302 303 304 305 306 307 308 309 310 311 312 313

For both CM and SA methods, the concentration fields have no spatial variation. The magnitude of concentration levels is different, especially for EC, for which the CM method estimates 90% higher concentrations than SA (1.25 µg/m3 for CM vs 0.66 µg/m3 for SA). Though CM and SA methods are more likely to be used in temporal time-series studies and the lack of spatial concentration variability is less of concerns in these studies. Among IDW, KRIG and TESS-NN, the spatial fields of concentrations are similar for PM2.5, as well as for other pollutants. The “Bull’s eye” effect is apparent near monitors, especially around YRK (Fig 2). Due to its design, the TESS-D method estimates concentration fields with sharp concentration changes along boundaries of polygons (Fig 2), although still partially resembling the spatial features of IDW, KRIG and TESS-NN results. Pollutant concentrations estimated at monitors differ between the methods, despite the same measurement data being used. The difference is most significant for EC (Fig S7) at the location of SDK, the central monitor (Fig 1). The temporal sparseness of EC measurement data was found to be the reason for such differences.

AC C

EP

TE D

M AN U

SC

RI PT

Spatial Fields

10

323 324 325 326 327 328 329 330

Figure 2. Cross-comparisons of average daily PM2.5 concentrations as estimated by different methods. Methods of the same type were outlined using one box. CM: Central Monitor; SA: Site Average; IDW: Inverse Distance Weighting; KRIG: Kriging; TESS-D: discontinuous tessellation; TESS-NN: continuous natural neighbor interpolation based on tessellation; LUR: Land Use Regression; AOD: satellite Aerosol Optical Depth; RLINE: RLINE dispersion model; CMAQ: CMAQ chemical transport model; CMAQ-Krig: CMAQ-kriging adjustment; CMAQ-DF: CMAQ combined with data fusion; CMAQ-RLINE-DF: CMAQ-DF further fused with RLINE data; and CMAQ-hybrid: CTM-receptor hybrid method that is capable of source appointment.

EP

315 316 317 318 319 320 321 322

AC C

314

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

The estimated spatial fields of PM2.5 concentrations from the LUR method are similar with IDW and KRIG results, which is expected since none of the land use variables included in the regression are found to be statistically significant in predicting PM2.5 concentrations (p > 0.05). Therefore, PM2.5 spatial fields estimated by the LUR method are essentially generated by ordinary kriging with meteorological parameters incorporated as independent variables. Boundaries of 12-km resolution grid cells in the meteorological field are vaguely visible from the map. The LUR method, as applied here, was limited by available data, which is not uncharacteristic of application to other areas without special field studies 11

ACCEPTED MANUSCRIPT

being conducted. The AOD method estimates a relatively smooth PM2.5 concentration field, sharing some similarities with results from LUR, though with additional spatial features.

333 334 335 336 337

Results from air quality models show different spatial concentration patterns than the purely observation-based methods for all pollutants, especially SO2 (Fig S5), which is due to its emissions being dominated by major coal fired power plants located within the study domain (Fig S10). Further, SO2 concentrations are sparsely monitored in the study domain. Although we have included all available SO2 measurement data, the elevated SO2 concentrations near power plants were not captured.

338 339 340 341 342 343 344 345

A major strength of the RLINE method is its high spatial resolution (250 m as applied here, but can be finer). Substantially elevated PM2.5 concentrations are simulated by RLINE near major roadways, consistent with studies that include near-road observations (Zhu et al. 2002). The other methods do not do so to the same degree, though methods using CTM results do show elevated concentrations in grids associated with major roads. The elevated pollutant concentrations near roadways are more apparent in spatial fields of PM2.5 (Fig 2). Spatially, all hybrid methods (CMAQ-Krig, CMAQ-DF, CMAQ-RLINE-DF and CMAQ-hybird) lead to similar concentration patterns as the corresponding raw air quality model outputs, preserving the observed spatial concentration patterns.

346 347 348 349 350 351 352 353 354 355 356 357 358 359

The above-mentioned 14 methods also show varying capabilities of capturing observed spatial concentration variability at monitor locations, as quantified using the spatial correlations (R) for PM2.5 (Table 2). As mentioned previously, a R value of 1 indicate favorable performance, suggest that the corresponding method perfectly captured the observed spatial concentration variabilities, while a R value of -1 indicate that the corresponding method simulated inverse spatial concentration variabilities. Results for SA, IDW, TESS-D, TESS-NN, KRIG, LUR, AOD, CMAQ-Krig and CMAQ-DF methods were obtained using leave-one-out cross validation at each of the monitors. The spatial R value for the CM method is zero and is not shown in Table 2. The estimated spatial R values are negative for SA, IDW, TESS-D, TESS-NN, KRIG, LUR, AOD and RLINE methods. For the observation-based approaches, this is because removing a high (or low) observation leads to a lower (higher) overall field, which is then compared to a higher (lower) value. The negative value for RLINE suggests that the monitors are impacted by sources other than just on-road motor vehicles. On the other hand, all CMAQ based methods including raw CMAQ outputs show positive spatial R values, suggesting that these methods better capture the spatial patterns of PM2.5 concentration variations.

EP

TE D

M AN U

SC

RI PT

331 332

360

Table 2. Spatial correlation (R) of different methods for PM2.5

AC C

361

Method

362

+

Spatial R

+

Method

Spatial R

SA*

-0.81

RLINE

-0.16

IDW*

-0.41

CMAQ

0.45

TESS*

-0.25

CMAQ-krig*

0.21

KRIG*

-0.26

CMAQ-DF*

0.49

LUR*

-0.01

CMAQ-RLINE-DF

0.12

AOD*

-0.21

CMAQ-hybrid

0.44

*Results calculated using leave-one-out cross validation

12

ACCEPTED MANUSCRIPT

+

CM: Central Monitor; SA: Site Average; IDW: Inverse Distance Weighting; KRIG: Kriging; TESS-D: discontinuous tessellation; TESS-NN: continuous natural neighbor interpolation based on tessellation; LUR: Land Use Regression; AOD: satellite Aerosol Optical Depth; RLINE: RLINE dispersion model; CMAQ: CMAQ chemical transport model; CMAQ-Krig: CMAQ-kriging adjustment; CMAQ-DF: CMAQ combined with data fusion; CMAQ-RLINE-DF: CMAQ-DF further fused with RLINE data; and CMAQ-hybrid: CTMreceptor hybrid method that is capable of source appointment.

3.2.

RI PT

363 364 365 366 367 368

Temporal performance

The temporal performance of all methods were cross-compared focusing on the ability of each method to capture: a) measured temporal concentration correlations among different pollutants; and b) measured temporal concentration variability for individual pollutants at different monitor locations.

373 374 375 376 377 378 379 380 381

Some epidemiologic analyses often strive to discern impacts between species while others may include multiple species in the same health model (including possible interaction terms). If the observed concentrations of two pollutants co-vary temporally (high correlation expected), the simulated concentrations by an ideal method should able to capture such high correlation. Similarly, if the observed concentrations of two pollutants do not co-vary (low correlation expected), the simulated concentrations should also capture such low correlation. In both cases, correctly capturing the correlations between pollutant species is important. Outputs from methods that fail to capture such correlations have the potential to confound subsequent health studies, especially when multiple pollutants were considered simultaneously.

382 383 384 385 386 387 388 389 390 391 392 393 394 395 396

Here both the observed data, and the modeled results by the various methods at the JST location are used to calculate the coefficient of determination (R2) between pollutant levels (Table 3). The data from JST were not used in the construction of the model simulation fields as leave-one-out cross validations were performed whenever observational data were used. The LUR and AOD methods only provide PM2.5 estimates, hence they were not included in the table. In the observational data, daily average PM2.5 concentrations are highly correlated with daily average OC and SO4 concentrations (R2 = 0.81 and 0.63, respectively), moderately correlated with daily average EC and daily maximum 8-hour ozone concentrations (R2 = 0.32 and 0.38, respectively), and weakly correlated with daily maximum 1-hour CO, NO2 and SO2 concentrations. Results from CM, SA, IDW, TESS-D, TESS-NN and KRIG methods show similar patterns with certain amount of fluctuation. The raw model outputs from RLINE and CMAQ models, however are different. For example, R2 between daily PM2.5 and CO concentrations is 0.99 for RLINE, and 0.45 for CMAQ, both of which are substantially higher than results from observational data (R2 = 0.12). The CMAQ model also estimated considerably higher R2 values among all primary pollutants and lower R2 values for O3 and SO4. When hybrid methods were applied, R2 values are closer to observations, with the CMAQ-DF method generally performing better than the CMAQ-Krig method.

M AN U

TE D

EP

AC C

397

SC

369 370 371 372

13

ACCEPTED MANUSCRIPT

398 Table 3. Correlation coefficients (R2) between daily concentration of PM2.5 and other pollutants, as observed at JST and modeled by 11 methods. Methoda COd NO2d SO2d O3e ECf OCf SO4f Observed at JSTb 0.12 0.12 0.01 0.38 0.32 0.81 0.63 CM 0.15 0.16 0.00 0.29 0.39 0.64 0.39 SA (L1O)c 0.21 0.04 0.00 0.35 0.40 0.71 0.56 IDW (L1O) 0.17 0.16 0.00 0.29 0.20 0.62 0.46 KRIG (L1O) 0.18 0.16 0.00 0.21 0.23 0.65 0.47 TESS-D (L1O) 0.16 0.16 0.00 0.26 0.20 0.57 0.44 TESS-NN (L1O) 0.13 0.14 0.00 0.40 0.34 0.65 0.52 RLINE 0.99 CMAQ 0.45 0.38 0.18 0.03 0.72 0.90 0.24 CMAQ-Krig (L1O) 0.20 0.20 0.00 0.15 0.21 0.37 0.23 CMAQ-DF (L1O) 0.23 0.24 0.01 0.26 0.51 0.72 0.48 CMAQ-RLINE-DF 0.15 CMAQ-hybrid 0.62 0.79 0.33 a Only PM2.5 concentrations are available from LUR and AOD methods, hence their correlations are not presented. CM: Central Monitor; SA: Site Average; IDW: Inverse Distance Weighting; KRIG: Kriging; TESSD: discontinuous tessellation; TESS-NN: continuous natural neighbor interpolation based on tessellation; LUR: Land Use Regression; AOD: satellite Aerosol Optical Depth; RLINE: RLINE dispersion model; CMAQ: CMAQ chemical transport model; CMAQ-Krig: CMAQ-kriging adjustment; CMAQ-DF: CMAQ combined with data fusion; CMAQ-RLINE-DF: CMAQ-DF further fused with RLINE data; and CMAQ-hybrid: CTMreceptor hybrid method that is capable of source appointment. b JST: calculated based on observation data collected at the JST site c L1O (leave-one-out method of evaluation): results obtained using leave-one-out cross validation d daily maximum 1-hour concentrations e daily maximum 8-hour concentrations f daily average concentrations

TE D

EP

We also evaluated the capabilities of different methods at capturing the measured temporal concentration variability for individual pollutants at different monitor locations. Table 4 provides the calculated temporal R2 values for three pollutants of different nature: ozone (secondary pollutants), PM2.5 (a mixture of both primary and secondary contributions), and CO (primary pollutants). Results were presented at the urban JST, the rural YRK site, and average across all available monitor stations. All methods generally perform better at the urban JST site than at the rural YRK site. Method performance are also considerably better for ozone (secondary pollutant), followed by PM2.5 (a mixture of both primary and secondary contributions), and worst for CO (primary pollutant). The raw CMAQ result show the worst performance for all three pollutants. However, when observational data were fused (CMAQkrig and CMAQ-DF), performance substantially improved, and the data fusion method (CMAQ-DF) clearly out-performed the simple CMAQ-krig adjustment.

AC C

401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425

M AN U

SC

RI PT

399 400

14

ACCEPTED MANUSCRIPT

RI PT

Evaluations results at all monitoring stations are presented in Fig. 3 for PM2.5 and Figs S11 through S17 for other pollutants. Again, better performance was observed near urban centers than in rural area, as indicated by the decreased R2 values further away from urban cores (Fig. 3). Better performance for secondary pollutants (ozone and SO4, average R2 mostly larger than 0.8) are also observed, followed by PM2.5 and OC that are known to have both primary and secondary contributions (average R2 generally larger than 0.7). All methods overall perform worst for primary pollutants with average R2 less than 0.5 for CO, NO2, SO2 and EC. Similar findings can also be seen for the CMAQ-hybrid method, although the improvements are not as substantial. Among other methods, the performance of spatial interpolation methods, including IDW, TESS-D, TESS-NN and KRIG, are similar. For the LUR and AOD methods, R2 values at each monitor locations are on-par or slightly better than the purely observation-based methods. Further details on correlations among other pollutants and statistical performance of other methods are provided in supplemental materials.

SC

426 427 428 429 430 431 432 433 434 435 436 437 438

Table 4 Statistical performance (temporal R2) of different methods at JST and YRK sites, and averaged R2 across all sites for daily PM2.5 concentrations. Average (ave.) R2 values are calculated across all monitoring sites within the study domain.

442 443 444 445 446 447 448

O3 PM2.5 CO JST YRK Ave. JST YRK Ave. JST YRK Ave. CM 0.91 0.70 0.83 0.77 0.64 0.76 0.58 0.07 0.55 SA* 0.91 0.80 0.80 0.91 0.81 0.88 0.55 0.09 0.37 IDW* 0.95 0.88 0.90 0.89 0.83 0.88 0.61 0.09 0.46 KRIG* 0.74 0.74 0.70 0.89 0.81 0.87 0.61 0.09 0.46 TESS-D* 0.95 0.87 0.89 0.85 0.65 0.83 0.59 0.05 0.41 TESS-NN* 0.96 0.92 0.91 0.91 0.83 0.88 0.64 0.07 0.46 CMAQ 0.65 0.69 0.63 0.43 0.30 0.42 0.30 0.30 0.32 CMAQ-krig* 0.90 0.57 0.80 0.76 0.61 0.74 0.56 0.23 0.46 CMAQ-DF* 0.95 0.88 0.92 0.87 0.77 0.88 0.60 0.32 0.53 *Results for methods marked with “*” are obtained by using leave-one-out cross validation. + CM: Central Monitor; SA: Site Average; IDW: Inverse Distance Weighting; KRIG: Kriging; TESS-D: discontinuous tessellation; TESS-NN: continuous natural neighbor interpolation based on tessellation; LUR: Land Use Regression; AOD: satellite Aerosol Optical Depth; RLINE: RLINE dispersion model; CMAQ: CMAQ chemical transport model; CMAQ-Krig: CMAQ-kriging adjustment; CMAQ-DF: CMAQ combined with data fusion; CMAQ-RLINE-DF: CMAQ-DF further fused with RLINE data; and CMAQ-hybrid: CTMreceptor hybrid method that is capable of source appointment.

AC C

449

EP

TE D

M AN U

439 440 441

15

ACCEPTED MANUSCRIPT

TE D

M AN U

SC

RI PT

450

451

461 462 463 464 465

EP

Figure 3. Cross-comparison of method performance (R2) for daily PM2.5 at each monitor location within the study domain. Results for methods marked with “L1O” are obtained by using leave-one-out cross validation. Results using RLINE are not available at YRK because YRK is not within the RLINE modeling domain. CM: Central Monitor; SA: Site Average; IDW: Inverse Distance Weighting; KRIG: Kriging; TESS-D: discontinuous tessellation; TESS-NN: continuous natural neighbor interpolation based on tessellation; LUR: Land Use Regression; AOD: satellite Aerosol Optical Depth; RLINE: RLINE dispersion model; CMAQ: CMAQ chemical transport model; CMAQ-Krig: CMAQ-kriging adjustment; CMAQ-DF: CMAQ combined with data fusion; CMAQ-RLINE-DF: CMAQ-DF further fused with RLINE data; and CMAQ-hybrid: CTMreceptor hybrid method that is capable of source appointment.

AC C

452 453 454 455 456 457 458 459 460

3.3.

Auto-correlations of biases

In addition to spatial and temporal performance, we also evaluated the auto-correlations of biases for all methods and for all pollutants at the JST evaluation site (Fig 4 for EC). Bias autocorrelations of an ideal method are expected to be random and non-significant. Statistically significant autocorrelations have the potential to confound the outcomes in epidemiologic analysis (e.g. lagged health outcomes). 16

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

466

Figure 4. Autocorrelation of simulation biases of EC at JST. Dashed lines indicate 95% confidence bounds. No EC results available for LUR and AOD methods. Results for the TESS-NN method is not provided due to scattered data. L1O: leave-one-out cross validation. CM: Central Monitor; SA: Site Average; IDW: Inverse Distance Weighting; KRIG: Kriging; TESS-D: discontinuous tessellation; CMAQ: CMAQ chemical transport model; CMAQ-Krig: CMAQ-kriging adjustment; CMAQ-DF: CMAQ combined with data fusion.

473 474 475 476 477 478 479 480 481 482 483

The CM method shows a long, positive, autocorrelation, followed by a long negative autocorrelation from about 22 to 43 days. Bias autocorrelations of the SA method are mostly non-significant. Bias autocorrelations of all spatial interpolation methods are almost identical, and show consistent peaks with a three-day interval. The three-day EC sampling schedule at the SDK site and elevated EC concentrations at SDK were found to be the reason. Bias autocorrelations for using raw CMAQ values show consistent peaks with a seven-day intervals and are linked to the input emission data to CMAQ, suggesting that the weekly emissions variations used in CMAQ may contain biases with a weekly pattern. The CMAQ-Krig and CMAQ-DF methods largely eliminated the significant bias autocorrelations in the results. The results again suggest that data-fusion methods are preferred over raw, unadjusted outputs from air quality models. Auto-correlation plots for other pollutants are provided in the supplemental materials.

484 485 486

4. Discussion

AC C

EP

TE D

467 468 469 470 471 472

The CM and SA approaches are the least data intensive approaches among all methods, however they are unable to capture the spatial heterogeneity of pollutant concentrations (Fig 2), and also suffer from 17

ACCEPTED MANUSCRIPT

temporal incompleteness due to inadequate data collection (especially for EC, OC and SO4). This issue may be less of concern for secondary pollutants in a small domain, where spatial concentrations variations are relatively flat. In addition, for certain types of epidemiological studies, such as time-series studies, the spatial completeness of pollution field may also be less of concern.

491 492 493 494 495 496 497

All of the spatial interpolation approaches are able to provide spatially resolved concentration fields, but they inevitably also suffer when observational data are sparse and incomplete (often the case for speciated PM components) (Whitworth et al. 2011; Wong et al. 2004). Without dense monitoring network, spatial heterogeneities of pollutant concentrations cannot be captured by spatial interpolation approaches, especially for primary pollutants. The TESS-D method leads to unrealistic, irregular fields and hence should not be used for health studies given the availability of preferable approaches, e.g., the TESS-NN method typically had the superior performance metrics of the observation-based approaches.

498 499 500 501 502 503

The ability of LUR and AOD methods to develop exposure fields is directly dependent on having sufficient observations to develop regression models with reasonable precision for each of the coefficients. The sparseness of the monitoring networks limits their direct application other than in large regional settings. Another weakness of the LUR approach is its lack of generalizability. The developed regression equations are usually restricted to the study region alone and may not be directly applied to another region (Jerrett et al. 2005; Wu et al. 2011).

504 505 506 507 508 509 510 511 512 513 514 515

At present, satellite observation-based methods have been applied primarily to PM2.5 using AOD information supplied by MODIS instruments. Information on the concentrations of other species is also available from other satellite retrievals, but suffers from issues dealing with spatial or temporal resolution, or uncertainties in the retrievals (Chudnovsky et al. 2013; Martin 2008; Sayer et al. 2013). Another drawback of the satellite-based approach is its diagnostic nature, as historical remote sensing data are needed to develop regression equations. However, the wealth of data from satellites continues to grow, and there are plans to launch a geosynchronous satellite (Zoogman et al. 2017) that will provide observations that can be used to provide more continuous, higher resolution, species concentration information. Overall, though LUR and AOD methods also rely heavily on measurement data, they are more statistically rigorous and provide additional spatial characterization than interpolation methods. These two methods are suitable for larger study areas, particularly as more monitors become available.

516 517 518 519 520 521 522

For observational data based methods, the number of monitors available, and the location of monitors could impact substantially on the resultant pollution fields. Pollutant concentration variabilities not captured by the monitoring network can not be captured by observational data based methods. For statistical based methods including KRIG, LUR and AOD, the developed regression relationships may not be statistically significant with a sparse network. Hence, for related epidemiological studies in which spatial concentration variation is important, a carefully designed saturation sampling campaign may be desired.

523 524 525 526 527 528

The major strength of both dispersion and chemical transport models is that they use spatially and temporally more complete information on emissions, meteorology and topography to simulate pollutant concentration fields by directly solving equations that describe pollutant transport and chemistry. This provides pollutant fields that are consistent with what is understood about factors impacting those concentrations and leads to spatially and temporally complete fields, potentially at high resolutions. However, although concentration fields estimated using air quality models have been found

AC C

EP

TE D

M AN U

SC

RI PT

487 488 489 490

18

ACCEPTED MANUSCRIPT

to capture the temporal and spatial variations in observed concentrations to some degree, the agreement between simulated and observed concentrations is highly dependent upon the application, season, species and location (Boylan and Russell 2006; Hogrefe et al. 2014; Hu et al. 2015; Koo et al. 2015; Porter et al. 2015; Simon et al. 2012). In this case, outputs from RLINE and CMAQ models that have not been adjusted using observations consistently show the worst performance spatially (Table 1 and Fig 4). Bias autocorrelation plots for CMAQ results suggest that part of the errors may be introduced by issues in input emission inventories. Such biases may not be proportional to concentrations, and potentially nonlinear (Frieberg et al., 2015), which may significantly impact resulting health analyses. Further, the observed correlations among pollutants (Table 1) were also distorted, hence distinct health effects among different pollutants may not be appropriately distinguished by directly using air quality model outputs. The bias autocorrelation analysis also shows that biases in unadjusted model results can linger for very long periods.

541 542 543 544

Our results suggest that studies should not directly use dispersion model or CTM fields in health assessments. If they do, an analysis of how biases and errors impact the analyses should be a part of the assessment. Potential steps should also be taken to minimize biases and errors, especially when the biases and errors are spatially and temporally-dependent (e.g., (Koo et al. 2015)).

545 546 547 548 549 550 551 552 553 554 555

As discussed previously, biases associated with raw air quality model outputs are of significant concern. Procedures minimizing such biases and errors are at the heart of data fusion methods that use air quality model fields in conjunction with pollutant observations. In this study, all three data fusion methods (CMAQ-krig, CMAQ-DF and CMAQ-RLINE-DF) corrected for much of the biases and errors while preserving spatial concentration patterns as modeled by RLINE and/or CMAQ models (Fig 2). The CMAQkrig method, although straight forward, performs consistently less favorably as compared with the CMAQ-DF method, which has been successfully applied in past health studies (Alhanti et al. 2015). Although the fused concentration fields are still heavily dependent on the accuracy of the output data from air quality models and their inputs, current results, as well as results from similar studies (Chen et al. 2014), have shown significantly improved performance when comparing with other approaches, making hybrid approaches a promising candidate for future epidemiologic studies related to air quality.

556 557 558 559 560

In this study, we applied a newly developed CMAQ-hybrid source apportionment method to simultaneously adjust CMAQ modeled concentration fields for 36 species and 16 emission categories. The performance of the CMAQ-hybrid method is less good than the data-fusion approach. The method was applied with a coarser grid, and is also not applied to reproduce PM2.5, directly, but to best fit individual species concentrations.

561 562 563 564 565 566 567 568

Overall, data fusion methods, especially CMAQ-DF and CMAQ-RLINE-DF methods, are more appropriate for exposure studies since they provided spatially, temporally and chemically complete pollution fields, and without the significant biases and errors of raw air quality model outputs. However, care should be taken to evaluate the spatial patterns of pollutant concentrations as modeled by air quality models since data fusion methods preserved such spatial concentration patterns. Despite above mentioned uncertainties, the CMAQ-hybrid method is still a powerful CTM-receptor modeling method that is suitable for source appointment analysis of reactive species. It is appropriate for exposure studies where impacts from different emission sources categories are needed, such as health accountability studies.

AC C

EP

TE D

M AN U

SC

RI PT

529 530 531 532 533 534 535 536 537 538 539 540

19

ACCEPTED MANUSCRIPT

5. Conclusions In this study, we extensively evaluated 14 methods for developing spatiotemporal air pollutant concentration fields for use in health studies. Those methods ranged in complexity from using a single central monitor to hybrid modeling techniques. This study allows a better understanding of a variety of methods used to generate air pollutant concentration fields for exposure and epidemiological studies. Results from this study clearly showed the differences of the 14 different methods as applied to the pollutants selected, and their corresponding strengths and weaknesses.

576 577 578 579 580 581 582 583 584

We find that the CM and SA methods are more suitable for the investigation of secondary pollutants and in a relatively small domain, or time-series studies where spatial completeness of pollution field is less of concern. Caution should be exercised during the selection of monitors to exclude sites with unique site characteristics. We find the spatial interpolation methods IDW, KRIG and TESS-NN are better suited to be applied in areas with relatively dense and well-distributed monitoring networks, and also to secondary pollutants. The TESS-D method leads to unrealistic pollution fields that may not be suitable for health studies. In addition, monitor selection for spatial interpolation methods should also carefully consider sampling schedules at each monitor to avoid zig-zag bias autocorrelations that are detrimental for health analysis.

585 586 587 588 589 590 591 592

The AOD method has seen increasing use for developing PM2.5 fields. We find that such methods are relatively accurate, and represent one of the state-of-the science approaches. The LUR method performs similarly as the spatial interpolation methods in this study. It is however also worth noting that both LUR and AOD methods may simulate inversed spatial PM2.5 concentration patterns at monitor locations, as suggested by the negative spatial correlation coefficient (Fig. 3). Overall, our results suggest the two methods should be applied in study domains with sufficient monitoring data available. Saturation sampling may be performed in the study area to better develop regression equations for the LUR and AOD approach.

593 594 595 596 597

The RLINE and CMAQ methods are capable of providing spatially, temporally and chemically complete pollution fields, though their unadjusted outputs are subject to significant bias and error that are temporally autocorrelated. The observed correlations among pollutants were also distorted to be larger or smaller (Table 2). Our results suggest against using such model outputs directly without careful evaluation in related epidemiological or health studies.

598 599 600 601 602

On the other hand, using results from air quality models along with observations appears to address much of the limitations found using only observations or the models themselves, especially the more statistically rigorous CMAQ-DF methods. The CMAQ-hybrid method, with source appointment capabilities, method performs reasonably well, though less favorably as compared with the CMAQ-DF method, but provided additional source impact information that are important for accountability studies.

603 604 605 606 607 608

Method evaluation should not be limited to simple comparison with observations and data-withholding should be used, as appropriate and possible. Further, bias autocorrelation analysis proved to be informative and showed that some methods lead to results that have temporally lingering biases and bias patterns, while those that utilize observations tend to have minimal, longer term bias autocorrelations. Methods using observations may be preferred for health studies in which autocorrelations should be avoided, such as time-series studies.

AC C

EP

TE D

M AN U

SC

RI PT

569 570 571 572 573 574 575

20

ACCEPTED MANUSCRIPT

Appropriate exposure estimates are critical for air quality related epidemiological and health impact assessment studies. Results from this study show that concentration fields generated by the various methods differ considerably. These differences can markedly impact the outcomes of subsequent exposure and health analyses. This study highlights the strengths and weaknesses of different methods and help investigators to better select methods that suit their specific needs.

614

Acknowledgement

615 616 617 618 619 620 621

This publication was made possible by funding from the EPRI and the US EPA under grant R834799. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the supporting agencies. Further, the US government does not endorse the purchase of any commercial products or services mentioned in the publication. We would also like to acknowledge the help of Eric Edgerton (Atmospheric Research & Analysis Inc), Cesunica Ivey (University of Nevada Reno), Cong Liu (Southeast University, China), Xinxin Zhai (Georgia Institute of Technology), Francesca Metcalf (Georgia Institute of Technology), Yang Liu and Xuefei Hu (Emory University).

622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652

Reference

M AN U

SC

RI PT

609 610 611 612 613

Abt Associates Inc. (2008). Environmental Benefits Mapping and Analysis Program (Version 3.0). Bethesda, MD. Prepared for Environmental Protection Agency, Office of Air Quality Planning and Standards, Air Benefits and Costs Group. Research Triangle Park, NC.

TE D

Alhanti BA, Chang HH, Winquist A, Mulholland JA, Darrow LA, Sarnat SE. 2015. Ambient air pollution and emergency department visits for asthma: A multi-city assessment of effect modification by age. Journal of Exposure Science and Environmental Epidemiology 26(2):180-8. Anselin L, Le Gallo J. 2006. Interpolation of air quality measures in hedonic house price models: Spatial aspects. Spatial Economic Analysis 1:31-52.

EP

Appel KW, Roselle S, Pouliot G, Eder B, Pierce T, Mathur R, et al. 2012. Performance summary of the 2006 community multiscale air quality (CMAQ) simulation for the AQMEII project: North american application. In: Air pollution modeling and its application xxi, (Steyn DG, Castelli ST, eds), 505-511.

AC C

Appel KW, Pouliot GA, Simon H, Sarwar G, Pye HOT, Napelenok SL, et al. 2013. Evaluation of dust and trace metal estimates from the community multiscale air quality (CMAQ) model version 5.0. Geoscientific Model Development 6:883-899. Bates JT. 2015. Comparison of two downscaling techniques going from 12-km to 250 m resolution. In: 14th Annual Community Modeling and Analysis (CMAS) Conference. Chapel Hill, NC. Batterman S, Burke J, Isakov V, Lewis T, Mukherjee B, Robins T. 2014. A comparison of exposure metrics for traffic-related air pollutants: Application to epidemiology studies in Detroit, Michigan. International Journal of Environmental Research and Public Health 11:9553. Baxter LK, Dionisio KL, Burke J, Sarnat SE, Sarnat JA, Hodas N, et al. 2013. Exposure prediction approaches used in air pollution epidemiology studies: Key findings and future recommendations. Journal of Exposure Science & Environmental Epidemiology 23:654-659. 21

ACCEPTED MANUSCRIPT

Beckerman BS, Jerrett M, Serre M, Martin RV, Lee S-J, van Donkelaar A, et al. 2013. A hybrid approach to estimating national scale spatiotemporal variability of PM2.5 in the contiguous united states. Environmental Science & Technology 47:7233-7241.

RI PT

Bergen S, Sheppard L, Sampson PD, Sun-Young K, Richards M, Vedal S, et al. 2013. A national prediction model for PM2. 5 component exposures and measurement error-corrected health effect inference. Environmental Health Perspectives 121:1017. Bernstein JA, Alexis N, Barnes C, Bernstein IL, Nel A, Peden D, et al. 2004. Health effects of air pollution. Journal of Allergy and Clinical Immunology 114:1116-1123.

SC

Boylan JW, Russell AG. 2006. Pm and light extinction model performance metrics, goals, and criteria for three-dimensional air quality models. Atmospheric Environment 40:4946-4959.

M AN U

Byun D, Schere KL. 2006. Review of the governing equations, computational algorithms, and other components of the models-3 community multiscale air quality (CMAQ) modeling system. Applied Mechanics Reviews 59:51-77. Chang HH, Hu X, Liu Y. 2014. Calibrating modis aerosol optical depth for predicting daily PM2. 5 concentrations via statistical downscaling. Journal of Exposure Science & Environmental Epidemiology 24:398-404.

TE D

Chen G, Li J, Ying Q, Sherman S, Perkins N, Rajeshwari S, et al. 2014. Evaluation of observation-fused regional air quality model results for population air pollution exposure estimation. Science of the Total Environment 485:563-574. Chudnovsky AA, Kostinski A, Lyapustin A, Koutrakis P. 2013. Spatial scales of pollution from variable resolution satellite imaging. Environmental Pollution 172:131-138.

EP

Civerolo K, Hogrefe C, Zalewsky E, Hao W, Sistla G, Lynn B, et al. 2010. Evaluation of an 18-year cmaq simulation: Seasonal variations and long-term temporal changes in sulfate and nitrate. Atmospheric Environment 44:3745-3752. Coulter TC. 2004. EPA-CMB8. 2 users manual. Office of Air Quality Planning & Standards, Emissions, Monitoring & Analysis Division, Air Quality Modeling Group.

AC C

653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699

D'Onofrio D. 2016. Atlanta roadside emissions exposure study: Methodology & project overview. Atlanta, GA. Atlanta Regional Commission. Edgerton ES, Hartsell BE, Saylor RD, Jansen JJ, Hansen DA, Hidy GM. 2005. The southeastern aerosol research and characterization study: Part ii. Filter-based measurements of fine and coarse particulate matter mass and composition. Journal of the Air & Waste Management Association 55:1527-1542. Edgerton ES, Hartsell BE, Saylor RD, Jansen JJ, Hansen DA, Hidy GM. 2006. The southeastern aerosol research and characterization study, part 3: Continuous measurements of fine particulate matter mass and composition. Journal of the Air & Waste Management Association 56:1325-1341.

22

ACCEPTED MANUSCRIPT

Engel-Cox JA, Holloman CH, Coutant BW, Hoff RM. 2004. Qualitative and quantitative evaluation of modis satellite sensor data for regional and urban scale air quality. Atmospheric Environment 38:24952509.

RI PT

Foley KM, Roselle SJ, Appel KW, Bhave PV, Pleim JE, Otte TL, et al. 2010. Incremental testing of the community multiscale air quality (CMAQ) modeling system version 4.7. Geoscientific Model Development 3:205-226. Friberg MD, Zhai X, Holmes HA, Chang HH, Strickland MJ, Sarnat SE, et al. 2016. Method for fusing observational data and chemical transport model simulations to estimate spatiotemporally resolved ambient air pollution. Environmental Science & Technology 50:3695-3705.

M AN U

SC

Gakidou E, Afshin A, Abajobir AA, Abate KH, Abbafati C, Abbas KM, et al. 2017 Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. The Lancet 390:1345-1422. Goldman GT, Mulholland JA, Russell AG, Srivastava A, Strickland MJ, Klein M, et al. 2010. Ambient air pollutant measurement error: Characterization and impacts in a time-series epidemiologic study in atlanta. Environmental Science & Technology 44:7692-7698. Goldman GT, Mulholland JA, Russell AG, Gass K, Strickland MJ, Tolbert PE. 2012. Characterization of ambient air pollution measurement error in a time-series health study using a geostatistical simulation approach. Atmospheric Environment (Oxford, England : 1994) 57:101-108.

TE D

Hansen DA, Edgerton ES, Hartsell BE, Jansen JJ, Kandasamy N, Hidy GM, et al. 2003. The southeastern aerosol research and characterization study: Part 1-overview. Journal of the Air & Waste Management Association 53:1460-1471.

EP

Hao H, Chang HH, Holmes HA, Mulholland JA, Klein M, Darrow LA, et al. 2016. Air pollution and preterm birth in the us state of georgia (2002–2006): Associations with concentrations of 11 ambient air pollutants estimated by combining community multiscale air quality model (CMAQ) simulations with stationary monitor measurements. Environmental Health Perspectives 124:875. Hoek G, Brunekreef B, Goldbohm S, Fischer P, van den Brandt PA. 2002. Association between mortality and indicators of traffic-related air pollution in the netherlands: A cohort study. The Lancet 360:12031209.

AC C

700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745

Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, et al. 2008. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmospheric Environment 42:7561-7578. Hogrefe C, Pouliot G, Wong D, Torian A, Roselle S, Pleim J, et al. 2014. Annual application and evaluation of the online coupled wrf–cmaq system over north america under AQMEII phase 2. Atmospheric Environment.

23

ACCEPTED MANUSCRIPT

Hogrefe C, Roselle S, Mathur R, Rao ST, Galmarini S. 2014. Space-time analysis of the air quality model evaluation international initiative (AQMEII) phase 1 air quality simulations. Journal of the Air & Waste Management Association 64:388-405.

RI PT

Hogrefe C, Pouliot G, Wong D, Torian A, Roselle S, Pleim J, et al. 2015. Annual application and evaluation of the online coupled WRF-CMAQ system over north america under AQMEII phase 2. Atmospheric Environment 115:683-694. Hopke PK. 2016. Review of receptor modeling methods for source apportionment. Journal of the Air & Waste Management Association 66:237-259.

SC

Hu J, Zhang H, Ying Q, Chen S-H, Vandenberghe F, Kleeman M. 2015. Long-term particulate matter modeling for health effect studies in california–part 1: Model performance on temporal and spatial variations. Atmospheric Chemistry and Physics 15:3445-3461.

M AN U

Hu X, Waller LA, Lyapustin A, Wang Y, Liu Y. 2014a. 10-year spatial and temporal trends of pm 2.5 concentrations in the southeastern us estimated using high-resolution satellite data. Atmospheric Chemistry and Physics 14:6301-6314. Hu X, Waller LA, Lyapustin A, Wang Y, Liu Y. 2014b. Improving satellite‐driven pm2. 5 models with moderate resolution imaging spectroradiometer fire counts in the southeastern us. Journal of Geophysical Research: Atmospheres 119:11,375-311,386.

TE D

Hu Y, Balachandran S, Pachon JE, Baek J, Ivey C, Holmes H, et al. 2014. Fine particulate matter source apportionment using a hybrid chemical transport and receptor model approach. Atmospheric Chemistry and Physics 14:5415-5431. Isakov V, Arunachalam S, Batterman S, Bereznicki S, Burke J, Dionisio K, et al. 2014. Air quality modeling in support of the near-road exposures and effects of urban air pollutants study (NEXUS). International Journal of Environmental Research and Public Health 11:8777-8793.

EP

Ivey C, Holmes H, Hu Y, Mulholland JA, Russell A. 2016. A method for quantifying bias in modeled concentrations and source impacts for secondary particulate matter. Frontiers of Environmental Science & Engineering 10:14. Ivey CE, Holmes HA, Hu YT, Mulholland JA, Russell AG. 2015. Development of PM2. 5 source impact spatial fields using a hybrid source apportionment air quality model. Geoscientific Model Development 8:2153-2165.

AC C

746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792

Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, Sahsuvaroglu T, et al. 2005. A review and evaluation of intraurban air pollution exposure models. Journal of Exposure Science and Environmental Epidemiology 15:185-204. Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, Sahsuvaroglu T, et al. 2005. A review and evaluation of intraurban air pollution exposure models. Journal of Exposure Science & Environmental Epidemiology 15:185-204.

24

ACCEPTED MANUSCRIPT

Jones JM, Hogrefe C, Henry RF, Ku JY, Sistla G. 2005. An assessment of the sensitivity and reliability of the relative reduction factor approach in the development of 8-hr ozone attainment plans. Journal of the Air & Waste Management Association 55:13-19. Kampa M, Castanas E. 2008. Human health effects of air pollution. Environmental Pollution 151:362-367.

RI PT

Kaufman JD, Adar SD, Barr RG, Budoff M, Burke GL, Curl CL, et al. 2016. Association between air pollution and coronary artery calcification within six metropolitan areas in the USA (the multi-ethnic study of atherosclerosis and air pollution): A longitudinal cohort study. The Lancet 388:696-704. Kim J. 2004. Ambient air pollution: Health hazards to children. Pediatrics 114:1699-1707.

SC

Kim S-Y, Yi S-J, Eum YS, Choi H-J, Shin H, Ryou HG, et al. 2014. Ordinary kriging approach to predicting long-term particulate matter concentrations in seven major korean cities. Environmental Health and Toxicology 29.

M AN U

Kloog I, Koutrakis P, Coull BA, Lee HJ, Schwartz J. 2011. Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmospheric Environment 45:6267-6275. Koo B, Kumar N, Knipping E, Nopmongcol U, Sakulyanontvittaya T, Odman MT, et al. 2015. Chemical transport model consistency in simulating regulatory outcomes and the relationship to model performance. Atmospheric Environment 116:159-171.

TE D

Laden F, Neas LM, Dockery DW, Schwartz J. 2000. Association of fine particulate matter from different sources with daily mortality in six us cities. Environmental Health Perspective 108:941-947. Li J, Carlson BE, Lacis AA. 2015. How well do satellite AOD observations represent the spatial and temporal variability of PM 2.5 concentration for the United States? Atmospheric Environment 102: 260273.

EP

Lv B, Hu Y, Chang HH, Russell AG, Bai Y. 2016. Improving the accuracy of daily pm2.5 distributions derived from the fusion of ground-level measurements with aerosol optical depth observations, a case study in north china. Environmental Science & Technology 50:4752-4759. Lyapustin A, Wang Y, Laszlo I, Kahn R, Korkin S, Remer L, et al. 2011. Multiangle implementation of atmospheric correction (MAIAC): 2. Aerosol algorithm. Journal of Geophysical Research: Atmospheres 116:D3.

AC C

793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840

Mannshardt E, Sucic K, Jiao W, Dominici F, Frey HC, Reich B, et al. 2013. Comparing exposure metrics for the effects of fine particulate matter on emergency hospital admissions. Journal of Exposure Science and Environmental Epidemiology 23:627-636. Marmur A, Mulholland J, Kim E, Hopke P, Sarnat J, Klein M, et al. 2006a. Comparing results from several PM2.5 source-apportionment methods for use in a time-series health study. Epidemiology 17:S200-S200. Marmur A, Park SK, Mulholland JA, Tolbert PE, Russell AG. 2006b. Source apportionment of PM2.5 in the southeastern united states using receptor and emissions-based models: Conceptual differences and implications for time-series health studies. Atmospheric Environment 40:2533-2551. 25

ACCEPTED MANUSCRIPT

Martin RV. 2008. Satellite remote sensing of surface air quality. Atmospheric Environment 42:7823-7843. Mebust MR, Eder BK, Binkowski FS, Roselle SJ. 2003. Models-3 community multiscale air quality (CMAQ) model aerosol component - 2. Model evaluation. Journal of Geophysical Research: Atmospheres 108.

RI PT

Napelenok SL, Foley KM, Kang DW, Mathur R, Pierce T, Rao ST. 2011. Dynamic evaluation of regional air quality model's response to emission reductions in the presence of uncertain emission inventories. Atmospheric Environment 45:4091-4098.

SC

Ozkaynak H, Baxter LK, Dionisio KL, Burke J. 2013. Air pollution exposure prediction approaches used in air pollution epidemiology studies. Journal of Exposure Science & Environmental Epidemiology 23:566572.

M AN U

Özkaynak H, Baxter LK, Burke J. 2013a. Evaluation and application of alternative air pollution exposure metrics in air pollution epidemiology studies. Journal of Exposure Science and Environmental Epidemiology 23:565-565. Özkaynak H, Baxter LK, Dionisio KL, Burke J. 2013b. Air pollution exposure prediction approaches used in air pollution epidemiology studies. Journal of Exposure Science and Environmental Epidemiology 23:566572. Pope III CA, Ezzati M, Dockery DW. 2009. Fine-particulate air pollution and life expectancy in the united states. New England Journal of Medicine 360:376-386.

TE D

Pope III CA, Dockery DW. 2006. Health effects of fine particulate air pollution: Lines that connect. Journal of the Air & Waste Management Association 56:709-742.

EP

Porter PS, Rao ST, Hogrefe C, Gego E, Mathur R. 2015. Methods for reducing biases and errors in regional photochemical model outputs for use in emission reduction and exposure assessments. Atmospheric Environment 112:178-188. Huang R, Zhai X, Ivey CE, Friberg MD, Hu X, Liu Y, et al. 2016. Air pollutant exposure modeling using air quality model-data fusion methods for blending models and observations, and comparison with satellite aod-derived fields: An application over north carolina, USA. International Technical Meeting on Air Pollution Modelling and its Application. Springer, Cham, 2016.

AC C

841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888

Rao ST, Mathur R, Hogrefe C, Solazzo E, Galmarini S, Steyn DG. 2014. Air quality model evaluation international initiative (AQMEII): A two-continent effort for the evaluation of regional air quality models. In: Air pollution modeling and its application xxii, (Steyn DG, Builtjes PJH, Timmermans RMA, eds), 455462. Reid CE, Jerrett M, Petersen ML, Pfister GG, Morefield PE, Tager IB, et al. 2015. Spatiotemporal prediction of fine particulate matter during the 2008 northern california wildfires using machine learning. Environmental Science & Technology 49:3887-3896. Rohr AC, Wyzga RE. 2012. Attributing health effects to individual particulate matter constituents. Atmospheric Environment 62:130-152. 26

ACCEPTED MANUSCRIPT

Ryan PH, LeMasters GK. 2007. A review of land-use regression models for characterizing intraurban air pollution exposure. Inhalation Toxicology 19:127-133.

RI PT

Sampson PD, Richards M, Szpiro AA, Bergen S, Sheppard L, Larson TV, et al. 2013. A regionalized national universal kriging model using partial least squares regression for estimating annual PM2.5 concentrations in epidemiology. Atmospheric Environment 75:383-392. Sarnat JA, Marmur A, Klein M, Kim E, Russell AG, Mulholland JA, et al. 2006. Associations between source-resolved particulate matter and cardiorespiratory emergency department visits. Epidemiology 17:S267-S268.

SC

Sayer A, Hsu N, Bettenhausen C, Jeong MJ. 2013. Validation and uncertainty estimates for modis collection 6 “deep blue” aerosol data. Journal of Geophysical Research: Atmospheres 118:7864-7872.

M AN U

Simon H, Baker KR, Phillips S. 2012. Compilation and interpretation of photochemical model performance statistics published between 2006 and 2012. Atmospheric Environment 61:124-139. Snyder MG, Venkatram A, Heist DK, Perry SG, Petersen WB, Isakov V. 2013. Rline: A line source dispersion model for near-surface releases. Atmospheric Environment 77:748-756. Swall JL, Foley KM. 2009. The impact of spatial correlation and incommensurability on model evaluation. Atmospheric Environment 43:1204-1217.

TE D

US Environmental Protection Agency (USEPA). 2010. Quantitative Health Risk Assessment for Particulate Matter. Second External Review Draft. US Environmental Protection Agency. Research Triangle Park, North Carolina 27711.

EP

US Environmental Protection Agency (USEPA). 2016a. 2014 National Emissions Inventory, version 1 Technical Support Document. US Environmental Protection Agency. Research Triangle Park, North Carolina 27711. US Environmental Protection Agency (USEPA). 2016b. NCore Network and Sites Information. US Environmental Protection Agency. Accessed January 6, 2018: https://www3.epa.gov/ttnamti1/ncorenetworks.html

AC C

889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936

Vaidyanathan A, Dimmick WF, Kegler SR, Qualters JR. 2013. Statistical air quality predictions for public health surveillance: Evaluation and generation of county level metrics of PM2.5 for the environmental public health tracking network. International Journal of Health Geographics 12. Watson JG, Cooper JA, Huntzicker JJ. 1984. The effective variance weighting for least-squares calculations applied to the mass balance receptor model. Atmospheric Environment 18:1347-1355. Whitworth KW, Symanski E, Lai D, Coker AL. 2011. Kriged and modeled ambient air levels of benzene in an urban environment: An exposure assessment study. Environmental Health 10:21. Wong DW, Yuan L, Perlin SA. 2004. Comparison of spatial interpolation methods for the estimation of air quality data. Journal of Exposure Analysis and Environmental Epidemiology 14:404-415. 27

ACCEPTED MANUSCRIPT

Wu J, Wilhelm M, Chung J, Ritz B. 2011. Comparing exposure assessment methods for traffic-related air pollution in an adverse pregnancy outcome study. Environmental Research 111:685-692.

RI PT

Yanosky JD, Paciorek CJ, Laden F, Hart JE, Puett RC, Liao D, et al. 2014. Spatio-temporal modeling of particulate air pollution in the conterminous united states using geographic and meteorological predictors. Environmental Health 13. Zanobetti A, Schwartz J. 2009. The effect of fine and coarse particulate air pollution on mortality: A national analysis. Environmental Health Perspectives 117:898-903.

SC

Zhai X, Russell A, Sampath P, Mulholland J, Kim B-U, Kim Y, et al. 2016. Calibrating r-line model results with observational data to develop annual mobile source air pollutant fields at fine spatial resolution: Application in atlanta. Atmospheric Environment 147:446-457

M AN U

Zhang Y, Liu P, Liu X-H, Jacobson MZ, McMurry PH, Yu F, et al. 2010. A comparative study of nucleation parameterizations: 2. Three-dimensional model application and evaluation. Journal of Geophysical Research: Atmospheres 115. Zhu YF, Hinds WC, Kim S, Sioutas C. 2002. Concentration and size distribution of ultrafine particles near a major highway. Journal of the Air & Waste Management Association 52:1032-1042.

TE D

Zoogman P, Liu X, Suleiman RM, Pennington WF, Flittner DE, Al-Saadi JA, et al. 2017. Tropospheric emissions: Monitoring of pollution (TEMPO). Journal of Quantitative Spectroscopy and Radiative Transfer 186:17-39.

EP

Zou B, Wilson JG, Zhan FB, Zeng Y. 2009. Air pollution exposure assessment methods utilized in epidemiological studies. Journal of Environmental Monitoring 11:475-490.

AC C

937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964

28