Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy

Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy

Accepted Manuscript Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy Alessandro Marco...

2MB Sizes 0 Downloads 19 Views

Accepted Manuscript Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy Alessandro Marcon, Kees de Hoogh, John Gulliver, Rob Beelen, Anna L. Hansell PII:

S1352-2310(15)30429-5

DOI:

10.1016/j.atmosenv.2015.10.010

Reference:

AEA 14168

To appear in:

Atmospheric Environment

Received Date: 23 December 2014 Revised Date:

5 October 2015

Accepted Date: 6 October 2015

Please cite this article as: Marcon, A., de Hoogh, K., Gulliver, J., Beelen, R., Hansell, A.L., Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy, Atmospheric Environment (2015), doi: 10.1016/j.atmosenv.2015.10.010. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1

ACCEPTED MANUSCRIPT Development and transferability of a nitrogen dioxide land use regression model

2

within the Veneto region of Italy

3

Authors' full names:

5

Alessandro Marcon1,2 ([email protected]), Kees de Hoogh2,3,4* ([email protected]),

6

John Gulliver2* ([email protected]), Rob Beelen5,6 ([email protected]), Anna L.

7

Hansell2,7 ([email protected])

8

* These authors contributed equally to this work

M AN U

SC

RI PT

4

9

10

Authors' affiliations:

11

1) Unit of Epidemiology and Medical Statistics, Department of Diagnostics and Public Health,

13 14

University of Verona, Verona, Italy

TE D

12

2) MRC-PHE Centre for Environment and Health, Department of Epidemiology & Biostatistics, School of Public Health, Imperial College London, London, UK 3) Swiss Tropical and Public Health Institute, Basel, Switzerland

16

4) University of Basel, Basel, Switzerland

17

5) Institute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands

18

6) National Institute for Public Health and the Environment, Bilthoven, The Netherlands

19

7) Imperial College Healthcare NHS Trust, London, UK

AC C

EP

15

20

21

22

1

ACCEPTED MANUSCRIPT

23

Corresponding author:

24

Alessandro Marcon, Unit of Epidemiology and Medical Statistics, Department of Diagnostics and

25

Public Health, University of Verona, Strada Le Grazie 8, 37134 Verona, Italy. Telephone: +39 045

26

8027668. E-mail: [email protected]

28

Running title: Transferability of a nitrogen dioxide LUR model

30

Abstract word count: 267 words

31

Manuscript word count: 4090 words

SC

29

RI PT

27

M AN U

32

33

34

AC C

EP

TE D

35

2

ACCEPTED MANUSCRIPT

ABSTRACT

37

When measurements or other exposure models are unavailable, air pollution concentrations could

38

be estimated by transferring land-use regression (LUR) models from other areas. No studies have

39

looked at transferability of LUR models from regions to cities. We investigated model

40

transferability issues. We developed a LUR model for 2010 using annual average nitrogen dioxide

41

(NO2) concentrations retrieved from 47 regulatory stations of the Veneto region, Northern Italy. We

42

applied this model to 40 independent sites in Verona, a city inside the region, where NO2 had been

43

monitored in the European Study of Cohorts for Air Pollution Effects (ESCAPE) during 2010. We

44

also used this model to estimate average NO2 concentrations at the regulatory network in 2008,

45

2009 and 2011. Of 33 predictor variables offered, five were retained in the LUR model (R2=0.75).

46

The number of buildings in 5,000 m buffers, industry surface area in 1,000 m buffers and altitude,

47

mainly representing large-scale air pollution dispersion patterns, explained most of the spatial

48

variability in NO2 concentrations (R2=0.68), while two local traffic proxy indicators explained little

49

of the variability (R2=0.07). The performance of this model transferred to urban sites was poor

50

overall (R2=0.18), but it improved when only predicting inner-city background concentrations

51

(R2=0.52). Recalibration of LUR coefficients improved model performance when predicting NO2

52

concentrations at the regulatory sites in 2008, 2009 and 2011 (R2 between 0.67 and 0.80). Models

53

developed for a region using NO2 regulatory data are unable to capture small-scale variability in

54

NO2 concentrations in urban traffic areas. Our study documents limitations in transferring a regional

55

model to a city, even if it is nested within that region.

SC

M AN U

TE D

EP

AC C

56

RI PT

36

3

ACCEPTED MANUSCRIPT

57

Keywords

58

Air pollution; Ambient air; Environmental exposure; Exposure assessment; Geographic Information

59

Systems; Land Use Regression.

AC C

EP

TE D

M AN U

SC

RI PT

60

4

ACCEPTED MANUSCRIPT

1. INTRODUCTION

62

Regression mapping, also called Land Use Regression (LUR) modelling, is one commonly used

63

exposure assessment technique (Jerrett et al., 2005). LUR has been recently employed in the

64

European Study of Cohorts for Air Pollution Effects (ESCAPE), where models for several air

65

pollutants, including nitrogen dioxide (NO2), have been developed based on dedicated monitoring

66

campaigns (Beelen et al., 2013) and applied (Schikowski et al., 2014) to estimate residential

67

outdoor air pollutant concentrations. With respect to interpolation methods like ordinary kriging,

68

LUR is generally better in capturing small-scale variability in air pollution, especially in urban

69

areas, where air pollution concentrations vary widely across short distances, and when the

70

monitoring network is relatively sparse (Jerrett et al., 2005; Gulliver et al., 2011; Akita et al., 2014;

71

Hoek et al., 2008). With respect to air dispersion models, LUR models have less demanding data,

72

software and hardware requirements (Beelen et al., 2010; Wang et al. 2015).

73

Ideally, developing a LUR model would need a dense network of monitors, possibly providing an

74

independent test set for model validation. Monitor locations should be selected that appropriately

75

reflect the variation in air pollution concentrations at the residential addresses of the study

76

population (using e.g. saturation sampling, Shmool et al., 2014). In practice, researchers are often

77

relying on available data from regulatory monitoring networks. Although these networks may

78

provide a wide temporal coverage, they usually have a sparse spatial coverage that may be poorly

79

representative of residential areas. The alternative is undertaking expensive ad hoc monitoring,

80

which might have limited temporal coverage (Jerrett et al., 2005).

81

The Genes Environment Interaction in Respiratory Diseases (GEIRD) population study is carried

82

out in 8 centres in Italy, aimed at investigating environmental and genetic determinants of

83

respiratory chronic inflammatory diseases (de Marco et al., 2010). Clinical examinations started in

84

2008 and are ongoing (Marcon et al., 2013). For three centres, including Verona (Northern Italy),

85

LUR models developed in ESCAPE could be used to estimate air pollution exposure of study

AC C

EP

TE D

M AN U

SC

RI PT

61

5

ACCEPTED MANUSCRIPT

participants (Beelen et al., 2013). For centres where neither exposure models, nor dedicated

87

monitoring campaigns, nor regulatory networks of adequate density are available, one option could

88

be to develop LUR models using available regulatory data for a larger area or region. However,

89

transferring models from regions to urban (although inner) areas requires a preliminary evaluation.

90

A number of studies have transferred LUR models between cities (Briggs et al., 2000; Jerrett et al.,

91

2005; Poplawski et al., 2009; Allen et al., 2011) or countries (Vienneau et al., 2010; Wang et al.,

92

2014). In some cases, both predictor variables and coefficients of LUR models were transferred

93

(Allen et al., 2011; Jerrett et al., 2005), while in others regression coefficients were recalibrated

94

using air pollution measurements from the target area (Briggs et al., 2000; Poplawski et al., 2009;

95

Vienneau et al., 2010). The performance of transferred models was highly variable, ranging from

96

very poor (Jerrett et al., 2005) to satisfactory (Briggs et al., 2000), but it was generally lower than

97

that of models developed for a specific area (Poplawski et al., 2009). To our knowledge, few studies

98

have attempted to transfer LUR models from a region to a smaller, nested area (Wang et al., 2014),

99

and none of these have examined transferability from regions to cities.

TE D

M AN U

SC

RI PT

86

In this study, we developed a LUR model for NO2 using regulatory monitoring data for 2010 from

101

the Veneto region, a large administrative area including the municipality of Verona (Figure 1), and

102

we studied its performance on the set of measurements carried out in Verona during the same year

103

in ESCAPE. We also studied whether this model could be used to estimate NO2 concentrations in

104

other years of the 2008-2011 period.

AC C

105

EP

100

6

ACCEPTED MANUSCRIPT

2. METHODS

107

2.1 Study area

108

The Veneto region has a surface of 18,407 km2 and about 4.8 million inhabitants (data on the 2011

109

census of the Italian population, available at http://dati-censimentopopolazione.istat.it/, last

110

accessed on 11 June 2015). It is divided into 7 provinces. The Verona municipality, inside the

111

Verona province, is 198.9 km2 large and has 252,520 inhabitants (Figure 1).

112

2.2 Air pollution data

113

Daily and annual average NO2 concentrations for years 2008-2011 measured by 53 regulatory

114

stations in Veneto were retrieved from the European air quality database website

115

(http://www.eea.europa.eu/). Of these, 41 were active throughout the whole period. The stations are

116

operated by the regional environmental agency (Agenzia Regionale per la Prevenzione e

117

Protenzione Ambientale del Veneto, [ARPAV]) and detect NO2 by chemiluminescence. They are

118

classified into regional background, urban (including suburban) background, industrial, and street

119

stations. Stations with ≤75% daily average concentrations were not considered. Since most of the

120

geographic information system (GIS) data were available only within the region and the largest

121

buffer for calculation of predictors was 5 km, stations that were at <5 km from the regional border

122

were not considered to avoid missing data in these buffers. Coordinates were retrieved from

123

ARPAV and checked on satellite maps.

124

NO2 concentrations measured in ESCAPE are described elsewhere (Cyrys et al., 2012). Briefly,

125

measurements had been conducted by passive sampling (Ogawa badges) in Verona, at 40 sites

126

classified as regional background, urban background and street sites, between January and June

127

2010, during three 14-day periods. Annual average concentrations had been calculated for all sites

128

correcting for temporal variation, using measurements obtained from a background regulatory

AC C

EP

TE D

M AN U

SC

RI PT

106

7

ACCEPTED MANUSCRIPT

station that was operated in Verona year-round (Cyrys et al., 2012). This improves comparability

130

between annual average concentrations measured by passive samplers and regulatory stations.

131

For each regulatory station, annual average concentrations were also estimated using daily

132

regulatory data for the same 3 ESCAPE sampling periods, applying the ESCAPE temporal

133

adjustment procedure (Cyrys et al., 2012).

134

2.3 Potential predictor variables

135

A GIS (Arc Map 10.1 software, ESRI, Redlands, California), set to the Monte Mario Italy 1

136

projected coordinate system (the regional standard), was used to generate predictor variables at the

137

coordinates of the regulatory and ESCAPE sites. Traffic data were not available. The set of 67

138

potential predictor variables extracted were:

SC



M AN U

139

RI PT

129

Altitude, which was transformed according to Beelen et al. (2009): √(nalt/max(nalt)), where nalt = altitude - min(altitude); the data source was the regional Digital Terrain Model

141

database (5 x 5 m cells);

142



TE D

140

Surface area (m2) of pre-defined land cover classes (high/low density residential land combined; industry; port; urban green; semi-natural and forested areas, alone and combined;

144

water) in 100, 300, 500, 1000 and 5000 m buffers; the data source was the regional land

145

cover database (http://idt.regione.veneto.it/, 1:10,000 resolution), which uses the

146

Coordination of Information on the Environment (CORINE) classification; •

AC C

147

EP

143

Population (census 2011), number of buildings and households (census 2001) in 100, 300,

148

500, 1000 and 5000 m buffers (http://www.istat.it/, census tract geometries: 1:5.000 and

149

1:25000 resolution in urban and less populated areas, respectively);

150 151



Length of roads/motorways (m) in 25, 50, 100, 300, 500 and 1000 m buffers; inverse distance (1/m) and inverse distance squared (1/m2) to roads/motorways; derived from the

8

ACCEPTED MANUSCRIPT

152

regional road network (http://idt.regione.veneto.it/, 1:10,000 resolution). A finer

153

classification of road types was not available (see Limitations, Discussion section). Buffer sizes and land cover coding were based on the ESCAPE protocol (Beelen et al., 2013).

155

2.4 LUR model development

156

A LUR model using annual average NO2 concentrations from the 47 regulatory stations that were

157

active in 2010 and fulfilled the inclusion criteria was developed. It was not possible to develop a

158

LUR model using regulatory data for the Verona municipality, where there were only 3 stations

159

(Figure 1). For sensitivity analyses, the model was re-developed after excluding the 5 industrial

160

monitoring stations, or the 9 stations in the Verona province. Finally, a LUR model was developed

161

using ESCAPE measurements. The latter differed from the published model for Verona (Beelen et

162

al., 2013) because some predictor variables (e.g. traffic counts) were not available for the region.

163

A priori GIS predictors were required to have values >0 for at least 20% of the sites. Redundant

164

predictors were excluded (Henderson et al., 2007): after ranking the predictors in each category

165

(e.g. buildings) according to the absolute value of the Pearson’s r coefficient (|r|) for their

166

correlation with NO2, any variables that were correlated (|r| >0.6) with the highest ranking variable

167

in the same category were excluded.

168

A supervised forward stepwise procedure was followed (Beelen et al., 2013). First, univariate linear

169

regression models were run for all predictors, and the model with the highest adjusted coefficient of

170

determination R2 (adjusted R2: aR2) and the regression coefficient with the pre-defined sign was

171

selected as the starting model. Then, the remaining predictors were offered one at a time, and the

172

model with the highest aR2 was selected, provided that the aR2 increase was >0.01, and that all

173

regression coefficients had the pre-defined signs. When no predictor further increased model’s aR2

174

of >0.01, variables with p >0.10 were sequentially removed. If variables in the same category but

175

different buffers were included in the final model, the variable in the largest buffer was replaced

AC C

EP

TE D

M AN U

SC

RI PT

154

9

ACCEPTED MANUSCRIPT

with a doughnut-buffered variable (Beelen et al., 2013). Diagnostic tests were applied to check for

177

multi-collinearity (predictors with Variance Inflation Factor [VIF]>3 were dropped), influential

178

observations (Cook’s D), heteroskedasticity, non-normality and spatial autocorrelation of residuals

179

(Moran’s I) (Beelen et al., 2013).

180

Model performance was evaluated by R2, aR2, and root mean square error (RMSE), and by leave-

181

one-out cross validation (LOOCV). In LOOCV the model was calibrated N (number of

182

measurements) times, using N – 1 measurements per time, and NO2 concentrations were predicted

183

at each of the left-out stations using these models. Then, the measured vs predicted linear fit was

184

calculated and the LOOCV-R2 and RMSE were reported.

185

The statistical analyses were performed using STATA 13.1 (StataCorp, College Station, TX) and R

186

3.1.0 (The R Foundation for Statistical Computing).

187

2.5 Model transferability

188

To study spatial transferability, model performance was evaluated at ESCAPE sites in Verona: the

189

LUR model was applied to predict NO2 concentrations at these sites, and the correlation between

190

measured (i.e. ESCAPE) and predicted concentrations was analysed. Since one common first-line

191

approach to study air pollution effects is to use rough proxy indicators of exposure based on

192

characteristics of the residential area (e.g. rural/urban or traffic intensity indicators derived by

193

questionnaires), the proportion of variability (R2) in NO2 concentrations explained by ESCAPE site

194

classification (rural background, urban background, street site) was also calculated, for reference,

195

using linear regression.

196

To study transferability over time, the correlation between concentrations predicted for 2010 and

197

concentrations measured for 2008, 2009, 2011, and 2008-11 were analysed at the regulatory sites.

198

Moreover, predictors’ coefficients were recalibrated by forcing variables selected for 2010 into

199

models with NO2 concentrations for the other years as dependent variables, and the measured vs

AC C

EP

TE D

M AN U

SC

RI PT

176

10

ACCEPTED MANUSCRIPT

predicted concentrations were analysed. Both these analyses were done using data from all available

201

stations.

202

Predictor values at ESCAPE sites, as well as at regulatory stations that were inactive in 2010, that

203

were outside the range of values at the 2010 regulatory stations were truncated to the maximum (or

204

minimum) measured value, since the linear relationship between the predictors and NO2

205

concentrations might not hold outside of this range (Akita et al., 2014). This procedure has been

206

shown to generally improve model performance (Wang et al., 2013).

AC C

EP

TE D

M AN U

SC

207

RI PT

200

11

ACCEPTED MANUSCRIPT

208

3. RESULTS

210

3.1 Air pollution data and GIS predictors

211

Among the 47 regulatory stations used to develop the LUR model, rural (23.4%) and urban

212

background (46.8%) stations were the most represented (Table 1). On average, percent data capture

213

was 93.9±3.7%. Annual average NO2 concentrations were the lowest at rural background (16.4

214

µg/m3) and the highest at street stations (39.8 µg/m3). Overall, NO2 concentrations were lower at

215

the regulatory stations (range: 7.3 to 46.8 µg/m3) than at ESCAPE sites (range: 16.3 to 100.1 µg/m3)

216

(Figure 2). Most of the 40 ESCAPE monitoring sites were street sites (n=23, 57.5%), whereas 14

217

(35%) and 3 (7.5%) were urban and rural background sites, respectively.

218

The distribution of all the 67 potential predictors extracted at the regulatory stations is described in

219

Table A1 (Appendix). Of these predictor variables, 15 and 19 were dropped because they had too

220

many zeroes or redundant information, respectively.

221

3.2 LUR model development

222

Five of the remaining 33 predictors offered were retained in the model (Table 2). The number of

223

buildings in 5,000 m buffers was the first variable to enter, accounting for almost half of the

224

variability in NO2 concentrations (R2 = 0.44). Small-scale traffic indicators (length of roads in 100

225

m buffers and inverse distance to motorways) entered the model last and they accounted for little of

226

the variability (gain in R2 = 0.07). Predictors had small VIF values (Table 2), and there were no

227

influential observations (maximum Cook’s D= 0.61). There was no spatial autocorrelation of

228

residuals (Moran’s I = -0.002, p = 0.53). Model statistics were R2 = 0.75, aR2 = 0.72 and RMSE =

229

4.97 µg/m3. The LOOCV R2 and RMSE were 0.64 and 5.69 µg/m3, respectively.

230

3.4 Spatial transferability

AC C

EP

TE D

M AN U

SC

RI PT

209

12

ACCEPTED MANUSCRIPT

With the exception of 3 outliers for industry in 1,000 m buffers (Figure 3, B), the range of predictor

232

values at the ESCAPE sites was within the distribution at regulatory stations (also see Table A1). In

233

the case of the number of buildings in 5,000 m buffers (A) and transformed altitude (C), the range at

234

ESCAPE sites was particularly narrow. Road length in 100 m buffers (D) showed a fairly similar

235

distribution between the two sets, whereas high values of 1/distance (i.e. points at small distance) to

236

motorways were poorly represented in both.

237

When transferred to the ESCAPE sites within Verona, the LUR model showed a poor performance

238

(coefficient = 1.61, R2=0.18) (figure 4, A) compared to the model based on site classification

239

(R2=0.35). When ESCAPE street sites were not considered, the performance improved (coefficient

240

= 1.16, R2=0.52) (figure 4, B). On average, the LUR model underestimated NO2 concentrations by

241

10.7±16.2 µg/m3 at street sites, whereas it overestimated concentrations by 4.1±5.2 µg/m3 at

242

background sites. With the exception of inverse distance to motorways, the contributions of the

243

predictor variables to the total concentration predicted increased when evaluated at the background

244

sites (Figure A1 in the Appendix).

245

The models developed for sensitivity analyses excluding industrial stations and stations in the

246

Verona province, respectively, are described in Table A2 (Appendix). Although they had a better R2

247

than the main model, their performance at ESCAPE (especially background) sites was worse.

248

3.5 Transferability over time

249

Concentrations measured for different years in 2008-11 at the 41 stations that were operated during

250

the whole period (Table 3) were highly correlated (all p<0.001): r coefficients ranged between 0.90

251

(2008 vs 2011) and 0.98 (2009 vs 2010). The performance (R2) of the model developed for 2010 in

252

the period 2008-2011 is illustrated in Figure 5. Recalibration improved model performance. The R2

253

values of models transferred to 2008 and 2009 were lower than 0.75 (the R2 of the original model),

254

whereas R2 values were larger than 0.75 when the model was applied to 2011.

AC C

EP

TE D

M AN U

SC

RI PT

231

13

ACCEPTED MANUSCRIPT

255

4. DISCUSSION

257

We developed a LUR model for annual average NO2 concentrations for 2010 for a region of Italy

258

using regulatory monitoring data, and we transferred this model to an urban area nested within that

259

region. We found that the transferred model was unable to capture the spatial variability in NO2

260

concentrations at the urban traffic sites. When tested only at the background sites, the transferred

261

model explained about half of the variability in NO2 concentrations (R2=0.52).

262

The lesson learned from the previous evaluations of LUR model transferability is that the

263

performance of transferred models strongly depends on the similarity between areas for

264

topographical and meteorological characteristics, urbanisation level, land cover, vehicle fleet mix,

265

etc (Briggs et al., 2000; Jerrett et al., 2005; Poplawski et al., 2009; Vienneau et al., 2010; Allen et

266

al., 2011; Wang et al., 2014). What the present study adds to previous literature is that

267

transferability issues may arise even when models developed for a region are transferred to cities

268

nested within that region, which prima facie could be thought to be a reasonable option in the

269

common situation that there are neither enough regulatory monitoring stations in a city to construct

270

a model, nor funds to conduct monitoring campaigns.

271

In our study, the same GIS data were used both for model development and transfer, which

272

eliminated one potential source of bias. We developed models using data from monitoring stations

273

that were working continuously throughout 2010. This is an advantage with respect to modelling

274

annual concentrations estimated on the basis of short campaigns conducted using passive samplers,

275

a less reliable measurement method. The drawback is that many of these stations are generally

276

located to represent average urban concentrations. Using such data may fail to capture hotspots

277

(Poplawski et al., 2009). Hence, studies have generally used regulatory air pollution data mainly to

278

develop LUR models for large regions or countries (Ross et al., 2007; Vienneau et al., 2010).

AC C

EP

TE D

M AN U

SC

RI PT

256

14

ACCEPTED MANUSCRIPT

4.1 Spatial transferability

280

Only 19.2% of the regulatory stations were classified as street stations, which implies that the

281

monitoring network mostly represented background concentrations. This is also reflected by the

282

type of predictors selected in the model and their relative contributions to overall performance. The

283

number of buildings in 5,000 m buffers, a large-scale indicator of urbanization, was the first

284

predictor to enter the model and it explained roughly half of NO2 variability. The 2nd (industry in

285

1,000 m buffers) and 3rd (transformed altitude) variables were also large-scale predictors. Finally,

286

two small-scale traffic indicators only accounted for 7% of NO2 variability. This pattern of selected

287

predictors and their relative contributions to the explained variability in NO2 concentrations is

288

unusual for LUR models developed in urban areas, where traffic indicators are generally the most

289

important (Hoek et al., 2008; Beelen et al., 2013), while it is more similar to the pattern found by

290

Vienneau et al. (2010) in the models developed for countries using regulatory data. Interestingly,

291

when a LUR model was developed on ESCAPE measurements using the same predictor variables

292

available for the region (Table S2), the only traffic indicator entering the model, length of roads in

293

50 m buffers, just explained 20% of NO2 variability, while two large-scale predictors (industry and

294

population in 5,000 m buffers) were selected. Thus, the lack of traffic data or the poor quality of

295

street network data, or some regional characteristic (e.g. the low air pollution dispersion in the

296

Italian Po Valley, Beelen et al., 2009) may have contributed to this result.

297

Model performance based on cross validation was good (R2 = 0.75, LOOCV-R2 = 0.64). However,

298

when applied to ESCAPE sites in Verona, performance dropped (R2 = 0.18). This was mainly due

299

to the presence of several street sites where NO2 concentrations were very high (60 to 100 µg/m3,

300

well outside the range of regional concentrations). After excluding street sites from the ESCAPE

301

test set, model performance improved, although the model was only able to capture about half of the

302

spatial variability in NO2 concentrations (R2 = 0.52).

AC C

EP

TE D

M AN U

SC

RI PT

279

15

ACCEPTED MANUSCRIPT

As previously mentioned, dissimilarities between areas can explain these results. First, the

304

distribution of monitoring sites was completely different between the regional and ESCAPE sites

305

(street sites were 19.2% vs 57.5% respectively). This was reflected in different distributions of NO2

306

concentrations (Figure 2). This difference is unlikely to be due to the diverse monitoring time (an

307

entire year vs short-term monitoring campaigns) or measurement principle (chemiluminescence vs

308

passive monitoring) (Gulliver et al., 2013). In fact, annual average concentrations estimated using

309

daily regulatory data from the three ESCAPE monitoring periods and calculated using the whole

310

2010 time series were very highly correlated (r=0.98, mean difference -5.4±2.3 µg/m3). Moreover,

311

the ESCAPE measurements were adjusted using data from a background regulatory station that

312

operated year-round (see Methods).

313

A second explanation for the poor performance of the transferred model is the different distribution

314

of the predictors at the regional and ESCAPE sites. The density of industries, population and

315

buildings were higher in Verona than in the region (Table A1 in the Appendix). Virtually all values

316

of the predictors entering the model were within the range of values represented at the regional

317

network. The exception was industry in 1,000 m buffers, where however truncation to the

318

maximum value only affected 3 (7.5%) sites. Interestingly, some predictors (buildings in 5,000 m

319

buffers and transformed altitude) showed much lower variability at ESCAPE sites. We speculate

320

that differences in NO2 concentrations affected transferability more than differences in predictor

321

values, since most predictors explained a substantial amount of the variability in NO2

322

concentrations when tested on background sites (Figure A1), despite their low variation. The

323

inclusion of inverse distance to motorways in the model was probably driven by a few street

324

stations that were close to motorways (see Figures 1; and Figure 3, panel E), which is a

325

consequence of the shape of the inverse distance function (Rava et al., 2012). In this view, stations

326

at a certain distance to motorways virtually gave no contribution. This is probably the reason why

327

this variable had no predictive power either at all (R2 = 0.006) or at background ESCAPE sites

328

(R2=0.002) (Figure A1, panels E1 and E2).

AC C

EP

TE D

M AN U

SC

RI PT

303

16

ACCEPTED MANUSCRIPT

Two further reasons for the poor performance of the transferred model could be valid. First,

330

industrial stations may be poorly representative of urban areas. Second, since monitoring networks

331

in different provinces within the region are operated by separate ARPAV departments, there may be

332

differences in the data from different areas. The two sensitivity analyses conducted, excluding

333

either industrial stations or stations in the Verona province, seem to rule these hypotheses out.

334

4.2 Transferability over time

335

Our findings suggest that the LUR model can be used to estimate NO2 concentrations at the

336

regulatory stations in the 2008-2011 period. This was expected, since annual NO2 concentrations

337

were highly correlated and stable over time. Changes in urbanisation, vehicle technologies or fleet

338

mix, which may cause spatial contrasts in NO2 to change over time (Caballero et al., 2012), are

339

negligible over short periods (Beelen et al., 2007; Madsen et al., 2011). Better estimates were

340

obtained by recalibrating the model (Briggs et al., 2000; Mölter et al., 2010). Indeed, spatial

341

contrasts in air pollution appear to be relatively stable across long periods (Eeftens et al., 2011;

342

Gulliver et al., 2013).

343

4.3 Limitations

344

The availability and quality of input data is crucial for the development of any predictive model

345

(Wang et al., 2014). Traffic data, which is the most important input of LUR models (Hoek et al.,

346

2008) were not available for this study. Road classification in the regional network also appeared to

347

have a questionable quality. Several road attributes (roadway width, administrative classification

348

and technical/functional classification) were tested to identify primary roads other than motorways.

349

However, when mapped, highways showed discontinuous patterns, and they were consequently not

350

offered into the model. Neither building height nor street configurations were available. All in all,

351

this may have contributed to poor model performance especially at street sites. Cross-validation is

352

known to overestimate model performance (Wang et al., 2012, 2013). However, an independent set

AC C

EP

TE D

M AN U

SC

RI PT

329

17

ACCEPTED MANUSCRIPT

of regulatory stations was not available for external validation. Model selection via

354

deletion/substitution/addition algorithms might have prevented over-fitting, which is an issue when

355

LUR models are developed using air pollution data from all the available monitoring sites, even

356

when model performance is evaluated by hold-out validation (Jerrett et al., 2013). However, this

357

model selection strategy was not deemed to be very appropriate in our study due to the small

358

number of monitoring sites.

359

4.4 Conclusions

360

Our model, developed for a region of northern Italy using NO2 regulatory data, was unable to

361

capture small-scale variability in NO2 concentrations at traffic sites when transferred to a city nested

362

within that region. This finding supports previous evidence advising against transferring LUR

363

models to areas with different characteristics and ranges of air pollution concentrations. This

364

includes transferring models from regions with a high proportion of background and rural sites to

365

cities (where traffic sources dominate), even if they are nested within those regions. In such

366

situations, additional monitoring needs to be conducted and a new LUR model constructed for use

367

in epidemiological studies. Transferring models developed for a region could still be useful to

368

maximize exposure gradients when selecting sampling sites for ad hoc monitoring campaigns

369

(Allen et al., 2011; Shmool et al., 2014).

370

In the centres participating in the GEIRD study where neither more appropriate exposure models,

371

nor adequate sets of measurements, are available, an alternative approach to study the association

372

between air pollution exposure and health outcomes, would be (in analogy to what was done in the

373

ESCAPE study, see for example Schikowski et al., 2014), to use road proximity metrics (Allen et

374

al., 2011; Gulliver et al., 2011) in combination with background NO2 concentrations estimated from

375

models developed for the regions.

AC C

EP

TE D

M AN U

SC

RI PT

353

376

18

ACCEPTED MANUSCRIPT

Abbreviations:

380

aR2, Adjusted R2

381

ARPAV, Agenzia Regionale per la Prevenzione e Protenzione Ambientale del Veneto

382

ESCAPE, European Study of Cohorts for Air Pollution Effects

383

GEIRD, Genes Environment Interaction in Respiratory Diseases (study)

384

GIS, Geographic information system

385

LOOCV, Leave-one-out cross-validation

386

LUR, Land-use regression

387

NO2, Nitrogen dioxide

388

RMSE, Root mean square error

SC M AN U

389

RI PT

379

Sources of financial support:

391

Dr. Marcon is the recipient of a European Respiratory Society Fellowship (STRTF 2014-4173),

392

which supported carrying out this work at the Imperial College London. The European Study of

393

Cohorts for Air Pollution Effects has received funding from the European Community’s Seventh

394

Framework Program (FP7/2007-2011) under grant agreement number: 211250.

EP

TE D

390

AC C

395

396

Acknowledgments:

397

We thank Ketty Lorenzet and Luca Zagolin, Osservatorio Aria, ARPAV, for double-checking the

398

coordinates of NO2 monitoring stations; Matteo Bellodi, Unità Agenti Fisici, ARPAV, for providing

399

Digital Terrain Model data.

400

401

Contributorship statement: 20

ACCEPTED MANUSCRIPT

402

AM, KdH, JG and ALH conceived the idea for this paper and planned the analyses. AM did the GIS

403

and statistical analyses and drafted the manuscript. AM, KdH, JG, RB and ALH discussed the

404

analysis, contributed to manuscript drafting and critically reviewed the manuscript.

Conflicts of Financial Interest:

407

none.

AC C

EP

TE D

M AN U

SC

406

RI PT

405

21

ACCEPTED MANUSCRIPT

REFERENCES

409 410 411

Akita Y, Baldasano JM, Beelen R, Cirach M, de Hoogh K, Hoek G, Nieuwenhuijsen M, Serre ML, de Nazelle A. Large scale air pollution estimation method combining land use regression and chemical transport modeling in a geostatistical framework. Environ Sci Technol. 2014;48:4452–9.

412 413

Allen RW, Amram O, Wheeler AJ, Brauer M. The transferability of NO and NO2 land use regression models between cities and pollutants. Atmos Environ. 2011;45:369–78.

414 415

Beelen R, Hoek G, Fischer P, van den Brandt PA, Brunekreef B. Estimated long-term outdoor air pollution concentrations in a cohort study. Atmos Environ 2007;41:1343–58.

416 417 418 419 420 421 422 423 424

Beelen R, Hoek G, Vienneau D, Eeftens M, Dimakopoulou K, Pedeli X, Tsai MY, Künzli N, Schikowski T, Marcon A, Eriksen KT, Raaschou-Nielsen O, Stephanou E, Patelarou E, Lanki T, Yli-Tuomi T, Declercq C, Falq G, Stempfelet M, Birk M, Cyrys J, von Klot S, Nádor G, Varró MJ, Dėdelė A, Gražulevičienė R, Mölter A, Lindley S, Madsen C, Cesaroni G, Ranzi A, Badaloni C, Hoffmann B, Nonnemacher M, Krämer U, Kuhlbusch T, Cirach M, de Nazelle A, Nieuwenhuijsen M, Bellander T, Korek M, Olsson D, Strömgren M, Dons E, Jerrett M, Fischer P, Wang M, Brunekreef B, de Hoogh, Kees. Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe – The ESCAPE project. Atmos Environ. 2013;72:10–23.

425 426

Beelen R, Hoek G, Pebesma E, Vienneau D, de Hoogh K, Briggs DJ. Mapping of background air pollution at a fine spatial scale across the European Union. Sci Total Environ. 2009;407:1852–67.

427 428 429

Beelen R, Voogt M, Duyzer J, Zandveld P, Hoek G. Comparison of the performances of land use regression modelling and dispersion modelling in estimating small-scale variations in long-term air pollution concentrations in a Dutch urban area. Atmos Environ. 2010;44:4614–21.

430 431 432

Briggs DJ, de Hoogh C, Gulliver J, Wills J, Elliott P, Kingham S, Smallbone K. A regression-based method for mapping traffic-related air pollution: application and testing in four contrasting urban environments. Sci Total Environ. 2000;253:151–67.

433 434

Caballero S, Esclapez R, Galindo N, Mantilla E, Crespo J. Use of a passive sampling network for the determination of urban NO2 spatiotemporal variations. Atmos Environ. 2012;63:148–55.

435 436 437 438 439 440 441 442 443

Cyrys J, Eeftens M, Heinrich J, Ampe C, Armengaud A, Beelen R, Bellander T, Beregszaszi T, Birk M, Cesaroni G, Cirach M, de Hoogh K, De Nazelle A, de Vocht F, Declercq C, Dėdelė A, Dimakopoulou K, Eriksen K, Galassi C, Grąulevičienė R, Grivas G, Gruzieva O, Gustafsson AH, Hoffmann B, Iakovides M, Ineichen A, Krämer U, Lanki T, Lozano P, Madsen C, Meliefste K, Modig L, Mölter A, Mosler G, Nieuwenhuijsen M, Nonnemacher M, Oldenwening M, Peters A, Pontet S, Probst-Hensch N, Quass U, Raaschou-Nielsen O, Ranzi A, Sugiri D, Stephanou EG, Taimisto P, Tsai MY, Vaskövi É, Villani S, Wang M, Brunekreef B, Hoek G. Variation of NO2 and NOx concentrations between and within 36 European study areas: results of the ESCAPE project. Atmos Environ. 2012;62:374–90.

444 445 446 447 448

de Marco R, Accordini S, Antonicelli L, Bellia V, Bettin MD, Bombieri C, Bonifazi F, Bugiani M, Carosso A, Casali L, Cazzoletti L, Cerveri I, Corsico AG, Ferrari M, Fois AG, Lo Cascio V, Marcon A, Marinoni A, Olivieri M, Perbellini L, Pignatti P, Pirina P, Poli A, Rolla G, Trabetti E, Verlato G, Villani S, Zanolin ME. The Gene-Environment Interactions in Respiratory Diseases (GEIRD) Project. Int Arch Allergy Immunol. 2010;152:255-63.

AC C

EP

TE D

M AN U

SC

RI PT

408

22

ACCEPTED MANUSCRIPT

Eeftens M, Beelen R, Fischer P, Brunekreef B, Meliefste K, Hoek G. Stability of measured and modelled spatial contrasts in NO2 over time. Occup Environ Med. 2011; 68:765–70.

451 452 453

Gulliver J, de Hoogh K, Fecht D, Vienneau D, Briggs D. Comparative assessment of GIS-based methods and metrics for estimating long-term exposures to air pollution. Atmos Environ. 2011;45:7072–80.

454 455 456

Gulliver J, de Hoogh K, Hansell A, Vienneau D. Development and back-extrapolation of NO2 land use regression models for historic exposure assessment in Great Britain. Environ Sci Technol. 2013;47:7804–11.

457 458 459

Henderson SB, Beckerman B, Jerrett M, Brauer M. Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter. Environ Sci Technol. 2007;41:2422–28.

460 461 462

Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, Briggs D. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ. 2008;42:7561–78.

463 464 465

Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, Sahsuvaroglu T, Morrison J, Giovis C. A review and evaluation of intraurban air pollution exposure models. J Expo Anal Environ Epidemiol. 2005;15:185–204.

466 467 468

Jerrett M, Burnett RT, Beckerman BS, Turner MC, Krewski D, Thurston G, Martin R V., Van Donkelaar A, Hughes E, Shi Y, Gapstur SM, Thun MJ, Pope CA. Spatial analysis of air pollution and mortality in California. Am J Respir Crit Care Med. 2013;188:593–9.

469 470 471

Madsen C, Gehring U, Håberg SE, Nafstad P, Meliefste K, Nystad W, Lødrup Carlsen KC, Brunekreef B. Comparison of land-use regression models for predicting spatial NOx contrasts over a three year period in Oslo, Norway. Atmos Environ; 2011;45:3576–83.

472 473 474 475

Marcon A, Girardi P, Ferrari M, Olivieri M, Accordini S, Bombieri C, Bortolami O, Braggion M, Cappa V, Cazzoletti L, Locatelli F, Nicolis M, Perbellini L, Sembeni S, Verlato G, Zanolin ME, de Marco R. Mild asthma and chronic bronchitis seem to influence functional exercise capacity: a multi-case control study. Int Arch Allergy Immunol. 2013;161:181-8.

476 477 478

Mölter a, Lindley S, de Vocht F, Simpson a, Agius R. Modelling air pollution for epidemiologic research--part II: predicting temporal variation through land use regression. Sci Total Environ. 2010;409:211–7.

479 480 481 482

Poplawski K, Gould T, Setton E, Allen R, Su J, Larson T, Henderson S, Brauer M, Hystad P, Lightowlers C, Keller P, Cohen M, Silva C, Buzzelli M. Intercity transferability of land use regression models for estimating ambient concentrations of nitrogen dioxide. J Expo Anal Environ Epidemiol. 2009;19:107-117.

483 484 485

Rava M, Crainicianu C, Marcon A, Cazzoletti L, Pironi V, Silocchi C, Ricci P, de Marco R. Proximity to wood industries and respiratory symptoms in children: a sensitivity analysis. Environ Int. 2012;38:37-44.

486 487

Ross Z, Jerrett M, Ito K, Tempalski B, Thurston G. A land use regression for predicting fine particulate matter concentrations in the New York City region. Atmos Environ. 2007;41:2255–69.

AC C

EP

TE D

M AN U

SC

RI PT

449 450

23

ACCEPTED MANUSCRIPT

Schikowski T, Adam M, Marcon A, Cai Y, Vierkötter A, Carsin AE, Jacquemin B, Al Kanani Z, Beelen R, Birk M, Bridevaux PO, Brunekeef B, Burney P, Cirach M, Cyrys J, de Hoogh K, de Marco R, de Nazelle A, Declercq C, Forsberg B, Hardy R, Heinrich J, Hoek G, Jarvis D, Keidel D, Kuh D, Kuhlbusch T, Migliore E, Mosler G, Nieuwenhuijsen MJ, Phuleria H, Rochat T, Schindler C, Villani S, Tsai MY, Zemp E, Hansell A, Kauffmann F, Sunyer J, Probst-Hensch N, Krämer U, Künzli N. Association of ambient air pollution with the prevalence and incidence of COPD. Eur Respir J. 2014;44:614–26.

495 496 497 498

Shmool JL, Michanowicz DR, Cambal L, Tunno B, Howell J, Gillooly S, Roper C, Tripathy S, Chubb LG, Eisl HM, Gorczynski JE, Holguin FE, Shields KN, Clougherty JE. Saturation sampling for spatial variation in multiple air pollutants across an inversion-prone metropolitan area of complex terrain. Environ Health. 2014;13:28.

499 500

Vienneau D, de Hoogh K, Beelen R, Fischer P, Hoek G, Briggs D. Comparison of land-use regression models between Great Britain and the Netherlands. Atmos Environ. 2010;44:688–96.

501 502 503 504 505 506 507

Wang M, Beelen R, Basagana X, Becker T, Cesaroni G, de Hoogh K, Dedele A, Declercq C, Dimakopoulou K, Eeftens M, Forastiere F, Galassi C, Gražulevičienė R, Hoffmann B, Heinrich J, Iakovides M, Künzli N, Korek M, Lindley S, Mölter A, Mosler G, Madsen C, Nieuwenhuijsen M, Phuleria H, Pedeli X, Raaschou-Nielsen O, Ranzi A, Stephanou E, Sugiri D, Stempfelet M, Tsai MY, Lanki T, Udvardy O, Varró MJ, Wolf K, Weinmayr G, Yli-Tuomi T, Hoek G, Brunekreef B. Evaluation of land use regression models for NO2 and particulate matter in 20 European study areas: the ESCAPE project. Environ Sci Technol. 2013;47:4357–64.

508 509 510 511 512 513 514

Wang M, Beelen R, Bellander T, Birk M, Cesaroni G, Cirach M, Cyrys J, de Hoogh K, Declercq C, Dimakopoulou K, Eeftens M, Eriksen KT, Forastiere F, Galassi C, Grivas G, Heinrich J, Hoffmann B, Ineichen A, Korek M, Lanki T, Lindley S, Modig L, Mölter A, Nafstad P, Nieuwenhuijsen MJ, Nystad W, Olsson D, Raaschou-Nielsen O, Ragettli M, Ranzi A, Stempfelet M, Sugiri D, Tsai MY, Udvardy O, Varró MJ, Vienneau D, Weinmayr G, Wolf K, Yli-Tuomi T, Hoek G, Brunekreef B. Performance of Multi-City Land Use Regression Models for Nitrogen Dioxide and Fine Particles. Environ Health Perspect. 2014;122:843-9.

515 516

Wang M, Beelen R, Eeftens M, Meliefste K, Hoek G, Brunekreef B. Systematic evaluation of land use regression models for NO₂. Environ Sci Technol. 2012;46:4481–9.

517 518 519 520

Wang M, Gehring U, Hoek G, Keuken M, Jonkers S, Beelen R, Eeftens M, Postma DS, Brunekreef B. Air Pollution and Lung Function in Dutch Children: A Comparison of Exposure Estimates and Associations Based on Land Use Regression and Dispersion Exposure Modeling Approaches. Environ Health Perspect. 2015 Apr 3. http://dx.doi.org/10.1289/ehp.1408541.

SC

M AN U

TE D

EP

AC C

521

RI PT

488 489 490 491 492 493 494

522

WEB REFERENCES

523

http://dati-censimentopopolazione.istat.it/

24

ACCEPTED MANUSCRIPT

524

TABLES

525 526

Table 1: Distribution of percent data capture and annual average NO2 concentrations at the selecteda regulatory stations in Veneto in 2010.

All sites

NO2 concentrations

mean±SD

mean±SD (µg/m3)

Median (min, max) (µg/m3)

Range/mean (%)

11 (23.4)

92.9±2.9

16.4±5.3

16.9 (7.3, 26.7)

118

22 (46.8)

94.2±4.7

29.9±5.9

29.2 (21.0, 40.0)

64

5 (10.6) 9 (19.2)

94.4±2.2 94.3±2.2

30.0±2.6 39.8±6.0

30.2 (26.4, 33.3) 39.1 (29.2, 46.8)

23 44

47

93.9±3.7

28.6±9.5

28.9 (7.3, 46.8)

138

RI PT

Rural background Urban background Industrial Street

Data capture (%)

N (%)

SC

Type of station

a

530

Table 2: regression coefficients of the LUR model developed using regulatory NO2 data for 2010.

534 535 536

monitoring stations with <75% data capture or located at <5km from the regional border were excluded

Estimate

SE

Intercept Buildings (5,000) Industry (1,000) Transformed altitude Length of roads (100) 1/distance to motorways

16.91269 .0007552 9.15e-06 -11.05696 .0099249 1611.996

2.177982 .0001349 2.57e-06 3.519141 .0041146 731.9124

t

p-value R2 (aR2) a

VIF

7.77 5.60 3.56 -3.14 2.41 2.20

<0.001 <0.001 0.001 0.003 0.020 0.033

1.18 1.12 1.14 1.06 1.16

0.45 (0.44) 0.61 (0.59) 0.68 (0.66) 0.72 (0.70) 0.75 (0.72)

R2 (and aR2) of the model with the predictor plus all the predictors that had previously entered the model

EP

a

TE D

Predictor

AC C

531 532 533

M AN U

527 528 529

Table 3: annual average NO2 concentrations (µg/m3) at the 41 regulatory monitoring stations in Veneto that were active throughout the 2008-2011 period. Type of station

N.

2008

2009

2010

2011

2008-11

Rural background Urban background Industrial Street

10 20 3 8

19.4 (7.4) 32.7 (7.3) 32.8 (4.8) 44.3 (6.6)

18.1 (6.4) 31.4 (6.1) 31.5 (2.8) 42.2 (6.5)

17.3 (4.6) 30.1 (5.9) 30.7 (2.3) 39.5 (6.4)

16.9 (6.0) 31.6 (6.7) 30.7 (1.2) 39.1 (9.6)

17.9 (5.8) 31.4 (6.2) 31.5 (2.8) 41.3 (6.9)

All stations

41

31.7 (10.8)

30.3 (10.0)

28.9 (9.3)

29.4 (10.3)

30.1 (9.9)

537 25

ACCEPTED MANUSCRIPT

538

540

FIGURES

541

For colour reproduction on the Web:

542

Figure 1: map of the study area.*

RI PT

539

544

AC C

EP

TE D

M AN U

SC

543

545

* panel A, Italy, with the Veneto region marked in light blue; panel B, Veneto region, with the

546

Verona province (hatched area) and the Verona municipality (white inner area) marked; panel C,

547

Verona municipality. Brown lines represent motorways, circles and triangles represent regulatory

548

and ESCAPE sites, respectively. Green, yellow, grey, and red symbols represent rural background,

549

urban background, industrial, and street type sites, respectively.

550 26

ACCEPTED MANUSCRIPT

551

For black-and-white reproduction in print:

552

Figure 1: map of the study area.*

EP

554

TE D

M AN U

SC

RI PT

553

* panel A, Italy, with the Veneto region marked in grey; panel B, Veneto region, with the Verona

556

province (hatched area) and the Verona municipality (white inner area) marked; panel C, Verona

557

municipality. Thick grey lines represent motorways, circles and triangles represent regulatory and

558

ESCAPE sites, respectively. White, grey, dotted grey, and black symbols represent rural

559

background, urban background, industrial, and street type sites, respectively.

560

AC C

555

27

561

ACCEPTED MANUSCRIPT

Figure 2: distribution of measured NO2 concentrations at the ESCAPE and regulatory sites.

100

RI PT

80

60

SC

40

20

0 RB

UB

M AN U

3

NO2 concentrations (µg/m )

ESCAPE regulatory

S

RB

UB

I

S

site type

562

TE D EP

564

RB, rural background; UB, urban background; S, street; I, industrial

AC C

563

28

ACCEPTED MANUSCRIPT

565

Figure 3: cumulative distribution of LUR predictors at the regulatory stations (plus symbols) and at

566

ESCAPE sites (circles). A, buildings in 5,000 m; B, industry in 1,000 m; C, transformed altitude; D,

567

length of roads in 100 m; E, 1/distance to motorways.

RI PT

A

B

M AN U

SC

C

D

-2

-1

0

TE D

E

1

2

3

4

5

6

Predictor values*

568

* to fit all distributions on one single graph for comparative purposes, each predictor was centred

570

and standardized on its distribution at the regulatory stations (ST), as follows:

571

 =

572

value of the predictor at the monitoring stations, and a 1-unit difference corresponds to a 1-SD

573

difference.

EP

569

AC C

 ( ) ( )

. As a consequence, a value of 0 corresponds to the mean

574 575

29

ACCEPTED MANUSCRIPT

576

Figure 4: comparison of NO2 concentrations predicted by the model transferred to ESCAPE sites

577

with measured concentrations. Panel A, all ESCAPE site types (n=40); panel B, only rural and

578

urban background sites (n=17).*

90

90

80

80

70

70

60

60

50

50

40

40

B

RI PT

100

A

M AN U 30

30 2

R =0.18

20 25

30

35

40

45 3

50

2

R =0.52

20

25

30

35

40

45

50

3

predicted NO 2 (µg/m )

TE D

predicted NO 2 (µg/m )

579

SC

3

measured NO 2 (µg/m )

100

* hollow, grey and black circles indicate rural background, urban background and street sites,

581

respectively

AC C

EP

580

30

ACCEPTED MANUSCRIPT

582

Figure 5: comparison of R2 values across the models applied to years 2008, 2009, 2011, and to

583

2008-11.*

2008 (43) 0.80

0.65 0.60 2008-11 (41)

TE D

M AN U

0.55

SC

0.70

RI PT

0.75

2011 (52)

584

2009 (46)

uncalibrated recalibrated 2010 (47)

* the numbers in brackets close to the period represent the number of stations available. The

586

performance of the model developed for 2010 is represented by a solid grey line.

588

589

AC C

587

EP

585

31

ACCEPTED MANUSCRIPT Highlights Land-use regression (LUR) is often used to estimate urban air pollution exposure



No studies have looked at transferability of LUR models from regions to cities



We developed a LUR model using NO2 regulatory data for a region of Italy for 2010



When transferred to a inner city, the model was unable to capture NO2 variability



LUR models should not be transferred to nested areas with different characteristics

AC C

EP

TE D

M AN U

SC

RI PT