Accepted Manuscript Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy Alessandro Marcon, Kees de Hoogh, John Gulliver, Rob Beelen, Anna L. Hansell PII:
S1352-2310(15)30429-5
DOI:
10.1016/j.atmosenv.2015.10.010
Reference:
AEA 14168
To appear in:
Atmospheric Environment
Received Date: 23 December 2014 Revised Date:
5 October 2015
Accepted Date: 6 October 2015
Please cite this article as: Marcon, A., de Hoogh, K., Gulliver, J., Beelen, R., Hansell, A.L., Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy, Atmospheric Environment (2015), doi: 10.1016/j.atmosenv.2015.10.010. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
1
ACCEPTED MANUSCRIPT Development and transferability of a nitrogen dioxide land use regression model
2
within the Veneto region of Italy
3
Authors' full names:
5
Alessandro Marcon1,2 (
[email protected]), Kees de Hoogh2,3,4* (
[email protected]),
6
John Gulliver2* (
[email protected]), Rob Beelen5,6 (
[email protected]), Anna L.
7
Hansell2,7 (
[email protected])
8
* These authors contributed equally to this work
M AN U
SC
RI PT
4
9
10
Authors' affiliations:
11
1) Unit of Epidemiology and Medical Statistics, Department of Diagnostics and Public Health,
13 14
University of Verona, Verona, Italy
TE D
12
2) MRC-PHE Centre for Environment and Health, Department of Epidemiology & Biostatistics, School of Public Health, Imperial College London, London, UK 3) Swiss Tropical and Public Health Institute, Basel, Switzerland
16
4) University of Basel, Basel, Switzerland
17
5) Institute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands
18
6) National Institute for Public Health and the Environment, Bilthoven, The Netherlands
19
7) Imperial College Healthcare NHS Trust, London, UK
AC C
EP
15
20
21
22
1
ACCEPTED MANUSCRIPT
23
Corresponding author:
24
Alessandro Marcon, Unit of Epidemiology and Medical Statistics, Department of Diagnostics and
25
Public Health, University of Verona, Strada Le Grazie 8, 37134 Verona, Italy. Telephone: +39 045
26
8027668. E-mail:
[email protected]
28
Running title: Transferability of a nitrogen dioxide LUR model
30
Abstract word count: 267 words
31
Manuscript word count: 4090 words
SC
29
RI PT
27
M AN U
32
33
34
AC C
EP
TE D
35
2
ACCEPTED MANUSCRIPT
ABSTRACT
37
When measurements or other exposure models are unavailable, air pollution concentrations could
38
be estimated by transferring land-use regression (LUR) models from other areas. No studies have
39
looked at transferability of LUR models from regions to cities. We investigated model
40
transferability issues. We developed a LUR model for 2010 using annual average nitrogen dioxide
41
(NO2) concentrations retrieved from 47 regulatory stations of the Veneto region, Northern Italy. We
42
applied this model to 40 independent sites in Verona, a city inside the region, where NO2 had been
43
monitored in the European Study of Cohorts for Air Pollution Effects (ESCAPE) during 2010. We
44
also used this model to estimate average NO2 concentrations at the regulatory network in 2008,
45
2009 and 2011. Of 33 predictor variables offered, five were retained in the LUR model (R2=0.75).
46
The number of buildings in 5,000 m buffers, industry surface area in 1,000 m buffers and altitude,
47
mainly representing large-scale air pollution dispersion patterns, explained most of the spatial
48
variability in NO2 concentrations (R2=0.68), while two local traffic proxy indicators explained little
49
of the variability (R2=0.07). The performance of this model transferred to urban sites was poor
50
overall (R2=0.18), but it improved when only predicting inner-city background concentrations
51
(R2=0.52). Recalibration of LUR coefficients improved model performance when predicting NO2
52
concentrations at the regulatory sites in 2008, 2009 and 2011 (R2 between 0.67 and 0.80). Models
53
developed for a region using NO2 regulatory data are unable to capture small-scale variability in
54
NO2 concentrations in urban traffic areas. Our study documents limitations in transferring a regional
55
model to a city, even if it is nested within that region.
SC
M AN U
TE D
EP
AC C
56
RI PT
36
3
ACCEPTED MANUSCRIPT
57
Keywords
58
Air pollution; Ambient air; Environmental exposure; Exposure assessment; Geographic Information
59
Systems; Land Use Regression.
AC C
EP
TE D
M AN U
SC
RI PT
60
4
ACCEPTED MANUSCRIPT
1. INTRODUCTION
62
Regression mapping, also called Land Use Regression (LUR) modelling, is one commonly used
63
exposure assessment technique (Jerrett et al., 2005). LUR has been recently employed in the
64
European Study of Cohorts for Air Pollution Effects (ESCAPE), where models for several air
65
pollutants, including nitrogen dioxide (NO2), have been developed based on dedicated monitoring
66
campaigns (Beelen et al., 2013) and applied (Schikowski et al., 2014) to estimate residential
67
outdoor air pollutant concentrations. With respect to interpolation methods like ordinary kriging,
68
LUR is generally better in capturing small-scale variability in air pollution, especially in urban
69
areas, where air pollution concentrations vary widely across short distances, and when the
70
monitoring network is relatively sparse (Jerrett et al., 2005; Gulliver et al., 2011; Akita et al., 2014;
71
Hoek et al., 2008). With respect to air dispersion models, LUR models have less demanding data,
72
software and hardware requirements (Beelen et al., 2010; Wang et al. 2015).
73
Ideally, developing a LUR model would need a dense network of monitors, possibly providing an
74
independent test set for model validation. Monitor locations should be selected that appropriately
75
reflect the variation in air pollution concentrations at the residential addresses of the study
76
population (using e.g. saturation sampling, Shmool et al., 2014). In practice, researchers are often
77
relying on available data from regulatory monitoring networks. Although these networks may
78
provide a wide temporal coverage, they usually have a sparse spatial coverage that may be poorly
79
representative of residential areas. The alternative is undertaking expensive ad hoc monitoring,
80
which might have limited temporal coverage (Jerrett et al., 2005).
81
The Genes Environment Interaction in Respiratory Diseases (GEIRD) population study is carried
82
out in 8 centres in Italy, aimed at investigating environmental and genetic determinants of
83
respiratory chronic inflammatory diseases (de Marco et al., 2010). Clinical examinations started in
84
2008 and are ongoing (Marcon et al., 2013). For three centres, including Verona (Northern Italy),
85
LUR models developed in ESCAPE could be used to estimate air pollution exposure of study
AC C
EP
TE D
M AN U
SC
RI PT
61
5
ACCEPTED MANUSCRIPT
participants (Beelen et al., 2013). For centres where neither exposure models, nor dedicated
87
monitoring campaigns, nor regulatory networks of adequate density are available, one option could
88
be to develop LUR models using available regulatory data for a larger area or region. However,
89
transferring models from regions to urban (although inner) areas requires a preliminary evaluation.
90
A number of studies have transferred LUR models between cities (Briggs et al., 2000; Jerrett et al.,
91
2005; Poplawski et al., 2009; Allen et al., 2011) or countries (Vienneau et al., 2010; Wang et al.,
92
2014). In some cases, both predictor variables and coefficients of LUR models were transferred
93
(Allen et al., 2011; Jerrett et al., 2005), while in others regression coefficients were recalibrated
94
using air pollution measurements from the target area (Briggs et al., 2000; Poplawski et al., 2009;
95
Vienneau et al., 2010). The performance of transferred models was highly variable, ranging from
96
very poor (Jerrett et al., 2005) to satisfactory (Briggs et al., 2000), but it was generally lower than
97
that of models developed for a specific area (Poplawski et al., 2009). To our knowledge, few studies
98
have attempted to transfer LUR models from a region to a smaller, nested area (Wang et al., 2014),
99
and none of these have examined transferability from regions to cities.
TE D
M AN U
SC
RI PT
86
In this study, we developed a LUR model for NO2 using regulatory monitoring data for 2010 from
101
the Veneto region, a large administrative area including the municipality of Verona (Figure 1), and
102
we studied its performance on the set of measurements carried out in Verona during the same year
103
in ESCAPE. We also studied whether this model could be used to estimate NO2 concentrations in
104
other years of the 2008-2011 period.
AC C
105
EP
100
6
ACCEPTED MANUSCRIPT
2. METHODS
107
2.1 Study area
108
The Veneto region has a surface of 18,407 km2 and about 4.8 million inhabitants (data on the 2011
109
census of the Italian population, available at http://dati-censimentopopolazione.istat.it/, last
110
accessed on 11 June 2015). It is divided into 7 provinces. The Verona municipality, inside the
111
Verona province, is 198.9 km2 large and has 252,520 inhabitants (Figure 1).
112
2.2 Air pollution data
113
Daily and annual average NO2 concentrations for years 2008-2011 measured by 53 regulatory
114
stations in Veneto were retrieved from the European air quality database website
115
(http://www.eea.europa.eu/). Of these, 41 were active throughout the whole period. The stations are
116
operated by the regional environmental agency (Agenzia Regionale per la Prevenzione e
117
Protenzione Ambientale del Veneto, [ARPAV]) and detect NO2 by chemiluminescence. They are
118
classified into regional background, urban (including suburban) background, industrial, and street
119
stations. Stations with ≤75% daily average concentrations were not considered. Since most of the
120
geographic information system (GIS) data were available only within the region and the largest
121
buffer for calculation of predictors was 5 km, stations that were at <5 km from the regional border
122
were not considered to avoid missing data in these buffers. Coordinates were retrieved from
123
ARPAV and checked on satellite maps.
124
NO2 concentrations measured in ESCAPE are described elsewhere (Cyrys et al., 2012). Briefly,
125
measurements had been conducted by passive sampling (Ogawa badges) in Verona, at 40 sites
126
classified as regional background, urban background and street sites, between January and June
127
2010, during three 14-day periods. Annual average concentrations had been calculated for all sites
128
correcting for temporal variation, using measurements obtained from a background regulatory
AC C
EP
TE D
M AN U
SC
RI PT
106
7
ACCEPTED MANUSCRIPT
station that was operated in Verona year-round (Cyrys et al., 2012). This improves comparability
130
between annual average concentrations measured by passive samplers and regulatory stations.
131
For each regulatory station, annual average concentrations were also estimated using daily
132
regulatory data for the same 3 ESCAPE sampling periods, applying the ESCAPE temporal
133
adjustment procedure (Cyrys et al., 2012).
134
2.3 Potential predictor variables
135
A GIS (Arc Map 10.1 software, ESRI, Redlands, California), set to the Monte Mario Italy 1
136
projected coordinate system (the regional standard), was used to generate predictor variables at the
137
coordinates of the regulatory and ESCAPE sites. Traffic data were not available. The set of 67
138
potential predictor variables extracted were:
SC
•
M AN U
139
RI PT
129
Altitude, which was transformed according to Beelen et al. (2009): √(nalt/max(nalt)), where nalt = altitude - min(altitude); the data source was the regional Digital Terrain Model
141
database (5 x 5 m cells);
142
•
TE D
140
Surface area (m2) of pre-defined land cover classes (high/low density residential land combined; industry; port; urban green; semi-natural and forested areas, alone and combined;
144
water) in 100, 300, 500, 1000 and 5000 m buffers; the data source was the regional land
145
cover database (http://idt.regione.veneto.it/, 1:10,000 resolution), which uses the
146
Coordination of Information on the Environment (CORINE) classification; •
AC C
147
EP
143
Population (census 2011), number of buildings and households (census 2001) in 100, 300,
148
500, 1000 and 5000 m buffers (http://www.istat.it/, census tract geometries: 1:5.000 and
149
1:25000 resolution in urban and less populated areas, respectively);
150 151
•
Length of roads/motorways (m) in 25, 50, 100, 300, 500 and 1000 m buffers; inverse distance (1/m) and inverse distance squared (1/m2) to roads/motorways; derived from the
8
ACCEPTED MANUSCRIPT
152
regional road network (http://idt.regione.veneto.it/, 1:10,000 resolution). A finer
153
classification of road types was not available (see Limitations, Discussion section). Buffer sizes and land cover coding were based on the ESCAPE protocol (Beelen et al., 2013).
155
2.4 LUR model development
156
A LUR model using annual average NO2 concentrations from the 47 regulatory stations that were
157
active in 2010 and fulfilled the inclusion criteria was developed. It was not possible to develop a
158
LUR model using regulatory data for the Verona municipality, where there were only 3 stations
159
(Figure 1). For sensitivity analyses, the model was re-developed after excluding the 5 industrial
160
monitoring stations, or the 9 stations in the Verona province. Finally, a LUR model was developed
161
using ESCAPE measurements. The latter differed from the published model for Verona (Beelen et
162
al., 2013) because some predictor variables (e.g. traffic counts) were not available for the region.
163
A priori GIS predictors were required to have values >0 for at least 20% of the sites. Redundant
164
predictors were excluded (Henderson et al., 2007): after ranking the predictors in each category
165
(e.g. buildings) according to the absolute value of the Pearson’s r coefficient (|r|) for their
166
correlation with NO2, any variables that were correlated (|r| >0.6) with the highest ranking variable
167
in the same category were excluded.
168
A supervised forward stepwise procedure was followed (Beelen et al., 2013). First, univariate linear
169
regression models were run for all predictors, and the model with the highest adjusted coefficient of
170
determination R2 (adjusted R2: aR2) and the regression coefficient with the pre-defined sign was
171
selected as the starting model. Then, the remaining predictors were offered one at a time, and the
172
model with the highest aR2 was selected, provided that the aR2 increase was >0.01, and that all
173
regression coefficients had the pre-defined signs. When no predictor further increased model’s aR2
174
of >0.01, variables with p >0.10 were sequentially removed. If variables in the same category but
175
different buffers were included in the final model, the variable in the largest buffer was replaced
AC C
EP
TE D
M AN U
SC
RI PT
154
9
ACCEPTED MANUSCRIPT
with a doughnut-buffered variable (Beelen et al., 2013). Diagnostic tests were applied to check for
177
multi-collinearity (predictors with Variance Inflation Factor [VIF]>3 were dropped), influential
178
observations (Cook’s D), heteroskedasticity, non-normality and spatial autocorrelation of residuals
179
(Moran’s I) (Beelen et al., 2013).
180
Model performance was evaluated by R2, aR2, and root mean square error (RMSE), and by leave-
181
one-out cross validation (LOOCV). In LOOCV the model was calibrated N (number of
182
measurements) times, using N – 1 measurements per time, and NO2 concentrations were predicted
183
at each of the left-out stations using these models. Then, the measured vs predicted linear fit was
184
calculated and the LOOCV-R2 and RMSE were reported.
185
The statistical analyses were performed using STATA 13.1 (StataCorp, College Station, TX) and R
186
3.1.0 (The R Foundation for Statistical Computing).
187
2.5 Model transferability
188
To study spatial transferability, model performance was evaluated at ESCAPE sites in Verona: the
189
LUR model was applied to predict NO2 concentrations at these sites, and the correlation between
190
measured (i.e. ESCAPE) and predicted concentrations was analysed. Since one common first-line
191
approach to study air pollution effects is to use rough proxy indicators of exposure based on
192
characteristics of the residential area (e.g. rural/urban or traffic intensity indicators derived by
193
questionnaires), the proportion of variability (R2) in NO2 concentrations explained by ESCAPE site
194
classification (rural background, urban background, street site) was also calculated, for reference,
195
using linear regression.
196
To study transferability over time, the correlation between concentrations predicted for 2010 and
197
concentrations measured for 2008, 2009, 2011, and 2008-11 were analysed at the regulatory sites.
198
Moreover, predictors’ coefficients were recalibrated by forcing variables selected for 2010 into
199
models with NO2 concentrations for the other years as dependent variables, and the measured vs
AC C
EP
TE D
M AN U
SC
RI PT
176
10
ACCEPTED MANUSCRIPT
predicted concentrations were analysed. Both these analyses were done using data from all available
201
stations.
202
Predictor values at ESCAPE sites, as well as at regulatory stations that were inactive in 2010, that
203
were outside the range of values at the 2010 regulatory stations were truncated to the maximum (or
204
minimum) measured value, since the linear relationship between the predictors and NO2
205
concentrations might not hold outside of this range (Akita et al., 2014). This procedure has been
206
shown to generally improve model performance (Wang et al., 2013).
AC C
EP
TE D
M AN U
SC
207
RI PT
200
11
ACCEPTED MANUSCRIPT
208
3. RESULTS
210
3.1 Air pollution data and GIS predictors
211
Among the 47 regulatory stations used to develop the LUR model, rural (23.4%) and urban
212
background (46.8%) stations were the most represented (Table 1). On average, percent data capture
213
was 93.9±3.7%. Annual average NO2 concentrations were the lowest at rural background (16.4
214
µg/m3) and the highest at street stations (39.8 µg/m3). Overall, NO2 concentrations were lower at
215
the regulatory stations (range: 7.3 to 46.8 µg/m3) than at ESCAPE sites (range: 16.3 to 100.1 µg/m3)
216
(Figure 2). Most of the 40 ESCAPE monitoring sites were street sites (n=23, 57.5%), whereas 14
217
(35%) and 3 (7.5%) were urban and rural background sites, respectively.
218
The distribution of all the 67 potential predictors extracted at the regulatory stations is described in
219
Table A1 (Appendix). Of these predictor variables, 15 and 19 were dropped because they had too
220
many zeroes or redundant information, respectively.
221
3.2 LUR model development
222
Five of the remaining 33 predictors offered were retained in the model (Table 2). The number of
223
buildings in 5,000 m buffers was the first variable to enter, accounting for almost half of the
224
variability in NO2 concentrations (R2 = 0.44). Small-scale traffic indicators (length of roads in 100
225
m buffers and inverse distance to motorways) entered the model last and they accounted for little of
226
the variability (gain in R2 = 0.07). Predictors had small VIF values (Table 2), and there were no
227
influential observations (maximum Cook’s D= 0.61). There was no spatial autocorrelation of
228
residuals (Moran’s I = -0.002, p = 0.53). Model statistics were R2 = 0.75, aR2 = 0.72 and RMSE =
229
4.97 µg/m3. The LOOCV R2 and RMSE were 0.64 and 5.69 µg/m3, respectively.
230
3.4 Spatial transferability
AC C
EP
TE D
M AN U
SC
RI PT
209
12
ACCEPTED MANUSCRIPT
With the exception of 3 outliers for industry in 1,000 m buffers (Figure 3, B), the range of predictor
232
values at the ESCAPE sites was within the distribution at regulatory stations (also see Table A1). In
233
the case of the number of buildings in 5,000 m buffers (A) and transformed altitude (C), the range at
234
ESCAPE sites was particularly narrow. Road length in 100 m buffers (D) showed a fairly similar
235
distribution between the two sets, whereas high values of 1/distance (i.e. points at small distance) to
236
motorways were poorly represented in both.
237
When transferred to the ESCAPE sites within Verona, the LUR model showed a poor performance
238
(coefficient = 1.61, R2=0.18) (figure 4, A) compared to the model based on site classification
239
(R2=0.35). When ESCAPE street sites were not considered, the performance improved (coefficient
240
= 1.16, R2=0.52) (figure 4, B). On average, the LUR model underestimated NO2 concentrations by
241
10.7±16.2 µg/m3 at street sites, whereas it overestimated concentrations by 4.1±5.2 µg/m3 at
242
background sites. With the exception of inverse distance to motorways, the contributions of the
243
predictor variables to the total concentration predicted increased when evaluated at the background
244
sites (Figure A1 in the Appendix).
245
The models developed for sensitivity analyses excluding industrial stations and stations in the
246
Verona province, respectively, are described in Table A2 (Appendix). Although they had a better R2
247
than the main model, their performance at ESCAPE (especially background) sites was worse.
248
3.5 Transferability over time
249
Concentrations measured for different years in 2008-11 at the 41 stations that were operated during
250
the whole period (Table 3) were highly correlated (all p<0.001): r coefficients ranged between 0.90
251
(2008 vs 2011) and 0.98 (2009 vs 2010). The performance (R2) of the model developed for 2010 in
252
the period 2008-2011 is illustrated in Figure 5. Recalibration improved model performance. The R2
253
values of models transferred to 2008 and 2009 were lower than 0.75 (the R2 of the original model),
254
whereas R2 values were larger than 0.75 when the model was applied to 2011.
AC C
EP
TE D
M AN U
SC
RI PT
231
13
ACCEPTED MANUSCRIPT
255
4. DISCUSSION
257
We developed a LUR model for annual average NO2 concentrations for 2010 for a region of Italy
258
using regulatory monitoring data, and we transferred this model to an urban area nested within that
259
region. We found that the transferred model was unable to capture the spatial variability in NO2
260
concentrations at the urban traffic sites. When tested only at the background sites, the transferred
261
model explained about half of the variability in NO2 concentrations (R2=0.52).
262
The lesson learned from the previous evaluations of LUR model transferability is that the
263
performance of transferred models strongly depends on the similarity between areas for
264
topographical and meteorological characteristics, urbanisation level, land cover, vehicle fleet mix,
265
etc (Briggs et al., 2000; Jerrett et al., 2005; Poplawski et al., 2009; Vienneau et al., 2010; Allen et
266
al., 2011; Wang et al., 2014). What the present study adds to previous literature is that
267
transferability issues may arise even when models developed for a region are transferred to cities
268
nested within that region, which prima facie could be thought to be a reasonable option in the
269
common situation that there are neither enough regulatory monitoring stations in a city to construct
270
a model, nor funds to conduct monitoring campaigns.
271
In our study, the same GIS data were used both for model development and transfer, which
272
eliminated one potential source of bias. We developed models using data from monitoring stations
273
that were working continuously throughout 2010. This is an advantage with respect to modelling
274
annual concentrations estimated on the basis of short campaigns conducted using passive samplers,
275
a less reliable measurement method. The drawback is that many of these stations are generally
276
located to represent average urban concentrations. Using such data may fail to capture hotspots
277
(Poplawski et al., 2009). Hence, studies have generally used regulatory air pollution data mainly to
278
develop LUR models for large regions or countries (Ross et al., 2007; Vienneau et al., 2010).
AC C
EP
TE D
M AN U
SC
RI PT
256
14
ACCEPTED MANUSCRIPT
4.1 Spatial transferability
280
Only 19.2% of the regulatory stations were classified as street stations, which implies that the
281
monitoring network mostly represented background concentrations. This is also reflected by the
282
type of predictors selected in the model and their relative contributions to overall performance. The
283
number of buildings in 5,000 m buffers, a large-scale indicator of urbanization, was the first
284
predictor to enter the model and it explained roughly half of NO2 variability. The 2nd (industry in
285
1,000 m buffers) and 3rd (transformed altitude) variables were also large-scale predictors. Finally,
286
two small-scale traffic indicators only accounted for 7% of NO2 variability. This pattern of selected
287
predictors and their relative contributions to the explained variability in NO2 concentrations is
288
unusual for LUR models developed in urban areas, where traffic indicators are generally the most
289
important (Hoek et al., 2008; Beelen et al., 2013), while it is more similar to the pattern found by
290
Vienneau et al. (2010) in the models developed for countries using regulatory data. Interestingly,
291
when a LUR model was developed on ESCAPE measurements using the same predictor variables
292
available for the region (Table S2), the only traffic indicator entering the model, length of roads in
293
50 m buffers, just explained 20% of NO2 variability, while two large-scale predictors (industry and
294
population in 5,000 m buffers) were selected. Thus, the lack of traffic data or the poor quality of
295
street network data, or some regional characteristic (e.g. the low air pollution dispersion in the
296
Italian Po Valley, Beelen et al., 2009) may have contributed to this result.
297
Model performance based on cross validation was good (R2 = 0.75, LOOCV-R2 = 0.64). However,
298
when applied to ESCAPE sites in Verona, performance dropped (R2 = 0.18). This was mainly due
299
to the presence of several street sites where NO2 concentrations were very high (60 to 100 µg/m3,
300
well outside the range of regional concentrations). After excluding street sites from the ESCAPE
301
test set, model performance improved, although the model was only able to capture about half of the
302
spatial variability in NO2 concentrations (R2 = 0.52).
AC C
EP
TE D
M AN U
SC
RI PT
279
15
ACCEPTED MANUSCRIPT
As previously mentioned, dissimilarities between areas can explain these results. First, the
304
distribution of monitoring sites was completely different between the regional and ESCAPE sites
305
(street sites were 19.2% vs 57.5% respectively). This was reflected in different distributions of NO2
306
concentrations (Figure 2). This difference is unlikely to be due to the diverse monitoring time (an
307
entire year vs short-term monitoring campaigns) or measurement principle (chemiluminescence vs
308
passive monitoring) (Gulliver et al., 2013). In fact, annual average concentrations estimated using
309
daily regulatory data from the three ESCAPE monitoring periods and calculated using the whole
310
2010 time series were very highly correlated (r=0.98, mean difference -5.4±2.3 µg/m3). Moreover,
311
the ESCAPE measurements were adjusted using data from a background regulatory station that
312
operated year-round (see Methods).
313
A second explanation for the poor performance of the transferred model is the different distribution
314
of the predictors at the regional and ESCAPE sites. The density of industries, population and
315
buildings were higher in Verona than in the region (Table A1 in the Appendix). Virtually all values
316
of the predictors entering the model were within the range of values represented at the regional
317
network. The exception was industry in 1,000 m buffers, where however truncation to the
318
maximum value only affected 3 (7.5%) sites. Interestingly, some predictors (buildings in 5,000 m
319
buffers and transformed altitude) showed much lower variability at ESCAPE sites. We speculate
320
that differences in NO2 concentrations affected transferability more than differences in predictor
321
values, since most predictors explained a substantial amount of the variability in NO2
322
concentrations when tested on background sites (Figure A1), despite their low variation. The
323
inclusion of inverse distance to motorways in the model was probably driven by a few street
324
stations that were close to motorways (see Figures 1; and Figure 3, panel E), which is a
325
consequence of the shape of the inverse distance function (Rava et al., 2012). In this view, stations
326
at a certain distance to motorways virtually gave no contribution. This is probably the reason why
327
this variable had no predictive power either at all (R2 = 0.006) or at background ESCAPE sites
328
(R2=0.002) (Figure A1, panels E1 and E2).
AC C
EP
TE D
M AN U
SC
RI PT
303
16
ACCEPTED MANUSCRIPT
Two further reasons for the poor performance of the transferred model could be valid. First,
330
industrial stations may be poorly representative of urban areas. Second, since monitoring networks
331
in different provinces within the region are operated by separate ARPAV departments, there may be
332
differences in the data from different areas. The two sensitivity analyses conducted, excluding
333
either industrial stations or stations in the Verona province, seem to rule these hypotheses out.
334
4.2 Transferability over time
335
Our findings suggest that the LUR model can be used to estimate NO2 concentrations at the
336
regulatory stations in the 2008-2011 period. This was expected, since annual NO2 concentrations
337
were highly correlated and stable over time. Changes in urbanisation, vehicle technologies or fleet
338
mix, which may cause spatial contrasts in NO2 to change over time (Caballero et al., 2012), are
339
negligible over short periods (Beelen et al., 2007; Madsen et al., 2011). Better estimates were
340
obtained by recalibrating the model (Briggs et al., 2000; Mölter et al., 2010). Indeed, spatial
341
contrasts in air pollution appear to be relatively stable across long periods (Eeftens et al., 2011;
342
Gulliver et al., 2013).
343
4.3 Limitations
344
The availability and quality of input data is crucial for the development of any predictive model
345
(Wang et al., 2014). Traffic data, which is the most important input of LUR models (Hoek et al.,
346
2008) were not available for this study. Road classification in the regional network also appeared to
347
have a questionable quality. Several road attributes (roadway width, administrative classification
348
and technical/functional classification) were tested to identify primary roads other than motorways.
349
However, when mapped, highways showed discontinuous patterns, and they were consequently not
350
offered into the model. Neither building height nor street configurations were available. All in all,
351
this may have contributed to poor model performance especially at street sites. Cross-validation is
352
known to overestimate model performance (Wang et al., 2012, 2013). However, an independent set
AC C
EP
TE D
M AN U
SC
RI PT
329
17
ACCEPTED MANUSCRIPT
of regulatory stations was not available for external validation. Model selection via
354
deletion/substitution/addition algorithms might have prevented over-fitting, which is an issue when
355
LUR models are developed using air pollution data from all the available monitoring sites, even
356
when model performance is evaluated by hold-out validation (Jerrett et al., 2013). However, this
357
model selection strategy was not deemed to be very appropriate in our study due to the small
358
number of monitoring sites.
359
4.4 Conclusions
360
Our model, developed for a region of northern Italy using NO2 regulatory data, was unable to
361
capture small-scale variability in NO2 concentrations at traffic sites when transferred to a city nested
362
within that region. This finding supports previous evidence advising against transferring LUR
363
models to areas with different characteristics and ranges of air pollution concentrations. This
364
includes transferring models from regions with a high proportion of background and rural sites to
365
cities (where traffic sources dominate), even if they are nested within those regions. In such
366
situations, additional monitoring needs to be conducted and a new LUR model constructed for use
367
in epidemiological studies. Transferring models developed for a region could still be useful to
368
maximize exposure gradients when selecting sampling sites for ad hoc monitoring campaigns
369
(Allen et al., 2011; Shmool et al., 2014).
370
In the centres participating in the GEIRD study where neither more appropriate exposure models,
371
nor adequate sets of measurements, are available, an alternative approach to study the association
372
between air pollution exposure and health outcomes, would be (in analogy to what was done in the
373
ESCAPE study, see for example Schikowski et al., 2014), to use road proximity metrics (Allen et
374
al., 2011; Gulliver et al., 2011) in combination with background NO2 concentrations estimated from
375
models developed for the regions.
AC C
EP
TE D
M AN U
SC
RI PT
353
376
18
ACCEPTED MANUSCRIPT
Abbreviations:
380
aR2, Adjusted R2
381
ARPAV, Agenzia Regionale per la Prevenzione e Protenzione Ambientale del Veneto
382
ESCAPE, European Study of Cohorts for Air Pollution Effects
383
GEIRD, Genes Environment Interaction in Respiratory Diseases (study)
384
GIS, Geographic information system
385
LOOCV, Leave-one-out cross-validation
386
LUR, Land-use regression
387
NO2, Nitrogen dioxide
388
RMSE, Root mean square error
SC M AN U
389
RI PT
379
Sources of financial support:
391
Dr. Marcon is the recipient of a European Respiratory Society Fellowship (STRTF 2014-4173),
392
which supported carrying out this work at the Imperial College London. The European Study of
393
Cohorts for Air Pollution Effects has received funding from the European Community’s Seventh
394
Framework Program (FP7/2007-2011) under grant agreement number: 211250.
EP
TE D
390
AC C
395
396
Acknowledgments:
397
We thank Ketty Lorenzet and Luca Zagolin, Osservatorio Aria, ARPAV, for double-checking the
398
coordinates of NO2 monitoring stations; Matteo Bellodi, Unità Agenti Fisici, ARPAV, for providing
399
Digital Terrain Model data.
400
401
Contributorship statement: 20
ACCEPTED MANUSCRIPT
402
AM, KdH, JG and ALH conceived the idea for this paper and planned the analyses. AM did the GIS
403
and statistical analyses and drafted the manuscript. AM, KdH, JG, RB and ALH discussed the
404
analysis, contributed to manuscript drafting and critically reviewed the manuscript.
Conflicts of Financial Interest:
407
none.
AC C
EP
TE D
M AN U
SC
406
RI PT
405
21
ACCEPTED MANUSCRIPT
REFERENCES
409 410 411
Akita Y, Baldasano JM, Beelen R, Cirach M, de Hoogh K, Hoek G, Nieuwenhuijsen M, Serre ML, de Nazelle A. Large scale air pollution estimation method combining land use regression and chemical transport modeling in a geostatistical framework. Environ Sci Technol. 2014;48:4452–9.
412 413
Allen RW, Amram O, Wheeler AJ, Brauer M. The transferability of NO and NO2 land use regression models between cities and pollutants. Atmos Environ. 2011;45:369–78.
414 415
Beelen R, Hoek G, Fischer P, van den Brandt PA, Brunekreef B. Estimated long-term outdoor air pollution concentrations in a cohort study. Atmos Environ 2007;41:1343–58.
416 417 418 419 420 421 422 423 424
Beelen R, Hoek G, Vienneau D, Eeftens M, Dimakopoulou K, Pedeli X, Tsai MY, Künzli N, Schikowski T, Marcon A, Eriksen KT, Raaschou-Nielsen O, Stephanou E, Patelarou E, Lanki T, Yli-Tuomi T, Declercq C, Falq G, Stempfelet M, Birk M, Cyrys J, von Klot S, Nádor G, Varró MJ, Dėdelė A, Gražulevičienė R, Mölter A, Lindley S, Madsen C, Cesaroni G, Ranzi A, Badaloni C, Hoffmann B, Nonnemacher M, Krämer U, Kuhlbusch T, Cirach M, de Nazelle A, Nieuwenhuijsen M, Bellander T, Korek M, Olsson D, Strömgren M, Dons E, Jerrett M, Fischer P, Wang M, Brunekreef B, de Hoogh, Kees. Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe – The ESCAPE project. Atmos Environ. 2013;72:10–23.
425 426
Beelen R, Hoek G, Pebesma E, Vienneau D, de Hoogh K, Briggs DJ. Mapping of background air pollution at a fine spatial scale across the European Union. Sci Total Environ. 2009;407:1852–67.
427 428 429
Beelen R, Voogt M, Duyzer J, Zandveld P, Hoek G. Comparison of the performances of land use regression modelling and dispersion modelling in estimating small-scale variations in long-term air pollution concentrations in a Dutch urban area. Atmos Environ. 2010;44:4614–21.
430 431 432
Briggs DJ, de Hoogh C, Gulliver J, Wills J, Elliott P, Kingham S, Smallbone K. A regression-based method for mapping traffic-related air pollution: application and testing in four contrasting urban environments. Sci Total Environ. 2000;253:151–67.
433 434
Caballero S, Esclapez R, Galindo N, Mantilla E, Crespo J. Use of a passive sampling network for the determination of urban NO2 spatiotemporal variations. Atmos Environ. 2012;63:148–55.
435 436 437 438 439 440 441 442 443
Cyrys J, Eeftens M, Heinrich J, Ampe C, Armengaud A, Beelen R, Bellander T, Beregszaszi T, Birk M, Cesaroni G, Cirach M, de Hoogh K, De Nazelle A, de Vocht F, Declercq C, Dėdelė A, Dimakopoulou K, Eriksen K, Galassi C, Grąulevičienė R, Grivas G, Gruzieva O, Gustafsson AH, Hoffmann B, Iakovides M, Ineichen A, Krämer U, Lanki T, Lozano P, Madsen C, Meliefste K, Modig L, Mölter A, Mosler G, Nieuwenhuijsen M, Nonnemacher M, Oldenwening M, Peters A, Pontet S, Probst-Hensch N, Quass U, Raaschou-Nielsen O, Ranzi A, Sugiri D, Stephanou EG, Taimisto P, Tsai MY, Vaskövi É, Villani S, Wang M, Brunekreef B, Hoek G. Variation of NO2 and NOx concentrations between and within 36 European study areas: results of the ESCAPE project. Atmos Environ. 2012;62:374–90.
444 445 446 447 448
de Marco R, Accordini S, Antonicelli L, Bellia V, Bettin MD, Bombieri C, Bonifazi F, Bugiani M, Carosso A, Casali L, Cazzoletti L, Cerveri I, Corsico AG, Ferrari M, Fois AG, Lo Cascio V, Marcon A, Marinoni A, Olivieri M, Perbellini L, Pignatti P, Pirina P, Poli A, Rolla G, Trabetti E, Verlato G, Villani S, Zanolin ME. The Gene-Environment Interactions in Respiratory Diseases (GEIRD) Project. Int Arch Allergy Immunol. 2010;152:255-63.
AC C
EP
TE D
M AN U
SC
RI PT
408
22
ACCEPTED MANUSCRIPT
Eeftens M, Beelen R, Fischer P, Brunekreef B, Meliefste K, Hoek G. Stability of measured and modelled spatial contrasts in NO2 over time. Occup Environ Med. 2011; 68:765–70.
451 452 453
Gulliver J, de Hoogh K, Fecht D, Vienneau D, Briggs D. Comparative assessment of GIS-based methods and metrics for estimating long-term exposures to air pollution. Atmos Environ. 2011;45:7072–80.
454 455 456
Gulliver J, de Hoogh K, Hansell A, Vienneau D. Development and back-extrapolation of NO2 land use regression models for historic exposure assessment in Great Britain. Environ Sci Technol. 2013;47:7804–11.
457 458 459
Henderson SB, Beckerman B, Jerrett M, Brauer M. Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter. Environ Sci Technol. 2007;41:2422–28.
460 461 462
Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, Briggs D. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ. 2008;42:7561–78.
463 464 465
Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, Sahsuvaroglu T, Morrison J, Giovis C. A review and evaluation of intraurban air pollution exposure models. J Expo Anal Environ Epidemiol. 2005;15:185–204.
466 467 468
Jerrett M, Burnett RT, Beckerman BS, Turner MC, Krewski D, Thurston G, Martin R V., Van Donkelaar A, Hughes E, Shi Y, Gapstur SM, Thun MJ, Pope CA. Spatial analysis of air pollution and mortality in California. Am J Respir Crit Care Med. 2013;188:593–9.
469 470 471
Madsen C, Gehring U, Håberg SE, Nafstad P, Meliefste K, Nystad W, Lødrup Carlsen KC, Brunekreef B. Comparison of land-use regression models for predicting spatial NOx contrasts over a three year period in Oslo, Norway. Atmos Environ; 2011;45:3576–83.
472 473 474 475
Marcon A, Girardi P, Ferrari M, Olivieri M, Accordini S, Bombieri C, Bortolami O, Braggion M, Cappa V, Cazzoletti L, Locatelli F, Nicolis M, Perbellini L, Sembeni S, Verlato G, Zanolin ME, de Marco R. Mild asthma and chronic bronchitis seem to influence functional exercise capacity: a multi-case control study. Int Arch Allergy Immunol. 2013;161:181-8.
476 477 478
Mölter a, Lindley S, de Vocht F, Simpson a, Agius R. Modelling air pollution for epidemiologic research--part II: predicting temporal variation through land use regression. Sci Total Environ. 2010;409:211–7.
479 480 481 482
Poplawski K, Gould T, Setton E, Allen R, Su J, Larson T, Henderson S, Brauer M, Hystad P, Lightowlers C, Keller P, Cohen M, Silva C, Buzzelli M. Intercity transferability of land use regression models for estimating ambient concentrations of nitrogen dioxide. J Expo Anal Environ Epidemiol. 2009;19:107-117.
483 484 485
Rava M, Crainicianu C, Marcon A, Cazzoletti L, Pironi V, Silocchi C, Ricci P, de Marco R. Proximity to wood industries and respiratory symptoms in children: a sensitivity analysis. Environ Int. 2012;38:37-44.
486 487
Ross Z, Jerrett M, Ito K, Tempalski B, Thurston G. A land use regression for predicting fine particulate matter concentrations in the New York City region. Atmos Environ. 2007;41:2255–69.
AC C
EP
TE D
M AN U
SC
RI PT
449 450
23
ACCEPTED MANUSCRIPT
Schikowski T, Adam M, Marcon A, Cai Y, Vierkötter A, Carsin AE, Jacquemin B, Al Kanani Z, Beelen R, Birk M, Bridevaux PO, Brunekeef B, Burney P, Cirach M, Cyrys J, de Hoogh K, de Marco R, de Nazelle A, Declercq C, Forsberg B, Hardy R, Heinrich J, Hoek G, Jarvis D, Keidel D, Kuh D, Kuhlbusch T, Migliore E, Mosler G, Nieuwenhuijsen MJ, Phuleria H, Rochat T, Schindler C, Villani S, Tsai MY, Zemp E, Hansell A, Kauffmann F, Sunyer J, Probst-Hensch N, Krämer U, Künzli N. Association of ambient air pollution with the prevalence and incidence of COPD. Eur Respir J. 2014;44:614–26.
495 496 497 498
Shmool JL, Michanowicz DR, Cambal L, Tunno B, Howell J, Gillooly S, Roper C, Tripathy S, Chubb LG, Eisl HM, Gorczynski JE, Holguin FE, Shields KN, Clougherty JE. Saturation sampling for spatial variation in multiple air pollutants across an inversion-prone metropolitan area of complex terrain. Environ Health. 2014;13:28.
499 500
Vienneau D, de Hoogh K, Beelen R, Fischer P, Hoek G, Briggs D. Comparison of land-use regression models between Great Britain and the Netherlands. Atmos Environ. 2010;44:688–96.
501 502 503 504 505 506 507
Wang M, Beelen R, Basagana X, Becker T, Cesaroni G, de Hoogh K, Dedele A, Declercq C, Dimakopoulou K, Eeftens M, Forastiere F, Galassi C, Gražulevičienė R, Hoffmann B, Heinrich J, Iakovides M, Künzli N, Korek M, Lindley S, Mölter A, Mosler G, Madsen C, Nieuwenhuijsen M, Phuleria H, Pedeli X, Raaschou-Nielsen O, Ranzi A, Stephanou E, Sugiri D, Stempfelet M, Tsai MY, Lanki T, Udvardy O, Varró MJ, Wolf K, Weinmayr G, Yli-Tuomi T, Hoek G, Brunekreef B. Evaluation of land use regression models for NO2 and particulate matter in 20 European study areas: the ESCAPE project. Environ Sci Technol. 2013;47:4357–64.
508 509 510 511 512 513 514
Wang M, Beelen R, Bellander T, Birk M, Cesaroni G, Cirach M, Cyrys J, de Hoogh K, Declercq C, Dimakopoulou K, Eeftens M, Eriksen KT, Forastiere F, Galassi C, Grivas G, Heinrich J, Hoffmann B, Ineichen A, Korek M, Lanki T, Lindley S, Modig L, Mölter A, Nafstad P, Nieuwenhuijsen MJ, Nystad W, Olsson D, Raaschou-Nielsen O, Ragettli M, Ranzi A, Stempfelet M, Sugiri D, Tsai MY, Udvardy O, Varró MJ, Vienneau D, Weinmayr G, Wolf K, Yli-Tuomi T, Hoek G, Brunekreef B. Performance of Multi-City Land Use Regression Models for Nitrogen Dioxide and Fine Particles. Environ Health Perspect. 2014;122:843-9.
515 516
Wang M, Beelen R, Eeftens M, Meliefste K, Hoek G, Brunekreef B. Systematic evaluation of land use regression models for NO₂. Environ Sci Technol. 2012;46:4481–9.
517 518 519 520
Wang M, Gehring U, Hoek G, Keuken M, Jonkers S, Beelen R, Eeftens M, Postma DS, Brunekreef B. Air Pollution and Lung Function in Dutch Children: A Comparison of Exposure Estimates and Associations Based on Land Use Regression and Dispersion Exposure Modeling Approaches. Environ Health Perspect. 2015 Apr 3. http://dx.doi.org/10.1289/ehp.1408541.
SC
M AN U
TE D
EP
AC C
521
RI PT
488 489 490 491 492 493 494
522
WEB REFERENCES
523
http://dati-censimentopopolazione.istat.it/
24
ACCEPTED MANUSCRIPT
524
TABLES
525 526
Table 1: Distribution of percent data capture and annual average NO2 concentrations at the selecteda regulatory stations in Veneto in 2010.
All sites
NO2 concentrations
mean±SD
mean±SD (µg/m3)
Median (min, max) (µg/m3)
Range/mean (%)
11 (23.4)
92.9±2.9
16.4±5.3
16.9 (7.3, 26.7)
118
22 (46.8)
94.2±4.7
29.9±5.9
29.2 (21.0, 40.0)
64
5 (10.6) 9 (19.2)
94.4±2.2 94.3±2.2
30.0±2.6 39.8±6.0
30.2 (26.4, 33.3) 39.1 (29.2, 46.8)
23 44
47
93.9±3.7
28.6±9.5
28.9 (7.3, 46.8)
138
RI PT
Rural background Urban background Industrial Street
Data capture (%)
N (%)
SC
Type of station
a
530
Table 2: regression coefficients of the LUR model developed using regulatory NO2 data for 2010.
534 535 536
monitoring stations with <75% data capture or located at <5km from the regional border were excluded
Estimate
SE
Intercept Buildings (5,000) Industry (1,000) Transformed altitude Length of roads (100) 1/distance to motorways
16.91269 .0007552 9.15e-06 -11.05696 .0099249 1611.996
2.177982 .0001349 2.57e-06 3.519141 .0041146 731.9124
t
p-value R2 (aR2) a
VIF
7.77 5.60 3.56 -3.14 2.41 2.20
<0.001 <0.001 0.001 0.003 0.020 0.033
1.18 1.12 1.14 1.06 1.16
0.45 (0.44) 0.61 (0.59) 0.68 (0.66) 0.72 (0.70) 0.75 (0.72)
R2 (and aR2) of the model with the predictor plus all the predictors that had previously entered the model
EP
a
TE D
Predictor
AC C
531 532 533
M AN U
527 528 529
Table 3: annual average NO2 concentrations (µg/m3) at the 41 regulatory monitoring stations in Veneto that were active throughout the 2008-2011 period. Type of station
N.
2008
2009
2010
2011
2008-11
Rural background Urban background Industrial Street
10 20 3 8
19.4 (7.4) 32.7 (7.3) 32.8 (4.8) 44.3 (6.6)
18.1 (6.4) 31.4 (6.1) 31.5 (2.8) 42.2 (6.5)
17.3 (4.6) 30.1 (5.9) 30.7 (2.3) 39.5 (6.4)
16.9 (6.0) 31.6 (6.7) 30.7 (1.2) 39.1 (9.6)
17.9 (5.8) 31.4 (6.2) 31.5 (2.8) 41.3 (6.9)
All stations
41
31.7 (10.8)
30.3 (10.0)
28.9 (9.3)
29.4 (10.3)
30.1 (9.9)
537 25
ACCEPTED MANUSCRIPT
538
540
FIGURES
541
For colour reproduction on the Web:
542
Figure 1: map of the study area.*
RI PT
539
544
AC C
EP
TE D
M AN U
SC
543
545
* panel A, Italy, with the Veneto region marked in light blue; panel B, Veneto region, with the
546
Verona province (hatched area) and the Verona municipality (white inner area) marked; panel C,
547
Verona municipality. Brown lines represent motorways, circles and triangles represent regulatory
548
and ESCAPE sites, respectively. Green, yellow, grey, and red symbols represent rural background,
549
urban background, industrial, and street type sites, respectively.
550 26
ACCEPTED MANUSCRIPT
551
For black-and-white reproduction in print:
552
Figure 1: map of the study area.*
EP
554
TE D
M AN U
SC
RI PT
553
* panel A, Italy, with the Veneto region marked in grey; panel B, Veneto region, with the Verona
556
province (hatched area) and the Verona municipality (white inner area) marked; panel C, Verona
557
municipality. Thick grey lines represent motorways, circles and triangles represent regulatory and
558
ESCAPE sites, respectively. White, grey, dotted grey, and black symbols represent rural
559
background, urban background, industrial, and street type sites, respectively.
560
AC C
555
27
561
ACCEPTED MANUSCRIPT
Figure 2: distribution of measured NO2 concentrations at the ESCAPE and regulatory sites.
100
RI PT
80
60
SC
40
20
0 RB
UB
M AN U
3
NO2 concentrations (µg/m )
ESCAPE regulatory
S
RB
UB
I
S
site type
562
TE D EP
564
RB, rural background; UB, urban background; S, street; I, industrial
AC C
563
28
ACCEPTED MANUSCRIPT
565
Figure 3: cumulative distribution of LUR predictors at the regulatory stations (plus symbols) and at
566
ESCAPE sites (circles). A, buildings in 5,000 m; B, industry in 1,000 m; C, transformed altitude; D,
567
length of roads in 100 m; E, 1/distance to motorways.
RI PT
A
B
M AN U
SC
C
D
-2
-1
0
TE D
E
1
2
3
4
5
6
Predictor values*
568
* to fit all distributions on one single graph for comparative purposes, each predictor was centred
570
and standardized on its distribution at the regulatory stations (ST), as follows:
571
=
572
value of the predictor at the monitoring stations, and a 1-unit difference corresponds to a 1-SD
573
difference.
EP
569
AC C
( ) ( )
. As a consequence, a value of 0 corresponds to the mean
574 575
29
ACCEPTED MANUSCRIPT
576
Figure 4: comparison of NO2 concentrations predicted by the model transferred to ESCAPE sites
577
with measured concentrations. Panel A, all ESCAPE site types (n=40); panel B, only rural and
578
urban background sites (n=17).*
90
90
80
80
70
70
60
60
50
50
40
40
B
RI PT
100
A
M AN U 30
30 2
R =0.18
20 25
30
35
40
45 3
50
2
R =0.52
20
25
30
35
40
45
50
3
predicted NO 2 (µg/m )
TE D
predicted NO 2 (µg/m )
579
SC
3
measured NO 2 (µg/m )
100
* hollow, grey and black circles indicate rural background, urban background and street sites,
581
respectively
AC C
EP
580
30
ACCEPTED MANUSCRIPT
582
Figure 5: comparison of R2 values across the models applied to years 2008, 2009, 2011, and to
583
2008-11.*
2008 (43) 0.80
0.65 0.60 2008-11 (41)
TE D
M AN U
0.55
SC
0.70
RI PT
0.75
2011 (52)
584
2009 (46)
uncalibrated recalibrated 2010 (47)
* the numbers in brackets close to the period represent the number of stations available. The
586
performance of the model developed for 2010 is represented by a solid grey line.
588
589
AC C
587
EP
585
31
ACCEPTED MANUSCRIPT Highlights Land-use regression (LUR) is often used to estimate urban air pollution exposure
•
No studies have looked at transferability of LUR models from regions to cities
•
We developed a LUR model using NO2 regulatory data for a region of Italy for 2010
•
When transferred to a inner city, the model was unable to capture NO2 variability
•
LUR models should not be transferred to nested areas with different characteristics
AC C
EP
TE D
M AN U
SC
RI PT
•