Atmospheric Environment 45 (2011) 3613e3620
Contents lists available at ScienceDirect
Atmospheric Environment journal homepage: www.elsevier.com/locate/atmosenv
Sampling design for air quality measurement surveys: An optimization approach Thomas Romary a, *, Chantal de Fouquet a, Laure Malherbe b a b
Mines ParisTech, Equipe Géostatistique, Centre de Géosciences, 35 rue Saint Honoré, 77305 Fontainebleau, France Institut National de l’Environnement Industriel et des Risques (INERIS), Direction des risques chroniques, Parc Technologique Alata, 60550 Verneuil-en-Halatte, France
a r t i c l e i n f o
a b s t r a c t
Article history: Received 23 August 2010 Received in revised form 25 March 2011 Accepted 28 March 2011
Measurement surveys using passive diffusion tubes are regularly carried out to elaborate atmospheric concentration maps over various areas. Sampling schemes must be designed to characterize both contaminant concentrations (of benzene or nitrogen dioxide for example) and their relations to environmental variables so as to obtain pollution maps as precise as possible. The concentration variable is interpolated by external drift kriging, with the help of exhaustively known covariates. The quality of a sampling scheme is quantified by the spatially averaged external drift kriging variance, which incorporates the drift estimation error as well as the spatial interpolation error. A weighted criterion is also introduced. Optimizing this criterion by simulated annealing provides an optimal sampling scheme. A preliminary study is performed on concentration data available from previous surveys on two different agglomerations so as to determine the covariance model and the relevant covariates to be used on a third agglomeration. It does not reveal a single model but a whole parametrized family of relevant models. The method is then applied to the third agglomeration with different values of the model parameters. The results are discussed and finally compared to those obtained with a pragmatic approach, described in a previous paper. 2011 Elsevier Ltd. All rights reserved.
Keywords: Optimal design Geostatistics External drift kriging Simulated annealing Sampling scheme
1. Introduction Mapping air pollution as precisely as possible is a major issue for French Local Air Quality Monitoring Agencies (the AASQAs) both for regulatory and information purposes and for public health concerns. Seasonal or annual average concentration maps can be obtained from passive sampling data collected at a large number of sites across the area of interest. The AASQAs regularly carry out such sampling surveys over various areas at various scales. Given those considerations, they have to design sampling schemes so that resulting concentration maps will fulfil precision criteria. The interpolation is performed by kriging, see e.g. Chilès and Delfiner (1999), and de Fouquet et al. (2007) for an application in atmospheric sciences. With its internal quantification of spatial variability through the covariance function (or variogram), kriging methodology can produce maps of optimal predictions and associated prediction error variance from incomplete and possibly noisy spatial data. Several papers in air pollution research have addressed the problem of designing sampling schemes for surveys or pollution monitoring/measurement networks. We can quote Kanaroglou
* Corresponding author. E-mail address:
[email protected] (T. Romary). 1352-2310/$ e see front matter 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2011.03.063
et al. (2005), Abidaa et al. (2008) and Wu and Bocquet (2011) among others. However, to our knowledge, none took advantage of the simple quantification of the quality of resulting maps provided by the kriging variance. Various methods have been proposed in the statistical literature for the problem of collecting spatial data, see Müller (2001) or De Gruijter et al. (2006) for a review. In this work, we ought to design a sampling scheme for a fixed number of locations, taking into account the available information coming from previous surveys, in particular the information provided by covariates. The methodology developed here for the problem of designing sampling schemes for measurement surveys of air quality is proposed by Brus and Heuvelink (2007). It relies on the definition of a quality criterion of the sampling design, defined from the external drift kriging variance (e.g. Chilès and Delfiner, 1999). This criterion is then optimized by simulated annealing (Kirkpatrick et al., 1983). What is appealing in this method is its relative automaticity and the optimality of the resulting design, in a sense we will further discuss in Section 2. In the same section, we also propose a new way to generate solutions along the algorithm, making possible to directly explore the space of auxiliary variables or feature space. This new proposal distribution, or solution generator, will help to better explore the space of possible designs and hence to optimize the criterion faster.
3614
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
Since the criterion which is to be optimized relies on the external drift kriging variance, the method requires to know beforehand what auxiliary variables are to be used and what is the covariance or variogram of the residual. The proposed methodology consists then of two steps: 1. A geostatistical analysis of the data collected during previous surveys is performed in order to set up the model (covariates and covariance) that will be used when applying the optimization method to another area. Data from benzene sampling campaigns conducted in two French cities (Lille and Reims) are used in this part. This analysis is briefly described in Section 3. 2. Once the model to use has been clearly identified, the optimization method is applied onto the newly investigated area. In our study, a third French city (Bordeaux) is taken as application case agglomeration. We discuss the results in Section 4.
drift (third term on the right-hand side of (3)), although the drift is estimated only implicitly in (2). The external drift kriging is often denoted by universal kriging in the literature. Fundamentally, universal kriging corresponds to a drift that consists of basis functions (splines, wavelets, see Chilès and Delfiner (1999) for more details) and external drift kriging corresponds to the more general case of a drift based on auxiliary variables.
2.2. Criterion building
Finally in Section 5, we compare the results obtained to those given with the methodology described in de Fouquet and Faucheux (2009) and discuss the pros and cons of both methodologies.
The prediction variance (3) does not depend on the data values but only on the sampling design h through c0, C and Y. Minimizing (3) with respect to the sampling scheme averaged on c, one automatically obtains the right balance between optimization of the sample pattern in geographic and feature space (Brus and Heuvelink, 2007). Consequently, the criterion to be optimized is the prediction error variance averaged over the domain of interest:
2. Methodology
OðhÞ ¼
2.1. Modelling Benzene emissions in outdoor environment are due to human activity, principally to road traffic. Consequently, predictors such as land use information and population density may be good predictors of benzene concentrations at the agglomeration scale. However, a model only based on those predictors would not account for the diffusion of the pollutant in the air at a local scale and therefore a regression model will not be sufficient to fully explain the concentrations. A spatially correlated term is then added to the regression model. We denote by c the domain study. Finally the model for atmospheric benzene concentration is defined by the following equation:
ZðxÞ ¼ b0 þ Y 0 ðxÞb þ SðxÞ;
(1)
where Z is the benzene concentration variable, x ˛ c 3 R is the spatial coordinate, Y is the matrix of auxiliary variables exhaustively known on c, 0 is the transpose operator, b is a vector of parameters and S(x) is a centred, spatially correlated residual. This model is just a drift plus a spatially correlated residual. Given a measurement sample Z ¼ (Z (x1),.,Z (xn)), Y the matrix of auxiliary variables at observation locations and Y0 the vector of auxiliary variables at prediction location, we can estimate Z (x0), x0 e c and x0 s (x1,.,x2), by external drift kriging: 2
^ Þ ¼ Zðx 0
1 0 C 1 Z; Y0 Y0 C 1 c0 c0 þ Y Y0 C 1 Y
(2)
where C is the covariance matrix of S ¼ (S1(x),.,Sn(x)), c0 is the vector of covariances between the residuals at observation and prediction locations. The associated prediction variance is given by the formula:
^ ÞÞ¼VðZðx ÞÞc0 C 1 c VðZðx0 Þ Zðx 0 0 0 0 0 1 0 1 Y0 C 1 Y Y0 Y0 C 1 c0 ; þ Y0 Y C c0 (3) where V(Z(x0)) is the punctual variance that may vary over the domain c. The kriging with external drift variance incorporates both the prediction error variance of the residual (first two terms on the right-hand side of (3)), and the estimation variance of the
Z
1 jcj
^ VðZðxÞ ZðxÞÞdx;
(4)
c
where jcj is the area of c. This integral is evaluated on a regular grid discretizing the domain c. Note that this criterion is proportional to the point variance of Z and hence, the optimal design (according to (3)) is independent of a multiplicative factor of the covariance function. This criterion may be modified according to the purpose of the study. To obtain a more accurate mapping in some areas, for instance in function of auxiliary variables values, we can associate O(h) with a weighting function w(x), which leads to:
Ow ðhÞ ¼
1 jcj
Z
^ VðZðxÞ ZðxÞÞwðxÞdx:
(5)
c
Ow, where w stands for weighted, will then associates a prediction error variance multiplied by a factor w(x). The design resulting from the minimization of Ow will show an increased point density in areas where w takes high values, ensuring kriging maps with locally increased precision. This will be used in Section 4 where the weighting function will give a weight ten times higher to densely populated and built-up areas.
2.3. Simulated annealing The optimization of (3) is problematic in practice: it is impossible to compute O(h) for all combinations of sample points of the grid of the discretized domain and then choose the design that minimizes the criterion (Van Groenigen et al., 1999). A gradient based optimization method is also impossible to imagine, since O(h) is clearly not differentiable with respect to h. A global optimization method needs therefore to be implemented to minimize the criterion. As proposed in Brus and Heuvelink (2007), we will use simulated annealing (SA). The SA algorithm has been introduced by Metropolis et al. (1953), then generalized by Kirkpatrick et al. (1983) for optimization problems. It can be applied to both optimization and simulation problems (Robert and Casella (2004), and references therein). The fundamental idea of this algorithm is that a change of scale, named temperature, allows larger moves on the surface to optimize and therefore avoid theoretically to remain trapped in a local minimum.
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
The name and inspiration of the SA comes from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects. The heat causes the atoms to become unstuck from their initial positions (a local minimum of the internal energy) and wander randomly through states of higher energy; the slow cooling gives them more chances of finding configurations with lower internal energy than the initial one. Practically, SA is an iterative optimization algorithm in which a sequence of states (ht) is generated by deriving a new solution from randomly modifying the previous state. Each time a new combination is generated, the quality measure is evaluated and compared with the value of the previous state. If the quality measure has been improved by the change, the new state is systematically accepted. Otherwise, the simulated annealing algorithm may accept changes that worsen the criterion according to a given probability. This helps to avoid remaining trapped in a local minimum. The probability of accepting a worsening state hp being in the state ht is given by:
Oðh Þ O h t p ; exp T
(6)
where T is the system temperature. The SA is performed by gradually lowering the temperature T from a high value to near-zero. (5) shows that, given the temperature, the larger the increase of the O, the smaller the probability of acceptance. As the temperature gets close to zero, the moves worsening the criterion are systematically rejected, ensuring that the generated sequence stays in states with low O. The temperature remains constant during a fixed number of transitions and is then decreased, which makes the acceptance probability gradually decrease as the iterations continue. Heuristic rules are generally applied to ensure the validity of SA: the starting temperature must be high enough and its decrease slow enough. A convergence result exists, for optimization purpose, with the condition that T decreases as l/log(n) (e.g. Robert and Casella, 2004). In practice, a geometrically decreasing sequence is generally used: Ttþ1 ¼ aTt where a is a constant, close to and smaller than 1. Besides this parameter, three more parameters need to be specified by the user: the initial temperature T0; the number of (accepted) transitions during which the temperature is kept constant; a stopping criterion. Kirkpatrick et al. (1983) provides a way to set T0: it should be chosen such that the average probability of choosing a worsening state equals 0.8. Kirkpatrick (1984) proposes an algorithm to set it automatically. Otherwise, it can be set by trials and errors. The number of transitions for a given temperature can be chosen as fixed, but it may also be specified in terms of the number of accepted transitions. The stopping criterion may simply be a fixed number of total iterations, a fixed final temperature, or alternatively one may decide to stop the iterations when there is lack of progress (i.e., when the quality measure did not improve in many tries). The mechanism to generate a new solution by random perturbation of the previous solution is an overriding ingredient of SA. One can find it under the name solution generator or proposal distribution in the literature. In Van Groenigen et al. (1999) and Brus and Heuvelink (2007), this is done by moving one randomly selected sample point of h over a vector h, with random direction, and a random length with a uniform probability distribution with limits 0 and hmax. The upper bound hmax is decreased as the iteration continues.
3615
Here we propose a new way to make the selected point move. Rather than shifting the point according to its position in the domain, we will also propose to move it according to its position in the cloud of covariates. That is, according to a probability p, we will propose a move as in Van Groenigen et al. (1999), or else by replacing the selected point by a randomly chosen point among its n nearest neighbours in the covariates cloud. Again, the number n is decreased along the iterations. This will help to explore faster the covariates cloud and therefore accelerate the convergence. Indeed, a specific problem with the simulated annealing algorithm while using the solution generator proposed in Van Groenigen et al. (1999) is its behaviour at low temperatures: only slight moves in the spatial domain will be accepted as the search radius decreases with the temperature. It may prevent from proposing moves improving the criterion when the objective function values are already low, particularly moves towards interesting points in the covariates cloud, that can become inaccessible due to the decrease of the search radius. Proposing moves towards neighbours in the covariates cloud will allow the points to perform large moves over the spatial domain while reaching more interesting configurations in the covariates cloud. Moreover, even at high temperatures, the new solution generator will perform a faster exploration of the space of configurations, browsing both spatial domain and covariates space in a proper way. To sum up, denoting by q1 the proposal distribution that moves a point in the coordinate space and by q2 the one that moves a point in the covariates cloud, the proposal distribution takes the following form:
qðh1 ; h2 Þ ¼ pq1 ðh1 ; h2 Þ þ ð1 pÞq2 ðh1 ; h2 Þ:
(7)
This is a mixture of the two proposal distributions q1 and q2. Several runs have been carried out to tune the parameter p, trying out different values. We eventually set it at 0.5. Therefore, in average, q will propose as many moves in the spatial domain (q1) as in the covariates cloud (q2). 3. Geostatistical analysis It is necessary to know most of the parameters of the model (1) to perform the optimization by SA, namely all but b. Indeed, in order to compute O(h) for a given design h, the covariance function or variogram of the residual S and the relevant covariates Y have to be known. Adding a covariate or modifying the covariance model may have a big impact on the resulting optimized design. Therefore, before applying the method, a preliminary geostatistical study has to be carefully carried out to determine the covariates and the covariance function of the variable to be studied, the paradox being that this variable has not been observed yet. In practice, those parameters have to be guessed with respect to the available information on the studied phenomenon. This information may be related to the physics of the phenomenon or may come from previously collected datasets. For this study, data from previous surveys are available. They were collected onto two different agglomerations, Lille and Reims, two cities of the north of France. Data from two measurement surveys performed on Lille at different times with the same sampling scheme are available and four on Reims. This preliminary data analysis helps to determine the covariates to use in (1): urban fabric proportion and population density. It is found that their translated logarithms are significantly linearly related to the benzene concentration with a great explanation power for both agglomerations. Indeed, the determination coefficients R2 are always greater than 0.9. Concerning the spatial structure of the residual, the variogram fitted consists of two components for all surveys: a linear
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
0.15 0
2000
4000
6000
8000
h Fig. 2. Experimental and fitted variogram of the residuals (Lille, average of two surveys).
4. Application We recall that the aim is to generate a design for a measurement survey of benzene concentration on the Bordeaux agglomeration. The domain size is 38 km 45 km, it is discretized on a 184 154 grid with a 250 m step. As we did not succeed in proposing a unique model for benzene concentration on account of the available data, the methodology proposed will be applied with two different models, corresponding to the bound values of the covariance parameters observed in the geostatistical analysis and compare the results obtained. For both models, the translated logarithm of the urban fabric proportion and of the population density are taken as covariates, according to what has been found in the preliminary analysis of Section 3. A proportional effect equivalent to that observed on Lille has been considered, due to the fact that the urban typology of Bordeaux is closer to that of Lille than Reims’. The covariance function C of the residual S takes the following form:
jhj CðjhjÞ ¼ s1 1 þ 1fY1 ðxÞ>m1 g 1fjhj ¼ 0g þ s2 1 ; R
(8)
1.5
2.0
where R ¼ maxc ðjx yjÞ, Y1 stands for log(1 þ population density), and m1 ¼ 4.7 is its median over the domain. The nugget effect is two times higher in areas where the population density is greater than its median. The covariance model then consists of two components: a spatially varying nugget effect and a linear component1. As we remarked in section, the criterion O(h) is insensitive to a modification of the covariance or variogram by a multiplicative factor, in other words to a multiplicative factor of the point variance of the residual. We will then make vary only the ratio s1/s2, keeping the point variance V(x) equal to 1 where log(1 þ population density) is lower than its median. Hence, the quantity that will vary will be the proportion of variance governed by either the nugget effect or the linear variogram and more precisely the ratio of those two proportions. This allows to have only one parameter, the ratio
1.0
concentration
0.00
For the case of Lille, a non stationary punctual variance has been detected: the variance of the residual obtained from the regression varies over the domain. Hopefully, it can be modelled with the use of the translated logarithm of the population density as illustrated by Fig. 1. Fig. 1 represents the benzene concentration averaged over the 2 surveys for Lille as a function of the translated logarithm of the population density. It is clear that the variability of the concentration is far more important in highly populated areas. More precisely, the residual variance is about 2 times higher in areas where the translated logarithm of the population density is greater than its median (dotted line). To account for this non stationarity of the variance (or heteroscedasticity), a weighted regression is estimated, where the weight given to each data is proportional to its variance (substantially data are given a weight 1 or 2 depending on the fact that the associated value of population density is lower or greater to the median). Fig. 2 shows the fitted (thick line) and experimental variogram (circles radii are proportional to the square root of the number of pairs involved in the computation of each value) computed for the residuals resulting from the weighted regression, taking the proportional effect into account. No structure was exhibited by the residuals resulting from the regression computed by ordinary least squares; the residuals obtained by this preliminary estimation presented high increment values for close pairs, inducing high values to the empirical variogram at small distances, thus hiding the spatial structure. Taking that varying punctual variance into account allowed to detect the residuals structure. Finally, we do not have to cope with a single model but with a parametrized family of models, the parameters being the nugget effect, the slope of the linear component of the variogram and the parameters modelling the proportional effect. It seems difficult though to determine a single model on which the methodology described in Section 2 can be carried out. Therefore, we will test different values of the model parameters, observe and discuss the differences between the resulting optimal designs.
0.05
γ (h)
3.1. Non stationary variance modelling
0.20
component and a nugget effect. However, the proportion of nugget effect with respect to the linear component of the variogram as well as the slope of the linear component slightly changed from a survey to another. The impact of these uncertainties will be studied in the following section.
0.10
3616
0
1
2 3 log(1+density)
4
Fig. 1. Mean benzene concentration as a function of the population density (Lille).
1 Note that (8) is what is called a generalized covariance function: there is no covariance model associated with the linear variogram, as it corresponds to a non stationary model. However when considered on a bounded domain a generalized covariance function can be built to simplify the calculus (see Chilès and Delfiner, 1999, for more details).
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
a
3617
b
Fig. 3. Optimal design for s1/s2 ¼ 1, in the spatial domain (a.) and in the covariates cloud (b.).
s1/s2, to describe the variogram variations and will ease the uncertainty analysis. We considered two s1/s2 values, 1 and 4. s1/s2 equal to 1 means that the nugget component equals 0.5 and that the slope of the linear component is equal to 0.5 divided by the maximum length in the domain considered. Those variogram models correspond to the bounds of the situations encountered in Section 3. The SA algorithm has been run for 10,000 iterations, with an initial temperature chosen such that the initial acceptation rate of worsening proposition was greater than 0.8. The temperature was lowered exponentially with a ¼ 0.8. The initial design was generated randomly. Both runs were initialized with the same seed. The algorithm has been implemented with R (R Development Core Team, 2010) and took about 10 h to run on a desktop computer. Figs. 3 and 4 consist of two different subfigures. On the left part (a.), the best design is represented by black circles in the spatial domain together with the urban fabric proportion that gives an idea of the shape of the agglomeration. The right figure (b.) represents the same best design in the cloud of auxiliary variables again as black circles while grey crosses represent all covariates couples. Both runs seem to have converged, as the criterion curve showed a decreasing trend, until attaining a floor around the minimal energy found. The empirical acceptance probability also exhibited a satisfactory shape, decreasing from values close to 1, thus performing a good initial exploration, to much smaller values (around 0.05 on the last hundreds iterations) meaning that new transitions were almost systematically rejected by the end of the run and therefore that an optimal (at least local) value had been reached. Comparatively, the best designs obtained in both cases show quite different patterns, while their representations are quite similar in the covariates cloud. When the linear structure is prominent (s1/s2 ¼ 1, Fig. 3a.), the best design exhibits a relatively uniform spatial coverage with a higher density of points in the
a
centre of the agglomeration. In the other case (s1/s2 ¼ 4, Fig. 4b.), the best design authorizes larger distances between points, whereas there are more points in the centre of the agglomeration, where to find more contrasts in the values of the covariates. In that second case, the spatial structure is weaker and hence, there is more weight given to the drift estimation variance term in the criterion. This explains why the best design present clusters of points while authorizing larger areas without any point. The repartitions in the covariates cloud are however similar: the points are located at the extremities of the cloud and also near the line defining the non stationary variance. This behaviour was expected as it is well known that the best design to estimate a linear regression by ordinary least squares is to have one point at each extremity of the regression line. Therefore, it is logical to obtain a similar pattern in our optimal design. As the drift is implicitly estimated by weighted least squares, additional points near the limits defining the non stationary variance have been generated with an equal number of points on both sides. The only difference between both representations in the covariates cloud is the presence of three points close to the centre of the cloud in Fig. 3b. As the correlation structure is stronger in that case, the negative impact of those points on the drift estimation variance is less important. Nevertheless, the designs shown in Figs. 3 and 4 do not fulfil all requirements of the French Local Air Quality Monitoring Agencies. Indeed, the World Health Organization defines thresholds beyond which the benzene is considered carcinogenic. Besides, an annual limit value for the protection of human’s health (5 mg m3) is set by the European legislation on ambient air quality (Directive 2008/50/ EC). As the agencies’ missions are mainly directed to populations’ health, the kriging maps should be more accurate in areas where this threshold is likely to be exceeded. As benzene emissions are mainly caused by traffic, a higher weight should be assigned in the criterion to densely populated (Y1) and highly constructed areas
b
Fig. 4. Optimal design for s1/s2 ¼ 4, in the spatial domain (a.) and in the covariates cloud (b.).
3618
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
a
b
Fig. 5. Optimal design for s1/s2 ¼ 1, in the spatial domain (a.) and in the covariates cloud (b.) (weighted criterion).
(Y2). That is why the following weighted criterion is now introduced:
Finally, it is worth noticing that the schemes produced by the simulated annealing algorithm may be slightly suboptimal: it is not run within the conditions that ensures asymptotic convergence and only for a finite number of iterations. Therefore, two different runs may lead to two different optimal sampling schemes.
where sd and su stand for the threshold beyond which the weighting applies, namely the median of each auxiliary variable. Therefore, such a criterion will assign a relative weight of 10 to points belonging to areas where both population density and urban fabric proportion exceed the respective median observed on the domain. This should imply a higher concentration of sample points in such areas. Figs. 5 and 6 present the results obtained with the weighted criterion in the same fashion as Figs. 3 and 4. Again, the objective function and empirical acceptation rate curves have been checked and show no pathologies. The main difference with respect to the previous results is the highest density of points in densely populated and constructed areas, thanks to the weighting used, whatever the value of s1/s2. Again, the same differences of point repartition are encountered while varying s1/s2 with a higher clustering associated with a weaker spatial covariance structure (s1/s2 ¼ 4). The points’ positions in the auxiliary variables cloud show roughly the same pattern as before, with points around extreme values of each covariate and close to the line delimiting the proportional effect, but we can notice higher concentrations of points to the right of that line. Indeed, when looking at the proportion of points on both sides of the median of the population density logarithm, we can observe that it is far more important on the right, corresponding to highly weighted areas.
5. Discussion
a
In a previous work (de Fouquet and Faucheux, 2009), a pragmatic methodology to design sample schemes had been set up and applied to the agglomeration of Bordeaux. That method is not model-based and aims at designing a sample scheme that will be easy to use to fit the variogram and choose the relevant covariates properly. It consists in applying a regular grid, then refining it in the heart of the agglomeration, while dropping some less informative points near the edges of the domain and finally choosing points in the covariates cloud so as to perform a “nice” spanning. The resulting design is plotted in Fig. 7. Obviously, this design is worse than that obtained in the previous section regarding the computed criterion. Precisely, the value of the criterion for this design is as bad as if it had been proposed at random, whatever parameter values considered (s1/s2 ratio and weighting). In other words, the associated criterion value is very close to the values attained at the beginning of the algorithm. While the weighting has been found to be helpful, it is still not clear, on the basis of the available information, what value for s1/s2 should be chosen in practice as both tested values appeared to lead to rather different results. There remain an uncertainty on that special parameter and the practitioner has to make a choice between those two sampling schemes. More generally, if the model on which the quality criterion is based occurs to be unadapted, it
b
Fig. 6. Optimal design for s1/s2 ¼ 4, in the spatial domain (a.) and in the covariates cloud (b.) (weighted criterion).
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
a
3619
b
Fig. 7. Previously proposed design in the spatial domain (a.) and its representation in the covariates cloud (b.).
may be difficult to infer the correct model afterwards. For instance, if a long range variogram has been fitted in a preliminary analysis and is used in the method, the resulting design may prevent from detecting a shorter range variogram. To that extent, the sampling scheme proposed in (de Fouquet and Faucheux, 2009) may be more flexible. On the other hand, if the phenomenon under study is modelled with no or little uncertainties, the method we propose is optimal for prediction purposes. To sum up, the method proposed here, although efficient and easy to implement, reaches its limits when dealing with important modelling uncertainties, as it relies on a unique model with fixed parameters. A solution to overcome this problem may be to consider not only a unique model but a variety of models, corresponding to the parametrized family of models identified in a preliminary analysis of available data, or defined from the physics of the phenomenon under study. A way to cope with a whole family of models of this kind is to define prior distributions on each parameter of the model, which can be viewed as a Bayesian approach. For instance, one can define a uniform distribution on a specified interval for the range of the variogram or, here, for the ratio s1/s2, or else different variogram models with an associated probability of occurrence. Concerning the auxiliary variables, with a given probability the model may include two or more covariates, or prior distribution could be defined on the drift coefficients. The resulting design will then be optimal in the sense of the criterion (3) averaged on the space of the uncertain parameters, with respect to the prior distributions. 6. Conclusion In addition to continuous monitoring performed by automatic stations, sampling surveys can be useful to evaluate the spatial distribution of atmospheric concentrations more precisely and provide information about air quality in any part of the territory, as required by the European legislation. Furthermore, spatial information provided by passive sampling surveys may be very helpful in addressing the question of the spatial representativeness of monitoring stations. This issue has been receiving increasing attention during those last years due to exposure concerns, cost constraints and air quality modelling considerations. Those reflections highlight the necessity of developing efficient sampling design methodologies which are able to integrate various data about the environment characteristics. In this work, an optimal design methodology, first proposed by Brus and Heuvelink (2007) has been adapted and extended to the problem of generating sampling designs for measurement surveys of air quality. This methodology allows to generate optimal sampling schemes based on previously available information, including known relationships between the pollutant and environmental variables. A new solution generator for the SA algorithm and
a weighted criterion that can be chosen according to the purpose of the survey, have also been introduced. Moreover, the methodology does not make any assumption on the distribution of the contaminant concentration, like Gaussianity which could appear artificial. As this method relies on the definition of a geostatistical model (covariates plus variogram of the residuals), a preliminary geostatistical study has first been carried out on the available data of benzene concentrations, coming from several surveys on two different cities. It appeared that a whole family of models should be considered. The method has been applied on the Bordeaux agglomeration with two different a priori models. The resulting designs have shown important dissimilarities. Finally, the pros and cons of the method have been discussed comparing it to a previous work, what highlighted the need to consider a whole family of possible models in the construction of the sampling scheme, leading to a future Bayesian approach. Although the method has been applied to atmospheric benzene concentrations study, it is universal and can be adapted to a wide range of sampling problems. We did not address here the problem of variogram estimation where the goal is to find a design that will minimize a criterion depending on the quality of the variogram estimation. The criterion will then rely on the Fisher information matrix for a given variogram class, for instance Matérn’s, under Gaussian assumption, see Zhu and Stein (2005) among others. Nevertheless, those approaches generally do not take into account the quality of spatial prediction. The same authors have then proposed a method to combine both objectives in Zhu and Stein (2006), based on a two step algorithm. Diggle and Lophaven (2006) described an approach similar to the one proposed at the end of the previous section, where the uncertainty about covariance’s parameters is taken into account by defining prior distributions. However, it is worth noticing that the latter approach can become highly numerically intensive, as the dimension of the problem is then multiplied by the number of uncertain parameters. More sophisticated algorithms should then be used instead of SA. Among others, two methods seem particularly promising for this problem: the population Monte-Carlo approach of Cappé et al. (2004) and the interacting Markov chains algorithms described in Romary (2010), as they allow to make full use of parallel computing facilities. The proposed methodology has been developed with the aim of supplying scientific and technical support to the French local air quality monitoring agencies. For the moment it has been applied to benzene sampling over urban areas but it can be extended to other pollutants such as NO2 and to larger spatial domains like regions. Acknowledgements This work has been carried out for the French central laboratory for air quality surveillance. It has been supported by the French
3620
T. Romary et al. / Atmospheric Environment 45 (2011) 3613e3620
ministry of responsible for the ecology and sustainable development. We also thank the associations AIRAQ, ATMO Nord-Pas-deCalais and ATMO Champagne-Ardenne for having supplied the data. References Abidaa, R., Bocquet, M., Vercauterena, N., Isnard, O., 2008. Design of a monitoring network over France in case of a radiological accidental release. Atmospheric Environment 42, 5205e5219. Brus, D.J., Heuvelink, G.B.M., 2007. Optimization of sample patterns for universal kriging of environmental variables. Geoderma 138, 86e95. Cappé, O., Guillin, A., Marin, J.-M., Robert, C.P., 2004. Population Monte-Carlo. Journal of Computational and Graphical Statistics 13 (4), 907e929. Chilès, J.P., Delfiner, P., 1999. Geostatistics, Modeling Spatial Uncertainty. John Wiley & Sons, New-York. de Fouquet, C., Faucheux, C., 2009. Recommandations sur le schéma d’échantillonnage pour les campagnes de mesure de la qualité de l’air. Tech. rep. Mnes Paristech. Available as technical annexe in Malherbe, L., Cárdenas, G., 2008. http://www.lcsqa.org/rapport/2008/ineris/echantillonnage-reconstitutiondonnees-espace-temps. de Fouquet, C., Gallois, D., Perron, G., 2007. Geostatistical characterization of the nitrogen dioxide concentration in an urban area. Part I: spatial variability and cartography of the annual concentration. Atmospheric Environment 41, 6701e6714. De Gruijter, J.J., Brus, D.J., Bierkens, M.F.P., Knotters, M., 2006. Sampling for Natural Resource Monitoring. Springer Verlag. Diggle, P.J., Lophaven, S., 2006. Bayesian geostatistical design. Scandinavian Journal of Statistics 33 (1), 53e64.
Van Groenigen, J.W., Siderius, W., Stein, A., 1999. Constrained optimization of soil sampling for minimisation of the kriging variance. Geoderma 87, 139e259. Kanaroglou, P.S., Jerrett, M., Morrison, J., Beckerman, B., Arain, M.A., Gilbert, N.L., Brook, J.R., 2005. Establishing an air pollution monitoring network for intraurban population exposure assessment: a locationeallocation approach. Atmospheric Environment 39 (13), 2399e2409. Kirkpatrick, S., 1984. Optimization by simulated annealing: quantitative studies. Journal of Statistical Physics 34, 975e986. Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P., 1983. Optimization by simulated annealing. Science 220 (4598), 671e680. Müller, W.G., 2001. Collecting Spatial Data, second ed. Physica-Verlag, Heidelberg. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A.T.M., 1953. Equations of state calculations by fast computing machines. Journal of Chemical Physics 21, 1087e1091. R Development Core Team, 2010. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3900051-07-0. http://www.R-project.org. Robert, C.P., Casella, G., 2004. Monte-Carlo Statistical Methods, second ed. Springer, Berlin. Romary, T., 2010. Bayesian inversion by parallel interacting Markov chains. Inverse Problems in Science and Engineering 18 (1), 111e130. Wu, L., Bocquet, M., 2011. Optimal redistribution of the background ozone monitoring stations over France. Atmospheric Environment 45 (3), 772e783. Zhu, Z., Stein, M.L., 2005. Spatial sampling design for parameter estimation of the covariance function. Journal of Statistical Planning and Inference 134 (2), 583e603. Zhu, Z., Stein, M.L., 2006. Spatial sampling design for prediction with estimated parameters. Journal of Agricultural Biological and Environmental Statistics 11, 24e44.