Atmospheric Environment 84 (2014) 20e27
Contents lists available at ScienceDirect
Atmospheric Environment journal homepage: www.elsevier.com/locate/atmosenv
Improvement of air quality forecasts with satellite and ground based particulate matter observations M. Hirtl a, *, S. Mantovani b, B.C. Krüger c, G. Triebnig d, C. Flandorfer a, M. Bottoni b, M. Cavicchi e a
Section Environmental Meteorology, ZAMG e Central Institute for Meteorology and Geodynamics, Vienna, Austria SISTEMA GmbH, Vienna, Austria Institute of Meteorology, BOKU e University of Natural Resources and Life Sciences, Vienna, Austria d EOX IT Services GmbH, Vienna, Austria e MEEO S.r.l., Ferrara, Italy b c
h i g h l i g h t s Fine resolved PM10-maps are developed from MODIS AOT using Support Vector Regression. Assimilation of PM10 ground measurements and PM-Maps improve model forecasts. PM10 simulations are conducted with WRF/Chem.
a r t i c l e i n f o
a b s t r a c t
Article history: Received 16 September 2013 Accepted 11 November 2013
Daily regional scale forecasts of particulate air pollution are simulated for public information and warning. An increasing amount of air pollution measurements is available in real-time from ground stations as well as from satellite observations. In this paper, the Support Vector Regression technique is applied to derive highly-resolved PM10 initial fields for air quality modeling from satellite measurements of the Aerosol Optical Thickness. Additionally, PM10-ground measurements are assimilated using optimum interpolation. The performance of both approaches is shown for a selected PM10 episode. 2013 Elsevier Ltd. All rights reserved.
Keywords: PM10 forecasts Support Vector Regression MODIS AOT WRF/Chem
1. Introduction The impact of particulate air pollution on human health has been documented by various epidemiologic studies for decades (e.g. Davidson et al., 2005) and evidence is given that very fine particulate air pollution exposures are associated with significant increases in lung cancer mortality (Pope et al., 2002). Therefore increasing interest exists in reliable air quality forecasts to provide warnings or plan counter-measures. As PM10 is the sum of various different particles and not a single species its prediction is a hard task as it was shown in many studies in the past (Schaap et al., 2007; Vautard et al., 2007). In general, the air quality models under estimate the PM10 concentrations. The complexity of the models makes it quite difficult to attribute the * Corresponding author. Central Institute for Meteorology and Geodynamics, Hohe Warte 38, A-1190 Vienna, Austria. Tel.: þ43 1 36026 2406; fax: þ43 1 36026 74. E-mail address:
[email protected] (M. Hirtl). 1352-2310/$ e see front matter 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.atmosenv.2013.11.027
deviations to a certain source of error as there are many different input data sources used. The main reasons are uncertainties in meteorology (Hongisto, 2005), the fraction of anthropogenic and biogenic sources (Simpson et al., 2007; Vautard et al., 2005), and also unknown details of the physical and chemical processes (Simpson et al., 2007). One common attempt to reduce the deviations to the measurements is the adjustment of the chemical initial conditions using data assimilation methods. The availability of air pollution measurements is increasing and data sets (in situ and remote sensing) can be accessed in real-time. The use of these air quality measurements to increase the reliability of the regional scale air quality forecasts is therefore a major task. As for meteorological forecasts the initial state of the atmosphere is important for models that predict the dispersion and chemical reactions of pollutants (e.g. Elbern and Schmidt, 2001). For air quality models, the initial state is determined by the horizontal and vertical distributions of different pollutants in the atmosphere. In the last few years significant progress towards the increased use of satellite products related to air quality has been made (e.g. Miyazaki et al., 2012). This progress became possible due to advances in sensor
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
technology and new algorithmic approaches. Satellite data cover regions where no ground-based stations are available. In general they have a larger spatial- and less-temporal sampling resolution than ground observations. The Support Vector Regression (SVR) technique is introduced to obtain high resolved PM10-Maps from the MODIS instrument as initial condition for air quality forecasts. Comprehensive measurement- (ground and satellite) and model-data sets were used to develop and optimize the SVR software component: the developed air pollution monitoring system elaborates multispectral data acquired by Moderate Resolution Imaging Spectroradiometer (MODIS) sensors installed on board of TERRA and AQUA satellite platforms. MODIS Level-1B Calibrated Geolocation Data Set (calibrated and geolocated radiances and reflectances in 36 spectral bands) and MODIS Geolocation Data Set (geodetic coordinates, ground elevation, solar and satellite zenith, and azimuth angle) are used to retrieve the Aerosol Optical Thickness (AOT) product. Using a tailored version of the International MODIS/AIRS Processing Package (IMAPP), the AOT product is provided one hundred times more detailed with respect to the MODIS Aerosol product (Remer et al., 2004). Dynamical downscaling of the tropospheric aerosol algorithm as well as statistical downscaling using Support Vector Regression techniques have been implemented and validated (Campalani et al., 2011; Nguyen et al., 2011, 2012) to increase the AOT spatial resolution up to 1 1 km2. The on-line coupled model WRF/Chem (Weather Research and Forecast model coupled with Chemistry; Grell et al., 2005) is operated at a resolution of 3 km in the core domain (alpine region). This also allows the usage of highly resolved emission inventories and the improvement of the input data. The latest emission inventories provided by Austrian authorities were collected and harmonized. These emissions are used together with European (UNECE/EMEP http://webdab.emep.int/ and TNO, Visschedijk et al., 2007) data sets. After the detailed description of data sets, the following sections describe the SVR approach to estimate PM10-Maps from heterogeneous data sources (ground-/satellite-/model-data) and the implementation of the assimilation scheme. 2. Data sets The data sources comprise observational data from ground measurements (meteorology and chemical species), satellite measurements (Level-1B Calibrated Geolocation Data Set) and modeling data (model forecasts for meteorological parameters). All the data were collected for a time span of three years (2008e2010). 2.1. Ground data Two data sources of ground based in situ measurements in Austria are available and were prepared for further analysis with the satellite data. The data for the time period of 3 years (for some stations the data are available for a limited time period) were used for the analysis. Fig.1 shows the distribution of the meteorological (MET) and the air quality stations (AQ). Fig. 1 also represents the inner grid of the modeling domain with a horizontal resolution of 3 km.
21
Fig. 1. AQA modeling domain (3 km resolution) and station locations of the Austrian MET- and AQ-networks.
2.3. Model data Model forecasts of the meteorological parameters (wind temperature, humidity, pressure and boundary layer height) are created with the modeling system ECMWF-WRF/Chem. The model simulations are based on the global forecasts provided by the IFS (Integrated Forecast System) of the ECMWF (European Centre for Medium-Range Weather Forecasts). These data (horizontal resolution ¼ 0.25 ) are further processed with the online-coupled model WRF/Chem which provides the meteorological forecasted fields with higher spatial resolution (three nested model domains with 27 km/9 km/3 km resolution). In order to save computation time the chemistry module is switched off because only the meteorological parameters are needed. 2.4. Emissions To provide the best possible anthropogenic emission data the latest emission inventories of the regional Austrian administrations are collected. The highly spatial resolved data from Austrian administrations are incorporated into a modified emission preparation process for the AQA model system. The data are incorporated into a modified emission preparation process for the AQA model system, which distributes the emissions to the differently resolved grids of the modeling system, harmonizes the emission sectors and supplements missing species. In addition, the new data for Austria are combined with emission inventories for the other areas covered by the model grids (Europe, parts of Africa and Asia, marine areas, and the missing Austrian regions). For the areas outside Austria, the data were taken from the TNO (Visschedijk et al., 2007) inventory. In addition, some emissions from the EMEP inventory (http://www. ceip.at/ceip) were included for areas not covered by TNO emissions. These areas are located mainly in Africa and Asia.
2.2. Satellite data 3. Methods Satellite data are retrieved for the model domain. MODIS Level 1B (MOD02) and Geolocation (MOD03) products are downloaded via the Level 1 and Atmosphere Archive and Distribution System (LAADS), finally Aerosol Optical Thickness (AOT) maps at 1 km spatial resolution are generated using the International MODIS/ AIRS Processing Package (IMAPP).
3.1. PM10-Maps estimation The Support Vector Machine (SVM) method analyzes data and recognizes patterns for classification purposes. It is based on an algorithm consisting of two steps, called “training phase” and “test
22
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
(or prediction) phase”, respectively. During the training phase, the SVM constructs hyperplanes in a space obtained by functional transformation, using a kernel function, called “feature space”, which is then used in the test phase to classify new items. The effectiveness of the SVM depends on the selection of the kernel, the kernel’s parameters, and the parameter C (constraint on the Lagrangian multipliers used in the classical Lagrangian optimization technique). The final model is selected as the best combination of the kernel’s parameters and C evaluating the cross-validation accuracy. Several variants of the SVM methods are available. One of the most comprehensive reviews on these statistical techniques is given by Cristianini and Shawe-Taylor (2000). A version of the SVM for regression was proposed by Drucker et al. (1997), called Support Vector Regression (SVR). The model produced by SVR is implemented during the training phase (using the TRAINings Data Set) evaluating different kernels and kernel’s parameters combinations. The essence of the SVR technique can be summarized as follows: . let x ¼ fx1 ; x2 ; .; xm g ¼ xj , (j ¼ 1, 2, ., m) be a set of real numbers with associated “true” values
, in an m-dimensional . hyperspace. Let us consider a number N of these items x , forming a training data set:
S
n. . . . . . o x 1; y 1 ; x 2; y 2 . x N ; y N n. . o ¼ S x i ; y i ði ¼ 1; 2; .; NÞ: Given
a
. .
.
.
Kð x i ; y j Þ ¼ Fð x i Þ$Fð y j Þ,
Kernel
let
.*
.*
* * * l1 ; ~l2 ; .; ~lN Þ be the solution vectors ¼ ð~ of the following optimization problem (Cristianini and ShaweTaylor, 2000):
l ¼ ðl*1 ; l*2 ; .; l*N Þ; ~l
W
. . N X N . . 1X l ; ~l ¼ li ~li lj ~lj K x i ; x j 2 i¼1 j¼1 3
N X
li þ ~li þ
j¼1
N X
li ; yi li ~
(1)
i¼1
with the constraints N X
*
l*i ~li
¼ 0;
(2)
f ðxÞ ¼
N .* . X * . .SV l*i ~li K x ; x i þ b* ; w * x þ b* ¼
This analytical expression of the regression function is then used in the test phase to associate new “true” values (j ¼ 1, 2, ., m) to the new items of a test set. The SVR techniques are applied to derive PM10 estimates from satellite measurements. The basic idea underlying this data analysis approach is to use a set of preliminary data, characterized by different but already known properties, to derive regression criteria to be applied to a new set of items still to be processed. In the regression procedure applied to the new set of items, properties are assigned according to analogies with the previous data set. Correlations between different data sources (e.g. AOT aggregate values versus PM10 ground measurements) are used to derive PM10 values. To derive data sets which include these correlations, satellite data, model data and ground measurements are used; different spatial and temporal resolutions require to collocate these data in space and time in order to generate SVM/SVR samples. The spatialtemporal collocation method is presented in Fig. 2: for each available satellite product over Austria, ground measurements as well as model data fields are collocated in time, then satellite data and model data fields are cropped around MET-/AQ-network site locations. Satellite overpass times rarely coincide with the ground measurements: a couple of MODIS images are acquired each day over Austria, while ground stations typically record half-hourly data, and several minutes typically separate the two acquisitions. For these reasons it is appropriate to compare spatial averages from MODIS with temporal averages from ground measurements. A spatio-temporal window is used to extract the data set for which statistics are computed. This is achieved (see Fig. 2) with reference to a radius (R) and time semi-interval (T). Each match compares the average of the satellite pixels within the radius of R around a ground site and the average of the corresponding ground measurements within T minutes from the satellite overpass. To guarantee the variability of the parameters requested to train and to test the products, the samples are separated in the TRAINDS (TRAINing Data Set) and VALDS (VALidation Data Set) using following criteria: - Monthly grouping, to model features (if any) correlated to the variability of meteorological fields during a month’s time;
i¼1 *
l*i ; ~li ˛½0; C
(3)
In Eq. (1) the F ’s are chosen functions and the parameter 3 is the maximum acceptable deviation from correct results which one is willing to accept in the application of the SVR technique. The solution of the above stated problem is obtained by identifying a so-called “regression function” which, written in terms of a . “weight vector” w and a “bias” b, has the form
. . . f x ¼ w * x þ b:
(4)
The solution for the weight vector is .*
w ¼
N X
*
.SV ;
l*i ~li F x i
(5)
i¼1
where the sum is restricted to SV a limited set of vectors called “sup. port vectors” (referred to as x i in Eq. (5)). Inserting Eq. (5) in Eq. (4), the regression function becomes
(6)
i¼1
Fig. 2. Spatial-temporal collocation method.
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
- Seasonal grouping, to model features (if any) linked to the variability of meteorological fields; - Annual grouping, to model features (if any) linked to the variability of meteorological fields during one year.
3.1.1. PM10-Map design and validation Several numerical experiments were conducted to evaluate SVR performance both in the “site domain” (i.e. pixels around each location of Austrian air quality ground-stations where PM10 concentration data and meteorological parameters are collected) and in the model domain. Experiments in the site domain were performed using the Radial Basis Function (RBF) kernel. Aerosol Optical Thickness from Moderate Resolution Imaging Spectroradiometer (MODIS) sensor, meteorological parameters (Temperature, Wind, Relative Humidity) and ancillary data (elevation) from ground-stations were matched-up to generate training and testing files. Temporal representativeness of data showed that SVR models should be implemented on a monthly basis: data are in a first step grouped by month (e.g. January 2008, January 2009 and January 2010), then separated in TRAINDS and VALDS samples to be used in the SVR training and prediction phases. In fact, it was found out that replacing the data sets with items collected for all the three years under consideration by data sets containing data from one single month improves considerably the quality of the results. We recall that the correlation coefficient between two vectors of data X ¼ fx1 ; x2 ; .; xN g and Y ¼ fy1 ; y2 ; .; yN g is defined as follows:
P PN PN N N i¼1 xi yi i¼1 xi i¼1 yi ffi r 2 ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 P 2 P PN 2 PN N N 2 N i¼1 xi N i¼1 yi i¼1 xi i¼1 yi (7) The RBF kernel was selected to investigate SVR performance to estimate PM10: the kernel parameters were at first tuned on the month of May 2008, then the same values were adopted to implement the SVR monthly models, also for the years 2009 and 2010. Results obtained for the correlation r2 are shown in Fig. 3. A detailed analysis of the impact of satellite data (i.e. AOT) on PM10 estimation shows a considerable increase of the accuracy (considering the correlation) between 3% and 68% (see Table 1). Since satellite products are affected by cloud cover with a potential impact on the estimation of PM10 values in grid domain, (see Fig. 4 upper left and upper right), a versatile set of SVR models has been implemented to eliminate cloud cover. As demonstrated
23
Table 1 Impact of AOT satellite product to estimate PM concentration values: correlation values between ground measurements and PM values estimated by SVR with (w) and without (wo) AOT. Period
2008e01 2008e02 2008e03 2008e04 2008e05 2008e06 2008e07 2008e08 2008e09 2008e10 2008e11 2008e12
With AOT
w/wo 1 (%)
Without AOT 2
r
r
0.92 0.92 0.88 0.90 0.96 0.88 0.87 0.87 0.91 0.91 0.91 0.89
0.84 0.84 0.78 0.81 0.92 0.77 0.75 0.76 0.83 0.82 0.83 0.80
2
r
r
0.88 0.89 0.84 0.85 0.94 0.82 0.81 0.67 0.84 0.86 0.85 0.86
0.77 0.79 0.71 0.73 0.89 0.68 0.65 0.45 0.71 0.73 0.72 0.74
9.65 7.30 9.83 10.86 3.38 14.11 15.07 68.16 16.34 12.44 15.41 7.79
in Table 1, the AOT satellite product has a positive effect on the retrieval procedure, the final SVR configuration allows to handle heterogeneous data sources (AOT from satellite, meteorological fields from AQA model and/or ground stations) to generate the best possible quality for the PM10 maps. Fig. 4 (lower left and lower right) shows the impact of different SVR models on the PM10 estimation by using a non-appropriate choice of parameters: higher values of parameter C perform better in the site domain (i.e. across ground stations) but generate unrealistic background values (red color), while, decreasing the value of C, SVR models are able to generate a more realistic geographical distribution. Although the application of the SVR technique is successful, in the course of the year-long applications some issues were identified which are not fully resolved. Therefore some open problems deserve to be investigated in a prosecution of the numerical and physical analysis of the available data and of data which may become available in the near future. We refer to two of these open items which are of high theoretical and practical relevance. i. Evaluation of optimum parameters used in the SVR algorithms. Let us refer to the Radial Basis Function (RBF) kernel given by
. .0 h . .0 i . .0 ¼ F x $F x ¼ exp g x x K x; x " . .0 # x x ¼ exp 2s2
(8)
where g ¼ 1/(2s2) is inversely proportional to the variance of a Gaussian distribution.
Fig. 3. Correlation (r2) between monthly PM10 calculations and observations for the years 2008, 2009 and 2010.
24
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
Fig. 4. Examples of PM10 maps obtained using SVR models with non-appropriate (left panels) and appropriate set of kernel parameters (right panels): high values for C parameter generate unrealistic background values (red color). Bottom panels show PM10 maps obtained by advanced SVR models to remove cloud cover impact: upper panels (both left and right) show AOT maps affected by cloud cover (black areas), while lower panels (both left and right) show how cloud cover has been removed in the estimation of PM10 maps. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The successful application of the SVR technique to a training data . set Sfð x i yi Þg (i ¼ 1, 2, ., N) requires a good estimation of the parameters gamma (g), epsilon (3 ) [maximum acceptable deviation from correct results in Support Vector Regression (SVR)] and C (constraint on Lagrangian multipliers). The Cross Validation (CV) technique allows for a search of these parameters, but it requires a substantial amount of computing time. As an alternative to the CV technique a procedure for evaluating C and 3 has been proposed in Cherkassky and Ma (2004), which however does not give any information about the value of the g parameter. The last given reference allows for an estimation of the C parameter depending on the mean value of the “true” values yi and on their standard deviation. The suggested value of the 3 parameter depends essentially on the number N of items and on their noise content. Application of the Cherkassky and Ma method to the training data set used in the frame of this study yield results which are acceptable but not optimum. In most cases the suggested parameters need improvement by trial and error attempts. Future work should therefore be dedicated to the enhancement of the method possibly taking into account other physical characteristics of the data set under examination, for instance the impact of different . physical components of the x vector. ii. Evaluation of parameter g in the Radial Basis Function (RBF) kernel. We recall that the Cherkassky and Ma method does not provide hints for the optimum value of the g parameter. Computational experience made with the use of the RBF Kernels, in the frame of the SVR algorithms, has led to the recognition of some practical rules to be observed to guarantee physically acceptable results. The most elementary of these rules requires that the parameter g be as high as possible, but no too large: one has to ensure that the contribution of the Kernel in the regression function does not vanish with respect to the contribution of the bias term. There are instances in which physically consistent results yielded by the regression function require either very small values or large values
of the g parameter, depending on the number of items in the data set or on the similarity between training data set and test data set. More work should be dedicated in future to analyse this issue and to consolidate the practical rules, so far derived, by means of a theoretical foundation. 3.2. Assimilation of the ground data Ground measurements are used in two ways: 1) in the ground site domain to perform several evaluations (e.g. to evaluate temporal signature (yearly, seasonal, monthly), to evaluate satellite data benefit, to evaluate key parameters for PM10 retrieval) and 2) both at the site and in the model domain to validate the forecast. Information from ground measurements are not introduced in the satellite based PM10 map at this stage of this work. Preliminary investigations in this direction are recently performed with geostatistical techniques to evaluate PM10 estimation from space and ground data by Campalani et al. (2011, 2012). As the PM10-ground measurements are currently not included in the PM10 maps generation using the SVR process these data are assimilated separately into the model. To determine the initial condition of the PM10 distribution from ground measurements and simulated model distributions, optimal interpolation (OI) analysis (Daley, 1991) is used. The OI is an algebraic simplification of the computation of the weight K in the analysis equations:
xa ¼ xb þ Kðy H½xb Þ
(9)
1 K ¼ BH T BH T þ R
(10)
with xa analysis values xb background values
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
K optimal weight matrix Y observations H transforms from Model- to observation-space B background error covariance R observation error covariance
M Model R Measurements I station number (1, ., imax)
Eq. 9 can be regarded as a list of scalar analysis equations, one per model variable in the vector x. For each model variable the analysis increment is given by the corresponding line of K times the vector of background departures (y H[xb]). To obtain B a simplified representation of the covariances between the grid points is used according to Balgovind et al. (1983) which assumes that the covariance is a function of the horizontal and vertical distances between the two points of interest. Assuming that instrument errors can be considered independent ensures that covariances between the observation errors can be set to zero in R ¼ rI where I is the identity matrix and r the error variance.
4. Case study The whole modeling system is evaluated with measurement data from the Austrian AQ-network. Based on the analysis of the PM10-measurements from 2008 to 2010 an episode in February 2010 (4 weeks) was chosen for the evaluation. Extensive exceedances of the 50 mg m3 threshold of the daily average PM10 concentrations occurred during that period. During that period a high pressure regime with low wind conditions developed over central Europe. About 130 AQ-stations in Austria were considered for the model evaluation. The model evaluation software BOOT (Chang and Hanna, 2004) was used to calculate and visualize the following statistical measures: The fractional bias (FB) accounts for the over- and underestimation of the model. The normalized mean square error (NMSE) is an estimator of the overall deviations between predicted and measured values. Contrary to the FB, in the NMSE the deviations, instead of the differences, are summed. If a model has a very low NMSE, then it is well performing both in space and time. At perfect agreement between model and measurements FB and NMSE are 0. The mentioned parameters are defined as follow:
FB ¼
MR 0:5* M þ R
NMSE ¼
ðMi Ri Þ2 M*R
25
(11)
(12)
Two different model runs were analyzed for the comparison with the measurements. One model run serves as reference run which is conducted without any inclusion of measurements. The second one uses ground and satellite data as input. The model assimilates PM10 measurements from the Austrian AQ-network daily at 00-UTC and the PM-Maps obtained from the MODIS instrument when the satellite passes the modeling domain (1 time per day) between 9 and 12 UTC. For each of the model runs a 7 day spin-up time for the chemical species distribution is used as the model run starts from ”clean air” conditions. No meteorological measurements are currently assimilated. WRF/Chem is operated using the RADM2 (Stockwell et al., 1990) module for the gas-phase chemistry and the MADE/SORGAM module to describe the aerosol chemistry (Ackermann et al., 1998; Schell et al., 2001). The spatial distribution of the statistical model performance for all AQ-stations is depicted in Figs. 5 and 6 for the 2 runs. For the model run without any feedback with the measurements the values of the FB are mainly above 0.75 and also the NMSE is fairly high in the whole domain. Both FB and NMSE are significantly improved when ground- and satellite-data are assimilated. Especially outside of Alpine regions the FB lies below 0.75 and the NMSE below 1 at most of the stations. Fig. 7 summarizes NMSE and FB for all stations together. The values were calculated on an hourly basis and for all considered stations. The inclusion of the measurement data improves the results significantly in respect to both FB and NMSE although the positive FB indicates consistent underestimation by the model for both runs especially in Alpine regions. A possible explanation is that for many AQ-stations especially in valleys the model using a resolution of 4 km cannot represent the influence of the local sources appropriately. Emissions of point- or line-sources are distributed instantaneously over the whole grid cell which leads to lower concentrations at the point measurement. This effect happens especially when the measurements are mainly affected by local emissions which occurs e.g. during high pressure periods with low wind conditions. The statistical measures at the individual stations in different regions for the model run using ground and satellite data can be found in the parabolic plots from Figs. 8and 9. The figures are divided into stations lying in flat land (Fig. 8) and stations inside the Alpine region (Fig. 9). For 110 stations (from all 130) the NMSE is lower than or equal to 2 which can already be considered as a quite good model performance in general. At 63 stations the NSME is even lower than or
Fig. 5. FB (hourly data) between PM10 measurements at Austrian AQ-stations (Feb 2010, hourly values) and model results with different input data. Left: no measurement data included. Right: ground stations and satellite data assimilated.
26
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
Fig. 6. NMSE (hourly data) between PM10 measurements at Austrian AQ-stations (Feb 2010, hourly values) and model results with different input data. Left: no measurement data included. Right: ground stations and satellite data assimilated.
equal to 1. The highest NMSE can be found at stations in the Alpine region. Values of up to 4 can be reached at individual stations. The FB is positive (model underestimates the measurements) for nearly all stations. An absolute value of the FB which is less than 0.75 (vertical dotted line) means that the deviation between model and measurements is lower than 50% at this location. At 85 AQstations the model forecasts have an FB below or equal to 0.75 (deviation of 50%). At 47 AQ-stations the FB is lower than or equal to 0.5 (deviation of 30%). Although the concentrations are generally underestimated by the model, the temporal agreement between model and measurements is satisfying for most stations. Fig. 10 represents the correlation of the daily mean values for the considered month. For 66 stations the correlation is higher than 0.6, the median of all stations is 0.6. A couple of stations even in complex terrain show reasonable correlation coefficients. As the threshold for PM10 is defined by a daily average concentration value the model is able to give an indication of the further development of the concentrations. 5. Conclusions Air quality forecast with WRF/Chem are successfully improved by assimilation of PM10 measurements from ground stations and with satellite based PM10 estimates provided by the MODIS instrument.
Fig. 7. NMSE and FB for all AQ-stations obtained from 2 model runs (nDA: no data assimilation, DA: assimilation of ground and satellite data).
The model is used with 3 nested grids with the finest resolution of 3 km over Austria and the alpine region. Interfaces to two different measurement data sources are presented. PM10-data from the Austrian air quality network is assimilated using optimum interpolation routines. Additionally MODIS measurements of AOT are processed by an SVR/SVM software module in order to obtain PM10 maps which are also used as initial conditions for the model simulations. To optimize the quality of the PM10 maps extensive analysis has been conducted based on a 3-years data set which consists of measurement data (satellite and ground) and model data. Highly resolved emission data from the Austrian federal states were collected and processed to provide a data base for model calculations with high spatial resolution. The resulting harmonized data set and the respective conversion programs comprise the most comprehensive emission inventory for Austria that is currently possible. The modeling system is evaluated for a selected episode of one month. In February 2010 extensive exceedances of PM10 thresholds occurred at nearly all Austrian air quality stations. The analysis shows that the use of the PM10 maps for model initialization has a strong influence on the air quality forecasts and it is shown that the statistical agreement between model and observations in regions outside of the Alps are improved significantly. The results also show that the model performance even improves in complex terrain but
Fig. 8. FB and NMSE for AQ-stations outside the Alpine region.
M. Hirtl et al. / Atmospheric Environment 84 (2014) 20e27
27
References
Fig. 9. FB and NMSE for AQ-stations in the Alpine region.
Fig. 10. Correlation coefficients (r2) of the daily mean concentrations.
still is not satisfying in the considered high polluted episode. The satellite measurements provide a dense spatial PM10 coverage and complement the point measurements from the ground network. Ground stations alone cannot give appropriate estimates of the PM10 distribution in regions with low station density. This study showed that PM10 forecasts can benefit from the assimilation of satellite data and ground measurements. However the initial conditions are only one aspect of the model input data which can introduce uncertainties into the model forecasts especially when forecasts of more than one day are conducted. As the influence of the initial conditions usually vanish after some hours (depending on the meteorological conditions) further work should also focus on the influence of the meteorological fields using different physical parameterizations (e.g. boundary layer height, cloud parameterization .) which affect not only the transport but also the chemical conversions. Acknowledgments This work has been supported by the FFG in the frame of the 7th call of the Austrian Space Programme (ASAP 2010).
Ackermann, I.J., Hass, H., Memmesheimer, M., Ebel, A., Binkowski, F.S., Shankar, U., 1998. Modal aerosol dynamics model for Europe: development and first applications. Atmos. Environ. 32 (17), 2981e2999. Balgovind, R., Dalcher, A., Ghil, M., Kalnay, E., 1983. A stochastic-dynamic model for the spatial structure of forecast error statistics. Mon. Weather Rev. 111, 701e722. Campalani, P., Nguyen, T.N.T., Mantovani, S., Bottoni, M., Mazzini, G., 2011. Validation of PM MAPPER aerosol optical thickness retrieval at 1 1 km2 of spatial resolution. In: Proceedings of the 19th International Conference on Software, Telecommunications and Computer Networks. Campalani, P., Mantovani, S., Hirtl, M., Caglienzi, M., Mazzini, G., 2012. Spatial prediction of PM10 mass concentrations with geostatistics: an Austrian case study. In: Proceedings of GeoENV2012. 19e21 September 2012. Chang, J.C., Hanna, S.R., 2004. Air quality model performance evaluation. Meteorol. Atmos. Phys. 87, 167e196. Cherkassky, Valdimir, Ma, Yunqian, 2004. Practical selection of SVM parameter and noise estimation for SVM regression. Neural Netw. 17, 113e126. Cristianini, N., Shawe-Taylor, J., 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press. Daley, R., 1991. Atmospheric Data Analysis. In: Cambridge Atmospheric and Space Science Series. Cambridge University Press, ISBN 0-521-38215-7, p. 457. Davidson, C.I., Phalen, R.F., Solomon, P.A., 2005. Airborne particulate matter and human health: a review. Aerosol Sci. Technol. 39 (8), 737e749. http:// dx.doi.org/10.1080/02786820500191348. Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A.J., Vapnik, V.N., 1997. Support Vector Regression Machines. Elbern, H., Schmidt, H., 2001. Ozone episode analysis by four-dimensional variational chemistry data assimilation. J. Geophys. Res. D4 (106), 3569e3590. Grell, G.A., Peckham, S.E., Schmitz, R., McKeen, S.A., Frost, G., Skamarock, W.C., Eder, B., 2005. Fully coupled on-line chemistry within the WRF model. Atmos. Environ. 39, 6957e6975. Hongisto, M., 2005. Uncertainties in the meteorological input of the chemistrytransport models and some examples of their consequences. Int. J. Environ. Pollut. 24 (1/2/3/4), 127e153. Miyazaki, K., Eskes, H.J., Sudo, K., Takigawa, M., van Weele, M., Boersma, K.F., 2012. Simultaneous assimilation of satellite NO2, O3, CO, and HNO3 data for the analysis of tropospheric chemical composition and emissions. Atmos. Chem. Phys. Discuss. 12, 16131e16218. http://dx.doi.org/10.5194/acpd-12-16131-2012. Nguyen, T.N.T., Mantovani, S., Campalani, P., 2011. Validation of support vector regression in deriving aerosol optical thickness maps at 1 1 km2 spatial resolution from satellite observations. In: Proc. the IEEE International Symposium on Signal Processing and Information Technology, pp. 551e556. Nguyen, T.N.T., Mantovani, S., Campalani, P., Limone, G.P., 2012. Downscaling aerosol optical thickness from satellite observations using support vector regression replied on domain knowledge. In: Proc. the 1st International Conference on Pattern Recognition Applications and Methods, pp. 230e239. Pope, C.A., Burnett, R.T., Thun, M.J., Calle, E.E., Krewski, D., Ito, K., Thurston, G.D., 2002. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. J. Arm. Med. Assoc. 287 (9), 1132e1141. http:// dx.doi.org/10.1001/jama.287.9.1132. Remer, L.A., Tanré, D., Kaufman, Y.J., 2004. Algorithm for Remote Sensing of Tropospheric Aerosol from MODIS: Collection 5. MODIS ATBD. Schaap, M., Vautard, R., Bergstrom, R., van Loon, M., Bessagnet, B., Brandt, J., Christensen, H., Cuvelier, K., Foltescu, V., Graff, A., Jonson, J.E., Kerschbaumer, A., Krol, M., Langner, J., Roberts, P., Rouil, L., Stern, R., Tarrason, L., Thunis, P., Vignati, E., White, L., Wind, P., Builtjes, P.H.J., 2007. Evaluation of long-term aerosol simulations from seven air quality models and their ensemble in the EURODELTA study. Atmos. Environ 41, 2083e2097, 3667. Schell, B., Ackermann, I.J., Hass, H., Binkowski, F.S., Ebel, A., 2001. Modeling the formation of secondary organic aerosol within a comprehensive air quality model system. J. Geophys. Res. 106, 28275e28293. Simpson, D., Yttri, K.E., Klimont, Z., Kupiainen, K., Caseiro, A., Gelencsér, A., Pio, C., Puxbaum, H., Legrand, M., 2007. Modeling carbonaceous aerosol over Europe: analysis of the CARBOSOL and EMEP EC/OC campaigns. J. Geophys. Res. 112, D23S14. Stockwell, W.R., Middleton, P., Chang, J.S., 1990. The second generation regional acid deposition model chemical mechanism for regional air quality modeling. J. Geophys. Res. 95 (d10), 16,343e16,367. Vautard, R., Bessagnet, B., Chin, M., Menut, L., 2005. On the contribution of natural Aeolian sources to particulate matter concentrations in Europe: testing hypotheses with a modeling approach. Atmos. Environ. 39, 3291e3303. Vautard, R., Builtjes, P., Thunis, P., Cuvelier, K., Bedogni, M., Bessagnet, B., Honoré, C., Moussiopoulos, N., Schaap, M., Stern, R., Tarrason, L., van Loon, M., 2007. Evaluation and intercomparison of ozone and PM10 simulations by several chemistry-transport models over 4 European cities within the City-Delta project, 2007. Atmos. Environ. 41, 173e188. Visschedijk, A.J.H., Zandveld, P.Y.J., Denier van der Gon, H.A.C., 2007. A High Resolution Gridded European Emission Database for the EU Integrate Project GEMS. TNO-report 2007-A-R0233/B.