PM2.5 analog forecast and Kalman filter post-processing for the Community Multiscale Air Quality (CMAQ) model

PM2.5 analog forecast and Kalman filter post-processing for the Community Multiscale Air Quality (CMAQ) model

Atmospheric Environment 119 (2015) 431e442 Contents lists available at ScienceDirect Atmospheric Environment journal homepage: www.elsevier.com/loca...

5MB Sizes 0 Downloads 18 Views

Atmospheric Environment 119 (2015) 431e442

Contents lists available at ScienceDirect

Atmospheric Environment journal homepage: www.elsevier.com/locate/atmosenv

Corrigendum

PM2.5 analog forecast and Kalman filter post-processing for the Community Multiscale Air Quality (CMAQ) model Irina Djalalova a, *, Luca Delle Monache b, James Wilczak c a

Cooperative Institute for Research in the Environmental Sciences (CIRES), University of Colorado at Boulder, CO, USA National Center for Atmospheric Research (NCAR), Research Applications Laboratory (RAL), Boulder, CO, USA c National Oceanic and Atmospheric Administration (NOAA), Earth Systems Research Laboratory (ESRL), Boulder, CO, USA b

a r t i c l e i n f o

a b s t r a c t

Article history: Available online 16 June 2015

A new post-processing method for surface particulate matter (PM2.5) predictions from the National Oceanic and Atmospheric Administration (NOAA) developmental air quality forecasting system using the Community Multiscale Air Quality (CMAQ) model is described. It includes three main components:

Keywords: Analog forecast CMAQ Kalman-filtering PM2.5

 A real-time quality control procedure for surface PM2.5 observations;  Model post-processing at each observational site using historical forecast analogs and Kalman filtering;  Spreading the forecast corrections from the observation locations to the entire gridded domain. The methodology is tested using 12 months of CMAQ forecasts of hourly PM2.5, from December 01, 2009 through November 30, 2010. The model domain covers the contiguous USA, and model data are verified against U.S. Environmental Prediction Agency AIRNow PM2.5 observations measured at 716 stations over the CMAQ domain. The model bias is found to have a strong seasonal dependency, with a large positive bias in winter and a small bias in the summer months, and also to have a strong diurnal cycle. Five different post-processing techniques are compared, including a seven-day running mean subtraction, Kalman-filtering, analogs, and combinations of analogs and Kalman filtering. The most accurate PM2.5 forecasts have been found to be produced when applying the Kalman filter correction to the analog ensemble weighted mean, referred to as KFAN. The choice of analog predictors used in the analog search is also found to have a significant effect. A monthly error analysis is computed, in each case using the remaining 11 months of the data set for the analog searches. The improvement of KFAN errors over the raw CMAQ model errors ranges from 44 to 52% for MAE and 13e30% for the correlation coefficient. Since the post-processing analysis is only done at the locations where observations are available, the spreading of post-processing correction information over nearby model grid points is necessary to make forecast contour maps. This spreading of information is accomplished with an eight-pass Barnes-type iterative objective analysis scheme. The final corrected CMAQ forecast over the entire domain is composed of the sum of the original CMAQ forecasts and the KFAN bias information interpolated over the entire domain, and is applied on an hourly basis. © 2015 Elsevier Ltd. All rights reserved.

1. Introduction

DOIs of original article: http://dx.doi.org/10.1016/j.atmosenv.2015.02.021, http://dx.doi.org/10.1016/j.atmosenv.2015.05.058. * Corresponding author. 325 Broadway, Boulder, CO, 80303, USA. E-mail address: [email protected] (I. Djalalova). http://dx.doi.org/10.1016/j.atmosenv.2015.05.057 1352-2310/© 2015 Elsevier Ltd. All rights reserved.

Almost all operational National Oceanic and Atmospheric Administration (NOAA)/National Weather Service (NWS) point forecasts of weather variables have some type of post-processing corrections applied to them to reduce model errors before the information is disseminated to the public. The correction can be of

432

Corrigendum / Atmospheric Environment 119 (2015) 431e442

various forms, ranging from complex statistical regression techniques, to the subjective corrections that Weather Forecast Office (WFO) forecasters apply in converting raw model output before a forecast is provided to the public. The most common type of postprocessing is based on multi-variate linear regression techniques, usually referred to as Model Output Statistics (MOS) corrections. This technique requires a long history of model output and observations, with no changes to the model formulation over this period, in order to be able to develop a stable set of statistical corrections. The NOAA National Air Quality Forecast Capability (NAQFC, http://www.nws.noaa.gov/ost/air_quality/) currently makes particulate matter (PM2.5, diameters less than 2.5 microns) predictions in a 'developmental testing' mode using the Environmental Protection Agency (EPA) Community Multiscale Air Quality (CMAQ) model, with forecasts produced routinely twice per day and verified in near-real time (Mathur et al., 2008; Gorline and Lee, 2009; Stajner et al., 2012). In contrast to weather variables from operational forecast models, presently no post-processing is applied to the CMAQ air quality variables, despite the fact that the model forecasts contain large systematic and random errors for PM2.5 (Wilczak et al., 2006; Djalalova et al., 2010). In fact, the CMAQ forecasts for PM2.5 in these studies were significantly worse than those that could be obtained simply by using a persistence forecast (that tommorow's value of PM2.5 concentration will be the same as today's value at the same time of day). Large bias errors for PM2.5 are a characteristic not only of CMAQ but are also found in many other air quality forecast models (Solazzo et al, 2012). Previous investigations of bias-correction techniques applied to air quality variables include: Delle Monache et al., 2006; McKeen et al., 2005, 2007, 2009; Delle Monache et al., 2008; Honore et al., 2008; Kang et al., 2008; Wilczak et al., 2009; Kang et al., 2010; Djalalova et al., 2010; Borrego et al., 2011. The present analysis uses a unique combination of both Kalman filtering and analog forecasting, and also implements a methodology for transferring point location corrected predictions to gridded forecast maps. In the present analysis we use data from the U.S. EPA AIRNow observation network (http://www.airnow.gov/) spanning the contiguous U.S. (CONUS) for the one year time period from December 01, 2009, to November 30, 2010, and compare this data to the CMAQ forecasts from NAQFC. A detailed description of the observed PM2.5 data set and the implementation of a real-time quality control procedure, as well as a brief introduction to the CMAQ model, are provided in section 2. Section 3 contains a description of the post-processing techniques implemented in this study. In section 4 the application of these techniques to PM2.5 is thoroughly analyzed for the 30 days of November 2010, using the rest of the data as the training period. Statistics for a full year are presented in section 5. Section 6 describes the spreading method used to apply the PM2.5 corrections to the entire CMAQ model domain, and a summary and discussion are provided in section 7.

Fig. 1. Surface PM2.5 monitoring locations from the EPA AIRNow network of 694 sites within the CMAQ model domain. Red indicates sites with more than 20% of the data for the 12 month analysis period found to be missing or flagged as invalid by an automated QC algorithm (124 sites), and blue denotes the final site locations used in this study (570 sites) that have 80% or more valid data. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

algorithm was applied independently to the one-year time series of PM2.5 for each AirNOW site. This algorithm has the following components and thresholds: 1) Daily constant value check. If MIN(PM) ¼ MAX(PM) for each 24 h period than the entire day is eliminated. 2) Data over 500 mg m3 is rejected. The time-continuity of the hourly observations is a key indication of data quality, and two tests were implemented to identify significant data outliers in neighboring hours:

2. Data sets and quality control

3) Single spike. If the difference in PM2.5 concentration between the present hour (h) and both the previous hour (h-1) and the following hour (hþ1) is greater than 50 mg m3, the data for hour (h) is rejected. 4) Averaged spike. If the mean concentration over three consecutive hours (h-1,h,hþ1) is greater than the data in the previous hour (h-2) and the following hour (hþ2) by more than 100 mg m3, the data for hours (h-1,h,hþ1) is rejected. 5) Histogram gap check. A histogram of all hourly values of PM2.5 for the year of data is compiled, and then a search for large gaps in the distribution is performed. If a large gap (greater than 50 mg m3 when the maximum value is less than 200 mg m3; or greater than 100 mg m3 otherwise) exists, then all the points in the tail of the distribution beyond the gap are rejected for that site.

The observational PM2.5 data set consists of 716 monitoring sites found in the AIRNow data set. The hourly AIRNow data used is the raw data that is avaialable in real-time. Although the EPA also provides quality controlled (QC'd) data (with a substantial time lag of many months) we have used the non-QC'd real-time observations, as this is the data that would be available in an operational setting. However, this also requires that we develop QC procedures that can be applied to the observational data in real-time. Of the 716 sites, 22 had completely missing or constant value data. The locations of the remaining 694 sites are shown in Fig. 1. For many of these sites, periods with erroneous data were obvious. To identify and eliminate the data outliers an automated QC

In addition to the above quality control tests, a check for frequently missing data at each site was implemented. Sites missing more than 20% of the data were eliminated. Following the above tests, 570 valid sites remained to which the postprocessing methods were applied, as shown by the blue solid circles in Fig. 1. A visual inspection of the observations after the quality control was applied indicated that clear outliers were almost entirely eliminated from the data set. The quality controlled set of observed data was then used to evaluate and correct the CMAQ model. Since 2007 the North American Mesoscale (NAM) meteorological model, run using the Nonhydrostatic Mesoscale Model (NMM) core of the Weather

Corrigendum / Atmospheric Environment 119 (2015) 431e442

433

Research and Forecasting (WRF) system (WRF-NMM), has been used to drive the CMAQ model over the contiguous U.S. at 12 km resolution (Eder et al., 2009). During the years 2009e2010 the aerosol module version 4 (AERO-4) and Carbon Bond Mechanism version IV (CBMIV) were used as the gas-phase chemical mechanism (Sarwar et al., 2008), and the Sparse Matrix Operator Kernel Emissions (SMOKE) modeling system was used for emission prediction (Houyoux et al., 2000; Eder et al., 2009). In this study we apply post-processing to NMM-CMAQ model output for one year of developmental model forecasts, from December 1, 2009, through November 30, 2010. The model was initialized at 12 UTC each day and ran for 48 h. For this year-long period all of the 48-h CMAQ forecasts as well as observations from the EPA AIRNow network have been downloaded and archived. To extract the CMAQ model PM2.5 values at the AIRNow site locations, a parabolic interpolation of CMAQ data over the 16 grid points surrounding each AIRNow location was used. The bias correction methodologies are evaluated using the first 24 h of each forecast. 3. Post-processing techniques Five different post-processing techniques have been applied to the CMAQ output, in all cases with the corrections calculated and applied independently for each hour of the day. Used as a reference, the first is a simple running-mean bias correction, where the model bias is calculated over a fixed time period of a number of immediately preceeding days (7 days are used in this analysis). This bias is then subtracted from the current forecast. The second technique that we have applied is an approach inspired by the Kalman-filter (KF). This technique was first applied to air quality forecasting by Delle Monache et al. (2006), and later by Kang et al. (2008, 2010), Djalalova et al. (2010), Tang et al. (2011), and De Ridder et al. (2012). The KF is a recursive algorithm that estimates a signal from noisy measurements, and has often been used in the meteorological community to improve the accuracy of initial fields in the data assimilation process. In our application the Kalman filter estimates the prediction error, which is then subtracted by the current forecast. The Kalman filter uses the entire year-long time series of data, but weights recent data more heavily. The third technique is an analog forecast technique. Development and implementation of the analog approach has previously been applied to various meteorological forecasting problems, with recent examples being Hamill and Whitaker (2006), and Delle Monache et al. (2011, 2013), but it has not been applied for air quality. The basic concept behind analog post-processing is to search for previous model forecasts that are similar to the current forecast to be corrected, and then to use the errors from those past predictions to correct the current forecast. Here we implement an analog-based approach proposed by Delle Monache et al. (2011), in which the prediction consists in the weighted ensemble mean of the PM2.5 observations corresponding to the 10 closest historical analog forecasts determined at individual observation sites and independently for every forecast hour. A description of the analog technique is illustrated in Fig. 2 in schematic form for an ensemble of only two analogs. Here the red curve is a time series predicted by an AQ model over an eight day period, and the black curve is the corresponding set of observations. The same hour of the day is considered. The data to the right of the dashed line at t ¼ 0 represents the new forecast, while the data to the left represents the previous 7 one-day forecasts. As shown in the top panel of Fig. 2, the analog technique searches for previous forecasts at the same lead time (red stars) that are similar to the current new forecast (blue star), and re-orders the time series with the first closest analog directly preceding the new forecast, and the

Fig. 2. Schematic description of the analog ensemble forecast technique. The red curve is the AQ model prediction with today's forecast in blue, while black indicates observed values. The top panel is the original time series, and the bottom panel has the model and observation pairs re-ordered so that the closest analog from the time series immediately precedes the forecast. Predicted (top, red) and observed (bottom, black) values for the two closest forecast analogs are circled.

second closest forecast next (bottom panel of Fig. 2). The metric used to estimate the degree of analogy is the same as in Delle Monache et al. (2011, 2013). The observations (black stars) corresponding to the two best analog forecasts (circled in the bottom panel of Fig. 2) are then weighted by how closely they resemble the current forecast, and their weighted sum provides the corrected forecast estimate (referred to as AN). Although this schematic figure shows only two analogs, in practice any number of analogs using any different number of search variables or combinations of variables (e.g., PM2.5, temperature, wind speed, etc.) can be used in the post-processing. A sensitivity to the different combinations of analog predictors as well as the number of analogs in the ensemble is presented in Section 4. The fourth technique uses a combination of analog forecasts and the Kalman-filter. The difference between this technique (referred to as KFAS or Kalman-filter in Analog Space) and the regular Kalman filter is that the regular Kalman filter is applied to the standard time series of data, while for the KFAS the Kalman filter is applied to the historical set of forecasts in analog space, ordered from the worst to the best analog (KFAS was referred to as ANKF in Delle Monache et al., 2011). The fifth technique also uses a combination of analog forecasts and the Kalman-filter, this time by applying the Kalman filter to the AN time-series described above. This procedure is referred to as the KFAN technique. We note that although KF, KFAS, and KFAN use the entire time series of data, in practice several weeks of preceding forecasts are enough for the Kalman-filtering technique. 4. Winter month results The five different post-processing methods all reduce the model errors, and to understand their comparative value it is useful to first analyze how these errors vary with time. Fig. 3 shows the CONUSaveraged PM2.5 bias of the CMAQ model for each month of the yearlong data set, using hourly model output over the full diurnal cycle. A strong seasonal dependence is apparent, with the largest biases of up to 8 mg m3 in the winter months, and with a slightly negative bias in the June and July summer months.

434

Corrigendum / Atmospheric Environment 119 (2015) 431e442

Fig. 3. Monthly PM2.5 biases for the raw CMAQ model using all AIRNow stations available, and using hourly data values over the full diurnal cycle.

In this section we will show the skill of the various postprocessing algorithms for the month of November, 2010, because it was the last month available of the year's CMAQ forecasts, and because it was a winter month, for which the CMAQ model generally has larger PM2.5 errors. In section 5 a similar analysis will be shown for the entire year. Fig. 4 shows the Mean Absolute Error (MAE) and correlation for the raw CMAQ forecasts, persistence, 7-day running mean subtraction, KF, KFAS, AN, and KFAN post-processing schemes for the entire month of November 2010. Statistics for each post-processing method are calculated independently at each AirNOW location and for each forecast lead time, using the previous 11 months of data (plus those days within the month of November that would be available in real-time) to train the KF, KFAS, AN and KFAN schemes. The AN and KFAN schemes use the 10 best analogs, and the different colored bars indicate how the analog methods perform when using various combinations of meteorlogical search variables, referred to as analog predictors. Persistence uses the previous day's observed PM2.5 value at each hour as the forecast for today's value at that same hour. The largest MAE is found for the raw CMAQ forecast, which is considerably worse even than the simple persistence forecast, although for correlation the raw model is better than persistence. The 7-day running mean subtraction technique provides a significant improvement over the raw forecasts for both MAE and correlation, but is less skillful than persistence for MAE. The Kalman filtered forecasts show improvement compared to the 7-day running mean but are still worse than persistence for MAE. Among the three types of analog forecasts, the relative skill depends on which variables are used as analog predictors in the

calculation of the analog metric (Eq. (6) in Delle Monache et al., 2011). For all three analog types, the combination of PM2.5, temperature, and wind speed and direction as analog predictors provides the highest or second highest skill for both MAE and correlation, with the most skillful overall being KFAN. The KFAN scheme with these 4 search variables provides a 52% reduction in MAE over the raw CMAQ forecast, and a 30% increase in correlation coefficient. The remainder of our analysis described below uses these four search variables. Another important issue is to determine the optimal number of members to be included in the analog ensemble. A sensitivity analysis was run using 3, 5, 10, 20 and 30 analogs, and the MAE for each size ensemble is shown in (Table 1). As the KFAS method uses the entire series of analogs (ordered from the worst to the best), it is independent of ensemble size, but it is included in the table for reference. The AN method shows a modest dependency on ensemble size with most of the improvement occurring with a 10member ensemble and then a smaller incremental improvement for larger size ensembles. KFAN shows a stronger dependency for all sizes of ensembles, and the improvement also appears to gradually asymptote to zero for very large ensembles. For the remainder of the paper, we use an ensemble of 10 analogs for the AN and KFAN schemes, while recognizing that marginally better forecasts could be obtained with a larger ensemble. Fig. 5 shows the November root-mean-square-error (RMSE) for the raw CMAQ forecasts, persistence, 7-day running mean subtraction, KF, KFAS, AN, and KFAN post-processing schemes in the top panel. The middle and lower panels break the RMSE into its unsystematic (random, or error variance) and systematic (mean bias) components, as follows:

RMSE2 ¼

N 1 X 2 ðFk  Ok Þ2 ¼ RMSEU þ RMSES2 N k¼1

where

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N h i2 u1 X   RMSEU ¼ t Fk  F  Ok  O N k¼1

and

RMSES ¼ BIAS ¼ F  O:

Fig. 4. MAE (left panel) and Spearman correlation coefficient (right panel) using hourly observed and forecast PM2.5 values for the month of November, 2010, for the raw CMAQ model, persistence, and five different post-processing schemes. For the three analog methods, the color bars indicate the various combinations of search variables (i.e., analog predictors) used. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Corrigendum / Atmospheric Environment 119 (2015) 431e442

435

Table 1 Analog techniques dependence of the number of ensemble members. MAE calculated on hourly basis for three bias-correction schemes, KFAS, AN and KFAN. Bold numbers indicate the percent improvement from the previous smaller ensemble size. Number of ensemble members

3

5

10

20

30

KFAS AN KFAN

4.974 4.417 4.463

4.974 4.237, þ4.1% 4.256, þ4.6%

4.974 4.097, þ3.3% 4.082, þ4.1%

4.974 4.041, þ1.4% 3.996, þ2.1%

4.974 4.038, þ0.07% 3.975, þ0.5%

All of the post-processing techniques including persistence are found to do a good job of removing the systematic mean bias component, which is not surprising. However, even the unsystematic component can be reduced significantly through postprocessing, with the best results again coming from the KFAN scheme, which produces a 35% reduction. The strong seasonal dependency of bias shown in Fig. 3 may affect the quality of the analogs that are found. For other variables (e.g., wind speed), even a 6-month training has been shown to be sufficient for AN to be able to remove the bias and significantly reduce the random errors (e.g., Delle Monache et al., 2011). However, for PM2.5 the error's seasonal dependency may require multiple copies of the same season (i.e., a multi-year training data set) for AN to be more effective. The diurnal variation of PM2.5 averaged over all 570 AIRNow sites, for the observations, raw CMAQ model, and the best analog scheme, KFAN, is shown in Fig. 6 for the month of November, 2010. All diurnal cycles were calculated considering the local time, i.e., accounting for four different time zones across the study domain. The raw model not only has an overall high bias, but this bias varies considerably through the day, such that the model has a much exaggerated diurnal variation. The KFAN scheme reduces the amplitude of the forecast diurnal variation to be very close to that observed, as a result of applying KF independently to each forecast hour in combination with the analog method's ability to capture the flow-dependent characteristics of the errors (Delle Monache et al., 2013). Fig. 7 shows the hourly variation of PM2.5 for the month of November, again averaged over the 570 AIRNow observation sites. Red indicates the raw CMAQ forecasts, black the observed values, and blue the KFAN post-processed values. The hourly KFAN forecasts are seen to follow the observed PM2.5 diurnal variations very closely through the entire month, with occasional maximum

Fig. 5. Total RMSE (top panel), unsystematic component of the RMSE (middle panel), and systematic component of RMSE (bottom panel) using hourly values for the month of November evaluated at the 570 AIRNow PM2.5 sites and for 1e24 h forecasts.

differences of 2e3 mg m3. Another aspect of the post-processing that is important to understand is its impact on large PM2.5 forecast errors. Fig. 8 shows the histogram of PM2.5 errors for the raw CMAQ forecasts and the KFAN corrected forecasts, again for the month of November. The raw CMAQ error histogram shows a large skewness towards high, positive PM2.5 errors, with frequent errors greater than 10e20 mg m3. In contrast, the KFAN corrected forecasts have a Gaussian shape, symmetric about zero error, and with a sharp peak at zero error. Fig. 9 shows the MAE of the raw CMAQ (red) and KFAN (blue) models as a function of the raw CMAQ forecast error. The first and last bins show the MAE for all negative CMAQ errors less than 200 mg m3 (far left side of the graph) and for all positive CMAQ errors greater than 200 mg m3 (far right side of the graph). The number of occurrences within each error bin are shown in the lower panel on a log scale. The improvement is significant even for infrequent very large errors. For example there are rare occurrences of 40e200 mg m3 errors in the raw CMAQ forecasts which are reduced by 40e100 % in the KFAN corrected forecasts. We note that the degree of correction is assymetrical, with large positive errors (model overforecasts, or false alarm events) being nearly completely corrected, while large negative errors (model underforecasts) have their errors reduced by approximately 9%. The assymetrical degree of correction may occur if the large negative errors (model underforecasts) are associated with sporadic forest fire events, for which close forecast analogs may not be associated with similar observed fire events since the emissions used to drive

Fig. 6. The diurnal variation of PM2.5 averaged over all 570 AIRNow sites for 1e24 h forecasts and the month of November, for the raw CMAQ model (red), AIRNow observations (black) and KFAN post-processed forecasts (blue). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

436

Corrigendum / Atmospheric Environment 119 (2015) 431e442

Fig. 7. Hourly time-series of PM2.5 averaged over the 570 AIRNow sites for 1e24 h forecasts and the month of November, for the raw CMAQ model (red), AIRNow observations (black) and KFAN post-processed forecasts (blue). The vertical lines mark 12 UTC each day.

CMAQ in this study do not include fires. Indeed, the correlation between the CMAQ predictions and the observations for large underforecasts (bias < 50 mg m3), is 0.022, while for large overforecasts (bias > 50 mg m3), is 0.570, indicating a weaker relationship between model and observations for underforecasts that leads to finding analogs of inferior quality. Fig. 10 breaks the MAE improvement of the KFAN forecasts into bins of the observed PM2.5 magnitude. The last bin on a right side of the plot shows the MAE for all observed PM2.5 values greater than 200 mg m3. Large observed PM2.5 events (>100 mg m3), which would have the largest health impacts, are found to have reductions of ~13% in the associated errors. 5. Full year results The analysis for the month of November described above

utilized the preceeding year-long data set for training the various post-processing schemes. Next we demonstrate that the improvements found for November for the best post-processing scheme, KFAN, are similar to those found for each month of the year. This yearly error analysis is computed 1 month at a time, from December 2009 through November 2010, using equal length training periods formed by rotating the 12 available calendar months so that there are always 11 months that follow chronologically through the calendar year. For example, an analysis of the month of August 2010 starts its 11-month training time series with SeptembereNovember 2010, followed by December 2009, and then completed with JanuaryeJuly 2010. MAE and correlation for both the raw CMAQ model and for the KFAN corrected forecasts are shown in Fig. 11 on a monthly basis. The MAE of the raw CMAQ model is clearly worse in the winter and early spring months (NovembereMarch) reaching a minimum in May and June. The correlation shows a significantly different pattern, generally with a smaller percentage month-to-month variation, but with FebruaryeMarch having the highest (best) correlation, and JulyeAugust having the lowest (worst) correlation. For both MAE and correlation, the KFAN method improves the forecasts significantly, with the resulting forecast skill being more constant through the year. The raw CMAQ MAE varies by as much as 4 mg m3 between months whereas the KFAN MAE varies by less than 2 mg m3. The raw CMAQ correlation coefficients vary from 0.38 to 0.56 between months, while the KFAN correlation varies from 0.59 to 0.69. The yearly averaged improvement of KFAN compared to the raw CMAQ model is 46% for MAE and 30.5% for correlation, and both of these values are very close to the improvement found for November, 2010. 6. Spreading of forecast corrections

Fig. 8. Histogram of hourly forecast errors of PM2.5 for the raw CMAQ model (red) and the KFAN corrected forecasts (blue), for the month of November and 570 AIRNow sites. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

The post-processing algorithms described here are implemented at point locations where verifying observations are available. However, forecast information is frequently disseminated as either a gridded or graphical map product. In this section a

Corrigendum / Atmospheric Environment 119 (2015) 431e442

437

Fig. 9. Top panel: MAE of the KFAN (blue) and the raw CMAQ (red) forecasts, as a function of the error of the raw CMAQ forecast, using hourly values for the month of November. Bottom panel: The number of events in each CMAQ error bin, shown on a log scale. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

procedure to generate a bias-corrected two-dimensional (2D) gridded field or forecast map is presented. To understand the procedure for generating the spreading technique we consider three fields: the forecast of what the model bias will be at a future time (KFAN e CMAQ); the observed or verified raw model bias (obs e CMAQ); and the residual bias that remains once the forecast bias is removed from the raw model (obs e KFAN). Fig. 12a shows the KFAN forecast of the CMAQ PM2.5 biases at the 570 AIRNow sites for one particular day, November 12, 2010, at forecast hour 15, which has the highest concentration over the

CONUS for November, as seen in the PM2.5 timeseries in Fig. 7. The KFAN forecast is that CMAQ will overestimate PM2.5 concentrations through most of the eastern U.S. In contrast, the KFAN forecast is that CMAQ will significantly underestimate the PM2.5 concentration in parts of central California as well as a few locations in Oregon and Washington. In Fig. 12b the verifying CMAQ bias is shown for the same hour. In general, the spatial characteristics of the forecast of the bias match the verifying bias quite well, including the overestimate of PM2.5 in the eastern U.S. and the underestimate at some west coast sites. Fig. 12c shows the residual

Fig. 10. Top panel: MAE of the KFAN (blue) and the raw CMAQ (red) forecasts, as a function of the magnitude of the observed PM2.5 concentration, using hourly values for the month of November. Bottom panel: The number of events in each observation data bin, shown on a log scale. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

438

Corrigendum / Atmospheric Environment 119 (2015) 431e442

Fig. 11. Monthly statistics for the entire year of available forecasts, for MAE (top panels) and Spearman correlation coefficient (bottom panels) and for the raw CMAQ model (left panels) and KFAN corrected forecasts (right panels).

bias from the KFAN forecasts at each of the AIRNow sites. It is clear that the residual biases are much smaller than those from the raw model output. However, Fig. 12a (or a similar plot of the KFAN forecast values of PM2.5 at the AIRNow sites) is not optimal as there is no forecast information available at locations other than the AIRNow observation sites. Creation of a continuous contour forecast map requires spreading of the bias-correction information from the points where it is calculated to other nearby model grid points. Our spreading method uses a Barnes-type iterative objective analysis scheme based on the work of Glahn et al. (2012). It employs an iterative 8pass Gaussian scheme that first computes the interpolated error correction using a radius of influence of 2000 km, then at 1000 km, 500 km, 250 km, 125 km, 62 km, 31 km and finally at 15 km, which is comparable to the model resolution of 12 km. The spreading technique is used for every forecast hour if both the CMAQ model output and KFAN are available. The goal of the technique is to spread the KFAN forecast of what the CMAQ model bias Sbiask will be at each observation site k to each model grid point, referred to as the gridded forecast bias Gbiasij. The KFAN predicted site bias Sbiask is simply

Sbiask ¼ KFANk  CMAQk

(1)

where CMAQk is the raw model forecast at site k. Sbiask remains the same through all iterations. The iterative process estimates Gbiasijm starting from the largest scales and working down to local scales, where m is the scale iteration counter, m ¼ 1,8. At the start of the iterative process we take the initial model grid point bias to be zero,

Gbiasijm ¼ 0

for m ¼ 1

(2)

The gridded model bias is then estimated for the next iteration step as

Gbiasijmþ1 ¼

 1 X R*R  dk *dk  Sbiask  Gbiasijm þ Gbiasijm Nk R*R þ dk *dk dk 2R

(3) where R is the radius of influence; Nk is the number of the sites within R; dk (with dk < R) is the distance from a grid point (i,j) to the site k inside the circle of influence. The summation is done over all observation sites inside the circle of influence, dk
CMAQCORRij ¼ CMAQij þ Gbiasij

(4)

It is possible that some of the CMAQ corrected grid values will be negative after (4). This mostly happens at grid locations far from measurement stations, such as over the ocean. Here the model usually has a small PM2.5 value, and when the large positive model bias generally present over land is spread to the ocean gridpoint locations, a negative corrected PM2.5 forecast value can result. In this case the negative values are replaced by zero, so that

Corrigendum / Atmospheric Environment 119 (2015) 431e442

439

Fig. 12. Forecast (a), observed (b) and residual (c) CMAQ biases at the AirNOW sites for forecast hour 15 on November 12, 2010. Black dots indicate the sites without valid data, magenta dots indicate the site with positive errors larger than 50 mg m3, and grey dots indicate sites with negative errors less than 50 mg m3.

if CMAQCORRij < 0

then CMAQCORRij ¼ 0

(5)

Fig. 13. The gridded KFAN forecast PM2.5 bias field (Gbiasi,j) at forecast hour 15 on November 12, 2010 (background colors). The overlain open circles show the corresponding observed bias in the raw CMAQ model at the AIRNow observation sites (shown in Fig. 12b).

Since the PM2.5 national standard is a 24-h average, we present an example of the gridded interpolated forecasts for a 24-h average of a single day, November 12, 2010. Fig. 14 shows the original raw CMAQ gridded forecast, (a), the gridded bias-corrected forecast, (b), and the observed PM2.5, (c). Modification of the geographical pattern of the original raw PM2.5 field is readily apparent. The KFAN model has less skill at improving under-forecasts compared to over-forecasts, as was shown in Figs. 9 and 10. The effect of this can be seen in California, where the PM concentration is underpredicted in CMAQ, and is not appreciably changed in the corrected CMAQ forecasts. In contrast, in the Eastern USA, the largely over-predicted PM concentrations are significantly reduced to match the observed values. Overall, the corrected forecast map is in much closer agreement with the observed PM2.5, as seen in Fig. 14b,c. We also note that the visual characteristics of the corrected map are similar to the original map, and could be used for public dissemination. After spreading the PM2.5 bias from the 570 site locations over the entire CMAQ domain, the corresponding forecast PM2.5 values at the AIRNow observation sites are calculated the same way as was done previously for the CMAQ_RAW forecasts, notably through parabolic interpolation using the 16 nearest grid points of the CMAQ_CORR forecasts. At each AIRNow location there are four PM2.5 values that can then be compared: the observed PM2.5, KFAN calculated at each location, and the raw CMAQ and CMAQ_CORR data interpolated from the 16 surrounding model grid points. The hourly variation of PM2.5 for the month of November, averaged over

440

Corrigendum / Atmospheric Environment 119 (2015) 431e442

Fig. 14. The original gridded raw CMAQ forecast for 24-h averaged PM2.5 concentrations on November 12, 2010 (a), the CMAQ_CORR forecast field (b), and the observed PM2.5 at the AIRNow sites (c).

the 570 AIRNow observation sites is shown in Fig. 15, similar to Fig. 7. The magenta line denoting PM2.5 values from the CMAQ_CORR interpolated fields is very close to the KFAN blue line. To determine how well the KFAN method corrects the model

forecast values at locations where no observations exist, an additional analysis has been done in which selected observations are removed from the bias-correction and spreading procedures, and used for verification only. The bias-corrected data for various

Fig. 15. Hourly time-series of PM2.5 averaged over the 570 AIRNow sites for the month of November, for the AIRNow observations (black), raw CMAQ model (red), KFAN postprocessed point forecasts (blue), and the corrected CMAQ_CORR model (magenta).

Corrigendum / Atmospheric Environment 119 (2015) 431e442

441

Fig. 16. MAE (left panel) and correlation coefficients (right panel) of PM2.5, averaged over the 570 AIRNow sites for forecast hour 15 November 12, 2010, for the raw CMAQ model (red), KFAN post-processed forecasts (blue), and KFAN post-processed forecasts with different portions of excluded sites (magenta). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

percentages of sites were excluded from the spreading procedure. We excluded 3, 5, 10, 15, 20, 25, 50, and 90% of the observation sites from the spreading procedure, randomly choosing the sites in 70 different configurations for each percentage value, which were then averaged. The MAE of PM2.5 and correlation coefficients averaged over all the sites (blue bars) and over only the excluded sites (magenta bars) is shown in Fig. 16. The MAE increases with the percentage of excluded sites (Fig. 16, left panel), but with even up to 25% of the sites excluded, the MAE has increased only slightly. The correlation (Fig. 16, right panel) is somewhat more sensitive to the number of excluded sites, with an approximately 10% decrease in correlation with 25% of the sites excluded, and then a sharper decrease when excluding more sites. Overall, these results give us confidence that the KFAN and spreading techniques provide accurate corrections even at model gridpoints where there are no observations.

display of post-processed maps of PM2.5. The visual and spatial characteristics of the corrected forecast maps are similar to those of the raw CMAQ forecasts, such that they could readily be used for public dissemination. Several ways to improve the model will be the goal of our future work, among them:  Modifying the spreading technique to create a smoother transition at the zero PM2.5 contour.  Forest fire emissions. The present analysis used the 2010 version of the developmental CMAQ forecasts which did not include any forest fire associated emissions. Beginning in 2014, fire emissions have been added to the model (Lee et al, 2013), and the bias correction algorithm will be evaluated and modified if necessary to account for these events. Acknowledgments

7. Summary and future plans Post-processing techniques that can be applied to real-time air quality forecasts of PM2.5 have been evaluated using one-year of CMAQ model output and a year of AIRNow PM2.5 data spanning the CONUS domain. Real-time quality control procedures have been developed to eliminate invalid measurement values in the observational data. Then, an implementation and evaluation of five post-processing techniques has been completed, including (1) a simple 7-day running mean bias correction, (2) a Kalman filter (KF) biascorrection, (3) the weighted mean of the observed PM2.5 corresponding to the 10 best analog forecasts (AN), (4) KF applied to the series of ordered analog forecasts (KFAS), and (5) a KF applied to AN (KFAN). All of these techniques show a significant improvement over the original forecasts, which suffer from very large diurnally and seasonally varying biases. The skill of the analog forecasts depends on which variables are included in the analog search. Although a significant improvement is found when using only PM2.5 itself, combining PM2.5 plus temperature generates a further significant improvement. Increasing the set of analog predictors to include wind speed and direction also adds incremental skill to the analog forecasts. The overall best forecast skill for PM2.5 is achieved using the KFAN approach. A technique has been developed for interpolating the station corrected model forecasts to the entire CMAQ grid, allowing for

This research is funded by United States Weather Research Program (USWRP). NCAR is funded in part by the U.S. National Science Foundation (AGS-0753581). References Borrego, C., Monteiro, A., Pay, M.T., Ribeiro, I., Miranda, A.I., Basart, S., Baldasano, J.M., 2011. How bias-correction can improve air quality forecasts over Portugal. Atmos. Environ. 45, 6629e6641. http://dx.doi.org/10.1016/ j.atmosenv.2011.09.006. Delle Monache, L., Nipen, T., Deng, X., Zhou, Y., Stull, R., 2006. Ozone ensemble forecasts: 2. A Kalman filter predictor bias correction. J. Geophys. Res. 111, D05308. http://dx.doi.org/10.1029/2005JD006311. Delle Monache, L., Wilczak, J., McKeen, S., Grell, G., Pagowski, M., Peckham, S., Stull, R., McHenry, J., McQueen, J., 2008. A Kalman-filter bias correction of ozone deterministic, ensemble-averaged, and probabilistic forecasts. Tellus B 60, 238e249. http://dx.doi.org/10.1111/j.1600-0889.2007.00332.x. Delle Monache, L., Nipen, T., Liu, Y., Roux, G., Stull, R., 2011. Kalman filter and analog schemes to postprocess numerical weather predictions. Mon. Wea. Rev. 139, 3554e3570. http://dx.doi.org/10.1175/2011MWR3653.1. Delle Monache, L., Eckel, T., Rife, D., Nagarajan, B., 2013. Probabilistic weather prediction with an analog ensemble. Mon. Weather Rev. 141, 3498e3516. http:// dx.doi.org/10.1175/MWR-D-12-00281.1. De Ridder, K., Kumar, U., Lauwaet, D., Blyth, L., Lefebvre, W., 2012. Kalman filterbased air quality forecast adjustment. Atmos. Environ. 50, 381e384. http:// dx.doi.org/10.1016/j.atmosenv.2012.01.032. Djalalova, I., Wilczak, J., McKeen, S., Grell, G., Peckham, S., Pagowski, M., McQueen, J., Lee, P., Tang, Y., McHenry, J., Gong, W., Bouchet, V., Marthur, R., 2010. Ensemble and bias-correction techniques for probabilistic forecast of surface O3 and PM2.5 during the TEXAQS-II experiment of 2006. Atmos. Environ. 44, 455e467. http:// dx.doi.org/10.1016/j.atmosenv.2009.11.007.

442

Corrigendum / Atmospheric Environment 119 (2015) 431e442

Eder, B., Kang, D., Mathur, R., Pleim, J., Yu, S., Otte, T., Pouliot, G., 2009. A performance evaluation of the National Air Quality Forecast Capability for the summer of 2007. Atmos. Environ. 43, 2312e2320. http://dx.doi.org/10.1016/ j.atmosenv.2009.01.033. Glahn, R., Im, J.-S., Wagner, G., 2012. Objective analysis of MOS forecasts and observations in sparse data regions. In: 92nd AMS Annual Meeting 21th Conf. On Probability and Statistics, 22e26 January 2012, New Orleans, LA. Gorline, J.L., Lee, P., 2009. Performance evaluation of NOAA-EPA developmental aerosol forecasts. Environ. Fluid Mech. 9, 109e120. http://dx.doi.org/10.1007/ s10652-008-9090-7. Hamill, T.M., Whitaker, J.S., 2006. Probabilistic quantitative precipitation forecasts based on reforecast analogs: theory and application. Mon. Wea. Rev. 134, 3209e3229. http://dx.doi.org/10.1175/MWR3237.1. , Ce cile, et al., 2008. Predictability of European air quality: assessment of 3 Honore years of operational forecasts and analyses by the PREV’AIR system. J. Geophys. Res. 113, D04301. http://dx.doi.org/10.1029/2007JD008761. Houyoux, M.R., Vukovich, J.M., Coats Jr., C.J., Wheeler, N.J.M., Kasibhatla, P.S., 2000. Emission inventory development and processing for the seasonal model for regional air quality (SMRAQ) project. J. Geophys. Res. 105, 9079e9090, 01480227/00/1999JD900975. Kang, D., Mathur, R., Rao, S.T., Yu, S., 2008. Bias adjustment techniques for improving ozone air quality forecasts. J. Geophys. Res. 113, D23308. http:// dx.doi.org/10.1029/2008JD010151. Kang, D., Mathur, R., Trivikrama Rao, S., 2010. Real-time bias-adjusted O3 and PM2.5 air quality index forecasts and their performance evaluations over the continental United States. Atmos. Environ. 44, 2203e2212. http://dx.doi.org/10.1016/ j.atmosenv.2010.03.017. Lee, P., Kim, H., Estes, H., 2013. In: Steyn, D.G., Builtjes, P. (Eds.), Ingestion of Intermittent Wild Fire Sources inside and outside the Forecasting Domain, Air Pollution Modeling and its Applications, vol. XIII. Springer. Mathur, Rohit, Yu, Shaocai, Kang, Daiwen, Schere, Kenneth L., 2008. Assessment of the wintertime performance of developmental particulate matter forecasts with the Eta-Community Multiscale Air Quality modeling system. J. Geophys Res. 113, D02303. McKeen, S., Grell, G., Peckham, S., Wilczak, J., Djalalova, I., Hsie, E.-Y., Frost, G., Peischl, J., Schwarz, J., Spackman, R., Middlebrook, A., Holloway, J., de Gouw, J., Warneke, C., Gong, W., Bouchet, V., Gadreault, S., Racine, J., McHenry, J., McQueen, J., Lee, P., Tang, Y., Carmichael, G.R., Mathur, R., 2009. An evaluation of real-time air quality forecasts and their urban emissions over Eastern Texas during the summer of 2006 TexAQS field study. J. Geophys. Res. 114, D00F11.

http://dx.doi.org/10.1029/2008JD011697. McKeen, S., Chung, S.H., Wilczak, J., Grell, G., Djalalova, I., Peckham, S., Gong, W., Bouchet, V., Moffet, R., Tang, Y., Carmichael, G.R., Mathur, R., Yu, S., 2007. Evaluation of several PM2.5 forecast models using data collected during the ICARTT/NEAQS 2004 field study. J. Geophys. Res.-Atmos. 112 (D10) http:// dx.doi.org/10.1029/2006JD007608. Art. No. D10S20, issn: 0148e0227, ids: 150AM. McKeen, S., Wilczak, J., Grell, G., Djalalova, I., Peckham, S., Hsie, E.Y., Gong, W., Bouchet, V., Menard, S., Moffet, R., McHenry, J., McQueen, J., Tang, Y., Carmichael, G.R., Pagowski, M., Chan, A., Dye, T., Frost, G., Lee, P., Mathur, R., 2005. Assessment of an ensemble of seven real-time ozone forecasts over eastern North America during the summer of 2004. J. Geophys. Res.-Atmos. 110 (D21) http://dx.doi.org/10.1029/2005JD005858. Art. No. D21307, issn: 0148e0227, ids: 985DA. Sarwar, G., Luecken, D., Yarwood, G., Whitten, G., Carter, W.P., 2008. Impact of an updated carbon bond mechanism on predictions from the CMAQ modeling system: preliminary assessment. J. Appl. Meteor. Climatol. 47, 3e14. http:// dx.doi.org/10.1175/2007JAMC1393.1. Solazzo, et al., 2012. Operational model evaluation for particulate matter in Europe and North America in the context of AQMEII. Atmos. Environ. 53, 75e92. http:// dx.doi.org/10.1016/j.atmosenv.2012.02.045. Stajner, I., Davidson, P., Byun, D., McQueen, J., Draxler, R., Dickerson, P., Meagher, J., 2012. US national air quality forecast capability: expanding coverage to include particulate matter. In: Steyn, Douw G., Castelli, Silvia Trini (Eds.), NATO/ITM Air Pollution Modeling and its Application XXI. Springer, Netherlands, pp. 379e384. http://dx.doi.org/10.1007/978-94-007-1359-8_64. Tang, X., Zhu, J., Wang, Z.F., Gbaguidi, A., 2011. Improvement of ozone forecast over Beijing based on ensemble Kalman filter with simultaneous adjustment of initial conditions and emissions. Atmos. Chem. Phys. Discuss. 11, 7811e7849. http://dx.doi.org/10.5194/acpd-11-7811-2011. Wilczak, J.M., Djalalova, I., McKeen, S., Bianco, L., Bao, J., Grell, G., Peckham, S., Mathur, R., McQueen, J., Lee, P., 2009. Analysis of regional meteorology and surface ozone during the TexAQS II field program and an evaluation of the NMM-CMAQ and WRF-Chem air quality models. J. Geophys. Res. 114, D00F14. http://dx.doi.org/10.1029/2008JD011675. Wilczak, J., McKeen, S., Djalalova, I., Grell, G., Peckham, S., Gong, W., Bouchet, V., Moffet, R., McHenry, J., McQueen, J., Lee, P., Tang, Y., Carmichael, G.R., 2006. Biascorrected ensemble and probabilistic forecasts of surface ozone over eastern North America during the summer of 2004. J. Geophys. Res. 111, D23S28. http:// dx.doi.org/10.1029/2006JD007598.