Atmospheric Environment 33 (1999) 4663}4674
An enhanced ozone forecasting model using air mass trajectory analysis W. Geo!rey Cobourn *, Milton C. Hubbard Department of Mechanical Engineering, Speed Scientixc School, University of Louisville, Louisville, KY 40292, USA Automated Analysis Corporation, Peoria, IL, USA Received 15 December 1998; accepted 16 April 1999
Abstract An enhanced ozone forecasting model using nonlinear regression and an air mass trajectory parameter has been developed and "eld tested. The model performed signi"cantly better in predicting daily maximum 1-h ozone concentrations during a "ve-year model calibration period (1993}1997) than did a previously reported regression model. This was particularly true on the 28 `high ozonea days ([O ]'120 ppb) during the period, for which the mean absolute error (MAE) improved from 21.7 to 12.1 ppb. On the 77 days meteorologically conducive to high ozone, the MAE improved from 12.2 to 9.1 ppb, and for all 580 calibration days the MAE improved from 9.5 to 8.35 ppb. The model was "eld-tested during the 1998 ozone season, and performed about as expected. Using actual meteorological data as input for the ozone predictions, the MAE for the season was 11.0 ppb. For the daily ozone forecasts, which used meteorological forecast data as input, the MAE was 13.4 ppb. The high ozone days were all anticipated by the ozone forecasters when the model was used for next day forecasts. 1999 Elsevier Science Ltd. All rights reserved. Keywords: Air pollution; Air quality; Ground-level ozone; Nonlinear regression
1. Introduction and background Ground-level ozone (O ) has been a serious air pol lution problem for several decades in many urban communities in the United States. The US Environmental Protection Agency has recently promulgated a revised National Ambient Air Quality Standard (NAAQS) for ozone, based on the daily maximum 8-h average concentration. The new 80 ppb, 8-h standard is based on health studies which demonstrate serious human health e!ects from prolonged exposures to ozone concentrations signi"cantly below the previous 120 ppb, 1-h standard (US EPA, 1996). In 1996, approximately 39 million people lived in counties where the ozone NAAQS had been exceeded (US EPA, 1998a). The revised NAAQS for ozone will bring many additional communities into a state of non-compliance with the clean air act, and will probably cause delays in reaching compliance
* Corresponding author.
for communities that were already in violation of the NAAQS for ozone. For many of the individuals sensitive to high ozone levels (e.g. asthmatics and children), it may therefore be years or even decades before they can regard the air quality to be `healthya, as de"ned by the NAAQS. Recent regulatory actions by the EPA calling for additional reductions in emissions of VOCs and NO comV pounds, hold promise for eventually improving the ozone air quality. In the meantime, people who su!er ozonerelated respiratory problems would bene"t from access to accurate forecasts of high ozone days, so that they could reduce their health risk on those days by staying indoors or reducing outdoor activities. In the past few years, many urban areas have enacted voluntary ozone control programs, aimed at increasing public awareness and participation in local clean air e!orts. A key component of these programs is the forecasting of probable high ozone days, and subsequent proclamation of an `ozone action daya (OAD) in the community. Health warnings or health advisories for asthmatics, children and others at risk are typically
1352-2310/99/$ - see front matter 1999 Elsevier Science Ltd. All rights reserved. PII: S 1 3 5 2 - 2 3 1 0 ( 9 9 ) 0 0 2 4 0 - X
4664
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
issued along with the OAD. In some of the programs, voluntary episodic control measures such as carpooling or industry cutbacks are encouraged. It is clear that the public acceptance and success of the OAD programs will depend crucially on the accuracy of the ozone forecasts. Many di!erent techniques are used for forecasting ozone, and the accuracy of the forecasts varies widely among these programs. Some of the techniques are ad hoc methods, devised by local o$cials, and others are based on more rigorous methods described in the scienti"c literature. These include multiple regression (e.g. Wol! and Lioy, 1978 or Ryan, 1995), generalized additive models (Niu, 1996), arti"cial neural networks (Comrie, 1997), and classi"cation and regression tree analysis (Ryan, 1995). The more accurate of these models can predict the day-to-day peak ozone concentrations reasonably well. Typical explained variances (square of correlation coe$cient) are in the range 60}80%, and typical mean absolute errors are in the range 10}20 ppb. A common problem with these models, however, is that on the very high ozone days, the errors tend to be much larger, and the O concentrations are systematically underpredicted. Of course, it is these very events that are most important from a health perspective. In order to maximize the e!ectiveness of OAD programs, it is essential to develop ozone forecast models with greatly improved accuracy at the upper end of the seasonal ozone distribution. Until more accurate ozone forecast models can be developed, an interim approach is to supplement or modify the objective model predictions with subjective forecasts prepared by `expertsa. The subjective forecasts are based on analysis of additional meteorological information that may be di$cult to include in parametric models, such as satellite imagery, synoptic #ow patterns, frontal analysis, or air mass trajectory analysis. Ryan and Luebehusen (1996) have reported success with this approach in the Baltimore area. Ozone forecasts made using a regression model supplemented by expert analysis were signi"cantly more accurate in predicting 1-h daily maximum ozone concentrations in excess of 120 ppb than the regression model alone. Air mass trajectory analysis has been a useful tool for identifying `source regionsa for various air pollutants. Ashbaugh et al. (1985) used backward air trajectories, combined with a statistical technique called residence time analysis, to identify source regions for transport of particulate sulfur to the Grand Canyon National Park. Employing a similar technique, Poirot and Wishinski (1986) identi"ed the Lower Great Lakes and Ohio River Valley regions as signi"cant contributors to high sulfate concentration and poor visibility in Northern Vermont. Comrie (1994) used backward, three-layer branching trajectories, combined with residence time analysis, to identify the transport corridors and source regions for high ozone in a forested area of western Pennsylvania. The source regions identi"ed included the lower Ohio and middle Mississippi valleys, and a secondary source re-
gion near the southern tip of Lake Michigan. Moy et al. (1994) have reported that certain backward trajectory patterns are associated with high O and NO concentra V tions at a rural site in Virginia. Using a trajectory clustering technique, they identi"ed distinct corridors for the transport of `cleana air and `dirtya air to the monitoring site. The authors hypothesized that the trajectory information could o!er predictive opportunities for air quality, but noted that confounding phenomena such as convective activity along the trajectory path tended to confuse matters. We have recently reported on the development of a multiple-linear regression model to forecast 1-h peak O concentration in Louisville, Kentucky (Hubbard and Cobourn, 1998). This model was composed of eight explanatory variables, viz. peak temperature, midday wind speed, atmospheric transmittance, cloud cover, minimum temperature, day of week, 24-h rainfall and number of nighttime calms. For the 1993}1996 model calibration period, the model predicted the 1-h peak O concentra tions to within$15.0 ppb 80% of the time, and the mean absolute error was 9.4 ppb. However, for the `extreme daysa, in which the peak [O ] was greater than 120 ppb, none of the predictions were within$15 ppb, and the mean absolute error for those days was 31.2 ppb. We reported also on the development of an adjunct model called the `Hi-loa model, to be used for the probable high ozone days, for which the mean absolute error of the extreme ([O ]'120 ppb) days was somewhat lower, at 19.4 ppb. In this paper we report on enhancements to the previously reported models. The enhancements are based on a reformulation of the nonlinear regression equation and the parameterization of air mass trajectory information for inclusion in the models. These enhancements have signi"cantly reduced the high-end prediction errors, and have made possible the forecasting of high ozone days (days near to or above the NAAQS) with a much greater degree of reliability and accuracy than was previously possible. We have concentrated on the problem of forecasting the daily maximum 1-h average ozone concentration for two reasons. First, the Louisville Air Quality Control Region is still subject to the 1-h, 120 ppb standard as of this writing. Second, the enhancements reported herein may be applied to other statistical models used in other communities; the degree of improvement made possible by these developments is best described by comparing to the previously reported models for Louisville.
2. Air quality and meteorological data 2.1. Ozone data Data from the seven ozone monitors that comprise the Louisville Air Quality Control Region were used for this
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
study. The monitors are located in a mix of urban, suburban, and rural sites in a region approximately 2000 km in area spanning both sides of the Ohio River. Daily 1-h average O concentrations for each monitor were obtained from the US Environmental Protection Agency (US EPA) Aerometric Information and Retrieval System (AIRS) database, for the months of May}September, over the "ve year period 1993}1997. Since the local pollution control authorities issue a single forecast for the entire Air Quality Control region, the appropriate ozone concentration for "tting the forecast models is the highest value in the region, i.e., the `domain peaka [O ]. If on a particular day the daily peak 1-h concentration could not be reliably determined for two or more monitors, the whole day was excluded from the database. No days were dropped on account of rain, but two days judged to be statistical outliers were excluded. The domain peak [O ] is approximately log-normally distrib uted with a geometric standard deviation of 1.38 and a geometric mean of 71.2 ppb. Relative to the arithmetic mean (76.4 ppb), the NAAQS threshold (120 ppb) is 1.86 standard deviations away.
4665
seasonal cycle. The derivation and application of atmospheric transmittance to ozone forecasting is described in detail in Hubbard and Cobourn (1998). The third class of predictors consisted of two derived meteorological products: the location of upper air mass trajectories into Louisville, KY, and the Ultraviolet (UV) Index (US EPA, 1994). The upper air trajectories are described in detail in the next section. The UV index, issued daily by the National Weather Service, is a forecast of the radiation intensity at Earth's surface for local solar noon for 58 US cities including Louisville. To derive the UV index, stratospheric ozone data from satellite observations, atmospheric pressure and temperature forecasts and expected cloudiness are analyzed and scaled to produce an index between the range 0}15. According to the US EPA (1994), about 90% of the forecasts are accurate to within$two units as veri"ed by ground based sensors. The accuracy of the forecast is very dependent on the accuracy of the cloudiness forecast.
3. Model development and formulation 2.2. Meteorological data and other predictor variables 3.1. Development of the trajectory parameter Three classes of predictor variables were used in this study: (1) surface weather observations, (2) deterministic parameters, and (3) derived meteorological products. The "rst class consisted of surface weather observations made by the National Weather Surface at the Standiford Field site in Louisville, KY. These included hourly observations of: temperature, cloud cover, dew point, relative humidity, and wind speed, as well as daily peak temperature, daily minimum temperature, and daily precipitation. The temperature extrema represented instantaneous values, not extremes of the hourly data. In addition, using the hourly wind speed data, the number of hourly periods for which speed fell below 3 m.p.h. during the nighttime period 12 midnight to 4 a.m. was used to de"ne a new variable } the number of overnight calms. Because the hourly meteorological observations were recorded over a brief interval lasting for just a few minutes, they may not have been representative of conditions lasting over the full time interval between successive observations. Therefore, some of the variables were averaged over several hours to reduce the e!ect of random #uctuations present in the data. The averaging intervals were as follows: temperature, dew point, and relative humidity (9 a.m. to 1 p.m.), cloud cover (9 a.m. to 2 p.m.), and wind speed (9 a.m. to 3 p.m.). The second class of predictors consisted of deterministic parameters that are typically found useful in ozone forecasting: (1) day of the week, (2) solar zenith angle, (3) theoretical atmospheric transmittance and (4) length of day. The last three predictors are a function of the time of year and serve as a proxy for the ultraviolet #ux and
Air mass trajectory analysis has for some time been used to help identify the direction and location of sources of pollutants or pollutant precursors, which are transported over long distances. In addition, it has been well documented that ozone and ozone precursors, particularly NO , can be transported over distances of several V hundred kilometers or more (e.g. Wol! and Lioy, 1977; US EPA, 1997; Ryan et al., 1998). For purposes of improving the ozone forecast model, we sought to identify transport patterns which were associated with high ozone. The original purpose was to use the information to develop subjective guidelines for modifying the regression model predictions. However, a relatively straightforward technique was developed for interpreting the information in terms of a single parameter for inclusion in the regression equation. There are two practical requirements for using any type of meteorological information in an ozone forecast model: (1) archives of the information must be available for studying relationships to daily peak ozone concentration, and (2) reasonably accurate meteorological forecasts of the information can be obtained routinely and quickly, at least 24 h in advance of the forecasted event. The ability to use air mass trajectory data in the ozone forecasts was made possible by a recently developed website which provides access to the Hybrid SingleParticle Lagrangian Integrated Trajectory (HYSPLIT) model for calculating forward or backward trajectories at various levels for continental US locations (NOAA, 1998). The NOAA Transport and Dispersion
4666
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
(HYSPLIT) website provides both trajectory forecasts based on numerical weather model output, and past trajectories (analysis trajectories) based on archived model output. The HYSPLIT model (version 4) uses gridded wind "eld data for computing the trajectories (Draxler, 1991). In the case of the forecast trajectories, the wind "elds are model forecasts. In the case of the past trajectories, the gridded wind "elds utilized are analyzed wind "elds, i.e. they are based on model interpolations of rawindsonde data obtained at 12-h intervals. Both the NGM and EDAS output data "les were used for calculating trajectories. The NGM data was available for trajectories prior to May 1997, and the EDAS data for trajectories after that time. Backward trajectory patterns were examined for several categories of daily 1-h peak ozone concentration [O ]. The calculated backward trajectories were all of 36-h duration, originating at 750 m height. The duration of the trajectories was based on the characteristic transport time for ozone and ozone precursors, estimated to be 1}2 days from the Ozone Transport and Assessment Group (OTAG) "nal report (US EPA, 1997). The trajectory height was chosen to be roughly half of the average summertime mixing height (e.g. Holzworth, 1967), so that the trajectories would be a mean representation of the transport, which varies in speed and direction throughout the mixed layer. Consideration was given to using trajectories at multiple heights, but rejected in favor of the simpler approach, in consideration of the time element involved in the ozone forecasting process. The trajectory patterns for the highest ozone category ([O ]*115 ppb) formed a distinct pattern. The average trajectory length was shorter than those of the lower ozone categories, and the trajectory paths tended to fall within distinct corridors, each roughly 300 km;600 km in size (Fig. 1). Three corridors were identi"ed: one extending west, and approximately centered along the Ohio River Valley, one extending east, also including the Ohio River Valley, and one extending to the northwest, encompassing the Indianapolis and Chicago metropolitan areas. These corridors together form an envelope (Fig. 2) which may be considered to be a `region of in#uencea for transport of ozone and ozone precursors into Louisville. The envelope was constructed so as to encompass nearly all (32/34) of the recorded high ozone trajectories, using published maps of large NO emission sources (e.g. US V EPA, 1998b) for additional guidance. For the next highest ozone category examined (100)[O ])114 ppb), 15 of 25 trajectories were within the envelope. Of those originating outside, three were from middle Tennessee, and "ve from below Tennessee. The trajectory patterns for the rest of the lower concentration categories were more dispersed directionally, and the average trajectory lengths were greater. Signi"cantly, on the days which were prone to high ozone concentration, but which did not reach concentrations above 99 ppb, the trajectories
Fig. 1. 36 h backward trajectories at 750 m elevation on highozone days ([O ]*115 ppb) during the period 1993}1997.
Fig. 2. Origins of the 36-h backward trajectories at 750 m elevation on high-ozone days during the period 1993}1997.
tended to come from directions other than along one of the three high ozone corridors, and nearly all (20/22) of the trajectory origin points (the open circles in Fig. 2) were outside of the high ozone envelope. The criteria used for determining the days prone to high ozone were the `sunny, sultry and stagnanta (3S) criteria described by Hubbard and Cobourn (1998), and brie#y summarized in Section 3.4 below.
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
Various attempts were made to parameterize the trajectory information for inclusion into the regression model, including correlation with trajectory length, etc. Most of these were only marginally successful. The scheme that worked the best was quite simple. A trajectory classi"cation parameter was formed and added to the `Hi-loa (HL) regression model. For the 1993}1997 limited Hi-lo data set (162 days) the trajectory parameter was given a value of unity when the trajectory was fully contained within the high [O ] envelope. Otherwise, the trajectory parameter was given a value of zero. The inclusion of this parameter into the HL regression model resulted in a signi"cant improvement in the model accuracy (the mean absolute error (MAE) improved from 11.1 to 9.4 ppb), and the parameter itself was determined to have a high degree of statistical signi"cance. This exercise was repeated for the Standard model (580 days), and the results were also positive, with the MAE improving from 8.7 to 8.3 ppb. Signi"cantly, for the 116 `in-corridora days, the average error (bias) of the HL regression model improved from !6.3 to 0.4 ppb. In order for the trajectory parameter to make an e!ective contribution to forecasting ozone, it is essential that the forecast trajectories agree with the corresponding `pasta, or analysis trajectories, since it is the latter parameter that was used for calibrating the model. We studied this, using available trajectories from 1997 and 1998, and found that the distance between the origin points of forecast and hindcast 36 h backward trajectories was on average equal to 40% of the trajectory path length. In terms of just the trajectory parameter described above (i.e. equal to zero or unity), we found that the trajectory forecasts agreed with the analysis trajectories about 70% of the time for the same study period. 3.2. The Standard and Hi-lo regression equations During the 1997 forecasting season we used two regression models: the Standard model and the Hi-lo model. The Standard model was intended to be the default forecast tool for predicting ozone levels with equal accuracy on all days. To this objective the Standard model was "tted to all days in the database with equal weighting given to each day. Since regressionbased air quality forecast models have a tendency to underpredict high ozone days, we paid particular attention during model formulation to minimize this potential problem by appropriate selection and transformation of the predictor variables. Comparison of model hindcasts with observations showed that the Standard model was free of systematic bias. That is, the model did not show on average a tendency to overpredict or underpredict over the full range of expected ozone levels. Nevertheless, on the highest ozone days the Standard model tended to underpredict the daily peaks. To improve upon this limitation of the Standard model, the Hi-lo model was
4667
developed to give improved accuracy on the high ozone days. This was accomplished by removing roughly the middle 80% of the ozone distribution from the original regression database. This technique preserved the numerical range of the predictors and predictand, but reduced the in#uence of the middle days on the outcome of the regression coe$cients. The cuto! values of 50 and 105 ppb were determined so as to leave approximately equal numbers of days at each end of the ozone distribution, and to minimize the MAE for the Hi-lo regression. Compared to Standard model, the Hi-lo model had greater explained variance and lower average error on the high ozone days. The regression model building approach involved performing exploratory analysis using graphical techniques and nonparametric regression to discover the salient relationships between ozone and the predictor variables. Using these estimates provided guidance in formulating a parsimonious parametric model, which was then "t to the data by a combination of linear and nonlinear leastsquares regression. The overall approach was much the same as we previously reported, but the application of nonlinear regression has been substantially expanded in the latest model. The need for greater nonlinear treatment was based on the exploratory analysis by Bloom"eld et al. (1996), which convincingly showed the highly nonlinear relationship of ozone to humidity, wind speed, and temperature. To capture these nonlinear e!ects better, we employed a two step model building approach in which a "rst stage regression model was "tted using nonlinear regression to this same set of meteorological variables. This regression was then used as a new predictor in the second regression, which was "tted using ordinary least squares and the stepwise method (IMSL, 1992). All regression variables were considered in the stepwise procedure. The two-step approach conveniently combines the advantages of nonlinear and linear regression into one model, yielding superior results to using linear regression alone. The "nal form of the Standard model consists of an intercept, seven regression coe$cients, and seven explanatory variables: O "b #b O #b CC#b DOW#b LOD LJ #b NC#b RF#b TRAJ
(1)
The dependent variable, O , is the 1-h domain peak ozone. The "rst explanatory variable, O , is the predic tion from the nonlinear regression equation (Eq. (2)), which is a function of temperature, surface wind speed, and relative humidity. O "(h #(h #h TMAX#h TMAX) ;exp(h WS))exp(h RH)
(2)
4668
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
Table 1 Predictors used in the regression equations Predictor
Units
Symbol
Timing
Air mass trajectory corridor Cloud cover Day of week Length of day Maximum temperature Minimum temperature Nighttime calms Nonlinear model prediction Rainfall Relative humidity Surface wind speed
* tenths * hours 3F 3F * ppb in % mph
TRAJ CC DOW LOD TMAX TMIN NC O RF RH WS
Upwind, previous 36 h 9 a.m.}2 p.m. (average) Saturdays Sunrise to sunset Daily 24-h peak Daily 24-h min. 12 midnight}4 a.m. * Daily 24-h total 9 a.m.}1 p.m. (average) 9 a.m.}3 p.m. (average)
Table 2 Regression coe$cients for the Standard model Coe$cient
Fitted value
S.E.
Constant O CC DOW LOD NC RF TRAJ
!43.7 0.800 !0.732 4.14 4.16 1.55 !2.29 11.3
7.55 0.0310 0.199 1.29 0.530 0.566 1.37 1.24
Table 3 Regression coe$cients for the nonlinear regression model t-statistic !5.79 25.83 !3.67 3.21 7.85 2.74 !1.67 9.06
The remaining explanatory variables } cloud cover, day of week, length of day, number of calms, rainfall, and trajectory corridor location, are summarized in Table 1. The "tted coe$cients for the Standard model are shown in Table 2 along with the standard errors and t-statistic. Except for the rainfall variable, the t-values exceeded 2.7 in absolute value, indicating that all the explanatory variables contributed signi"cantly to the regression. The term representing the nonlinear regression was by far the strongest contributor, with a t-value exceeding 25. The overall explained variance (R) was 0.790, the mean absolute error (MAE) was 8.45 ppb, and the root mean square error (RMSE) was 10.58 ppb. In order to optimize the nonlinear model, many variable transformations and groupings of terms were evaluated. The best results were obtained by using a second order transformation of temperature and by applying an exponential transformation to wind speed and relative humidity separately (Eq. (2)). The "tted coe$cients for the nonlinear parameters are shown in Table 3 along with the standard errors and t-statistic. The accuracy of the nonlinear regression model is quite good considering that only three explanatory variables are used. The explained variance, MAE, and RMSE are 0.724, 9.52 ppb, and 12.12 ppb, respectively. This level of performance is su$ciently good that the nonlinear model by itself could
Coe$cient h h h h h h
Fitted value 76.5 181 !9.26 0.0933 !0.115 !0.00654
S.E. 4.62 127 3.09 0.0193 0.0124 0.000703
t-statistic 16.59 1.42 !3.00 4.83 !9.32 !9.30
be used as the primary forecast model for those communities that lack access to the other predictor variables. The Hi-lo model was "t to the days with peak ozone concentrations below 50 ppb and above 105 ppb using the same methods and explanatory variables as for the Standard model, except that the variables for 24-h rainfall and the number of overnight calms were removed from the regression for lack of statistical signi"cance. Though the sample size was much smaller (162 vs. 580 days) the retained regression variables were statistically signi"cant. 3.3. Comparison with previous models To demonstrate the improvements realized in the latest generation of regression models resulting from the expanded nonlinear regression term and inclusion of the trajectory parameter, a stepwise comparison of models is presented in this section. The forecast model used during the 1997 forecasting season (`Standard Ia)
b #b TMAX#b TMAX#b TMAX #b exp(hWS)#b AT >" #b TMIN#b CC#b RF #b NC#b DOW
(3)
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
4669
Table 4 Comparison of mean absolute error, root mean square error, and explained variance of various models for the period 1993}1997 (580 days) Forecast model
MAE (ppb)
RMSE (ppb)
R
Hybrid II Standard II Standard II (w/o traj.) Standard I (w/o traj.)
8.35 8.45 8.99 9.50
10.50 10.58 11.31 11.97
0.844 0.790 0.760 0.729
di!ers from the 1998 Standard model (`Standard IIa) in several ways: (1) The use of nonlinear regression was limited to the wind speed term only. (2) Maximum temperature was treated as a linear variable and was transformed up to the fourth order versus second order for Standard II. (3) The e!ects of humidity were captured via minimum temperature instead of relative humidity. (In e!ect, minimum temperature served as a proxy for the dewpoint, but actually gave better results than dewpoint with the Louisville data set.) (4) The proxy term for the seasonal cycle was theoretical atmospheric transmittance instead of length of day. (5) Standard I bene"ted from a square-root transformation of the dependent variable in order to stabilize variance. Standard II did not bene"t from any stabilization transformation probably owing to the greater use of nonlinear regression which better captured interactions between the variables. Finally, Standard I lacked the air mass trajectory term. When we re"t the Standard II model to exclude the trajectory term, but keeping everything else the same and using the same days to "t both models, the Standard II (w/o traj.) had 5.4% lower mean absolute error (8.99 vs. 9.50 ppb), and 4.3% greater explained variance (0.760 vs. 0.729), as compared to Standard I. The bulk of this improvement is attributed to incorporating temperature, wind speed, and relative humidity into a single nonlinear regression term. Inclusion of the air mass trajectory provided similar improvements. The mean absolute error decreased an additional 6.0% (8.45 vs. 8.99 ppb), and the explained variance increased an additional 3.9% (0.790 vs. 0.760). To summarize (see Table 4), the Standard II regression had 11.1% less mean absolute error and 8.4% greater explained variance than Standard I but used only one additional explanatory variable. 3.4. Hybrid model By using the 3S criteria to identify the probable high ozone days, a strategy can be employed to switch in an optimal way between the Standard model to the Hi-lo model to improve forecast performance beyond what could be achieved using either model alone. Combining
Fig. 3. Scatterplot of Hybrid II model hindcasts against observations for the period 1993}1997. The diagonal indicates the line of perfect correspondence between hindcasts and observations.
the Standard II and Hi-lo II models in this way gives rise to the Hybrid II model. The following 3S criteria were used to invoke the Hi-lo II model: (1) expected maximum temperature greater than 87 3F, (2) expected wind speed less than 6 mph, and (3) expected cloud cover less than 2.5 tenths. Relative to the Standard II model, this switching strategy reduced the mean absolute error by 1.2% (8.35 vs. 8.45 ppb) and increased the explained variance by 6.8% (0.844 vs. 0.790). Inspection of the scatterplot of ozone observations vs. Hybrid II hindcasts (Fig. 3) con"rms that the switching does not introduce systematic overprediction or underprediction error anywhere in the range: the scatter of points coalesces along a straight line having a slope of 1 and an intercept of 0.
4. Prediction of high ozone concentrations It stands to reason that on the few days of extremely high concentration during the ozone season, all of the factors which contribute to high ozone must be at or near to their extreme values for the season. Good regression models can account for a substantial portion of the daily ozone variation, but the models cannot account for the e!ects of factors not included in the model. Consequently, regression models (and other types of statistical models) will tend to underpredict on days when all ozone related factors are nearly maximal. The prediction errors tend to be substantially negative on these days, due to the strong e!ect of the unexplained factors. There are at least two possible means of improving the prediction errors on high ozone days. First, if some separate method (independent of the principal regression model) can be found to identify probable high ozone days, then an adjunct regression equation can be developed which is specially "tted for high ozone. This is the hybrid model approach.
4670
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
Second, if an additional ozone-related parameter (which is su$ciently independent from the existing ones) can be found, then this parameter can be added to the regression in order to reduce the amount of unexplained variance. Adding the trajectory parameter follows this approach. This last development has been particularly useful, since there is a strong association between high ozone days, and the `in-corridora back trajectories. For example, during 1993}1996, on the days where [O ]*120 ppb, 93% were `in-corridora days. By contrast, of the days where [O ](60 ppb, only 10% were `in-corridora days. These developments have made it possible to formulate a hybrid regression model which is much more accurate in forecasting extreme ozone events than the previously reported regression model. In most cases, very little subjective adjustment to the model forecast is necessary. Subjective or `experta skill is still required, however, in evaluating all relevant meteorological information to make the best forecasts of the meteorological input parameters. With the current hybrid model, the high-end errors have been signi"cantly reduced, as compared to previous models (Fig. 4). The graph shows the successive improvements achieved by the updated regression equation (Standard II w/o traj.), the trajectory parameter (Standard II), and the hybrid technique (Hybrid II). Based on the retrospective predictions, or `hindcastsa of the 1993}1997 model calibration period, using archived meteorological data for the model input, the Hybrid II model predicted 63% of the [O ]*120 ppb days to within 15 ppb; and the MAE for those days was 12.1 ppb. In contrast, the Standard II model without the trajectory parameter predicted only 36% of the [O ]*120 ppb days to within 15 ppb, and the corre sponding MAE was 18.1 ppb. For the Standard I model,
Fig. 4. Distribution of ranked model hindcast errors on days during the 1993}1997 period when observed [O ]*120 ppb, by model.
Table 5 Comparison of mean absolute error, root mean square error, and explained variance of various models for the 3S-days (77 days) during the 1993}1997 period Forecast model
MAE (ppb)
RMSE (ppb)
R
Hybrid II Standard II Standard II (w/o traj.) Standard I (w/o traj.)
9.07 9.84 11.67 12.25
11.30 11.85 14.15 14.99
0.633 0.458 0.362 0.318
re"tted to the 1993}1997 data, these statistics were respectively 21% and 21.7 ppb. For the 77 days during the 1993}1997 period which were meteorologically prone to high ozone, according to the `3Sa criteria, the Hybrid II model also performed signi"cantly better than the other models discussed above (Table 5). The capability of predicting concentrations exceeding the air quality standard, or other relevant threshold levels, for purposes of issuing health advisories, etc., is a critical aspect of OAD programs. The `detection ratea (DR), de"ned as the fraction of observed threshold exceedances predicted by the model, is a useful indicator of the model performance in predicting high threshold values, such as the 1-h, 120 ppb standard still in e!ect in the Louisville MSA as of this writing. The DR is also referred to as the `probability of detectiona (e.g. Ryan, 1995). In simple terms, an event is detected if both the model and observed values exceed the threshold. This could be called the `zero tolerancea DR, because observations just over the threshold (e.g. by 1 ppb) would be regarded as undetected, or `misseda, even if the prediction was very close, but just under the threshold. It is possible to assign a `tolerancea to the DR computation, wherein exceedances would be regarded as detected as long as the predictions exceeded an `alarm levela de"ned as the threshold minus a certain tolerance (e.g. 15 ppb). This scheme is rational in that it takes into account the model imperfections and tendency to underpredict the extreme values. Another useful indicator of model performance is the false alarm rate (FAR), nominally de"ned as the fraction of predicted threshold exceedances in which the observed values are below the threshold. It is appropriate to use the same tolerance discussed above for computation of the FAR, so that predicted exceedances would not be counted as false alarms as long as the observed concentration exceeded the alarm level. Using the 1-h, 120 ppb NAAQS as the currently applicable threshold for Louisville, the theoretical DR and FAR values were computed for the 1993}1997 model calibration period. Both indicators were good to excellent for the Hybrid II model, depending on the chosen
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674 Table 6 Theoretical detection rates and false alarm rates for the Hybrid II and Standard II (w/o traj.) models, based on a 120 ppb threshold, for various tolerances during the 1993}1997 period Tolerance (ppb)
Hybrid II DR (%)
FAR (%)
Std. II (w/o traj.) DR (%) FAR (%)
0 5 10 15
53 64 75 96
35 31 29 26
18 39 57 68
50 48 33 30
tolerance (Table 6). For example, at a tolerance of 15 ppb, corresponding roughly to the 85th percentile of the prediction errors for the period, the DR was 96%, and the FAR was 26%. The DR and FAR indicators for the Hybrid II model were signi"cantly better than those of the Standard II model without the trajectory parameter (Table 6), at several tolerance levels. These statistics are based on hindcasts made for the 1993}1997 calibration period, and so are theoretical values. In the forecast mode, the values would be expected to di!er somewhat (lower for the DR and higher for the FAR) for two reasons. First, because the model input is meteorological forecast data, and there are errors in these forecasts. Second, because in the forecast mode the model is being tested against new data, and the e!ect of the unexplained factors contributing to ozone may vary from year to year. The tolerance that leads to an acceptably high DR and acceptably low FAR can be used to establish the alarm level for issuing OADs in the community. Thus, if the ozone forecast would equal or exceed the threshold minus the tolerance value, the forecaster would recommend that an ozone action day be called. During the 1998 ozone forecast season in Louisville, a 15 ppb tolerance was chosen, so OADs were recommended on days when the model forecasts exceeded 105 ppb.
5. Model validation 5.1. Daily predictions for the 1998 ozone season The Hybrid II model was deployed by the Air Pollution Control District of Je!erson County (APCDJC) in cooperation with the Indiana Department of Environmental Management (IDEM), during the period 14 May to 16 September 1998. The o$cial forecasts for each day were prepared by IDEM meteorologists. Since these forecasts were geared primarily towards the identi"cation of the days appropriate for designation as ozone action days, here they are referred to as the `ozone actiona, or OA forecasts. Normally, these were prepared each
4671
Monday, Wednesday and Friday. Some OA forecasts were updated on Tuesdays and Thursdays if high ozone was expected. Thus, less than half of the OA forecasts were next day forecasts, a shortcoming which was due to sta$ng limitations. A second set of forecasts, called `raw forecastsa, was prepared at the university. The raw ozone forecasts were all next day forecasts, and were generated by directly entering the unadjusted meteorological forecast data obtained from the NGM numerical weather prediction model (the NGM MOS output is available twice daily from several internet sites) and the trajectory parameter as determined by the HYSPLIT trajectory model. In addition, the hindcasts from the Hybrid II model were prepared and archived once the recorded meteorological data became available from the National Weather Service (NWS). The observed daily 1-h maxima of ground level ozone were provided by the APCD. The raw forecasts tracked the day-to-day ozone variation reasonably well (Fig. 5). Some of the high peaks (e.g. [O ]'120 ppb) were seriously missed by the raw forecasts, however. The o$cial forecasts were more accurate on these days, particularly when next day forecasts were made. The hindcasts of the high peaks were also more accurate than those of the raw forecast. One problem with forecasting the high peaks is that in this realm, the model predictions are much more sensitive to small changes in certain meteorological parameters such as temperature and wind speed. Therefore, careful review of all pertinent meteorological information, taking into account expected errors in the raw forecast information, is crucial to the process. Such expert skill requires experience with both meteorological and ozone forecasting, and can be very e!ective in improving the next-day forecasts when relatively high ozone concentrations are expected. The overall statistical comparison of the 1998 hindcasts, raw forecasts, and OA forecasts is presented in Table 7. The hindcast MAE was 11.0 ppb, which is about 2.7 ppb higher than the value from the hindcasts of the 1993}1997 model calibration period. This deviation was higher than expected, since experience with using the original model on the 1997 ozone forecasts led us to expect only about 1 ppb added to the hindcast MAE. The hindcast average error, or bias was !6.8 ppb, whereas from the 1997 results we had anticipated a near zero bias. However, the original model exhibited similar increases in the MAE and bias in calculating a set of hindcasts for 1998. Part of the explanation for these anomalies is that some is due to normal year-to-year statistical variation. For example, the annual MAE varied from 7.0 to 9.1, and the annual bias varied from !1.3 to 1.0 for the hindcasts of the 1993}1997 model calibration period. It is possible also that uncharacteristic di!erences in meteorology and ozone climatology between 1998 and the 1993}1997 calibration period played an important role. We note that for Louisville, May and September of 1998 were much
4672
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
Fig. 5. Time series of observed and predicted daily maximum 1-h ozone concentration for the raw forecasts during the 1998 ozone season. Table 7 Error statistics for the 1998 daily 1-h maximum ozone hindcasts, raw ozone forecasts, Ozone Action forecasts, and next day Ozone Action forecasts Index
Hindcast
Raw forecast
OA forecast
1-d OA forecast
90th pct. (ppb) 10th pct. (ppb) Average (ppb) MAE (ppb)
7.8 !22.6 !6.8 11.0
21.8 !20.3 0.9 13.4
27.0 !18.8 4.1 15.4
31.1 !22.1 3.4 15.2
warmer and drier than usual, and July was cooler and rainier, as compared to the model calibration period. The MAE of the raw forecast, at 13.4 ppb, was slightly higher than that of the hindcast. The bias, at 0.9 ppb, was actually a signi"cant improvement. This `boosta may have come from the fact that for our area, the NGM model tends to over-predict peak temperatures during the summer months. The MAE of the OA forecasts was greater than that of the raw forecasts by about 2 ppb, and the bias was higher by about 3 ppb. This was mostly due to the fact that the forecasters' "rst priority was to identify and recommend OADs, rather than to minimize the overall forecast error. Therefore, uncertainties in the various forecasted meteorological parameters were typically hedged in the direction of higher predicted ozone, in order to avoid a `missed exceedance.a The error statistics for the next day OA forecasts (2-d and 3-d forecasts removed) were only slightly improved. Probably, the aforementioned `miss avoidancea technique contributed strongly to the overall error statistics of the next day OA forecasts as well. Nevertheless, the technique was successful when measured against its objective, since OADs were called on all days exceeding the 120 ppb threshold for which next day forecasts were available.
Table 8 Detection rates and false alarm rates for the 1998 daily 1-h maximum ozone hindcasts, raw forecasts, ozone action forecasts and next day ozone action forecasts Index
Hindcast
Raw forecast
OA forecast
1-d OA forecast
DR FAR
1.00 0.20
0.40 0.50
0.60 0.47
1.00 0.33
5.2. Predictions of high ozone in 1998 The detection rate (DR) for the 1998 hindcasts, based upon the nominal threshold value of 120 ppb, and a 15 ppb tolerance, was 5/5, or 100% (Table 8). The false alarm rate (FA) was 2/10, or 20%. These values are close to the expected values of 96% and 26% from the calibration hindcasts. For the raw forecasts, the corresponding DR was 2/5, or 40%, and the FAR was 6/12, or 50%. The disparity in these "gures is due to the critical relationship between ozone and meteorological factors on the days when conditions conducive to high ozone exist.
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
This makes the meteorological forecast errors more important on these critical days. For logistical reasons associated with disseminating an ozone action alert for the next day, the forecasts must be issued by 11 a.m. This time is usually just prior to the release of the next updated numerical weather model output, which is the 1200 UTC (8 a.m. EDT) model initialization run. As a result, the available output is the 0000 UTC run, which is about 15 h old at the time of the forecast. Since ozone normally peaks near 3 p.m. EDT, the next day forecasts are actually for about 42 h ahead. A skilled ozone forecaster can take the critical ozone relationships and forecast uncertainty into account, as well as departures between the raw model forecasts and other expert forecasts, e.g. from NWS. Other information, such as the location and strength of high pressure systems may also be taken into account. Thus, the DR for the next day OA forecasts was 3/3, or 100%, and the FAR was 4/12 or 33%. These indicators compare well to those of the calibration hindcasts. When the 2-d and 3-d forecasts are added in, the results are less impressive: 3/5 (60%) for the DR and 8/17 (47%) for the FAR. Of the two missed forecasts, one was for two days ahead, and the other was for three days ahead. Both occurred in middle September, near the end of the ozone forecast season. It appears doubtful that forecasts of high ozone events beyond a next day outlook can be made with any degree of reliability. 5.3. Application to forecasts of 8-h daily maximum ozone An optimized version of an 8-h ozone forecast model would involve re-examining all possible meteorological input parameters, using various averaging periods for the data, and combining these parameters into several di!erent formulations with the aim of minimizing the model error. As a prelude to this major undertaking, a rudimentary 8-h forecast model was developed by using a correlation equation to convert the 1-h forecasted ozone values to equivalent 8-h forecasts. The correlation equation was a second order polynomial, "tted to the 1-h and 8-h ozone daily maxima from the 1993}1997 calibration period. The purpose was to test the feasibility of using the same or similar model formulation for an 8-h forecast model, and to arrive at an upper bound estimate of the model MAE. The resulting preliminary 8-h model predicted the 8-h ozone daily maxima reasonably well. The time series of predicted and observed values appeared much the same as those in Fig. 5, when rescaled appropriately. As for the overall forecast statistics, the MAEs for the converted hindcasts, raw forecasts, and OA forecasts were respectively 10.1, 10.8 and 11.9 ppb. The DR values, based on the nominal threshold of 80 ppb, and using a 15 ppb tolerance were respectively 84, 92 and 89%. Of course, there were a far greater number of threshold exceedances for the 8-h, 80 ppb threshold: 38 as compared to 5 for the 1-h, 120 ppb threshold. The corresponding FAR values
4673
were 22, 24 and 29%. These results seem rather good for such a crude estimation procedure. Thus, it seems likely that it is feasible to develop an 8-h forecast model which is at least as accurate as the 1-h forecast model.
6. Conclusions The Hybrid II model was signi"cantly more accurate than the original Standard I model for predicting 1-h maximum daily ozone for the Louisville area during the 1993}1997 model calibration period. The improvement was even more dramatic at the higher end of the ozone distribution. The improved accuracy is attributed to a more advanced regression equation, use of the hybrid concept, and the addition of the trajectory parameter. The Hybrid II model was "eld tested during the 1998 ozone season, and it performed reasonably well. When the model was used by ozone forecasters during the 1998 ozone season to predict the days that were expected to approach or exceed the current 1-h ozone standard, the model performed well when the forecasts were next day forecasts. When the forecasts were made two or three days out, the model did poorly, due to the added uncertainty in the meteorological forecasts. It appears feasible to use the same or similar formulation for an 8-h ozone forecast model. The accuracy would be expected to be about the same, or slightly better than for the 1-h model described herein. Acknowledgements This work was partially supported by the Air Pollution Control District of Je!erson County (APCDJC). The authors wish to thank Art Williams, Arthur Chang, Cynthia Lee, and other APCDJC o$cials for their encouragement and cooperation. LGE Energy Corporation and the University of Louisville Research Foundation also provided support. We wish also to thank meteorologists Steve Sherman, John Welch, Mark Derf, and Mark Neyman of the Indiana Department of Environmental Management for their cooperation and technical assistance throughout the model development and evaluation. Graduate students Becky Hall, Richard Decker and Chutikoon Gajaseni faithfully performed many data collection and analysis chores required for this project. References Ashbaugh, L.L., Malm, W.C., Sadeh, W.Z., 1985. A residence time probability analysis of sulfur concentrations at Grand Canyon National Park. Atmospheric Environment 19, 1263}1270. Bloom"eld, P., Royle, J.A., Steinburg, L.J., Yang, Q., 1996. Accounting for meteorological e!ects in measuring urban
4674
W.G. Cobourn, M.C. Hubbard / Atmospheric Environment 33 (1999) 4663}4674
ozone levels and trends. Atmospheric Environment 30, 3067}3077. Comrie, A.C., 1994. Tracking ozone: air-mass trajectories and pollutant source regions in#uencing ozone in Pennsylvania forests. Annals of the Association of American Geographers 84, 635}651. Comrie, A.C., 1997. Comparing neural networks and regression models for ozone forecasting. Journal of the Air and Waste Management Association 47, 653}663. Draxler, R.A., 1991. The accuracy of trajectories during ANATEX calculated using dynamic model analysis versus rawindsonde observations. Journal of Applied Meteorology 30, 1446}1467. Holzworth, G.C., 1967. Mixing depths, wind speeds and air pollution potential for selected locations in the United States. Journal of the American Meteorological Association, 1039}1044. Hubbard, M.C., Cobourn, W.G., 1998. Development of a regression model to forecast ground-level ozone concentration in Louisville, KY. Atmospheric Environment 32, 2637}2647. IMSL, 1992. FORTRAN Subroutines for Statistical Analysis. User's manual-version 3.0. IMSL, Houston, TX, USA. Moy, L.A., Dickerson, R.R., Ryan, W.F., 1994. Relationship between back trajectories and tropospheric trace gas concentrations in rural Virginia. Atmospheric Environment 17, 2789}2800. Niu, W., 1996. Nonlinear additive models for environmental time series, with applications to ground-level ozone data analysis. Journal of the American Statistical Association 91, 1310}1321. NOAA, 1998. HYSPLIT4 (Hybrid Single-Particle Lagrangian Integrated Trajectory) Model. Internet URL: http:// www.arl.noaa.gov/ready/hysplit4.html, NOAA Air Re-sources Laboratory, Silver Spring, MD, USA. Poirot, R.L., Wishinski, P.R., 1986. Visibility, sulfate and air-mass history associated with summertime aerosol in northern Vermont. Atmospheric Environment 20, 1457}1469.
Ryan, W.F., 1995. Forecasting severe ozone episodes in the Baltimore metropolitan area. Atmospheric Environment 29, 2387}2398. Ryan, W.F., Luebehusen, E., 1996. Accuracy of ozone air quality forecasts in the Baltimore metropolitan area. 96-TA45.03, Presentation at 89th Annual Meeting and Exhibition, Nashville, Tennessee, of the Air and Waste Management Association, 23}28 June 1996. Ryan, W.F., Doddridge, R.R., Dickerson, R.R., Morales, R.M., Hallock, A.H., Roberts, P.T., Blumenthal, D.L., Anderson, J.A., 1998. Pollutant transport during a regional O episode in the mid-Atlantic states. Journal of the Air and Waste Management Association 48, 786}797. US EPA, 1994. The experimental UV index factsheet: explaining the index to the public. EPA 430-F-94-019, US. Environmental Protection Agency}National Oceanic and Atmospheric Administration. US EPA, 1996. Review of National Ambient Air Quality Standards for Ozone: Assessment of Scienti"c and Technical Information. EPA 452/R-96-007, U.S. Environmental Protection Agency, Research Triangle Park, NC, USA. US EPA, 1997. OTAG technical supporting document. Internet URL: http://www.epa.gov/ttn/otag/"nalrpt/. US EPA, 1998a. National Air Quality and Emissions Trends Report, 1996. EPA 454/R-97-013, U.S. Environmental Protection Agency, Research Triangle Park, NC, USA. US EPA, 1998b. NO emissions coal-"red plants continental V U.S., Internet URL:http://www.epa.gov/acidrain/emission/ index.htm. Wol!, G.T., Lioy, P.J., 1977. An investigation of longrange transport of ozone across the Midwestern and Eastern United States. Atmospheric Environment 11, 797}802. Wol!, G.T., Lioy, P.J., 1978. An empirical model for forecasting maximum daily ozone levels in the Northeastern U.S. Journal of the Air Pollution Control Association 28, 1035}1038.