Climatological variability in modeled maximum concentrations

Climatological variability in modeled maximum concentrations

Amwsphenc Enunurronmmr Vol. 19. No. 5. pp. 739-W. Printed in C&at Britan. 1985. CLIMATOLOGICAL MAXIMUM C@M-6981 85 13.00 + 0.00 Perpmon Press Ltd ...

450KB Sizes 1 Downloads 27 Views

Amwsphenc Enunurronmmr Vol. 19. No. 5. pp. 739-W. Printed in C&at Britan.

1985.

CLIMATOLOGICAL MAXIMUM

C@M-6981 85 13.00 + 0.00 Perpmon Press Ltd

VARIABILITY IN MODELED CONCENTRATIONS

WILLIAM B. PETERSEN* and JOHN S. IRWIN* Meteorology and Assessment Division, Environmental Sciences Research Laboratory, Research Triangle Park, NC 27711, U.S.A. (First

receiced

13 July 1984 and

injdform

1 Ocrober1984)

Abstract-A 1‘I-year data base consisting of estimated concentrations of the high and high second-high are analyzed for their year to year variability. This paper addresses the issue of the uncertainty in maximum concentration estimates as a function of period of record. Resampling techniques are utilized to estimate the ratios of the annual high and high second-high concentrations to the high and high second-high’sfrom the 17-year record. This analysis suggest that if one year of data were used to estimate the high second-high concentration, the climatological highest high second-high would be 15-18 y0 higher depending on source type. It is the major intent of this paper to stimulate interest and research and to suggest that if less than five years of meteorological data are collected then some sort of correction may need to be applied to the high second-high concentration estimate.

Key

word

index:

Climatological variability, maximum concentrations, bootstrap.

INTRODUCTION

Recently an E.P.A. study compared various periods from a 17-year meteorological data set to determine the minimum number of years’ data needed to approximate highest modeled concentration estimates at one station (Burton et al., 1983). This study indicated that the variability of model estimates due to the meteorological data input was adequately reduced if a 5-year period of record was used. These findings are consistant with the Environmental Protection Agency’s recommendation that suggest that 5 years of meteorological data are required to characterize the climatological variability of maximum concentrations (U.S. EPA, 1978). However, for many applications 5 years of data may not be availabie or too costly to obtain. Assuming that 5 years of meteorological data are required to ensure that maximum concentrations have been estimated, and one year of data has been measured at random, then there is an 80% likelihood that, if another 4 years of data were gathered, a higher maximum concentration would be estimated. Meteorological data bases less than 5 years (usually 1 year or less) in length are generally measured on-site. On-site meteorological measurements provide representative input data to the air quality models. However, no matter how representative a l-year data base is, it cannot provide any information on the yearto year variability in maximum concentrations that are likely. The issue of the uncertainty in maximum concentration estimates as a function of period of record is an interesting environmental issue. In this

l On Assignment from the National Oceanic Atmospheric Administration, Dept. of Commerce.

work we ask ourselves the question, “If less than five years of data are collected, how uncertain are we in our estimates of maximum concentrations?” The major purpose of this paper is to stimulate interest in this area. In this paper we reanalyze the 17-year data base mentioned above. In that analysis the meteorological parameters were analyzed for their variability. In this analysis we look at the high and the high second-high (HSH) concentrations for two typical sources for the 17 years as modeled by the CRSTER air quality model. Resampling techniques are used to assign probabilities of choosing the highest second-high concentration from the 17 year record assuming that l-5 years of meteorological data were measured. It is not the intent of this paper to suggest that the exact probabilities estimated from this data base are universal to every city or site location. This is not likely. The important issue is to indicate the shortcomings in accepting as little as one year of on-site data for regulatory applications and to suggest a method that will better ensure that maximum concentrations are properly estimated. Of course, better than any approach to correcting the high second-high for climatological variability is to acquire periods of record longer than one year.

ANALYSIS Burton et al. (1983) used 17 years of National Weather Service (NWS) meteorological data from Philadelphia, PA to investigate the temporal representativeness of short-term meteorological data sets. In that analysis the 17 years of meteorological data (1965-1981) were used as input to the CRSTER (U.S. EPA, 1977) air quality model. Concentration estimates were made for 180 receptors for two typical point sources, a 4O&MW unscrubbed coal-burning power plant and a lOO@MW power plant. In their analysis the usual

and 739

Wrrrra~

740

B.

PETERSENand JOHS

assumptions of flat terrain and homogeneous flow are applicable to model estimates. The 17 years of CRSTER outnut for the high and the HSH concentrakons for the two sources for averatkg times of 1.3 and 24 h were analyzed. Because of the limited number of receptors used in the CRSTER runs we suspected that the maxima for short averaging times might not truely reflect the highest ground level concentrations. As we pursued the analysis, inconsistencies in the results indicated that this was so. Therefore, only the 24-h averages, which should be less sensitive to the exact receptor locations, were used in this study. Table 1 lists the high and HSH 24-h average concentrations for the two source types. The highest estimated concentration for the lOOO-MW plant was 93.1 pgrnb3 occurring with the 1973data. The lowest maximum estimated concentration for the large point source was 63.58 pg me3 occurring with the 1965 meteorological data. Since the emissions used in the model runs remained the same, only differences in meteorology can account for the range of nearly 30 pg m-l. HSH concentrations for the larger source had a range of 26.6 ng m-‘. The small point source (4OCLMWplant) showed a range in the high and HSH concentrations of 8.1 and 6.6 fig rn:‘, respectively over the 17-year period. The ratio of the minimum HSH to the maximum HSH was 0.68 for the large source and 0.71 for the smaller source. From Table 1, it is evident that the estimates of maximum concentration for this 17-year meteorological record exhibit considerable variability. If one analyzed the meteorological data set in pairs, i.e. the maximum HSH from two consecutive years of data are compared with the highest HSH from all 17 years, the ratio of the minimum HSH from the pairs to the 17year highest HSH would be closer to unity. The following analysis pursues this idea of selection to investigate the climatological variability of the HSH concentrations. From the 17 years of HSH concentrations, l-5 consecutive years were selected and the maximum HSH of the selected years was used in forming the ratio with the maximum HSH from the 17-year record. In the selection process for 1 year, 17 ratios were computed. For two consecutive years 16 ratios were computed by selecting the maximum concentration from the sixteen couples. For example, in Table 1, the first 3 HSH values that would be chosen for the smaller source are 22.26, 22.26 and 19.37 from couples (1, 2) (2, 3) and (3, 4), respectively. This process was repeated for l-5 years. From the ratios formed in this process a worst case and typical value are recorded in Table 2. Worst case is defined as the lowest ratio Table 1. High and high second-high 24-h concentrations bg m-‘) output from CRSTER model runs for a 17-year period of record

Yr.

Small source High High Second-High

Large source High High Second-High

65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81

19.12 23.16 19.92 20.56 19.69 21.72 22.85 20.88 22.26 22.44 22.59 27.21 20.51 21.44 i9.87 24.89 22.90

63.58 76.20 68.60 75.06 69.09 80.30 74.39 77.43 93.10 75.42 79.50 87.35 75.28 82.82 85.06 87.38 84.88

17.90 22.26 19.37 17.06 15.73 18.85 19.21 16.64 21.52 17.81 22.32 22.20 20.34 16.89 19.36 18.81 18.40

56.94 74.44 65.51 65.37 56.37 67.15 69.01 71.06 83.01 59.66 77.45 79.43 65.76 60.95 76.14 68.76 68.56

S. IRWIN

Table 2. Ratios formed from 17-year data base for l-5 years of high second-high concentrations

+I (Y) 1

2 3 4 5

Small source Worst case Median

Large source Worst Median case

0.705 0.764 0.845 0.861 0.861

0.679 0.787 0.789 0.831 0.831

0.845 0.890 0.964 0.980 0.997

0.826 0.897 0.917 0.937 0.957

formed from the set of 17-13 ratios for 1-5 consecutive years, respectively. The typical value is the median of the distribution of ratios. For the smaller source with one year of meteorological data, worst case assumptions would suggest that the HSH is underestimated by 30%. Typically however, the HSH may be underestimated by 15 %. Assuming that one would like to be within 10% of the 17-year HSH, under worst case assumptions collecting five years of data shows that concentrations are within 14%. outside of our allowable range of 10%. On the other hand, median values would suggest that a little over two years of data would usually be sufficient to be within 10% of the ll-year HSH. The larger source shows similar results.

RESA,MPLINC

TECHNIQUES

In the above analysis we were limited to at most 17 HSH samples. A relatively new statistical technique called “bootstrap” (Efron, 1982) is used to reanalyze the concentrations for the 17 years of CRSTER output. The “bootstrap” method allows one to estimate the distribution of a population parameter, e.g. mean, median, variance, or any percentile, based on a random sample. The method does not require any assumptions regarding the distribution of the underlying population. “Bootstrap” samples are formed by randomly drawing an item from the original sample recording it and replacing it back in the original sample. This process of selection with replacement is repeated until the “bootstrap” sample has a large number of items in the sample. It is evident that an item in the original sample may be chosen many times or not at all to form the “bootstrap” sample. In reanalyzing the 17 HSH concentration values a concentration was selected, recorded and replaced back in the original sample. This process was repeated 500 times. Our “bootstrap” sample now consists of 500 high second-highs. In a process analogous to the procedure described above, HSH concentrations were chosen at random for two years and the highest concentration recorded and both items replaced. This process of selecting multiple years was repeated up to 10 years. Figure 1 is a plot of the geometric mean of the 500 ratios of the HSH to the maximum HSH for each of the ten samples of 500. The solid and dashed lines are for the small and large source, respectively. This plot suggests that if one year of data were used to estimate the HSH, the climatological highest HSH based on the “bootstrap” method would be 15-l 8 7; higher depending on the source.

Climatological variability in modeled maximum concentrations

Fig. I. Geometric mean of 500 ratios of the HSH to the maximum HSH for 24-h average concentrations.

Figure 1 was generated based on 17 years of data. If less than 17 years data had been utilized in the analysis, what would be the effect on the geometric mean ratios? Intuitively one would expect that this type analysis should be performed on as large a data base as possible. Shorter periods of record would provide more variability in the ratio of the one year HSH to the HSH over the period of record. To investigate this variability 5 and 10 year periods of record were selected from the 17 years of data. For the 5-year period of record the ratios of the HSH for each year to the HSH for the S-year period were generated and the geometric mean of the five values computed. This was done for the thirteen 5-year periods possible in the 17year data base. Similarly the ratios of the HSH for one year to the HSH for the lO-year period of records were generated and the geometric mean of the 10 values computed. The range of the means for the S-year period or record vary from 0.824 to 0.930 for the smaller source. The range of mean ratios for the lOyear records are 0.833-0.870. The range of the means is approximately three times as great for the 5-year records compared to the 10 year records. These results suggest that further analyses conducted to verify and extend these results should include at least 10 years or more in the analysis. The “bootstrap” samples can be used to generate a table similar to Table 2 which shows the worst case and median ratios. Table 3 provides not only the worst case and median but also the 25th and 75th percentiles. The worst case condition for both the 17-year analysis and

74

the “bootstrap” analysis are the same for one year of data, as they should be. However, when two or more years are analyzed, the “bootstrap” estimate of the lowest ratio is always lower than the lowest ratio using the 17-year data base. This occurs because selections are made with replacement in ‘bootstrapping’ allowing the year with the lowest HSH to be selected more than once. The median values for the 17-year data base and the “bootstrap” samples are about the same. The 25th and 75th percentiles span 50% of the data. The range of ratio values between the 25th and 75th percentiles becomes smaller as the number of years approaches 5. Based on the 50-th percentile results, the tendency is to underestimate the HSH values by as much as 17 S; depending on the number of years data employed in the analysis. Tbe values of the ratios in Table 3 can be used to adjust the HSH based on N years of data for year-toyear variability. If C is the HSH concentration based on N years of model runs, N < 5, and R is a ratio from Table 3, then C, = C[l +(l -R)], where C, is the adjusted concentration. Any R could be chosen from the table. Choosing a value of R under the worst case column would provide largest increase to the HSH. The median values would provide a more typical correction. The corrections for the two sources used in this analysis are not too different. For multiple source problems perhaps an average ratio of the small and large source ratios would be appropriate. However. differences in the ratios for different source types needs to be verified with other data bases.

CONCLUSIOSS

Before approaches as described in this paper can provide a confident method for adjusting HSH concentrations for climatological variability, more data need to be analyzed. Besides differences in meteorology, differences in source characteristics and terrain may affect the ratios. Further analysis of similar length data bases may show that the ratios are stable and one can make an adjustment with a fairly high degree of precision. These issues need to be addressed. If further analysis shows that the ratios are insensitive to different data sets or can be partitioned into well defined

Table 3. Ratios formed from ‘bootstrapped’ samples from l-5 years of selected high second-high concentrations

1v(Y)

Worst case

1 2 3 4 5

0.705 0.705 0.746 0.764 0.802

Small source Percentile 25 50 0.798 0.845 0.867 0.868 0.91 I

0.845 0.868 0.964 0.995 0.995

75

Worst case

0.911 0.995 0.997 0.997 1.000

0.679 0.679 0.719 0.788 0.734

Large source Percentile 50 25 0.787 0.826 0.831 0.897 0.897

0.826 0.897 0.917 0.933 0.933

75 0.897 0.933 0.957 0.957 0.957

742

WILLIAM 0. PETERSEN and JOHN S. IRII~IH

categories, the methods described in this paper provide a simple adjustment to the HSH in situations where 5 years of on-site data are not available. The data employed in this analysis resulted from computer simulations using the CRSTER air quality dispersion model. A previous assessment of the CRSTER model did not reveal any tendency for this model to yield unusually high 24-h concentration estimates for situations when the terrain is well below the stack top (Turner and Irwin, 1982). For the model simulations, level terrain was assumed. Hence, although some differences would likely result if a different model were used, we do not anticipate such differences would significantly alter our conclusions. This discussion investigated uncertainty in estimated maximum concentration values resulting from use of meteorological records of less than five years in length. We have suggested that use of less than j-year records could result in typically underestimating the high second-high concentration by I5 %, and a worst case assumption could result in underestimating the high second-high by 30 %_ if this degree of underestimation is not tolerable then some sort of correction needs to be applied. The exact form of the correction is open to question as well as the magnitude of the uncertainty. One might question if five years are sufficient and/or decide that topics under discussion

are best handled using a completely different approach. This paper has served its intended purpose if discussions in the literature are fo~hcoming that better address the issues raised in this work. Acknowledgements-The authors wish to express their appreciation to the Office of Air Quality Planning and Standards of the U.S. ~nvjro~ental Protection Agency for the l-i-years ofCRSTER runs used in this analysis. They also wish to express appreciation to Mr. D. Bruce Turner for his many helpful suggestions in the analysis and presentation of the results in this paper.

REFERENCES Burton C. S., Stoeckenius T. E. and Nordin J. P. (1983) The temporal representativeness of short-term meteorological data sets: implications for air quality impact assessments. SYSAPP-83/092, Systems Applications, Inc. Efron B. (1982) The jackknife, the bootstrap and other resampling plans. Society for industrial and Applied Mathematics, No. 38. Turner D. B. and Irwin J. S. (1982) Extreme value statistics related to performance of a standard air quality simulation model using data at seven power plants. Armospheric ~~~iron~~r 16, 1907-1914. U.S. EPA (1978) Guideline on air quality models. U.S. Environmental Protection Agency, EPA450/2-78-027. U.S. EPA (1977) User’s manual for single-source (CRSTER) model. U.S. Environmental Protection Agency, EPA450/2-77-013.