AN ADAPTIVE DIAGNOSTIC MODEL FOR AIR QUALITY MANAGEMENT ROBERT CARBONE
Academic Faculty of Management Science
and WILPEN L. GORR
School of Public Administration, The Ohio State University, Columbus, OH 43210, U.S.A. (First received 1 Nouember 1976 and infinal form 13 January 1978) Abstract - This paper draws attention to the program evaluation and regulation adjustment capabilities required for implementing and maintaining air quality standards. A diagnostic approach is presented for determining the causes of air quality problems based on a multivariate time series model. The model formulation is based on routinely available data and the concept of time-varying parameters with estimation via an adaptive tittering technique. The diagnostic approach is illustrated with some total suspended particulates data from Allegheny County, Pennsylvania, recorded over a period of time during which a series of pollution control milestone events occurred.
1. INTRODUCTION
and maintenance of air quality standards tend to be treated as once-only planning efforts with the resulting pollutant emission regulations being thought of as “permanent” (see Federal Register, 1971 and 1974). Ambient air quality resulting from plans may miss the targeted standards, however, for several reasons. These include emission inventory errors, forecasting errors, diffusion modeling prediction errors, non-compfiance with regulations and unfavorable changes in uncontrollable variables such as background con~ntration. In some cases, then, it may be necessary to institute feedback control systems to adjust regulations or make other corrections in air quality management according to the experienced air quality trends. The purpose of this paper is to draw attention to the adjustment-of-air-pollution-regulations problem and to develop and test an appropriate diagnostic ap preach for the adjustment process - the Adaptive Diagnostic Model (ADM). Diagnostic models have the purpose of inexpensively limiting the explanation of an event to a subset of all possible causes of the event. Thus explanation becomes a staged process with the first stage being made with readily available information and routine procedures. The ADM which is presented for this first stage analysis is a multivariate time series model based on the concept of time-varying parameters with parameter estimates calculated by the Adaptive Estimation Procedure (AEP) developed by Carbone (1975) and Carbone and Longini (1975, 1976). Through the application of this adaptive filtering algorithm, ADM is self-adaptive to changing structures of an airshed so Implementation
that it has the capability of continuously providing diagnostics without the attention of systems analysts. The next section of the paper describes the empirical setting for a demonstration of ADM using total suspended particulates (TSP) concentrations recorded over a period of five years at the Logans Ferry monitoring site in Allegheny County (Pittsburgh) Pennsylvania. In an earlier paper (Carbone and Gorr, 1976) we reported initiat experience in applying AEP to the estimation of multivariate models of pollutant specifically background concenconcentrations, trations, based in part on these data. These data are reexamined in this paper because of specific milestone events of air pollution control which have occurred over the period at the various point sources in the vicinity of the site in question. An analysis of such data allows us to see whether or not ADM is capable of detecting and separating such events from the series of ambient concentration without using any a priori knowledge. Section 3 presents ADM and a brief discussion of the estimation approach. Finally, the results of our study are summarized in Section 4.
2. DATA
Figure 1 shows the locations of major particulate sources in Allegheny County including sources of fugitive emissions (largely from by-product coke plants) and point sources (largely integrated steel plants and fossil-fueled power plants). Also in Fig. 1 is the monitoring site under study, Logans Ferry operated by the Allegheny County Bureau of Air pollution Control. Additional information on the Logans Ferry monitor and nearby sources can be found in Rubin and
1785
1786
ROBERTCARBONEand WILPEN L. GORR 3. MODEL FORMULATION
AND ESTIMATION
APPROACH
(3M~sj~r~SWtce,~8ftii~t8)J
8missron rate0 Bs“ @ Fusl eombustii SOWCB @Industrial
prOC8SS
SOWGS
Fig. 1. Distribution of particulate sources and site of Logans Ferry monitor in Allegheny County (1972).
Bloom (i975), Carbone and Gorr (19761, and Gorr and Dunlap (1977). The data set from Logans Ferry has 365 observations of 24-h average TSP taken by a hi-vol sampler from May 1970 to April 1975. A standard Unico shelter and Gelman filter paper with a collection emciency of 99.7% for particles of size 0.3 pm and larger ,are used for this instrument. Meteorological data include surface measurements and Rawinsonde data taken at the Greater Pittsburgh Airport in the western portion of Allegheny County. The Rawinsonde data were processed by Denardo and McFarland Weather Services, West MifIlin, Pennsylvania to yield inversion-related variables. Lending interest to this data set are air pollution control activities of two large fossil-fuels power plants, A and B, just to the west of Logans Ferry (see Fig. 1) which followed a trend ofdecreasing particulate emissions over the five year study period : Plant A (a) In 1970 there were five coal-fired boilers in service with 322 MW capacity: Nos. 2,4 and 6 - 565 MMBTU h each ; No. 77 - 1020 MMBTU h ; No. 88 - 1450 MMBTU h. (b) In January, 1973 Nos. 2,4 and 6 were derated and finally taken out of service. (c) In December. 1974 No. 77 was converted to low sulfur oil with the plant serving a minor role to provide peak power. Plant B (a) Old coal-fired units rated at 262 MW were gradually derated from 1970 and taken ouf of service in May 1971. (b) A new coal-fired -nit with precipitators and a tall stack commenced operation in June 1970 and by May 1971 was operating at capacity.
The formulation of a model which relates pollution concentration, c(t), in real time at a monitoring site during period t as a function of a set of meteorological and other variables denoted by x,
a#““)
[E
Q)li(‘) (,C, i$Q)xI(t))
where G is a set ofgeneral variables, i and J are sets-of inversion related variables, and NI and NJ are sets of non-inversion related variables. The a,(t) and fidt) in (1) are corresponding time-varying parameters and u(t) is an undefined error term. All the variables in (1) are restricted to the values of0 or 1. Wind-directional sectors for the NJ subset and “inversion” for the J subset correspond to the x-type variables. In the present context, inversion is assumed to be a non wind~ir~tional phenomenon. Only one of the x-type variables can be assigned the value 1 for my given period since observations are either classified as a particular wind-direction sector or inversion. In our empirical study, we have defined 45-135”, 135-225”, 225-315” and 315-45” as the wind-dir~tional sectors and a period t is determined to be an “inversion” if there is a morning ground-based inversion with a temperature differential of at least 2°C across the thickness of the inversion. The z-type variables by definition indicate the occurrence of one of a finite number of nominal classes of explanatory variables such as probation levels, wind speed, etc. They are assumed to indirectly affect concentration by weighting or correcting the e&imated levels determined by the /? parameters. While
An adaptive diagnostic model for air quality management some adjustment factors in (1) are assumed to affect both wind-directional and inversion classified observations (the elements of G), others are viewed as having an adjusting effect only when one of the two types of period prevail. In our study, four classes for precipitation (none, O-O.254 cm, 0.254-0.889, and greater than 0.899cm in water equivalent), two classes for degree day (“heating” if the 24-h average ambient temperature is less than 18.3”C, “cooling” otherwise), and two classes reflecting general activity level for particulate sources (“weekday” and “weekend”) constitute the elements of G. Moreover, wind speed adjustments are applied only to the four wind directional estimates, and not to the inversion condition. Five discrete classes for the variable “24-h average wind speed” (O-2.25 m s- ‘, 2.25-4.41 m s - I, 4.47-6.71 m s-l, 6.71-8.94 m s-l, and greater than 8.94 m s-t) are the z-type variables contained in NI. No z-type factors have been used for I. Attention is now directed to the estimation problem which is to capture the path that each parameter may follow over time so as to adapt (1) to changing structural conditions. This is accomplished by applying the AEP algorithm. According to this algorithm updated parameter estimates of (1) are obtained with each new observation via the following two recursive formulae
1787
types of processes governing the change in parameter values. This aspect is crucial since it is generally impossible to assume a priori knowledge of the processes governing structural changes in an airshed. Note that no restriction in (2) and (3) is imposed as to the type of parameter variation that may arise. In reference to the empirical study, the following non-unique steps were taken to estimate the a,(t) and a,(t) coefficients using F,!ZP : (a) “None” for prectpttation, “cooling” for degree day, “weekend” for day and 2.25-4.47 m s- ’ for wind speed were defined to be standard conditions for normalization of the z-type variables. The Zi(t) variables associated with the standard classes are sup pressed by setting their values to zero for all observations. (Then the /3,(t) coefficients represent the TSP concentration of the given wind direction or inversion during the standard condition.) (b) At t =O,a,(t) = 1 Viand/3)(t) = 80pgrnm3Vj (an approximation of average TSP concentration at Logans Ferry). (c) fj(t) = 10+-j at t = 0 and 5, = 0.04. (d) D = 40 was applied for updating the /lAt)‘s in (2) and D = 10 for the a,(t)‘s in (3). A smaller value of D was chosen for the a,(t)‘s since they are updated only when the class of explanatory variable corresponding to i occurs. (e) The set of observations was then reiterated several times through the AEP equations (2) and (3) via a forward-backward input procedure (see Longini et al., 1975, and Carbone and Longini, 1975) to adjust the value of D, if necessary, and to converge to the pattern of change in the parameter estimates. In this case, no adjustments to D were found to be necessary. Note that because of the small values of D used, the algorithm and parameter estimates have virtually no memory of individual events of past iterations. A “purged” data set was also created by deleting the observations that resulted in an absolute estimation error greater than 60 pg me3 after the above steps were performed. This second data set was similarly processed through the above steps. The purpose here is to investigate robustness properties of the estimated parameter paths derived.
: sjw =b,(t -I)+ I& -1)1 A (c(t)~-,,).g., 1(2) [ for all j
and &(t) = oi,(t - 1) + a;# - 1)
(c(t) - W) [
Z(t)
z,(t).&
1
(3)
for all i where : 2(t) = a short run prediction of concentration calculated by applying the observed data at t to the parameter estimates oft - 1; D = a damping parameter > 1; P = the number of variables described by ztype indicators; f,(t) = updated average for the j-th x variate. An exponential smoothing scheme is used to calculate this latter average as follows: q(t) = &.x,(t) + (1 - S&(t
- 1)
where 0 c S, < 1 and depends on the “forgetting rate” of past observations applied to the process. This algorithm is easy to implement ; for example, it requires no matrix inversion or calculation of correlation functions. The algorithm functions as a servomechanism. Given initial values for the parameters, it is designed to automatically capture with some lag the
4.
RESULTS
Table 1 presents some measures of performance for the full and purged data sets. The simple correlation coefficient in this table is between observed and estimated TSP concentrations, and the serial correlation coefficient is for a first-order autoregressive scheme. The average observed and estimated concentrations reported in Table 1 shows little error in central tendency for both data sets. The standard deviation of estimated concentrations is smaller than that of the observed in both cases, but this is to be expected when significant measurement errors exist as with hi-vol data since the variance of observed concentrations
ROISERT CARBINEand WILPENL. GORR
1788
Table 1. Some descriptive measures of model performance
Statistic __-_^..__ _______ Average observed concentration Average predicted concentration Standard deviation observed Standard deviation estimated Root mean square error
_
Model from full data set 87.4 88.9 43.3 27.9 36.0 27.0 0.563 - 0.022
Mean absoiute error Simple correlation coefficient Serial correlation coefficient
includes the variance of true concentrations and the error distribution variance. Furthermore, we note that statistics reflecting closeness of fit (root mean square error, simple correlation coefficient and mean absolute deviation) are comparable to those derived from other empirical modeling efforts. See for example Islitzer and Slade (l%S), McColhster and Wilson (1975), Bankoff and Hanzevack (1975), Samson et al., (19751, and Sidik and Neustader (1976). Finally no
Model from purged data set _.__~__ ~~___. 81.9 83.2 34.2 23.8 26.3 20.9 0.643 - 0.052
evidence of serial correlation is detected which is an indication that the time-lag introduced by damping is causing no adverse effects. Table 2 presents parameter estimates for the Logans Ferry TSP model. The first column of Table 2 is the ‘?; occurrence of each class of a variable in the sample. The next two columns give parameter estimates for the start and end of the five year study period as estimated from the full data set. The last column contains the
Table 2. Parameter estimates for model with full data set and comparison with purged data set
Variable class U ind direction (“) 45-135 135-225 225-315 315-45 Inversion
“/0of occurrence 7.4 8.5 40.6 14.2 29.3
Parameter estimates Start End
Mean absolute error full vs
purged
43.1 119.0 118.6 63.7 106.7
52.9 97.3 95.1 54.3 112.2
1.20 11.44 2.62 2.51 4.21
1.092 1.ooo 1.013 1.205
1.106 1.000 0960 1.055 1.196
0.116 O.ooO 0.027 O.OQO 0.004
1.000 0.906 0.873 0.890
mlo 0.854 0.851 0.895
0.012 0.034 0.018
l.oOO 0.854
Moo 0.807
O.OiKl
1.ooo 1.250
l.ooo 1.127
0.000 0.038
100.0 Wind speed (m s-‘) O-2.24 2.24-4.47+ 4.47-6.71 6.71-8.94 > 8.94
6.3 51.6 36.4 4.9 0.8
1.088
lao.0 Precipitation none* low medium heavy
53.4 20.3 15.9 10.4
0.000
100.0 Degree day cooling* heating
28.5 71.5
0.012
100.0
W weekend* weekday
26.3 73.7 100.0
* Parameter estimates for these classes are always equal to 1 because they were &bed as standard classes w&b respect to which the el%cts of other multiplicative variabies are compared.
1789
An adaptive diagnostic model for air quality management mean absolute deviation between parameter estimates from the full and purged data sets computed over observations common to both data sets. From this table, it is easy to calcuiate the range of estimated concentrations which is 32.1-179.2 pg me3 at the start and 34.1-131.1j~grn-~ at the end. The values reported give results as theoretically expected. For example, the results reveal that concentration on a weekday is 25% greater than on a weekend at the beginning and about 12% at the end of the period (which may provide some measure of the general effectiveness of control policies); that precipitation reduces concentration by about 12-15% in contrast to no precipitation over the period ; that a resultant wind direction from the major point sources located close to the monitor and from the city (135-315”, see Fig. 1) for days characterized by the standard condition leads to concentration levels close to three times greater than those for the background resultant wind direction (45-135”) at the beginning and less than two times greater at the end (which also provides some indication of the effectiveness of control policies). In fact, the results reveal that concentrations from the urban area directions have decreased from 20-40% over the period depending upon the type ofindicator condition. Also of interest is the fact that the coefficient for the highest wind speed class is approx 20% higher than the three middle classes. This provides some evidence that the Steady Wind Incident phenomenon of high pollutant concentration exists for TSP as well as sulfur dioxide (see Gorr and Dunlap, 1977). To further explore the diagnostic capabilities of ADM, Figs 2-6 present the patterns of variation over time of the estimates of certain parameters. In all cases examined, the paths derived from both data sets are plotted with the heavier line corresponding to the full data set. Observations not common to both data sets have also been deleted in the parameter plots. Since
-I 75
150
225
Observation
300
375
number
Fig. 3. Parameter paths of the fi estimates for 225-3 15” wind direction.
0
75
I50
225
Observation
300
3
number
Fig. 4. Parameter paths of the p estimates for inversion.
1.60 1 E” ; 1.20 z
f
l-l
p7
I
o.60j----=-=--
L
CT 0.40 1971
0
1972 75
1973 150
Observation
1974 225
1975 300
i 1971
3
number
0
1973 150
Observation
Fig. 2. Parameter paths of the b estimates for 45-135” wind direction (heavy line indicates full data set, light line purged data set).
1972 75
Fig. 5. Parameter
1974 225
1975 300
3j! 5
number
paths of the ai estimates precipitation.
for mild
1790
ROBERT CARBONE
OAISI
1971 0’
1972 7’5
1973 150
Observation
1974 225
index
and
,
1975 360
3;5
number
Fig. 6. Parameter paths of the ai estimates for weekday.
parameter estimates are only updated when a class of a variable is observed, it is assumed in these figures that they persist in value until such an occurrence. Figures 2,3 and 4 depict the paths of j? parameters respectively associated with the 45-135” (background) and 225-315” (major point sources) wind directional classes, and inversion. The plotted values in these cases represent the patterns of variation of expected TSP given the standard condition; i.e. no precipitation, cooling, weekend, and 2.25-4.41 m s-l wind speed for non-inversion days. It is evident from these figures that the paths from the full and purged data sets follow nearly identical patterns. Our results portrayed in Fig. 2 reveal an increasing trend in TSP advected into the region under the standard condition. In contrast, we note from Fig. 3 a decreasing trend for the same condition whenever the wind was blowing from the direction of major nearby point sources (power plants A and B in Fig. 1). The milestone events in particulate emission controls at these sources mentioned in Section 3 appear to be evidenced in Fig. 3. Also plotted in Fig. 4 is a running annual average (see Rubin and Bloom, 1975) computed from monthly summaries of district steel production reported by the American Iron and Steel Institute (AISI) which combines Allegheny County steel production with that of some more distant facilities. Significant contributions to inversion concentrations are more likely due to low level emissions such as from by-product coke plants as opposed to elevated sources such as power plants. A qualitative comparison of these plots suggests that inversion concentration followed a similar pattern of change as the steel production trend until 1973. Thereafter a significant departure in opposite.direction is noticed. The year 1973 may then indicate the point in time when local emission control policies for steel plants impacted ambient air quality. Figures 5 and 6 present the patterns of variation of the estimates of the multiplicative parameters respectively assigned to the classes “mild precipitation”
WILPEN
L. GORK
and “weekday”. Here again, we can visually observe consistency between the patterns derived from the full and purged data sets. Figure 5 reveals that the impact of mild precipitation in contrast to no precipitation has remained somewhat constant over the study period as is expected. It is interesting to note from Fig. 6 that the path derived for weekday in contrast to weekend which reflects changes due to general source activity level follows a pattern similar to the inversion trend portrayed in Fig. 4. Comparing this path to the running annual AISI steel production index as in Fig. 4, we can easily observe the same phenomenon. The same pattern of variation between these two paths seems to persist until 1973, and thereafter, a decreasing trend in the weekday parameter exists as opposed to an increasing trend in the steel production index. This further supports our earlier contention as to when local emission control policies show some sign of effectiveness. 5. CONCLUDING
REMARKS
The main thrust of this paper has been to propose an adaptive diagnostic framework for air quality management based exclusively on routinely available information. It is apparent from the results reported in the previous section that this framework appears promising. These results demonstrate the ability of the system for accurately capturing and analyzing changes in trends in concentration 1el:els without using any a priori knowledge. Large scale computational experiments with real time data from a telemetry network are planned. Also, we are currently exploring the use of this type modeling process for the dynamic calibration of monitoring instruments, long-run forecasting via the extrapolation of the estimated paths of the parameters, and finally, refinement of ADM for real time prediction of concentration levels at any point in an airshed given an existing monitoring network. Acknowledgments - Appreciation is expressed to members of the Allegheny County Bureau of Air Pollution Control and especially Deputy Director R. J. Chleboski and S. Feigenbaum whose efforts resulted in much of the information for this study. Partial support was provided by an Ohio State University Research Grant.
REFERENCES
Bankoff S. G. and Hanzevack E. L. (1975) The adaptivefiltering transport model for prediction and control of pollutant concentration in an urban airshed. Atmospheric Environment 9, 792-808. Carbone R. (1975) The design of an automated mass aooraisal svstem using feedback. Unpublished Ph.D. disiertation; Carnegie-Mellon University, Pittsburgh. Carbone R. and Gorr W. L. (1976) Environmental Modeling a& S&m&ion. (Edited dy Oti) pp- 478482. USEPA 600/9-76-016, Washington,. D.C. Carbone R. and Lontini R. L. (1975) An adaptive stochastic
An adaptive diagnostic model for air quality management approximation algorithm for estimating time-varying parameters. College of Administrative Science, The Ohio State University WPS 75-56. Carbone R. and Longini R. L. (1977) A feedback approach to automated real estate assessment. Mgt Sci. 24, 241-248. Federal Register. (1971j Environmental Protection Agency 36, No. 158, 15486-15505. Federal Register. (1974) Environmental Protection Agency 39, No. 167,31000-31009. Gorr W. L. and Dunlap R. W. (1976) Characterization of steady wind incidents for air quality management. Atmospheric Environment
11, 59-64.
Islitzer N. F. and Slade D. H. (1968) Meteorology and Atomic Energy. (Edited by Slade) pp. 132,143. U.S. Atomic Energy Commission TID-24190, Springfield, Virginia. Longini R. et al. (1975) Filtering without phase shift. IEEE Trans. biomed. Engng 432-433.
1791
McCollister G.‘M. and Wilson K. R. (1975) Linear stochastic models for forecasting daily maximum and hourly concentrations of air pollutants. Atmospheric Environment 9, 417-423.
Rubin E. X. and Bloom H. T. (1975) Maintenance of ambient particulate standards in an industrialized region. Presented at the 68th Annual Meeting of the Air Pollution Control Association, Boston. Samson P. J., Neighmond G. and Yencha A. J. (1975) The transport of suspended particulates as a function of wind direction and atmospheric conditions. J. Air Polk. Control Ass. 25, 1232-1237. Sidik S. M. and Neustader H. E. (1976) Environmental Modeling and Simulation. (Edited by Ott) pp. 678-682. USEPA 600/g-76-016, Washington, DC. Widrow B. et al. (1975) Adaptive noise cancelling: principles and applications. Proc. IEEE 63, 1692-1716.