Regional air pollution assessment by cumulative semivariogram technique

Regional air pollution assessment by cumulative semivariogram technique

Pergnmoni Atmospheric Environment Vol. 29, No. 4, pp. 543-548, 1995 Copyright 0 1995 Elsevier Science Ltd Printed in Great Britain. All rights reser...

689KB Sizes 14 Downloads 25 Views

Pergnmoni

Atmospheric

Environment Vol. 29, No. 4, pp. 543-548, 1995 Copyright 0 1995 Elsevier Science Ltd Printed in Great Britain. All rights reserved 1352-2310/95 S9.50 + 0.00

1352-2310(94)00246-O

REGIONAL AIR POLLUTION ASSESSMENT BY CUMULATIVE SEMIVARIOGRAM TECHNIQUE ZEKAi $EN Technical University of Istanbul, Faculty of Aeronautics and Astronautics, Department of Meteorology, Maslak 80626, Istanbul, Turkey (First received 20 December 1992 and in jinalform

18 June 1994)

Abstract--Cunnulative semivariogram (CSV) methodology is proposed for depicting qualitative regional features of air pollutants dispersion within an area. This technique provides useful information concerning the regional isotropy, continuity and dependence about the air pollutants distribution. It is also possible to obtain evidence that data can be approximated by certain types of regional stochastic models. The application of lthe proposed methodology is presented for smoke, sulfur dioxide and suspended particulate matter pollutants that are measured at different sites in the city of Istanbul. Key word index: Air pollution, cumulative semivariogram, regional variability, models.

DTRODUCHON

Air pollution atmospheric bined effects topographic from various

phenomenon takes place within the planetary boundary layer under the comof meteorological factors, earth surface features and the releases of air pollutants

sources. Although the concentrations of these pollutants may be measured rather exactly at source sites they become more dispersed and less dense as the distance increases from emission points. The meteorological factors such as the wind speed and direction, humidity, temperature and pressure on the one hand, and the Hearth surface roughness on the other play as dominant agents for the regional mixture of air pollutants. The process of mixture is relatively straightforward only for a pollutant which is effectively chemically inert. Otherwise, if particular pollutants react with others then additional complications appear, that have not yet been overcome. On the other hand, measurements of air pollutants at regular time intervals are made usually at irregularly distributed sites within an area. These irregularities in measurement sites are inevitable in practical sampling procedures such as air pollution concentration sampling and they give difficulties in any regional study of air pollutio-n modeling either by numerical models whereby a regular grid and hence node distribution is a prerequisite or in spatial simulation of air pollution phenomenon by stochastic processes. Additional practical difficulty arises from the fact that most often the meteorological station locations do not match with thF air pollution measurement sites. Consequently, it is not possible to seek meaningful relationships between meteorological factors and air pollutant concentrations. Hence, it is very convenient

to have a methodology whereby the meteorological factor information can be conveyed to the air pbllution measurement stations or vice versa. Due to the aforementioned difficulties almost all researchers concentrated their studies on modeling of air pollution at a single site by employing statistical methods for a long time (Larsen, 1971). The investigation of air pollutant variables at a single site for different periods of time is accomplished mostly by using univariate time series analysis (Merz et al., 1972; Chock et al., 1975; Zinsmeister and Redman, 1980; Taylor et al., 1986). Such approaches cannot provide any regional information as for the air pollutants’ spatial variations but may be successful only for temporal variations, if implemented properly. The spatial variation of air pollutants have not been studied so far by appropriate techniques but by constructing conventional contour maps only. However, air pollutant concentration-meteorological factor relationships have been tried on a regional basis by employing multivariate statistical modeling techniques, especially principal component analysis (PCA), (Peterson, 1970, 1972; Henry and Hidy, 1979). It has also been successfully applied to air pollution analysis for the past several decades and the literature is full of these applications. PC analysis has been used since the mid-1950s to investigate the spatial properties of global-scale stationary waves in the atmosphere, the geophysical correlation of sea-surface temperatures during El-Nino-Southern oscillation and in many other applications. It is the main purpose of this paper to present a new technique for evaluating qualitatively the regional features in air pollution concentration dispersions. This technique is the cumulative semivariogram (CSV)

543

544

Z. $EN

function as suggested by ‘$en (1989). It provides significant clues about the regional variability of any air pollutant within the study area. The application of this method is implemented for Istanbul air pollutant variables, namely, smoke, SO, and suspended particular matter (SPM).

CUMULATIVE

SEMIVARIOGRAM

TECHNIQUE

The agents that cause mixture of air pollution concentrations within the atmosphere have temporal as well as spatial variabilities. Measurement, assessment and modeling of these variations provide illuminating insights into the air pollution phenomena analysis, control and prediction. So far in the literature temporal measurements made at individual sites appear as a time series sampled at equal intervals. They provide a common basis for identification of singlesite variation patterns by suitable stochastic models and consequently one can control the standard levels of air pollution concentrations. As already mentioned in the introduction section of this paper many researchers investigated time variability only. On the other hand, the spatial variability of air pollutants from one site to another leads to the concept of regional variability of emission concentrations within the area. This variability determines the regional behavior as well as the predictability of the concentrations on the basis of which the interpretations are derived provided that suitable techniques and models are identified. For spatial variability the classical time-series techniques yield useful interpretations but for equal distance sampling. However, a great deal of progress has been made in the adaptation of statistical techniques to unevenly sampled data (North et al., 1982). These techniques do indeed yield useful information which are significantly different from information obtained by the use of cumulative semivariogram technique which is the main theme in this paper. Regular scatter of sites might not provide enough regional information as irregular sites since meteorological agents and surface features are almost always heterogeneous. Consequently, the following significant questions remain unanswered so far in the literature on air pollution. (i) How to quantify from irregular site measurements whether the regional air pollution distribution is homogeneous, continuous, dependent, etc.? (ii) How to model the heterogeneity so as to represent continuous variability within the area concerned? (iii) How to construct maps concerning the regional variability such that the estimates are unbiased? Only answers to the questions in (i) fall within the scope of this paper. It is well known that even though the measurement sites are irregularly distributed one can find central statistical parameters such as mean, median, mode, variance, skewness, etc., but they do not yield any detailed information about the phenomenon concerned. The greater the variance the greater

the variability, but unfortunately this is a global interpretation without detailed useful information. The structural variability in any phenomenon within an area can best be measured by comparing the relative change between two sites. For instance, if any two sites, distant d apart, have measured concentration values Zi and Zi+,, then the relative variability can simply be written as (Z, - Zi +d). However, similar to Taylor (1915) theory concerning turbulence, the squared difference, V(d), represents this relative change in the best possible way. Hence, v(d)=(zi-zi+.Jz.

(1)

This function has appeared first in the Russian literature as the “structure function” of regional variable. It subsumes the assumption that the smaller the distance, d, the smaller will be the structure function. Large variability implies that the degree of dependence among pollutant concentrations might be rather small even for sites close to each other. Such a variability may be a product of an active weather phenomena as changes in wind speed and direction, pressure, relative humidity, temperature, etc. In order to quantify the degree of regional variabilities, variance and correlation techniques have been frequently used in the literature. However, these methods cannot account correctly for the regional dependence due to either nonnormal distribution functions and/or irregularity of sampling positions. The classical semivariogram (SV) technique has been proposed by Matheron (1963) to eliminate the aforementioned drawbacks. Mathematically, it is defined as a version of equation (1) by considering all of the available sites within the study area as (Matheron, 1962; Clark, 1979) Yd- -&

Tgd (zi-zi+d)z I 1

(2)

where yd is the SV value at distance d and N is the total number of equally spaced observations. The elegancy of this formulation is that the regional variable probability distribution function is not important in obtaining the SV, and furthermore, it is effective for regular data points. It is to be recalled, herein, that the classical variogram, autocorrelation and autorun techniques all require equally spaced data values. Because of the irregularly spaced point sources, the use of classical techniques is highly questionable, except that these techniques might provide biased approximate results only. The SV technique, although suitable for irregularly spaced data, has practical difficulties as summarized by $en (1989). Among such difficulties is the grouping of distance data into classes of equal or variable lengths for SV construction, but the result appears in an inconsistent pattern and does not have a nondecreasing form as expected in theory. However, adaptation of a cumulative semivariogram (CSV) provides (with the same data) a nondecreasing pattern without grouping of distances, but by taking successive summations of squared differ-

Regional air pollution assessment ences. Hence, each one of the distances is considered individually in the regional variability of the intact lengths. The construction of CSV can be explained step by step as follows.

0

Sampling site

(i) Calculate distances between every possible pair of sites. If the number of sites is n, then m = n(n - 1)/2 half-squared differences and distances, di (i = 1, 2 , ... , m) exist. (ii) For each pair of sites, find the half-squared differences, D(d,), between air pollution data values. Hence, for each distance, d, a corresponding halfsquared difference may be calculated. (iii) Rank the distances in ascending order with their attached half-squared differences, D [d”)], where superscript (i) indicates the rank. For instance, D [#)I, is the half-squared difference corresponding to the smallest distance. (iv) Successive summation of the ordered halfsquared differences yields the CSV expression as Fig. 1. Location map. Yc(dk)= i

D[d(‘)]

(k=1,2,3, . . . ,m)

(3)

i=l

where yc(dk) is the value of the kth-order distance CSV value. This expression can be written for practical uses as

yc(d,)=k ,$1(Zi- Zi- 1)‘. I

It is also possible to easily estimate CSVs over different compass directions in order to get some clues and interpretations about the anisotropy in the regional variable. For this purpose the projections of sites along any desired direction are obtained with a distance measurement from an arbitrary original point on the projection line. Subsequently, aforementioned steps are applied leading to a directional CSV.

APPLICATION

The field data for the implementation of the proposed methodologg are extracted from a previous study by Ayalp (19’76)concerning air pollution monitoring in the city of Istanbul. Istanbul with its estimated population of over ten million is located in a basin of approximately 5712 km2 surface area with famous strait (Bosphorous) separating the city along northsouth direction. Hence, the predominant wind direction is along the north direction coming from over the Black Sea reaching the Bosphorous. The micrometeorological structure of the lower atmosphere over the city was investigated by Incecik (1986). He stated that the low surface wind velocities, stable nighttime atmosphere and weak dispersion ability observed in the vertical wind speed profiles indicated that the meteorological conditions favour air pollution accumulation in the region. At eight different sites the average annual smoke, sulfur dioxide and suspended

particulate matter were recorded. The location of these sites are shown in Fig. 1. Most of these sites are selected at highly polluted areas in the city and they include different sources such as industrial, domestic and commercial areas. The frequency distribution functions for these air pollutant variables are found by Ayalp (1976) to comply with lognormal distribution functions. In the Istanbul region the cloud cover and precipitation peak in the winter period and breaks occur when transitory high move across Turkey bringing low-level southwesterly winds. Cyclogenesis occurs over the northern part of the study area when cold fronts move slowly southwestward. Frontal passages are observed west of the winter. Moderate lowlevel turbulence and wind shear are common with southerly winds. The sulfur dioxide observations are actual measurements of the gaseous acidity of the air caused by this gas and acidimetric titration method is used for its measurement. The air sample is labeled through a dilute hydrogen peroxide solution where SO, is absorbed and oxidized to form H,SO, during 24 h. The acidity of the resulting solution is estimated by titration with alkali, NaOH, and the results are related to sulfur dioxide concentration in the sample. On the other hand, in the smoke measurement the air is pulled through a filter paper held in a lacquered brass clamp for 24 h leading to a dark circular stain on the clean filter paper. The darkness of the stain is converted to concentration values. “EP/ILAC/lOl g-Day Automatic sampler” is the worldwide used instrument in detection of sulfur dioxide and smoke in ambient air. Many authors (Liu et al., 1985) suggested the use of the correlation function technique for air pollutants’ spatial structure exploration. However, correlation techniques are most suitable for normally distributed variables @en, 1977). In general, nonnormal and espe-

546

Z. gEN

cially lognormal distribution functions can be transformed to a normal distribution function rather easily but the transformation of negative exponential function presents great difficulties. Nevertheless, even if the transformation is possible, the transformed values will not reflect genuine properties of original observations. As mentioned in the previous sections, the CSVs are rather robust and valid for any distribution function. In fact, the central limit theorem of classical statistics states that whatever the underlying probability distribution function of a random variable, its successive summations or averages will approach to a normal distribution (Feller, 1951). Herein, two perpendicular directions, namely, southnorth (SN) and eastwest (EW) are considered for the experimental CSV in addition to general CSV irrespective of direction which yields a global picture about the regional variability of air pollutants. The experimental CSVs for each air pollutant considered in this study are presented in Figs 2-4. In order to show clearly the general trends in each one of these graphs, scatter of relevant points for each direction is represented by smooth curves and straight lines. All stations were used in each of the three figures and the eastwest component of the distance between two stations was used, for example, in the westeast calculation of the CSV. In fact, different curves have the same number of points but some of them fall outside of the area scaled in the figures and therefore are not shown. However, the points shown in the figures are enough to indicate the general trend along each direction. Although objective methods such as the least-squares technique can be employed in deriving the functional forms of these trends in this paper the best curves and lines are fitted simply by eye. Interpretations of these graphs lead to the following significant points.

0

-.

/

as .*w. 1

,I’

,

I 5

3

Ro

Fig. 3. Experimental CSV for sulfur dioxide.

0.

x

0,’

I

I

I’

I ,i

I

/

d

.*

,I’

$1

-

- ----_

9

(km)

----20-

I

I 1

Distance

General

W-E NS

-

-

General WE NS

160

I

Ro

3

I 5

I I

I 9

Distance (km)

Fig. 4. Experimental CSV for SPM.

0

Ro 2

4

6

Distance (km) Fig. 2. Experimental CSV for smoke.

8

(i) Individual air pollution CSVs along various directions are very different from each other which indicates that the regional air pollution distribution over the area is anisotropic. It is a necessary and sufficient requirement for isotropic area1 distributions that the experimental CSVs are independent from the direction selected within the limit of sampling errors. (ii) That SN or general CSVs do not pass through the origin. This is tantamount to saying that the pollutant occurrences along these directions within the city of Istanbul cannot be considered as regionally smooth processes, but atmospheric diffusion is under

Regional air pollution assessment the control of some topographic and/or meteorological factors. This further implies that in the dispersion of air pollutants, uniform conditions did not prevail but rather complex combination of multitude meteorological events such as wind direction and speed, pressure, temperature, humidity, etc., took place concurrently and sequentially in addition to the topographic factors. (iii) That the initial portions of each general experimental CSVs have: intercept, R,, on the horizontal (distance) axis. Within the distance R, the CSV value is equal to zero; hence from equation (4), ZirZi_ 1 which implies structural control within the regional variable. Furthermore, in general, large pollutant concentrations follow large concentrations and small ones follow small concentrations, e.g. there are islands of high or low concentration locations. (iv) That each experimental CSV point fluctuates about a straight line for large distances. The existence of straight line portions in the CSV implies that pollutants concerned are independent from each other along these distances. These portions correspond to horizontal segments at large distances in classical SV. Furthermore, this is the only range where the classical formulations keep their validity. Local deviations from the straight line indicate the hidden or minor dependencies in the pollution concentration distributions over a region. Elaboration on this point is beyond the scope of this paper. (v) The EW direction CSVs pass through the origin for each air pollutant. Such a property on CSV diagram implies the continuity of air pollutants along this direction. Continuity means that there are no nugget effects or discontinuities within the regional air pollution variables Notice that EW direction coincides with the cross wind direction in the city of Istanbul. (vi) All the CSVs along EW direction and general CSV for SO2 in Fig. 3 have curvature portions for moderate range of distances. In fact, such a range corresponds to the distance scale as defined in the turbulent flow by Taylor (1915). After this range, the CSVs coverage to straight lines as discussed in (iv). The initial curvature implies that the air pollutants along these direcnons have regional dependencies which weaken towards the end of curvature distance range (Sen, 1989). Since curvatures are convex there are positive regional structural dependence @en, 1992). Furthermore, the curvature implies that the pollutant is serially dependent, i.e. not only external factors but also the pollutant dispersion in the atmosphere contributes to the regional correlation structure. (vii) The general CSV of suspended particular matter has no curvature part at all. Such a situation is valid in cases where the regional concentration distributions arise predominantly due to the activities of external factors only. Furthermore, there is no structural correlation, i.e. suspended particulate matter that evolve randomly. This is in accordance with

547

Brownian motion of very fine particles within free atmosphere. (viii) As suggested by $en (1992) the experimental CSVs help to identify the underlying generating mechanism of regional phenomenon. For instance, if the CSV passes through the origin and has straight line portions only then the regional phenomenon concerned complies with independent (white noise) process with no regional dependence at all. None of the experimental CSVs in Figs 2-4 have such a case. However, when the CSV is in the form of straight line but does not pass through the origin then moving average process is the underlying generating mechanism of the regional variability ($en, 1992). It is clear from Figs 2-4 that all of the SN direction experimental CSVs as well as the general CSV of suspended particulate matter have such a property and it is possible to conclude that moving average mechanisms are dominant along these directions. Curvature in the CSV implies in general autoregressive processes among which the Markov and ARIMA process CSVs are presented is a recent work by Sen (1992) for a set of parameters. Generally, the existence of straight line following initial curvature portion indicates that the underlying generating process of the regionalized variable concerned accords with a Markov process, whereas if the curvature continues at reduced rates at large distances then an ARIMA process is the convenient model. It is necessary to state herein that future studies are necessary for a complete picture of CSVs in cases of other stochastic processes which will help to identify the most suitable one for the regionalized data at hand. (ix) That the slope of long-distance straight line portion is related to the standard deviation of the underlying pollutant concentration. It is possible to consider this slope as the population standard deviation of the pollutant concentration. (x) That the vertical distance between the longdistance straight line and the one drawn parallel to it passing through the. origin reflect the magnitude of regional correlation coefficient of pollutant concentration. Obviously, the smaller the distance, the smaller will be the pollutant concentration regional correlation. (xi) Regional discontinuity in air pollution distribution along SN direction is obvious from CSVs along this direction due to the intercept, y,,, on the vertical axis. Such an intercept implies abrupt changes (discontinuities) within the regional variable. The dominant wind direction in the city of Istanbul is almost along SN direction and cessation in the wind processes from time to time imbeds discontinuities in the dispersion process of air pollutants.

CONCLUSIONS

It is significant to depict the regional dependence structure of air pollutants dispersed over an area. In

548

2. SEN

the past, without seeking any answer to regional dependence the air pollutant modeling was achieved by employing some multivariate statistical techniques or time series analysis all of which have restrictions as to the assumption of concentration distribution being normal and in most of the cases thev are reeionallv _ -

considered as independent processes. The difficulty was due to the lack of a reliable technique for quantifying the correlation

structure

of the regional variable.

However, in this paper cumulative semivariogram (CVS) is proposed as a practical tool for assessing the regional dependence structure. The application of CSV is straightforward without any conceptual difficulty or ambiguity. The CSV graphs show the change of half-squared differences between air pollution concentrations at distinct sites with ordered distance. The pollutants have independent structure only when CSV variation with distance appears as a straight line passing from the origin. Otherwise, they are dependent and according to the dependence intensity and variation they take different curvature shapes. However, all of these curves have a common point that they merge straight lines at large distances. The slope and the vertical distance of this straight line from independent process straight line play important role in the determination of some regional correlation properties of the underlying pollutant concerned. It is hoped that the methodology presented in this paper for identifying quantitatively the regional variation features in air pollutant distributions over an area will be extended quantitatively in modeling the air pollution regional variations.

REFERENCES

Ayalp A. (1976) Istanbul’da atmosfer kirlenmesi. Unpublished Ph.D. thesis, Technical University of Istanbul, 246 pp. (in Turkish). Chock D. P., Terre11T. R. and Levitt, S. B. (1975) Time series analysis of riverside, California air quality data. Atmospheric Environment 9, 978-989.

Clark I. (1979) The semivariogram-Part 1. Engng Mining J. 180,90-94. Feller W. (1968) An Introduction to Probability Theory and its Application. Wiley, New York, 509 pp. Henry R. C. and Hidy G. M. (1979) Multivariate analysis of particulate sulfate and other air quality variables by principle components-Part I. Atmospheric Environment 13,1591-1596. Incecik S. (1986)Hava kirlili&inin meteorolojik parametrelerinin analizi ile ilgili bir uygulama. Ceure 1, 15-21 (in Turkish). Larsen R. ‘L. (1971) A mathematical model for relating air quality measurements to air quality standards. Publication AP-89, Environmental Protection Agency, North Carolina. Matheron G. (1962) Traite De geostatistique Appliquee, Tome 1. Edition Technique, Paris, 334 pp. Matheron G. (1963) Principles of geostatistics. Econ. Geol. 58, 1246-1266. Merz P. H., Painter L. J. and Ryason P. R. (1972) Aerodynamic data analysis-time series analysis and forecast and an atmospheric smog diagram. Atmospheric Environment 6, 319-342. North G. R., Bell T. L. and Cahalan F (1982) Sampling errors in the estimation of empirical orthogonal functions. Mon. Wea. Rev. 110,699-706. Peterson J. J. (1970) Distribution of sulfur dioxide over metropolitan St. Louis, as described by empirical eigenvectors, and its relation to meteorological parameters. Atmospheric Enuironment 14, 501-518.

Peterson J. J. (1972) Calculations of sulfur dioxide concentrations over metropolitan St. Louis. Atmospheric Enuironment 16,433-442.

$m Z. (1977) Autorun analysis of hydrologic time series. J. Hydrol. 36, 1189-1210. !$en Z. (1989) Cumulative semivariogram models of regionalized variables. Jnt. J. Math. Geol. 21, 891-903. $en Z. (1992) Standard cumulative semivariograms of stationary stochastic processes and regional correlation. Inc. .I. Math. Geol. 24,417-435. Taylor G. I. (1915) Eddy motion in the atmosphere. Phil. Trans. R. Sot. A 215, 1. Taylor J. A., Jakeman A. J. and Simpson R. W. (1986) Modeling distributions of air pollutant concentrations-L Identification of statistical models. Atmospheric Enoironment 20, 1781-1789. Zinsmeister A. R. and Redman T. C. (1980) A time series analysis of aerosol composition measurements. Atmospheric Environment 14, 201-215.