Multivariate data analysis of key pollutants in sewage samples: a case study

Multivariate data analysis of key pollutants in sewage samples: a case study

Analytica Chimica Acta 393 (1999) 181±191 Multivariate data analysis of key pollutants in sewage samples: a case study Mari Pantsar-Kallioa,*, Satu-P...

573KB Sizes 0 Downloads 34 Views

Analytica Chimica Acta 393 (1999) 181±191

Multivariate data analysis of key pollutants in sewage samples: a case study Mari Pantsar-Kallioa,*, Satu-Pia Mujunenb, George Hatzimihalisc, Paul Koutou®desd, Pentti Minkkinenb, Philip J. Wilkiee, Michael A. Connore b

a Department of Ecological and Environmental Sciences, University of Helsinki, Lahti, Finland Department of Chemical Technology, Lappeenranta University of Technology, Lappeenranta, Finland c Hatlar Environmental P/L, Melbourne, Vic. 3000, Australia d Melbourne Water, Environment Risk Management, Melbourne, Australia e University of Melbourne, Department of Chemical Engineering, Melbourne, Australia

Received 30 October 1998; received in revised form 2 March 1999; accepted 5 March 1999

Abstract Waste water treatment plants often need detailed information about the sources and levels of pollutants in sewage in order to maintain stable process conditions and to achieve permitted levels for hazardous compounds in their ef¯uents. A high content of pollutants is usually traceable to industrial inputs. In this study the main objective was to study the factors affecting the composition of sewage of domestic origin. Sixty-®ve domestic sewage samples collected during 9 months at eight different sites in Melbourne, Australia, were analyzed for 83 chemical variables. The data set also included two samples of combined domestic/industrial wastewaters, seven samples from waste water treatment plant in¯uent streams and ®ve domestic water supply samples. The data was studied with multivariate data analysis methods; principal component analysis (PCA) and partial least squares (PLS). With multivariate methods, effects of lifestyle of residents, day of the week and sampling time or weather on the pollutant levels could be determined. # 1999 Elsevier Science B.V. All rights reserved. Keywords: Wastewater; Sewage treatment; Domestic; Priority pollutants; PLS; PCA; Multivariate analysis; Melbourne

1. Introduction Melbourne is a city of over three million residents. Most of the domestic and industrial sewage is puri®ed in Melbourne Water's two major sewage treatment plants. The Western Treatment Plant handles 450  106 l of sewage a day from the region containing many of Mebourne's industries, while the Eastern *Corresponding author. Tel.: +358-3-892-20333; fax: +358-89220189; e-mail: [email protected]

Treatment Plant treats 380  106 l of sewage a day from a more residential area containing fewer industrial areas. The Environment Protection Authority of Victoria (EPAV) has established maximum permitted levels for pollutants in the ef¯uents from the waste water treatment plants. These requirements are in place to protect the environment especially the receiving waters into which the ef¯uents are discharged. To meet these requirements, knowledge about the hazardous compounds and elements entering the sewer system in the

0003-2670/99/$ ± see front matter # 1999 Elsevier Science B.V. All rights reserved. PII: S 0 0 0 3 - 2 6 7 0 ( 9 9 ) 0 0 2 8 7 - 1

182

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

®rst place is needed. Industry is the main source for many pollutants in sewage and limits have been established to reduce the volume and strength of industrial discharges. However, in order to work out an acceptable compromise between pro®table industrial activity and imposing too strict limits for discharges a detailed knowledge about sewage composition and the sources of the pollutants must be obtained. [1,2] In this study the main objective was to characterize the levels of pollutants in sewage originating mainly from domestic activities or industry in Melbourne. Since the data obtained had multivariate nature and many of the variables studied were correlated, multivariate data analysis methods principal component analysis (PCA) and partial least squares (PLS) were used for the studies. PCA and PLS have proven to be ef®cient methods for analyzing large data sets in environmental e.g., biological, chemical, and ecotoxicological case studies. [3±11]. 2. Experimental 2.1. Location of sampling sites The quality of analytical data depends on how representative the sampling procedures are. In this

study, the sampling sites were chosen so as to be widely distributed geographically and such that their catchments encompassed a diverse collection of residential area types. When selecting the representative sampling sites several factors (related to lifestyle and residential density, age of suburbs and geographical locations) were taken into account. The majority of the sites selected had catchment areas contributing mainly domestic waste. This was con®rmed by determining the exact catchment boundaries for each selected site using detailed computerized sewer maps and checking that the catchment was free of trade waste inputs. Despite the emphasis placed on domestic sewage, there were obvious advantages also in sampling from sites where the sewage had a trade waste component. Data from such sites would be obtained at similar times and the results would be relevant for comparative purposes in the study. When all the presented criteria were considered, eight sampling sites for domestic samples were chosen. The sampling sites are presented in Fig. 1. The sites 1±4 spanned the western and eastern suburbs, and had no licensed Trade Waste Discharges in their catchments. One older suburban site (Site 5), closer to the centre of Melbourne, received trade waste from an auto electrical services company, a petrol station and a medical laboratory. The sixth site, originally expected to receive only domestic sewage, later

Fig. 1. The Map for the location of the domestic sampling sites in Melbourne. Site 1 Warrandyte, 2 Croydon Hills, 3 Moolroolbark, 4 Keilor, 5 Camberwell, 6 Box Hill North, 7 Keilor Treatment Plant and 8 Aberfeldie.

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

183

Table 1 Characteristics of domestic sampling sites Site number

Suburb

Trade waste input

Area/ha

Number of tenements

Number of residents

1 2 3 4 5 6 7 8

Warrandyte Croydon Hills Mooroolbark Keilor Camberwell Box Hill North Keilor (treatment plant) Aberfeldie

None None None None Slight Moderate Moderate Substantial

225 155 301 263 242 138 2168 2360

880 540 1300 675 3600 1500 5550 18530

2880 1865 4200 2487 9250 3925 20180 56740

proved to be receiving a moderate trade waste input through a sewer connection shown on maps as being closed. The seventh site was the Keilor waste water treatment plant, a small plant in the western suburbs with a comparatively low trade waste component. The eighth site was on a main sewer receiving a substantial input from industry. [1,2] Samples from this site were taken to establish the kinds of changes caused by industrial inputs. The characteristics of the sampling sites are presented in Table 1. Domestic water supply samples were collected from garden taps at Site 3 and 9. Site 9 spanned two areas; the ®rst area adjacent to sites 1±6 and the second site adjacent to Site 8. This was done to ascertain how much of each pollutant in the sewage was present in the original water supply and how much would result from household activities. Sampling Site 10 was used for the Eastern and Western Wastewater Treatment Plant, which are the major treatment plants in Melbourne. The samples from these plants were analyzed to determine the levels of pollutants in waste waters entering such plants. 2.2. Sampling program Seventy-nine samples comprising 65 domestic sewage samples (sites 1±6), two sewage waste water samples containing a signi®cant industrial component (Site 8), seven samples from waste water treatment plant in¯uents (sites 7 and 10) and ®ve domestic water supply samples (sites 3 and 9) were analyzed. The samples were collected during the 9-month-period from March to December 1994. The sampling was done only when the weather was dry or when there was no or only little rain. This precaution was taken to

ensure that the sewage samples were not diluted or contaminated with storm water. The sampling program consisted of six runs. During every sampling run also some replicate samples were collected to test the repeatability of the sampling and the analyses. A test sample (25 March) was taken to evaluate the capabilities of the laboratory (Australian Laboratory Services) which carried out all the analyses. During the sampling runs 1±3 (Run 1: 2 June, Run 2: 17 August, Run 3: 24 August) the samples were collected from sites 1±8. The runs 4 (23 November) and 6 (18 December) were done in order to study diurnal trends in sewage composition and the variations in sewage between a typical weekday (Wednesday) and a weekend (Sunday). For these analyses sewage samples were collected from sampling Site 2 on Wednesday (23 November) and on Sunday (18 December) at 7, 8, 9 and 10 a.m., 12 noon, 2, 3, 4, 5, 6, 7, 8, and 10 p.m. In addition the in¯uence of the day of the week on pollutant concentrations was studied for samples collected on each day of the week at sampling Site 2 (Run 5: 24±30 November). Five domestic water supply samples (sites 3 and 9; runs 2 and 5) and seven samples from wastewater treatment plants (Site 7 and 10 runs 1±3) were also collected. 2.3. Chemical variables measured The chemical components to be analyzed were chosen based on a list of priority pollutants maintained by the EPAV. The list of components measured is presented in Table 2. In multivariate data analysis sulphite, total phenols, Be, Hg, Se, Tl, V, all the amines, ethers, PAH's, except naphthalene, and all the hydrocarbons, except 1,2-dichlorobenzene, 1,3-

184

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

Table 2 The variables measured in the samples. The abbreviations used in data analysis (Figs. 1±4 Fig. 5) are marked in parenthesis after the variable name. The variables used for data analysis are marked as bold Type of the variables

Variables

Basic wastewater characteristics

Biochemical oxygen demand (BOD), chemical oxygen demand (COD), total organic carbon (TOC), oil and grease (FEM), total suspended solids (TSS), total dissolved solids (TDS), ammonia nitrogen (NH4), total nitrogen (N-tot), total phosphorus (P-tot), total oxidised sulphur (TOS), sulphate, sulfite, sulfide, total phenols, color Aluminium (Al), antimony (Sb), arsenic (As), barium (Ba), beryllium (Be), boron (B), cadmium (Cd), calcium (Ca), chromium (Cr), cobalt (Co), copper (Cu), iron (Fe), lead (Pb), magnesium (Mg), manganese (Mn), mercury (Hg), molybdenum (Mo), nickel (Ni), potassium (K), selenium (Se), silver (Ag), sodium (Na), strontium (Sr), thallium (Tl), tin (Sn), titanium (Ti), vanadium (V), zinc (Zn) Bis (2-ethylhexyl)phthalate [B(2ethhex)], butylbenzyl phthalate (Butbenzphth), dibutyl phthalate (DB Phth), diethyl phthalate (DE Phth), dimethyl phthalate (DM Phth), dioctyl phthalate (DO Phth) Chlorobenzene, 1,2-dichlorobenzene (1,2DiClbenz), 1,3-dichlorobenzene (1,3DiClbenz), 1,4dichloro-benzene, 1,2,3-trichlorobenzene, 1,2,4-trichlorobenzene, hexachlorobenzene, chloroform (CHCl3) Acenaphthalene, acenaphthylene, anthracene, benzo (a) anthracene, benzo (g,h,i) perylene, benzo (b) fluoranthene, benzo (k) fluoranthene, chrysene, dibenzo (a,h) anthracene, fluoranthene, fluorene, indeno (1,2,3-c,d) pyrene, naphthalene (naphth), phenanthrene, pyrene, Nitrosodimethylamine, nitrosodiphenylamine, nitrosodipropyl-amine, 1,2-diphenylhydrazine Bis (2-chloroethoxy) methane, bis (2-chloroethyl) ether, bis (2-chloroisopropyl) ether, 3bromophenyl phenyl ether, 4-chlorophenyl phenyl ether

Elements

Phthalate esters Chlorinated hydrocarbons Polynuclear aromatic hydrocarbons (PAH's) Amines Ethers

dichlorobenzene and chloroform were omitted because the data sets for these variables contained only few values above the analytical method's detection limit. The number of variables used for data analysis was 46. 2.4. Multivariate modeling 2.4.1. PCA model PCA is a bilinear projection method where the original m-dimensional measurement space described by matrix X (n samples  m variables) is projected into a lower, A-dimensional space by decomposing the X-matrix into a sample score matrix T, variable loading matrix P0 whose product models the systematic variation in the data and into a residual matrix E, which in ideal case contains only the measurement errors. [12,13]

By plotting two columns of the T matrix against each others a two-dimensional projection of the original data set is obtained. Plot of the rows of the P0 matrix show how the variables are correlated. Fig. 2 shows an example of scores and loadings plots. PCA can be used to classi®cation problems to display data as informative plots. The score values have the same properties as weighted averages, i.e., they are not sensitive to random noise but show processes affecting several variables simultaneously in a systematic way. This makes them suitable for detecting multivariate trends, e.g., in multivariate time series, and clustering of either object or variables in multivariate data sets. PCA can be seen as a data compression method which can be used (1) to display multivariate data sets, (2) to ®lter noise (3) to study and interpret multivariate processes. In PCA and PLS models the original data matrices were autoscaled (the variables were mean centred and

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

185

Fig. 2. (a) PCA scores and (b) loadings for the first two principal components for the analysis of all samples. The numbers in the scores plot indicate sampling sites. The group of five samples in the left side of the Figure is the group of water supply samples.

scaled to unit variance) before the data analysis. The multivariate analyses were made by using Unscrambler 7.01 (Camo ASA, Norway) [14]. 2.5. PLS Modeling There are also a number of previously published papers on the mathematical basis of PLS [13,15]. In

PLS modeling the data is divided into two groups of variables, into x (descriptor) variables and y (response) variables. A causal relationship is assumed to exist between them. The x set and y set are modeled separately (with models based on `the principal component idea') and the solution is rotated so that the correlation between the score matrices is maximized. In matrix notation the model can be expressed with

186

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

three equations: X ˆ TP0 ‡ E Y ˆ UQ0 ‡ F ua ˆ da ta ‡ ga (dependence of latent variables on each other, i.e., the correlations between the columns of matrices T and U.) where U Q E F g D

Object score matrix from original Y space Variable loading matrix from original Y space Matrix of model and measurement error (X space) Matrix of model and measurement error (Y space) Vector of model and measurement error (T-U space) Diagonal matrix containing the slopes of equations joining the columns of T and U

Discriminant PLS (DPLS) is a special form of PLS analysis whose purpose is to ®nd the variables and directions in multivariate space which discriminate the known classes in calibration set. To ®nd the discriminating directions a dummy or indicator Y matrix is constructed. This Y contains as many columns as there are known classes in the calibration set, i.e., each class has a column in Y. Each class variable is assigned a value 1 or 0 depending into which class an object belongs. Here DPLS was used to clarify the differences in sewage found on different weekdays and weather conditions. When undesigned data is analyzed there is always a possibility that chance correlation will be modeled instead of real phenomena. To guard against over®tting cross-validation was used for selecting the dimension of models (especially in PLS modeling). [13] The cross-validated coef®cient of determination, Q2, which indicates the variance captured in crossvalidation, was used as an indicator of over-®tting. There is also another widely used coef®cient, R2. It is also called the coef®cient of determination but it indicates the variance captured with the model. The R2 values for the components are presented in the relevant Figures.

3. Results and discussion The main purpose of this study was to characterize the contribution of domestic pollutants to sewage. This was done by interpreting the results of the laboratory measurements using multivariate analysis methods: PCA and PLS. Also diurnal trends in sewage composition and the difference between weekdays and weekends were studied for sewage of domestic origin. In order to establish the contribution of domestic pollutants to sewage the concentrations of tap water samples, industrially contaminated ef¯uents and water treatment plant in¯uents were also analyzed. There were no outliers in the data set, which was tested by checking the residual Q statistics and also by the studentized residual versus leverage plots. The data was also evaluated based on univariate statistics and expertise knowledge before data analysis. 3.1. Differences between sample types The differences between different sample types: sewage samples of domestic origin, industrially affected waste water samples, waste water treatment plant in¯uents and domestic water supply samples were studied by PCA. The scores and loadings of the ®rst two principal components (PCs) are presented in Fig. 2. Despite the fact that the ®rst two components explained only 36% of the total variance, Fig. 2 re¯ected the main groupings in the data set. Based on cross-validation there were four signi®cant components. The ®rst PC illustrated the variation of overall pollutant levels in the samples (and the presence of CHCl3). As could be expected, the domestic water supply samples separated clearly from the other samples. In these samples the levels of CHCl3 were at their highest level (due to chlorination of raw waters) while the levels of other pollutants were lower. In other samples CHCl3 was observed in lower concentrations, since it tends to evaporate as the water is used. The domestic samples contained higher overall pollutant levels than the water supply samples but these levels were in turn lower than in the industrially contaminated samples and most of the water treatment plant in¯uents. The differences in the concentrations of pollutants in the water treatment plant in¯uents and

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

187

in the domestic samples arise from the contribution of industrial ef¯uents. A difference between in¯uents to different plants can also be seen from Fig. 2. In¯uent to the waste water treatment plant at domestic site 7 clearly has a different mix of industrial ef¯uents compared to the other two major treatment plants (Site 10). The in¯uent to the former plant (Site 7) contained high levels of FEM, Ba, dichlorobenzenes, phthalate esters, naphthalene and Zn. A sample collected at 7 a.m. from the same site contained much lower levels of these pollutants; which can be explained by the early sampling time, which is prior to the commencement of much industrial activity in the catchment area. The in¯uents to the two major waste water treatment plants (Site 10) were dominated by pollutants like Sr, Ca, TDS, Mg, Mo, Na, Fe, K, Mn, Sn, TOS, and TSS. These differences re¯ected the major contributions of industry to the in¯uents to these two plants.

from those made later in the day and in the evening as presented in Fig. 4. The dominating feature in sewage on Sunday morning was the high concentrations of Ntot, NH4‡, Mo, Mn, Mg, Ca, Sr, Fe, TSS and also P-tot (7 and 8 a.m.). Later in the day, the concentration of Ptot was even higher, while the levels of N-tot, NH4‡, Mo, Mn, Mg, Ca and Sr decreased. The high nitrogen concentrations early in the morning suggest that sewage at this time is dominated by toilet wastes. Later in the day wastes come a variety of from another activities, and the increased phosphorus level in evening could conceivably be related to the increased use of detergents. However, these interpretations have to remain speculative as we have no ®rm data on people's habits. For samples collected on Wednesday the dominating feature was the high metal (Co, Cd, Al, Pb, As, Ag and Ni), TOC and FEM concentrations, especially in the afternoon at 2 p.m. as shown in Fig. 4.

3.2. Influence of weekday and diurnal trends on domestic samples

3.3. The influence of weather

PCA and DPLS analysis showed that the mix of pollutants in the domestic samples was different between typical weekdays and weekends. A DPLS for samples collected on different days of the week provided a graphic illustration of the differences. The samples collected on Sundays and Saturdays (marked with o in Fig. 3) are clearly separated (on the right in the Figure) from the samples collected on weekdays (from Monday to Friday; symbol x). Based on crossvalidation there were two signi®cant latent variables (LV). The composition of sewage on weekends re¯ected typical human functions and weekend activities compared to sewage on Wednesday, which was characterized by the presence of more industrial pollutants. The sewage on Sunday contained higher levels of Fe, SO42ÿ, Ca, K, Mg, Sr, Mn, Mo, P-tot, N-tot, TSS, BOD and NH4‡ compared to typical industrial pollutants. Industrial pollutants like Al, Ni, Co, As and FEM were dominant in the sewage on weekdays. There was also a sampling time related trend, which was studied in more detail for samples collected on Wednesdays and Sundays (two signi®cant LVs). The observations made early on Sunday morning differed

The sewage composition also depended on the weather at the time of sampling. The DPLS scores and loadings are presented in Fig. 5 (two signi®cant LVs). The ®rst latent variable described the main in¯uence, that of temperature, on sewage composition. The scores for samples collected in cool weather are located on the left in the Figure and samples taken during hot/sultry weather on the right. The samples collected during warm weather are generally located between these two groups. The samples taken on hot and sultry days were dominated by variables such as BOD, COD, TSS, TOC, P-tot, N-tot, NH4‡ K, Ca, Co, Sn and Mn. The samples collected on cool and rainy weather contained lower levels of these pollutants and higher levels of metals (Cr, B, Pb, Sb) and phthalates (dimethyl and dibutyl). The variation in sewage composition as the weather changes is in line with the observations that people's activities and behavior in Melbourne are in¯uenced considerably by weather. Gardening, for example, is a potential contribution of a variety of soil components to sewage (e.g. from hand washing afterwards), and this is a favored occupation on warm days. Also, on hot days one would expect a different balance of food and drink to be consumed compared to cool days. Lack of data means, however, that at present we

188

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

Fig. 3. (a) DPLS scores and (b) loadings for samples collected at sampling Site 2 on weekdays (Monday, Tuesday, Wednesday, Thursday, Friday; marked with x) and on weekends (Saturday, Sunday; marked with o).

cannot con®rm these rather speculative conclusions about people's habits. Since special care was taken to stop sampling before any rain or drizzle would have a chance to affect the samples, the samples were not diluted by rainwater. Also the in¯uence of diurnal trends can be seen in Fig. 5. The Group 1 on the left side refers to the

samples collected on Wednesday between 12 noon and 2 p.m., while Group 2 contains samples collected on Wednesday between 7 a.m. and 12 noon. The samples collected early on Sunday morning form Group 3, on the right side of the Figure. The concentrations of N-tot, NH4‡, Sr and Mn were especially high in these samples.

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

189

Fig. 4. (a) PCA scores and (b) loadings for samples collected on Wednesday and on Sunday. The first letter in the sample scores describes the day of sampling (W ˆ Wednesday, S ˆ Sunday) and the numbers after the letter describe the time of sampling, e.g., W7 stands for a sample collected on Wednesday at 7 a.m. and S22 for sample collected on Sunday at 10 p.m.

4. Conclusions The data collection in this case study was far from ideal. The best scheme would have been to collect the samples in every site every day at the same time during a long time period. This would have allowed to apply some 3-way multivariate technique, such as 3-way PCA or PARAFAC methods, into the data set. By these techniques the results on the effects and variation of the variables, sites and time would have been more

clear and precise. Unfortunately the sampling program of this kind could not be carried out in this study for economical and practical reasons. Multivariate data analysis methods (PCA and PLS), however, proved to be effective tools for analyzing and displaying the composition of sewage. This study showed that the pollutants from different sources could be characterized. The nature of sewage pollutants originating mainly from domestic waste was studied more closely and they were found to depend

190

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

Fig. 5. (a) DPLS scores and (b) loadings for describing the influence of weather on the sewage. C stands for the samples collected on cool weather (5±158C), W for samples collected on warm weather (15±258C) and H for samples collected during hot/sultry weather (25±358C). The ellipses indicate certain groups of the objects and have no statistical meaning.

on many factors like the location of sampling areas, the lifestyle of residents, the day of week and the sampling time. These results indicate that the sampling results are (among many other factors) sensitive to behavior patterns, which may again have implications for those involved in ongoing sewage sampling. Of especially interest were the differences in sewage between weekends and weekdays. The weekend

samples were characterized by high levels of pollutants like total phosphorus and total nitrogen which can be attributed to typical household activities like cooking, cleaning and laundry. On typical weekdays the dominating feature was the higher metal concentrations. Strong diurnal trends between 7 a.m. and 10 p.m. were observed both on weekends and weekdays. Also

M. Pantsar-Kallio et al. / Analytica Chimica Acta 393 (1999) 181±191

the weather was found to have an effect on people's activities. Acknowledgements The late Steward Morgan is gratefully acknowledged for the contribution made in early stages of the research. Also Alan Bennett, formerly of Melbourne Water, and Keith Evans for his assistance with sample analyses are warmly acknowledged. References [1] P.J. Wilkie, The Contribution of Domestic Sources to Levels of Key Organic and Inorganic Pollutants in Sewage, Master of Engineering Science Thesis, University of Melbourne, Australia, 1995. [2] P.J. Wilkie, G. Hatzimihalis, P. Koutoufides, A. Connor, The contribution of domestic sources to levels of key organic and inorganic pollutants in sewage: The case of Melbourne Australia, Wat. Sci. Tech. 34 (1996) 63±70. [3] S.-P. Mujunen, P. Minkkinen, B. Holmbom, A. Oikari, PCA and PLS methods applied to ecotoxicological data: Ecobalance project, J. Chemom. 10 (1996) 411±424. [4] D.O. Tegelmark, Site factors as multivariate predictors of the success of natural regeneration in Scots pine forests, For. Ecol. Manage. 109 (1998) 231±239.

191

[5] C. Palmborg, L. Bringmark, E. Bringmark, A. Nordgren, Multivariate analysis of microbial activity and soil organic matter at a forest site subjected to low-level heavy metal contamination, Ambio 27 (1998) 53±57. [6] C. Palmborg, A. Nordgren, E. Baath, Multivariate modelling of soil microbial variables in forest soil contaminated by heavy metals using wet chemical analyses and pyrolysis GC/ MS, Soil Biol. Biochem. 30 (1998) 345±357. [7] C. Andren, B. Eklund, E. Gravefors, Z. Kukulska, M. Tarkpea, A multivariate biological and chemical characterization of industrial effluents connected to municipal sewage treatment plants, Environ. Toxic. Chem. 17 (1998) 228±233. [8] M. Vega, R. Pardo, E. Barrado, L. Deban, Assessment of seasonal and polluting effects on the quality of river water by exploratory data analysis, Water Res. 32 (1998) 3581±3592. [9] T. Berg, O. Royset, E. Steinnes, M. Vadset, Atmospheric trace element deposition: Principal component analysis of ICP-MS data from moss samples, Environ. Pollution 88 (1995) 67±77. [10] C. Rosen, G. Olsson, Disturbance detection in wastewater treatment plants, Water Sci. Tech. 37 (1998) 197±205. [11] J.J. Mangas, J. Moreno, A. Picinelli, D. Blanco, Characterization of cider apple fruits according to their degree of ripening: A chemometric approach, J. Agric. Food Chem. 46 (1998) 4174±4178. [12] S. Wold, Principal component analysis, chemom, Int. Lab. Systems 2 (1987) 37±52. [13] H. Martens, T. Naes, Multivariate Calibration, John Wiley, UK, 1989. [14] The Unscrambler 6, Users's Guide, CAMO AS, Trondheim, 1996. [15] HoÈskuldsson Agnar, Prediction Methods in Science and Technology, vol. 1, Basic Theory, Poland, 1996.