The ‘principal components’ statistical method as a complementary approach to geochemical methods in water quality factor identification; application to the Coastal Plain aquifer of Israel

The ‘principal components’ statistical method as a complementary approach to geochemical methods in water quality factor identification; application to the Coastal Plain aquifer of Israel

Journal of Hydrology, 140 (1992) 49-73 49 Elsevier Science Publishers B.V., Amsterdam [21 The 'principal components' statistical method as a compl...

1MB Sizes 0 Downloads 16 Views

Journal of Hydrology, 140 (1992) 49-73

49

Elsevier Science Publishers B.V., Amsterdam

[21

The 'principal components' statistical method as a complementary approach to geochemical methods in water quality factor identification; application to the Coastal Plain aquifer of Israel A. M e l l o u l a n d M . C o l l i n L~'rael Hydrological Service, P.O. Box 7043. IL-61070 Tel A viv, L~'rael

(Reccived 30 November 1991: revision accepted 30 May 1992)

ABSTRACT Melloul, A. and Collin, M., 1992. The "principal components" statistical method as a complementary approach to geochemical methods in water quality factor identification; application to the Coastal Plain aquifer of Israel. J. Hydrol., 140: 49-73. In general, Schoeller and Piper diagrams, as well as other classical techniques, facilitate the description of the various geochemical groups of water, and help to explain quality changes in an aquifer. We propose the use mainly of 'principal components" analysis for identification of relevant groups of water and the factors that bring about a change in their quality. The major advantage of this method is its suitability for simultaneous analysis of a great number of variables and observations. It is here being applied for the investigation of the Dan metropolitan region of Israel's Coastal Plain. The variables involved include major ions as well as the physical factors of depth of the well intake filter below sea-level, distance from the sea, and aquifer recharge. The primary purpose of this work is to demonstrate the use and reliability of principal components analysis (PCA) as a method complementary to classical approaches for hydrogeochemical research. Subsidiary purposes involve determination of the major water groups characterizing the Coastal Plain aquifer, identification of some of the principal variables that influence changes in water quality in the aquifer, and description of some of the geochemical phenomena contributing to chav~ge in aquifer water quality. The following results were achieved for the test area: (1) Two major groups of water were identified: a low-salinity, calcium bicarbonate water occurring in the phreatic portion of the Coastal Plain aquifer, and a more saline, sodium chloride water characterizing the neighboring Cenomanian aquifer and the confined portions of the Coastal Plain aquifer. (2) The major water input sources, such as rain, injection water, waste water, irrigation, etc., may vary in their influence upon resulting water quality; this water quality is affected by such physical factors as distance from the sea, depth of well intake filters below sea-lcvel, proximity to streams and the lithology of the saturated and unsaturated zones. (3) The evolutionary alteration in content of specific ions with time can be clearly visualized on the same graph by means of PCA. The universal applicability of such a statistical approach to other hydrological settings is noted.

C o r r e s p o n d e n c e to: A. Melloul, Israel Hydrological Service, P.O. Box 7043, IL-61070 Tcl Aviv. Israel.

0022-1694/92/$05.00

(~ 1992 -

Elsevier Science Publishers B.V. All rights reserved

50

A. MELLOUL AND M COLLIN

INTRODUCTION

Phreatic aquifers are recharged mainly by rainfall. Additional recharge of this type of aquifer in populated regions will probably involve indirect supply from agriculture, industry and domestic usage, often by way of surface water collector networks such as streams and wadis. The overall resultant water quality of the aquifer is influenced by all the land-use activities on the ground surface. Water quality here is affected by the types of salts found in the soils and lithologies through which the water percolates and flows. Furthermore~ change in water quality of a coastal aquifer should be a function of two other factors, i.e. distance from the sea and depth to the water table. Salinity change can thus be related to either proximity to the sea or surface pollution sources, or both. Thus, a well that is further from the sea and shallower will be less likely to show the influence of seawater than one which is closer to the sea and deeper in the same region of an aquifer. We need to determine the specific source or sources of salinity change, if we are to reverse the trend of deteriorating water quality in the course of aquifer and land-use management. The chemical variables involved in these changes include the major ions, such as Ca, Mg, Na, K, HCO3, SO4, C1 and NO 3, and pH. Other factors used here as variables include distance from the sea and depth below the surface. Using the classical methods, i.e. those of Schoeller and Piper (Schoeller, 1962), the chemical variables cannot be related to physical factors such as distance from the sea or surface, and soil-rock characteristics. These methods are also restricted by the number of samples involved, i.e. their clarity suffers with increasing numbers of samples graphed. The present study is an attempt to employ principal components analysis (PCA) as a starting point for factor analysis of the data. Factor analysis subsumes a fairly large variety of procedures. The main steps include preparation of a correlation matrix, extraction of initial factors (exploration of possible data reduction)+ and transformation (by rotational mathematical processes) to arrive at a final solution. Factor analysis involves probability and stochastic methods and tests to cluster data. This paper uses only the first step of this analysis, commonly termed PCA (Davis, 1984). The most distinctive aspect of the PCA technique is its data reduction capability. Given an array of correlation coefficients for a set of variables, the method proposes a means of finding a pattern of relationships such that the raw data may be rearranged or reduced to a smaller set of factors or components. These may then be taken as source variables accounting for the observed interrelations in the data. One can then note similar relationships between variables and water from sampled wells, to determine, locate, and pinpoint pollution sources, as an RQ-mode procedure. The RQ-mode is a means of extracting

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

51

simultaneously the two separate factor analysis procedures, one for finding coordinates for variables, R-mode, and the other for observations, Q-mode (Zhou et al., 1983; Davis, 1984). Over the past 30 years, factor analysis, including PCA, has been used extensively in many fields. Thus, in socio-economics, the data for five variables involving 12 census tracts in the Los Angeles Standard Metropolitan Statistical Area are analyzed through the use of PCA (Harman, 1976). In surface hydrology, this technique has been employed by Morin et al. (1979), who attempted to identify homogeneous precipitation stations for optimal interpolation using PCA. In stratigraphy and paleontology the method is often applied (Davis, 1984). Recently, PCA has been involved in determining an optimal method of reservoir management (Saad and Turgeon, 1988). In fact, this method has been used in numerous earth science branches such as hydrochemistry to assist in understanding hydrologic processes affecting groundwater and soil salinity (Deverel, 1989). Furthermore, PCA has been used to build a conceptual model of deep aquifers such as the sandstone component of the Albian formation in the Paris basin and the Nubian sandstone aquifer of the Sinai and Negev (Melloul, 1979). In 1983, an attempt was made to identify pollution sources in the Coastal Plain aquifer of Israel by use of this technique; the results were summarized in an internal report (Melloul, 1983). Owing to the interest awoken by that report, this paper has been prepared. We attempt here to expand the usefulness of PCA by dealing with the specific case of groundwater pollution. The technique has been employed to visualize similarities between variables in sampled wells. No probability or clustering analysis was involved, but standardization was used to represent the values of sampled wells on the graphs. Thus, PCA was employed as a tool along with the Schoeller and Piper methods to help to identify pollution sources. Schoeller diagrams indicate ionic quantity levels in samples, whereas the statistical approach of the principal components diagram facilitates determination of assemblages of water quality results which are indicative of genetic processes and points of origin of pollutants. It is thus recommended that the classical and the principal components approaches be used as complementary tools in geochemical analysis. Israel's Coastal Plain is highly populated, with intensive agricultural landuse (see Fig. 1). Beneath the Coastal Plain lies the Coastal Plain aquifer, composed of sand, sandstone and silt, interbedded with clay lenses (Melloul, 1988). This aquifer forms one of the major components of Israel's water resource system. The aquifer supplies water for agriculture, drinking water, and industrial purposes. Most of this water results from rainfall recharge, but some also comes from direct recharge via the National Water Carrier (this

52

A. MELLOUL AND M. COLLIN

?

25 sc,~LE

59 km

.\......,

J

) S q /

HAIFAJ

~S~Z ,I/

j

F

..f

t

IIi

b

N ETANYA

g~ ~TEL AVIV~

!

(~ .~r,

J ERUSA|.EM



ASHQELON

i \

!

:

\!

!

t,

i

I

~ J i I i,),EILAT

Fig. 1. Mapof Israel,with locationof theCoastalPlainaquifer.

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

53

water being routed south from the Kineret drainage basin in the Galil by way of the Bet NetoFa reservoir). Additional sources of recharge include indirect percolation from agriculture, industry and domestic usage, as well as by way of streams and wadis. In the present study, PCA has been tested in the Dan metropolitan region (Tel Aviv area) of the coastal plain, specifically in strip 132 of the Hydrological Service's reference network for the aquifer (see Figs. 2 and 3). Within this strip ten wells were selected. Seven of these penetrate the Coastal Plain aquifer, and the three others penetrate the neighboring limestone (YarkonTananim, or "Mountain') aquifer. Furthermore, four of the Coastal Plain wells draw water from the hydrological sub-horizons A, B, and C (Fig. 3). For each well two samples from differing periods ( 1959-1964 and 1973-1974) were utilized. The initial period represents the water which occupied the aquifer before the advent of the National Water Carrier, at the inception of heavy industrialization of the Coastal Plain, whereas water from the second period might have been already influenced by the Functioning of the National Carrier, begun in 1965. This period thus represents a time characterized by industrialization and its by-products. The major assumptions made in this study were: (1) each subaquifer is in itself a hydrological unit, which might have some inter-connection with neighboring units along its boundaries; (2) changes in water quality are indicated by the content of major ions, and might be correlated with such Factors as depth, distance from the sea, recharge, proximity to surface water courses and reservoirs; (3) net circulation is considered here as chiefly involving downward percolation and east-west groundwater flow within the aquifer; (4) the 10 year period separating the two series of analyses accounts for the evolution and change in the quality of water in the aquifer; (5) the data representing the chemical content of seawater and rainfall at the Reading and Lod stations ought to be considered as stable over the period of sampling. The objectives of this study include: (1) demonstration of the value of PCA in augmenting the use of classical geochemical Forms of presentation (the Piper and Schoeller methods) to explain the potential influence of pollution on water quality; (2) illustration of the utility of the principal components statistical method in identifying some of the major variables which can influence aquifer water chemistry; (3) use of this method in visualizing such variables by the use of graphs that plot their relationship to the well water they might influence; (4) determination of the main water chemistry types and their evolution with time as expressions of certain geochemical processes operating within the aquifer; (5) general recommendations on aquifer water quality management. (For original data compiled in the course of this study, see Melloul (1983).)

3

.

..'.

'-'?,,.oo

_.~~ TEL T AVIV ~

J

__~

-.'-. "" . .""

Fig. 2. Stud) area; location of strip 132 in the D A N

--4

4

-- 170

\

-~

"\

~

Silt Clay Limestone

so..

LEGEND

..,...' .- . " - ' '• . - ' ;. " . . . . : - " -""" " " ~ / "" 'iiI'.'-i i'}}'~i'.~.',']'.~/S~

R/ver

~ ~ ~

~

/

/

I"

Lod rainfoll

I-)



/~ ~ ~

g c

~

÷

/

C,ooslol Ploin AquiferGenerolPleislocene Aquifer Cen_omanionAquifer

/

er

30

r~~

SoboqUifersof,.~

./

metropolitan region of the Coastal Plain aquifer,

L

L.'.

n

well No.

Observolion

Ce.-

tN

ZO

,6o-

170--

z

Z

F.

-

BO

180

1

i

!

O

.

4

.

I

,

T

.

~

;

7

.L GeneralPleistocene c~quifer- F~

32__!

Screen well No.

v

6

32

5

Obser~|ion we~t No.

V~ter level

,

5

}

2

Cenomonion aquifer- Ce

Sea water Intrusion zone

LEGEND

- 200

-

-160

i

~

.

;

8

~

10

p

!

~2

T

15

14

F~~

~

I

1

Dolomite

Limestone

Chalk

DISTANCE FROMTHE .SEA { Kms}

il

Sub- aquifers of (he Coosta( Plain aquifer- A, B, C

C'oy

Sitt

Sandstone

9

- 167.2 1

/

I

,

I'

--140

--

Z3

I

.

33

20

-120

-to0

~ 32 -

~

Fig, 3. Hydro-geological cross-section of strip 132 of the Coastal Plain aquifer.

0

"r

-40

- 20

,See/eve/.

40

~°'

80 ]

--NV/-

15

~

45Z 8 - 699 16

-

I

Tu-Ce

~

-SE-

'-} "<

>

;x~

~

>

> r-<

Z

>

Z

>

~

56

A. MEELOUL AND M. COI.LIN

D E F I N I T I O N OF THE PROBLEM

In general, the classification of water chemistry types has been achieved by such geochemical methods as represented in Schoeller graphs and Piper diagrams. These approaches involve only the major ionic components of water, such as Ca, Mg, Na, K, C1, SO4, HCO3 and NO 3, and plot these data for each sample. Therefore, the first problem one encouters is how to present the multitude of chemical data on the same graph or diagram so as to distinguish between the various ~clusters' of data when many samples are represented. The complexity that results on Schoeller graphs and Piper diagrams is evident from Figs. 4-6, which attempt to illustrate the various levels of major ions for samples from only 10 wells in the Coastal Plain region as well as seawater and rain water (Table 1). As a total of 23 samples are thus involved, two separate Schoeller graphs are required to illustrate the representative chemical types, and even then the resulting geochemical reality is difficult to visualize. Using a Piper diagram, more samples may be represented, and relationships between water chemistry types begin to become more apparent, but there remains a difficulty in clearly delineating these relationships. In addition, it is impossible to use either of these approaches to visualize impact of both chemical and physical variables upon water chemistry. Such physical variables might include distance from the sea, from neighboring aquifers or from some other source of pollution such as a stream, or depth to the groundwater level. Having said this, it ought to be clearly stated that Schoeller graphs and Piper diagrams are well suited to the task of illustrating specific geochemical characterizations of a small number of samples. However, when pollution problems involving a large number of samples require identification of relationships between a multitude of chemical as well as physical variables over time, additional statistical methods are needed to manipulate, reduce and visualize these data relationships. Thus, in the present study, PCA has been used to identify the relevant water chemistry types and the factors which might bring about a change in water quality in the Dan metropolitan region of the Coastal Plain aquifer. The chemical data involved in the Schoeller graphs and Piper diagram (Figs. 4-6), as well as data applying to the two physical variables distance from the sea and depth to ground water have been plotted on graphs, and allow a clear visualization of factors which may have an impact upon water quality over time. M E T H O D OF APPROACH

Once a complete set of representative data has been collected for all the

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

I11Q/'t.,

I

lOOq(-

9¢ "

-I

~oo-

-

-

vo

"

L

57

~ ,

1000-1 _,

50-

I i

-'

1103"

-i

~oo-~

-

.,

1 0

-

-

\

J

'

."

,

" \ . '

'

....

"



~-

-

.." i'

"~

HC03"

tO0(,

.

30

"

",,~":,., ,~;/-.

,~

3

~1~ *~,

' .~..~

,

" ~ .

"~,'

"/

e,

~\

.

-

2-

\t0o-

k',X

" '. "

-

~:~'\.;

<~i, .V'.

,/

If • . ~ I

\~O

-

"



1511%41

23(1963)

Ce PI el Pl

20(1959) 22(1963)

t ......

--x

.

0.2

' " _

~

o.~-

--x-

......

24(1965)

Ce

26(1963)

PI

30(1965)

Ce

31(1962) 32(1961)

IDI (O "" " " " [ 1+l (A, 1R) - - * - - * - -

33(1961)

1+1 (C)

(C)

.....

]

i I

,t. ~ x •

~I

I"

-

/

]

1 1-.

Fig. 4. SchoeIIer diagram of water samples from strip 132, Coastal Plain aquifer, 1959-1965.

5[~

A.MELLOULANDM ('OLLIN

t

Ca2+

¢

~z

Na +



°

Cl"

t

-

!~1/1,

i~l/I, ~oo

1ooo~

foo



?0

.

-

"10

1

5O

•~o

1000-

1000-

1

..



/

•eo

'

-

"~

-

~0

JO

'

--

t

I

- -

7 • 5

.

.

1oo-.K

.

"

20

t

-

I

-i ~

u/ / o~ ~ ""



1t03"

fO d

I - - "



~

HCO3"

~

'1

~, o .-. '-

"'~

.

;

e~r~ #oN~ / / / /

,

I

"/

,

:t

i

\•

,t t-

\,-,

-

,-

t/

,

.

- f

\.,



:'./

0.3-

0.3 f'l J

-

-;-\~

/\

15(197411

Ce

. . . .

28(1973)

Fl

.....

22(1973)

Pl

--x~x--

PI

......

34(1974)

Ce

26(1973)

Pl (C) Ce

31(19'/3)

PI (C)

32(1973)

Pl Ill,B)

33(1973)

Pl

,

). 2 -

,

J

~.I -

I

23(1973)

38(1974)

_

......

1 "

(C)

.,.

• • •

--.--.

I- _

xx~
.

I

Fig. 5. Schoeller diagram of water samples from strip 132, Coastal Plain aquifer, 1973-1974.

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

59

/ OO

I

TiN perlcxl 1959 - 1965 TiM peri~ 1973- 1974 Lod rainfall statim Readi~ raim~all s t a t i m

" ~.,,~

~



• •

÷

•,.

O

wirer

e X V

\

~w-

O

O

~

J

;~

tO

TOO

Fig. 6 . Piper diagram of water samples from strip I32, Coastal Plain aquifer.

variables (v) and well sampling sites (n), this can then be represented by a data matrix IX] of dimension v • n. Without any mathematical transformation it is difficult to see correlations between variables and well sampling sites. Therefore, this data matrix can be used as the basic input to statistical methods for identifying factors that shed light on the changes which water quality has undergone. This allows the graphing of variables and well sampling sites, so that relationships can be visualized. At this point, factor analysis is used to transform the data matrix [X] into a new set of composite variables or principal components. In this paper, only a limited factor analysis scheme is involved: PCA and some rotational transformations. By using PCA the [X] matrix can be transformed to a variance-covariance matrix, which yields the principal components which are orthogonal, that is, uncorrelated with each other. By themselves, these components may provide significant

0.02 0.01 0.4 1.32 0.35 0.32 0.23 0.39 0 0 0.32 0.04 0.09 0.09 0.1 0.09 0.82 1.09 0.82 1.96

5.2 5.2 1.2 1.6 1.04 1 1.5 1.3 4.4 4.7 3.1 5.6 7.2 3.4 4.1 4.3 1.4 2.8 2 2.3 535.9 2.49 0.24

NO3 Cl 1.1 1.5 0.17 0 0.16 0 0.23 0 1.6 1.4 0.37 1.7 0.98 0.7 0.5 1.3 0.17 0.1 0.27 0.2 56.2 0.99 0.24

5.2 5.4 2.1 2.6 2.7 2.2 3.8 3.5 5.4 4.7 3.9 4.9 3.3 4.6 5.1 4.7 2.6 3.8 4.1 3.8 2.3 0.4 0.43

0.14 0.09 0.01 0.02 0.02 0.01 0.05 0.04 0.11 0.08 0.04 0.12 0.17 0.08 0.13 0.11 0.03 0.02 0.03 0.02 9.7 0.05 0.03

SO4 HCO~ K

A n i o n s (m equiv. I ~)

5.1 4.4 0.9 1.4 1.4 1.1 1.9 1.6 4.6 4 3.2 4.9 5.7 2.7 5.7 5.6 1 2.2 1.7 2 457.6 1.93 0.2

Na 2.6 1.7 0.7 0.9 0.8 0.5 1.2 0.5 2.9 1.6 1.5 2.7 2.7 2.8 1.6 1.5 0.9 0.8 0.81 0.2 111 0.58 0.12

Mg 3.7 5.6 2.3 3.4 2 2 2.8 3.1 3.4 5.2 3 4 3 3.4 2.5 3 3.05 4.6 5.75 5.7 20 0.76 0.45

Ca

Cations ( m e q u i v . 1 ~)

11.52 12.11 3.87 5.52 4.25 3.52 5.76 5.19 11.4 10.8 7.69 12.24 11.57 8.79 9.8 10.39 4.99 7.79 8.19 8.26 594.4 3.88 0.91

(1) 11.54 11.79 3.91 5.72 4.22 3.61 5.95 5.24 11.01 10.88 7.74 11.72 11.57 8.98 9.93 10.21 4.98 7.62 8.29 7.92 598.3 3.32 0.8

(2) 23.06 23.9 7.78 11.24 8.47 7.13 11.71 10.43 22.41 21.68 15.43 23.96 23.14 17.77 19.73 20.6 9.97 15.41 16.48 16.18 1192.7 7.2 1.71

(1) + (2)

Total Total Total a n i o n s cations ions

-0.02 0.32 --0.04 --0.2 0.03 --0.09 --0.19 --0.05 0.39 --0.08 --0.05 0.52 0 --0.19 0.13 0.18 0.01 0.17 --0.1 0.34 --3.9 0.56 0.11

b b 7800 7800 8500 8500 10800 10800 ~ b 3100 3100 b b 1400 1400 3100 3100 3800 3800

b 61.8 61.8 b h 121.6 121.6 51.5 51.5 50.4 50.4

b h 40 40 54.8 54.8 40 40

(D.W.) (m)

Difference Distance from Depth" to (1) - (2) sea (D.S.) (m) filter well

Below sea-level. bThese wells belong to the Limestone aquifer (which limits the eastern p a r t of the C o a s t a l Plain aquifer), where D.W. a n d D.S. are not involved,

15-1 15--1 20--1 20--2 22--1 22--2 23--1 23--2 24--1 24--2 26--1 26--2 30--1 30--2 31--1 31--2 32--1 32--2 33--1 33--2 Seawater Reading rainfall station Lod rainfall station

O b s e r v a t i o n well no.

Chemical a n d physical d a t a

TABLE 1

z

~r>z

>

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

61

insight into the structure of the matrix (Davis, 1984). This is done by the resolution of the data matrix [X], wherej = 1. . . . v, represents the columns of the matrix, involving such variables as major ion s, pH, depth from the ground surface to the water level and distance from the sea. Furthermore, i = 1 , . . . n represents the rows of the matrix, involving the number of well sampling sites. The initial output given by this program includes statistical data such as mean and standard deviation, correlation coefficients between variables, eigenvalues and vectors, and factor score coefficients for the major two components. This allows these data to be represented on graphs where the major axes are, in fact, the basic solutions of the variance-covariance data matrix. Thus, the first solution (F1) is the principal factor or component representing the initial axis, which explains as much as possible of the total variance of the observations; the second solution (F2) is the second factor or component representing the second axis, which explains as much as possible of the residual variance, and so forth, for the other factors or components, each explaining less and less of the total variance. These first two components are then plotted as perpendicular axes representing all parameters and well data involved (see Fig. 7). On the graph, the variables' coordinates with respect to factors F I and F2 are obtained by a Varimax rotated factor matrix after rotation with Kaiser normalization (Davis, 1984). The well samples' coordinates in regard to factors F1 and F2 are obtained in the same manner, by using factor score coefficients and standardization of the raw data. This standardization is given by the formula Z,j -

~,-

2

where Xgj are the raw data of variable j in well sample i, 2 is a mean value of the variable Xj for all the well samples and sj is the standard deviation of the variable Xj for all the well samples. Figure 7, as an example, shows that the two axes of the diagram represent the salinity of water (axis F1) and nitrate pollution (axis F2). Thus, close to F1 can be found the major ions expressing the water's salinity, and close to F2, the nitrate variable. This approach gathers data for influential variables and well samples, to allow the relationship between these variables and individual well samples to be identified. By delineating the limits of the resultant assemblages of well sample positions relative to the axes, it is possible to note groupings of data which indicate major characteristics of water from the sampled sites. This delineation of assemblage limits is visual

62

A. MELLOUL AND M. COLLIN

1rAzoR2 (/2)

'

/'13,i '~

t24 I

t

,

I I

I

~

29 •

- ~ (-)

~

q

, F

t/

1.6

; 1'.2

!

,' 9.8

SO4'

;

t26

~-~

8.4

~1.2

32

"3O~w.

3L

q

1.6

F-r---r-~ 1.9 (÷)

~4

~ ~

2O

22

22,-"

LEG.T.ItD i

-1.6

PI - General pleistocene A - [ Pleistocene / B - | sub /

Y(1): 68 z

C - L.aquifers Ce - Cenommt~

1r(1) ÷ 1r(2) : 87 Z

*

-

1959-1965 SO#pie

- 1973-1974 samPle Fig. 7. Principal components analysis of major ions affecting groundwater quality in strip 132 of the Coastal Plain aquifer.

PRINCIPAL COMPONENTS ANALYSIS OF- WATER QUALITY

63

and not statistical, and simplifies analysis of data. For instance, in Fig. 7, group A represents more saline water, and group B represents fresher water. The graph in Fig. 7 involves the same data as had been plotted, using the Schoeller and Piper methods, in Figs. 4-6. Where these two methods were unable to delineate clearly relationships between well samples and variables influencing the resultant water quality, the PCA method, as illustrated in Fig. 7, distinctly represents about 87% of the total variance of the data on a simple two-dimensional graph. Here, the change in water quality over a period of 10 years is depicted on the same graph by dotted lines for the same well sample. It may be helpful to note that in Fig. 9 (below), where physical variables are involved, the axis of the variables (e.g. NO 3, C1, distance from the sea (D.S.) and depth to ground water in the well (D.W.)) goes through the centre of the graph. Thus, the direction of influence of each variable on the well samples increases in the direction of the arrow. For example, for well 31, if we project from its coordinates in relation to the two main axes, at right-angles to the D.S. axis as this extends through the centre, we can see the relative influence of the D.S. variable on this sample. Furthermore, if we project the position of well 31 onto the D.W. axis, it will be noted that this well is deeper than most of the other wells. Finally, it may be pointed out that the closer to the axis of each variable the well's position on the graph lies, the stronger is the correlation of this variable with the well's water quality, vis-fi-vis other competing variables. It will be pointed out in the Discussion how this form of representation may help to visualize the process of seawater intrusion, as may be noted by the shift closer to FI of water sampled on dates about 10 years apart, from well 26. Thus, it is apparent that this approach can facilitate determination and visualization of certain aspects of geochemical pollution problems. Nonetheless, although it clearly delineates water quality relationships, the PCA graph does not indicate levels of specific ionic variables for the well samples nearly as clearly as does a Schoeller diagram. Therefore, principal components and Schoeller graphics should be used as complementary tools for the identification and explanation of pollution processes and phenomena, tying in both chemical and physical variables. RESULTS

Over the first and second periods of sampling (from 1959 to 1964 and from 1973 to 1974), for the same set of wells, the data given in Table 1 were accumulated. This information was augmented by appended data characterizing the chemistry of seawater and rain, sampled on the coast (Reading station) and in the eastern recharge region of the Costal Plain aquifer (Lod station).

64

A. MELLOUL AND M, COLLIN

Additional data include distance of each well site from the sea, depth below sea-level to the well's intake filter and the stratigraphic sequence of the subaquifer involved (Fig. 3). The sampling region is a strip, perpendicular to the coast, extending inland around 17 km from the seashore to a point along the eastern aquifer boundary, and having a uniform width of 2 km. A limited geographic zone was selected so as to illustrate the effectiveness of the principal components statistical method for hydrological research. As mentioned in the Method of approach, two major assemblages of data can be visually noted on the PCA graph, group A and group B. Each group is characterized by differing salinity levels, as indicated by ionic ratios, etc. (Figs. 7-10). Principal components analysis of data represented by these groups can help point out the influence of specific variables upon the patterns of water quality. In the context of sampling location and rainfall, these physical variables here include D.S. and D.W. Chemical variables involve concentrations of major ions (e.g. chlorides, sodium, nitrates, etc.) and the state of geochemical alteration (ionic ratios).

Rainfall The Coastal Plain aquifer is naturally recharged annually mainly by rainfall. That which falls nearer the sea is more saline, as a result of the higher salinity of its environment, than that which falls further inland, as illustrated in Fig. 8, where the two sampling sites are shown in relation to the major ionic concentration zones of sampled wells. In this assemblage of sampling sites, the delineation between two water groups involving major ions is clearly seen. Group A exhibits greater correlation with F1, representing water which is more saline than group B (see Fig. 8 and Table 1). Group B involves only Coastal Plain aquifer samples of shallow or moderate depth. The Lod station rainfall sampling site, near the eastern aquifer boundary, is clearly situated on the diagram in closer proximity to group B than the Reading station rainfall sampling site, which is located on the coast. Furthermore, the projection of the Lod station onto the F1 axis is more negative than the Reading site, therefore representing water of lower salinity.

Distance of well from the sea ( D.S.) For the purpose of this analysis, as represented by Fig. 9, only Coastal Plain aquifer wells were involved. As seen in the figure, wells 26 and 31, which are near axis F l and closest to the sea, appear most saline, drawing their water from the greatest depth, whereas those furthest from the sea were lower in salinity but, being shallower, registered markedly higher nitrate levels.

65

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

FIt~OR

II~l

tt

~

1.4

/

29

g,

/'H03

-r-"r--r--'~ ~-,

!

i., ',2,.~

,.,

~

~

,.,

:

,-....-~-.~,2_,:i.~

~.,

,.,

22

-0.6

~0

-8.8

V

-i.~: ,, t

-].0'



(')1

i

V •

- Lod rainfall station - Reading rainfall station P1 - General pleistocene ,

B

i l Pleistocene

s.b

F(I)= 54 Z

C

aquifers

F(I) + F(2) : 81 7,

Co -

CenoManian

+

-

u -

1959-1965 sample 1973-1974 sample

Fig. 8. Principal components analysis of major ions affecting groundwater quality in strip 132 of the Coastal Plain aquifer, with location of rainfall measuring stations.

66

A. MELLOUL AND M. COLLIN

YA~02 2 (F2)

1i/ 1.6

1.4

132 2

i

i

f

;

; 12

/

___~cl

"4

.6

~

20

~

2~..22

~o~ 1 , , .

¥

0.8

/

I O

]1,4 -'"

APl- Gen.•al.l.lstocener ?I.istocen.

'¢ "L .;l:.. 4- - 1959-1965 sample

T(I):

62 z

) ' ( 1 ) + l r ( 2 ) : 83 Y

4e - 19?3-19?4sample D.S. - Distance f~oN t ~ sea D.M. - Intake filter depth belov sea level

Fig. 9. Principal c o m p o n e n t s analysis o f effect of m a j o r ions, distance from the sea, and i n t a k e filter d e p t h below sea-level on g r o u n d w a t e r quality in strip 132 of the C o a s t a l Plain aquifer.

67

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY FACTOR ;2. (IV'l)

2.e 30

/ t

,,

1.6

,,

~32

/

,'

!

l.S

',

/,, (CI-(~÷XI]/CI

24

504/CL " : ~ &

w,~ - - -~..sPI~/Ca424 (-1

2.0

1.2

4

~-

"'~5 1.2

"

.5

1.8



/ i

-1.6

)'(1): 51 z F(I) + F(2) : 75 F,

-2.0

/ ~

31/

-2.4

LEGI]~ PI - Genera| pleistocene

Ionic ratios

A - ~ Pleistocene I B - | sub /

Mg/Ca

average 6.43

St. deviation

Mg/Ha

0.52

8.25 O.23

C - La(Nifers

[CI-(Ha÷X)]/¢I

9.02

0.27

Ce-

Ceno~anian

Ha/Ca

8.91

0.59

+ -

1959-1965 s ~ p l e

HCO3/Cl

1.54

g.52

e -

1973-1974 sample

S04/Cl

9,15

8,18

S.A.R.

1.96

1.62

Fig. [0. Principal c o m p o n e n t s analysis of effect of major ions and ionic ratios on groundwater quality in strip 132 of the Coastal Plain aquifer.

Depth below sea-level to well's intake filter (D. W.) Looking again at Fig. 9, it may be observed that most of the shallower wells (between 40 and 55 m depth) are found within group B, on the negative side of centre of the graph, along the D.W. axis. On the other hand, the deeper wells are located on the positive side of the D.W. axis, in group A.

68

A. M E L L O U L A N D M. COLLIN

Nitrate pollution levels Nitrates were selected as indicative of a significant level of pollution resulting from land-use on the surface (Ronen et al., 1983). Figures 7 and 9 indicate that most of the shallow wells in group B are characterized by higher nitrate content than those in group A. Figure 7 shows further that the deeper wells of the Coastal Plain aquifer, and the wells tapping the limestone aquifer to its east, are uniformly lower in nitrate content than are the shallower coastal wells. Further, most wells in groups A and B indicate marked increase in nitrate content between the two sampling periods, with the greatest increases in group B. The closer to the nitrate parameter axis the sample lies, the higher is the nitrate content in the sample. The differences between one wells and another in nitrate content within the same cluster may possibly be accounted for by the proximity of each to specific, varying pollution sources. Geochemical state Data on geochemistry have been assessed by means of ionic ratio analysis (Magaritz et al., 1981). Such analyses include Na/Ca, SAR (sodium absorption ratio; i.e. SAR = Na +/[(Ca 2+ + M r + ) / 2 ] ~/2, Mg/Ca, SO4/C1, HCO3/CI, and the index of base exchange (i.e. (C1 - (Na + K))/C1). In Fig. 10, the results of this assessment are presented on the basis of the PCA method. Wells in group A have higher Mg/Ca and Na/Ca ratios, and SAR, than those of group B. This can be an indication of the influence of lithology and seawater intrusion. Furthermore, Fig. 10 indicates that well water of group B is characterized by higher HCO3/C1 ratios than water of group A. Thus, increase in HCO3 in relation to C1 over the time of the sampling period is more marked in group B than in group A. Finally, the index of base exchange and the Mg/Na ratio may be seen to characterize a process at work in wells of both groups A and B. A negative index value can indicate enrichment in Na. In group A, wells 30 and 31 indicate enrichment over the time of the sampling period in Mg and C1, whereas over the same time period, most of the group B wells show the reverse. Statistical values employed in PCA Table 2 summarizes statistical results (some of which are used in plotting Fig. 7), including factor scores, communalities, correlation values between parameters and eigenvalues of the various principal components. This example allows comparison of PCA with the classical geochemical methods dealt with in this paper.

69

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

TABLE 2 Results obtained by PCA for major chemical variables (1) Statistical results of variables

Ca Mg Na K HCO3 SO4 CI NO3

Mean

Standard deviation

Factor 1

Factor 2

Communality

3.57 1.44 3.05 0.06 3.92 0.59 3.17 0.47

1.19 0.87 1.75 0.05 1.07 0.56 1.83 0.61

0.17 0.88 0.94 0.94 0.81 0.91 0.94 - 0.65

0.96 -0.17 - 0.01 - 0.11 0.35 0.14 0.11 0.65

0.96 0.81 0.89 0.90 0.79 0.85 0.91 0.84

(2) Correlation coefficient values

Ca Mg Na K HCO3 SO4 C1 NO 3

Ca

Mg

Na

K

HCO3 SO 4

CI

NO 3

-0.01 0.11 0.02 0.46 0.32 0.28 0.45

0.75 0.84 0.65 0.75 0.81 -0.63

0.94 0.77 0.81 0.92 -0.56

0.69 0.77 0.91 -0.60

0.75 0.69 -0.36

0.86 -0.58

-0.49

(3) Weights of the principal components and eigenvalues Factor

Eigenvalue

1 2 3 4 5 6 7 8

5.41 1.55 0.39 0.28 0.25 0.08 0.04 0.01

Percentage of variation 67.6 19.4 4.9 3.5 3.1 1.0 0.5 0.0

Cumulative percentage of variation 67.6 87.0 91.9 95.4 98.5 99.5 100.0 100.0

DISCUSSION

Figures 4-6, when compared with Fig. 7, indicate the ease with which PCA delineates groups of well samples that have similar geochemical characteristics, as opposed to such classical geochemical methods as the Schoeller or

70

A. MELLOUL AND M. COLLIN

Piper approaches. For example, in Fig. 5, wells ! 5, 24, 30 and 31, which are represented in the lines in the upper portion of this graph, are clearly seen as group A in Fig. 7. This indicates that PCA can be used to support the Schoeller or Piper methods in visualizing assemblages of individual data points. The principal components statistical method is thus shown to delineate the relationship between a large number of samples and variables which might contribute to a change in an aquifer's water quality. It is easy to visualize on a graph the influence of specific variables upon ultimate water quality, while discounting the influence of other variables by using direction and length of vectors. Thus, in Fig. 7, the vectors representing Na and K are in close proximity, indicating parallel and significant influences of Na and K salts on water quality. As C1 ions are highly correlated with total salinity (Table 2), the correspondence of the Na and C1 vectors may be noted in the graph. Therefore, in the case of salinity analysis it is redundant to plot both Na and C1. Likewise, in Fig. 10, the vectors for SAR and the Na/Ca ratio are identical in direction and approximately the same in length. This allows the use of only one of these to represent the effect of Na upon water quality. In the case of the SAR formula it is not necessary to use both SAR and Mg to indicate water quality changes; one of these variables is sufficient. Thus, this method can enhance the efficiency and economy of environmental monitoring by eliminating the consideration of redundant chemical factors. The principal components method allows parallel consideration of chemical as well as physical variables in determining changes in water quality. In this study, the relationship of D.S. and D.W. vis-fi-vis several chemical parameters may be visualized, as shown in Fig. 9. On this graph, the influence of the major ions can be seen to be in the opposite direction to the distance from the sea. The position on the graph of all the ions lies on the positive side of the F1 axis, whereas the position of D.S. is on the negative side of the same axis. Thus, the closer the well sample's position to that of D.S., the less saline will be the water involved. Furthermore, with time, it can be seen that most well samples are increasingly on the positive, ionic side of the F1 axis, and are thus more saline. (Note the drastic shift of values over time for water from well 26 (dotted line), the closest to the sea of the wells on the graph.) Likewise, the two variables D.W. and nitrate lie in almost opposite directions on the graph. Samples with higher nitrate content tend to typify shallower wells; this emphasizes the influence of ground surface activities on aquifer water quality. Thus, a cursory glance at the graph emphasizes the interrelationships of physical and chemical variables in their influence on aquifer water quality. Furthermore, the lack of connection between local pollution sources of nitrates as well as sources of salinity and ground water of the non-phreatic

PRINCIPAL C O M P O N E N T S ANALYSIS O F WATER Q U A L I T Y

7l

subaquifer C is evident in Fig. 9 when comparing the direction of nitrate increase with the nitrate content of such wells as 26 and 31. To illustrate the point further, Fig. 9 shows that the variable 'distance from the sea' clearly affects wells 26 and 31 in group A, involving water of higher salinity, more than wells of group B. This correlates with present awareness of a marked degree of seawater intrusion characterizing wells close to the sea in subaquifers A, B, and C in this strip. Finally, as seen in Fig. 10, the HCO3/C1 ratio of wells of group B uniformly exceeds that of group A. This can be indicative of greater contact with the atmosphere as a result of the phreatic character of the aquifer of wells of group B as opposed to those of group A, which belong to a more confined aquifer. An additional feature of the principal components method allows for the visualization of increased levels of individual chemical variables in well water over a period of time. Figure 7 shows increased levels of all the major ions (along the dotted lines) over the period of sampling, and indicates which of these increase more radically, as well as the relative directions of increase. Water samples from the phreatic aquifers (wells 20 and 32 in group B) show the sharpest increase in nitrate. In the case of well 32, its proximity to the Ayalon stream channel, which normally contains highly polluted effluents (Kanfi and Ronen, ! 982), might well account for increased nitrate content (see Figs. 2 and 3). This point is borne out in Fig. 10, which shows that the decrease in the index of base exchange (indicating greater influence of Na and K in relation to C1 in the environment) is parallel in both wells 20 and 32, indicating either a greater responsivity to pollution sources in the ground water of the phreatic Coastal Plain aquifer or an expression of contact and exchange of water with Na in aquifer clays. Furthermore, the deeper wells of group A, which show an increase in this index with time, thus have a lower sensitivity to pollution sources at the ground surface. The process of mixing over time between the Cenomanian aquifer and the eastern portion of the Coastal Plain aquifer is illustrated by Fig. 7. It can be seen that the salinity levels and major ionic contents of the deeper group A wells (from the Cenomanian) decrease in time in the direction of the group B wells of the phreatic Coastal Plain aquifer. In Fig. 10, for the same Cenomanian wells (15, 24, and 30) the ratios of SO4/C1, Mg/Ca and Na/Ca (SAR) decrease in time, whereas the HCO3/C1 ratio increases, thus emphasizing the possibility of mixing with coastal water, as represented by group B wells. In this region there is a hydraulic contact and gradient between the Coastal Plain and the Cenomanian aquifers, as shown in Fig. 3, which fits the pattern noted in Fig. 10. The negative base exchange index in wells 22 and 23, and its decrease with time, may be explained by the wells' proximity to the clay basement of the

72

A. MELLOUL AND M. COLLIN

Costal Plain aquifer (see Fig. 3), with the consequent likelihood of higher base exchange capacities of these clays with the ground water. The fact that this index does not become more negative with time indicates that the process of salinization is of greater importance than the ionic exchange process resulting from contact of aquifer water with clays. A greater period of time might be required for the effect of this exchange process to be significant. The influence of rainwater on phreatic aquifer wells is more pronounced than it is upon deeper Cenomanian wells, as shown in Fig. 8. This figure indicates that the rainwater itself at sampling station Reading, near the sea, is more saline than that at sampling station Lod, at the eastern margin of the Coastal Plain. Thus, it may be noted that fresher water is likely to result from the higher levels of rainfall typical of rainfall report stations located along the Costal Plain aquifer's eastern boundary. Another point of explanation afforded by these figures is that the higher Na/Ca and Mg/SO 4 ratios and SAR levels of group A vis-Gvis group B wells may be explained by the richer Na, Mg and SO4 sources, such as dolomite~ marl and chalk, found in the Cenomanian limestone aquifer than in the Coastal aquifer. Furthermore, nitrates appear, derived from agricultural run-off as well as sewage in streams. Another source of input might be the National Water Carrier, the recharge water from which might affect water quality of some Cenomanian and Pleistocene wells over a period of years. From this example of Israel's Coastal Plain aquifer it is evident that a wide range of potential causes might have resulted in specific effects. To determine the true cause-effect relationship, principal components analysis allows the isolation of the variables that exhibit the greatest parallel influence upon water quality results. For example, the influence of proximity to the sea, to natural recharge regions, to stream channels, to aquifer water tables, to clay beds or to the contacts with neighboring aquifers, as well as to artificial recharge sites, have each been found capable of explaining specific water quality changes in the ground water of the Dan metropolitan region. CONCLUDING REMARKS

The present concern for effective, efficient stewardship of natural resources, properly balanced with maintenance of a healthy state of the environment, makes it critical that management decisions involved in water supply and quality control be based upon as accurate as possible an assessment of the interrelationships between the multitude of factors which are involved as causes or effects of these decisions. An attractive aspect of the principal components statistical method is its potential use as a tool for sorting through a vast quantity of data and subsequently identifying the most significant

PRINCIPAL COMPONENTS ANALYSIS OF WATER QUALITY

73

influences on and sources of water quality changes. By employing field data, this method depicts the empirical situation in situ, and provides a real-world basis for the development and calibration of water resource models. Such a means of facilitating assemblages of water quality types and well locations which are indicative of genetic processes and points of origin of pollutants can aid in the development of modeling concepts. As a result, management decisions can be made with a clearer picture of existing or potential causeeffect connections between the variables and ultimate water quality. In this study, a small area was investigated to indicate the utility of the PCA method as a complementary tool to classical geochemical methods. Specific attention was paid to nitrates as a pollution indicator. It should be noted that this method may be equally well applied to larger areas with a greater number of samples, and a wider range of pollution indicators. REFERENCES Davis, J.C., 1984. Statistics and Data Analysis in Geology, 2nd edn. Wiley, New York, 646 pp. Deverel, S.J., 1989. Geostatistical and principal components analysis of ground water chemistry and soil-salinity data, San Joaquin Valley, California. Proceedings of the Baltimore Symposium, May 1989, Regional Characterisation of Water Quality, IAHS Publ. 182, pp. 11-18. Harman, H.H., 1976. Modern Factor Analysis, 3rd edn., University of Chicago Press, pp. 355-360. Kanfi, Y. and Ronen, D., 1982. Changes and trends in the nitrate contents of the Coastal Plain aquifer of Israel. Ministry of Agriculture, Israel Water Commission, Jerusalem, 10 pp. (in Hebrew). Magaritz, M., Nadler, A., Koyumdjisky, H. and Dan, J., 1981. The use of Na/C1 ratios to trace solute sources in a semi-arid zone. Water Resour. Res., 17(3): 602-608. Melloul, A., 1979. Hydrogeological knowledge of deep aquifers in porous media and with few data by Principal Components analysis based on geochemical and isotopical parameters. Ph.D Thesis, University of Neuch~tel, 268 pp. (in French). Melloul, A., 1983. Identification of factors that influence water quality in the Dan area of the Coastal Plain aquifer. Israel Hydrological Service Work Rep. Hydro./2/1983, Jerusalem, 49 pp (in Hebrew). Melloul, A., 1988. A hydrogeological atlas of Israel's Coastal Plain a q u i f e r - its geometry and physical properties. Israel Hydrological Service--- Work Rep. Hydro. /8/1988, Jerusalem, 33 pp (in Hebrew). Morin, G., Fortin, J.P., Sochanska, W., Lardeau, J.P. and Charbonneau, R., 1979. Use of Principal Component Analysis to identify homogeneous precipitation stations for optimal interpolation. Water Resour. Res., 16(6): 1841-1850. Ronen, D., Kanfi, Y. and Magaritz, M., 1983. Nitrogen presence in groundwater as affected by the unsaturated zone. Israel Water Commission, Work Rep., Jerusalem, 30 pp. Saad, M. and Turgeon, A., 1988. Application of Principal Components analysis to long-term reservoir management. Water Resour. Res., 24(7): 907-912. Schoeller, H., 1962. Les eaux souterraines. Ed. Masson et C ~e, 619 pp. Zhou Chang, T. and Davis, J.C., 1983. Dual extraction of R-mode and Q-mode factor solution. Math. Geol., 15(5): 581-596.