Anaiyttca Chmca Acta, 258 (1992) l-10
Elsevler Saence Pubhshers B V , Amsterdam
Application of principal component analysis to the interpretation of rainwater compositional data Pexun Zhang ‘, Nelley Dudley, Allan M Ure and David LttleJohn * Lkparhent
of Pure
andApplied Chemtstry, Untversttyof Strathclyde, Cathedral Street, Glasgow GI 1XL (Ux)
(Received 7th September 1991, revised manuscnpt received 3rd September 1991)
Abstract Pnnclpal component analysis (PCA), based on non-hnear lteratlve parhal least squares (NIPALS) coupled Hrltha cross-vahdatlon approach, IS apphed to data obtamed from the chenucal analysis of ramwater The correlation between vanables IS obtamed and their sources Identfied The classlficatlon of samples mto groups by PCA IS also mvestlgated The problem of data scahng and the evaluation of methods for assessmg the number of s~gmficant components m the data are also dlscussed Keywords Atonuc absorption spectrometry, Prmclpal component analysis, Acrid ram, Cross-vahdatlon, Factor analysis, Ramwater, Waters
The phenomenon of acid ram IS currently an important aspect of envlronmental pollution Many studies have mvestlgated the cause and effects of acid ram and various strateges have been proposed to control the problem [l-4] The most Important objectwe m such research is to understand the phenomenon Thus requires an optimal method of obtammg as much mformatlon as possible from the multivariate data analysis There are two pomts [5], the nature of the vanables and the purpose of the study, to be taken mto account when selectmg the best procedure to analyse a set of data Prmapal component analysis (PCA), sometlmes called prmapal factor analysis (PFA), IS one of the most effiaent and widely used approaches for analysis of multlvanate data It IS described m most standard texts on multi-
’ On leave from Changchun Instltute of Apphed Chemistry, Chmese Academy of Sciences, Changchun, Chma
vanate analysis [6-81 It has been apphed to the study of ram chemistry m the Puget Sound Reglen [9], an pollution sources [lo], the number of spectes present m chemical equlhbra ill], and m the classlficatlon of Chmese tea samples [12] In this paper, prmclpal component analysis IS used to analyse composltlonal data for ramwater collected at Largs, a coastal town m the West of Scotland Correlations between the vanables and the sources of samples are obtamed A new function for the mfluence of wmd dlrectlon IS proposed, which allows the correlation between wmd dIrectIon and the concentration of speaes to be studled quantitatively The mvestlgatlons tighhght the problem of preparmg the data pnor to analysis (e g transformation of pH to hydrogen Ion concentration) and classlfles samples mto groups according to the prmclpal components In addltlon, the prease number of prmclpal components assessed mth different methods IS evaluated m detail
0003-2670/92/$05 00 0 1992 - ElsevIer Science Pubhshers B V All nghts reserved
P ZHANGETAL
L
EXPERIMENTAL
Sample colhxtwn
Dally rainwater samples were collected from 1 January 1988 to 20 March 1988 (80 days) at a rural site m Largs on the west coast of Scotland The samples were collected m polythene contamers mth guarded funnels This multiple dally sampling ensured reasonable freedom from contamination of ram samples Rainfall volumes were not measured, but were obtamed from a nearby auxlhary Meteorological Office Station 649501, where rainfall was collected, using the standard MO ramgauge Spectes determmatlon The pH of Largs rainwater was measured ddy
The pH meter was calibrated regularly with pH 4 and pH 7 standard buffer solutions The dally general wmd direction during the study was obtamed from the Dally Telegraph and Sunday Telegraph The ram water samples were analysed quantltatlvely for cations Na+, Mg*+, Ll+, Pb*+, Cd*+ and Cu*+, and anions Cl-, NO;, SO:Flame atomic absorption spectrometry was used for the detennmatlon of sodium and magnesium, and electrothermal atomic absorption spectrometry wth a graphite furnace atonuser was used for hthnun, lead, cadmium and copper The anions chloride, sulphate and nitrate were determined by ion chromatography The determmatlons were made dally except when the volume of ramwater collected was too small for analysis In total, 54 samples of rainwater were analysed for 12 vanables Every precaution was taken to mmmuse contammatlon of solutions and all vessels were acid and distilled water-washed before use Standard solutions for cations were made up from stock solutions commercially prepared for spectrochemical analysis AnalaR reagents were used for the standard solutions m the determmatlon of anions and these were chemically standardlsed Well-established flame atomic absorption spectrometry procedures, usmg direct cahbratlon wth standard solutions were applied for Na and Mg For Cd, Cu, Ll and Pb, the electrothermal atomic absorption spectrometry procedures used were
first optunlsed for furnace parameters (temperatures and times) and the analysis checked for possible chemical interferences by the standard additions method This showed that no mterferences of this nature occurred for Cd and Lr and direct cahbratlon Hrlth standard solutions was used In the case of Ll and Pb small mterferences from matnx components could occur and these elements were determined by the standard addltlons procedure In all analyses, blank solutions were analysed with each batch of samples Data analyst The analytical results were expressed as a ma-
tnx X with size of M times N, where M and N are the number of samples and vanables respectively The raw data were tailored by the meancentred, variance scaling method (autoscahng or z-transformation) [13] A cross-vahdatlon procedure [14] was used on the scaled data to assess the number of slgmflcant components present m the data When the number of slgmflcant components was known, the correlation matrices were decomposed using a non-lmear iterative partial least squares (NIPALS) method [15] for the elgenvalues and elgenvectors The loadings were rotated by the vanmax procedure [16] to aid interpretation The samples which had the greatest loadings, either positive or negative, on the same prmclpal component, varunax rotated, were put into one group A VAX computer was used for all calculations The programme was written m FORTRAN 77
RESULTS AND DISCUSSION
Wind dvectwn and sources of specres
A study of the correlation between the wmd direction and the concentration of species IS useful because It can reveal where the species come from For this purpose, a function 1s needed to transfer the wmd direction into numerical values In thrs work, west to east was made the reference axis To account for the perrodlclty of the wmd direction, the function f = 2 sm O-cos 0 was used As westerly winds predommate at Largs, a west wmd was assigned a value of 0 = 0 This
APPLICATION
OF PRINCIPAL
COMPONENT
3
ANALYSIS
Jl
-3
40 H
45 SW
90
135
800
225
270
315
360
s
SE
E
ID
N
w
w
Wind Direction/angle
Rg
1 Wmd dIrectIon and its correspondmg
numerical value
function has a maximum for a south southeast wmd and a nununum for a north northwest wmd, as shown m Fig 1 Hence, If the wmd dlrectlon 1s known, a numerical value or ldentlfler can be obtained from the function The reverse operation IS more problematx, because for every value of f, there are two possible correspondmg wmd dlrectlons Nevertheless, tins can be determined with a knowledge of the predommant wmd dlrectlon at the sampling site For Instance, for f = - 1, there are two possible wmd dlrectlons. (a) west and (b) between north and north northeast However, the latter seldom occurs at Largs and so It is reasonable to assign a value of f = - 1 to a westerly wmd If the concentration of a species is posltlvely correlated with the wmd, that means its concentratlon wfll be enhanced during the period when a southeast or south southeast wmd predomlnates In other words, this species mainly came from that dlrectlon The contrary 1s the case where the species mainly come from the north or north north west If the concentration 1s not slgmflcantly correlated with wmd, it mdlcates that the species 1s not Influenced by the wmd and IS mamly emitted near the samphng site Data preparation
Generally, the raw data are not suitable for statistical analysis It 1s necessary to tallor the results pnor to analysis and mean-centred, vanante scaling IS an approach often used [13] If the aim of the study 1s to mvestlgate the correlation
between variables, this procedure 1s not essential even when the magnitudes of the data vary by two to three orders However, if the aun IS to study the correlation between samples, meancentred, variance scalmg IS very nnportant for data Hrlth a wide range of magnitudes, otherwise artlflcmlly high correlation coefflclents will be obtamed To overcome this potential hazard, it 1s suggested that arbitrary units be used for different variables with the decimal pomts m the data moved so that all the data have numerical values that are of a snnllar order of magmtude Another way to achieve this would be to autoscale the data before the correlation of samples 1s calculated However, autoscahng 1s unsuitable m some cases where one or more variables are nearly constant Another problem associated with data preparatlon 1s data transformation SometImes a measurement 1s not lmearly correlated to others For vahdatlon usmg a linear model, one or more vanables should be transformed Logarithmic and exponential transformations are the methods often used In this work, the pH values are transformed into hydrogen ion concentrations because there 1s not a linear relatlonshlp between pH and the concentrations of the other Ions Correlatton between vanables
Correlation coefficients mdlcate the linear relatlonshlps between parameters Correlation coefficient values of 0 or f 1 are the extreme cases mdlcatmg either no relationship or a perfect hnear relatlonshlp, respectively In this study, a value of 1 would show that the two species come from precisely the same source and a value of zero Indicates independent sources In practice, I r,, I 1s m the range O-l Therefore, crltena are required to Judge whether the correlation IS not slgmficant, Just slgnlficant or highly significant These cases can also be described as not correlated, correlated and well correlated, respectively If the random errors m the data are known, the F-test can reveal the extent of correlation Another simple method mvolves a comparison of the calculated rr, with r+ found m statistIca tables of agmficance, where CY1s the slgruflcance level
1
correlation
’
loo0 0962 0 816 0 153 -0 107 0264 0 874 -0346 0 981
Na Mg loo0 0 787 0 054 -0091 0128 0 823 -0382 0 970 1000 0 137 -0033 0168 0 767 -0126 0 835
Ll Pb 1 OflO 0046 0 782 0 292 0 377 0 142
1000 0 411 -0102 0254
cu
m Itahcs mdlcate
1000 -0081 -0099 0 382 -0 107
cd
m tbls case IS 0 354 ((u = 0 01, f = 521, (T=,~) 4 = 0 595 Values
1000 -0475 -0474 -0422 0256 0 533 0006 -0326 0 798 -0480
H
m ramwater
Volume loo0 0 159 -0288 -0247 -0071 -0087 -0048 -0226 -0331 -0 038 -0302
vanables
for sqgufaxnt
Wmd 1000 0490 0 473 -0534 -0509 -0385 0 102 0 156 -0238 -0399 0 376 -0554
between
a The cntenon
cl
NO3
so,
Wmd Volume H Na Mg LI Pb cd cu
Correlation
TABLE
the highly slguticant
1000 -0102 0886
so,
correlations
NO, 1000 -0334
cl 1000
P
5
APPLICATION OF PRINCIPAL COMPONENT ANALYSIS
(generally 0 OS>and f the degree of freedom, 1 e the number of samples mmus two Because data from envxonmental studies are unrepeatable, there 1s no method of assessmg a correlation coefflclent for high slgmficance unless the coefflaents are close to umty Here it 1s suggested arbitrarily that (r,,f)i be used as a criterion for highly slgmflcant correlation If I rr, 1 > I-,,~, this Indicates a slgmfxant correlation and if I r,, I > (raJ)f a highly slgnlficant correlation 1s suggested Table 1 hsts the correlation between species m Largs rainwater Na+, Mg’+, LI+, SOiand Cl- show highly slgmficant correlation wth each other These speaes are negatively correlated with a south-east wmd This implies that the species come from the west, as components of the marme aerosol The fact that NO; and H+ are well correlated, posltlvely, with the wmd dnectlon shows that the two species come mamly from the east, as expected, probably from a man-made or pollutant source The correlation between Pb2+ and Cu2+ IS unexpectedly high This may be caused by outhers and IS dlffrcult to explain
dlcular component vectors that contam most or all of the unportant mformatlon m the data Unfortunately, there are no known methods that can establish with certamty how many components are present m a set of data Ramos et al [17] consldered that cross-vahdatlon may be one of the best methods for solvmg this problem There are several modified versions of this procedure but all retam the same essential pattern [14,18,19] Wold [14] preferred dlvldmg samples mto 4-7 groups m order to save computer time Eastment and Krzanowsk~ 1183suggested that to extract the maMmum possible mformatlon, each deleted group should be as small as possible Computers are now more available and savmg computer tnne 1s less important than the accuracy of results Thus, we dlvlde samples mto m groups and delete n data on diagonals or pseudodlagonal hnes each tnne The crltenon used to stop the lteratlon follows that proposed m Refs 17-19, that IS, the number of significant components is set to the value of K, the current number of components, at which the predlctlve residual error sum of squares (PRESS) has a mmlmum Usmg the above-mentloned approach to the data mean-centred, variance scaled, the number of slgmflcant components 1s assessed to be four which Includes 86% of the total vanance of the data This means there are four kmds of vanables Table 2 grves the loadmgs of the vanables
Estlmatmg the number of sgmjkant components The most nnportant role of PCA 1s to estimate
the number of slgmficant components present m the data To do thn, PCA reduces the mformatlon to a muumum set of a few, usually perpenTABLE
2
Varunax
loadmgs
of vanables
m Largs ramwater
a
Vanable
Fl
F2
F3
F4
h2
Wmd Volume H Na Mg Ll Pb Cd Cu
0 3631 0 1539 0 3561 - 0 9454 - 0 9446 - 0 8835 -0 1268 -00457 -0 1450 -08761 0 2151 - 0 9522
-03247 0 0986 - 0 8197 0 1805 0 1752 0 0289 -0 1938 - 0 8282 0 0514 0 0502 - 0 7879 0 1738
-00188 0 1113 -0 1687 -0 0868 0 0398 -00445 -0 9149 0 1916 -09102 - 0 2857 -03460 - 0 0767
0 7284 0 8907 0 1966 -0 1745 - 0 1376 -0 1146 0 0797 -0 0771 - 0 2018 -0 1540 0 0718 0 1927
0 7683 0 8392 08660 09644 0 9436 0 7965 0 8971 0 7306 0 8930 0 8756 0 7920 0 9799
so4 NO3
Cl PVAF b
384
a Values m ltahcs mdlcate
186 the most important
loadings
163 b Percentage
of vanance
12 9 accounted
for by the factor
863
6
after varnnax-rotation From Table 2, It can be seen that the first component represents seawater factors, Na+, Mg2+ Llf SO:- and Cl-, the second (H+, Cd’+ kd i0;) and, third (Pb2+ and Cu2+) components represent pollutant factors, or man-made sources, and the fourth component concerns meteorological factors (wind and volume of ramwater) It IS mdlcated that NO; is highly correlated with H+ and may be the species responsible for the rainwater aadlty at Largs The same phenomenon has been observed elsewhere [91 In Table 2, the communahty factor, h*, 1s the fraction of the total vanance of each vanable spanned by the factor components, or chemically, the fraction of all information on each variable covered by the four components The values of communality for some variables m Table 2, h+, NO; and Cd2+, are less than 08 There are two posslblhtles for this One 1s that the preaslon and accuracy of the determination of these vanables are not good enough Alternatively, to explain all the variance, another component or source for the variables may be required For Ll+ and Cd2+, many of the concentrations measured are near the detection limits of the analytical procedures (AAS) hence poor precision and accuracy may be the cause of the communality farlure A similar problem may exist for the wind dlrectlon, which was not measured accurately at the sampling site The fourth principal component m Table 2 1s a factor of meteorology If the study 1s mainly focussed on the chemical species, then wmd and the sample volume can be excluded from the data when the principal components are extracted When the cross-vahdatlon method 1s applied to the data, excluding the wmd and rainwater volume, the results show there are three significant components, as expected The species hnked by a particular component are the same as they are m three of the four components obtained when all the variables are considered Estimation of the number of significant components by cross-vahdatlon 1s affected by the dlmenslons of the matrix to which the method 1s applied [14] To investigate how important this effect is, cross-vahdatlon was also used on another two matrices, the correlation matrvr of van-
P ZHANGETAL
ables, R (sue 12 x 12 m this case) and the correlation matrlx of samples R’ (sue 54 X 541, in addition to the mean-centred, varmnce scaled data, X (size 54 x 12) The number of significant components obtained 1s two from R and four from R’ Two components 1s unreasonable (the reason will be mentioned below) If the total number of samples 1s greater than 200, usmg R’ ~11 consume too much computer tune, although it generally gives more accurate results So, it is recommended that X should be used m this type of study to establish the number of slgmticant components In Table 2, each pnnclpal component 1s composed of several variables It is of Interest to consider how they behave when the number of slgmflcant components 1s equal to the number of variables, 1e 12, and then rotate them by the varlmax technique The results show that Na+, Mg*+, Ll+, SOi- and Cl- still have maxunum loadmgs on the first component, Cd2+ on the second, Pb2+ and Cu2+ on the thud, volume on the fourth, H+ and NO; on the fifth, and wmd on the sixth No vanables have sign&cant loadings on the other four components This mdlcates that as long as the correlation between the vanables is reasonably high, the variables cannot be separated durmg vanmax-rotation, even though the number of components increases This behavlour can confirm whether the variables come from the same source or not Classijicatlon of samples into groups
Classification of samples mto groups 1s possible with PCA There are two requirements m classlflcatlon One 1s that the difference between groups should be as great as possible Another 1s that dtierences within the group should be as small as possible To meet the two requirements, samples are classdied by choosmg their maximum absolute loadings on the factors, that is, the samples which have the maxunum loadings (either positive or negatlve) on the same component are assigned to a group The number of pnnclpal components determmes the number of groups In essence this procedure 1s the same as clustering samples according to their correlations, but It would be dlfflcult to obtam visually, by cluster
-052a -0840 1761 1628
-0445 -0903 1700
1 2 3 4
1’ 2’ 3’
450 419 1899
430 43 9 2837 803
Volume (cm’)
456 506 425
480 504 454 406
pH
500 1122 246
5 81 1061 2 96 188
Na (pg ml-l)
0 65 151 031
0 72 147 040 021
Fgml-‘)
010 020 009
0 11 0 19 007 011
t;g 1-l)
185 OS2 122
187 074 068 185
Pb (figl-l)
0 10 0 16 0 19
009 0 17 0 11 0 28
Cd (pg 1-l)
. . . . . -...----- ...._... ---.-- ..._-.-- _... _ .__----_______
a Number of samples m the group b Percentage of variance accounted for by the group
Wmd
404 118 045
4 10 096 0 43 049
CU (pgl-‘)
078 051 114
077 051 049 191
E3rnl-‘)
10 76 23 84 488
122 228 5 57 408
cC:i,l-‘I
-..---------_____________. _.^. --
300 454 220
3 22 435 234 204
$rnl-‘)
363 370 16 9
366 362 13 6 120
PVAG b (%o)
- .“-.-_~-”.._I^. --._ ___ ____.__._
21 20 13
22 19 7 6
NS’
:: 8
Mean concentration of vanables m each group
Group
%
TABLE 3
8
analysq the numbers of sample categones and the percentage of mformatlon covered by each group that IS obtamed by PCA Table 3 gives the results for the mean concentrations of variables and the percentage of vanance spanned by the group obtained by PCA Classifymg samples mto 3 or 4 groups 1s reasonable and acceptable for the data, because the mean concentration of vanables are slgmflcantly different In the case of the four groups m the upper half of Table 3, the species m the first group, assocrated with a southwest wind, have mixed orlgms of man-made and marine sources, because the concentrations of all species are high and those of Pb2+ and Cu2+ are dommant The second group, which ongmates from the west, has its orlgm m marme aerosol The third group 1s from the southeast, but the abundant ramfall dilutes the concentration of all the species The fourth group 1s also from the southeast and has a man-made pollutant source For the three groups m the lower half of Table 3, the character of each group IS more distinct So it may be more reasonable to classify samples mto three groups If the samples are dlvlded mto 2 groups, the Intergroup drfference becomes vague If the number of prmclpal components 1s set to five, no samples have maxunum loadmgs on the fifth component and the group 1s empty This confirms that the number of prmclpal components m the data should be three or four as estabhshed m the first analysis After classlflcatlon of samples, the correlation between spectes wlthm groups was also mvestlgated and the results are hsted m Table 4 The correlations of some pans of species drffer greatly m groups For Instance, H+ and NO; are highly or slgmficantly correlated m the first and third groups (pollutant sources), but uncorrelated m the second (seawater) These results can also confirm whether the classlflcatlon of samples 1s reasonable, although the effect of different numbers of samples m the correlations should be taken mto account when the comparison IS made It should be pomted out that the order of prmclpal components m Table 2 (for vanables) may not be consistent with that m Table 3 (for samples) This is because the prmapal compo-
P ZHANGETAL
TABLE 4 Correlations between some speaes wthm groups Pmr
Correlations m group 2
1
3
Na/Mg
094
092
099
Na/Ll Na/SO., Na/Cl Mg/Ll Mg/SQ Mg/Cl Pb/Cu Pb/SO, Pb/NO, H/NO, H/SO, Cd/NO,
0 81 078 097 076 066 0 95 0 86 0 71 042 087 -021 055
0 81 0 87 097 0 77 084 094 053 0 29 -004 0 17 -031 027
047 078 099 042 0 79 098 020 -017 039 062 -027 -015
nents take the order of the fraction vanance of the data they account for Nevertheless, the chemical meaning of prmapal components m Table 3 1s not d$flcult to ldentlfy If the mean concentration of species IS exammed Evaluatton of methods to estabhsh the number of s~gnifcant components There are several other approaches to estabhsh the slgnlficant components present m a set of data besides the cross-vahdatlon method All of these methods have been evaluated m this work and the results are listed m Table 5 If the correlation matrix method 1s used, it IS advocated [201 that only those components that correspond to elgenvalues greater than umty be retamed Another common method mcludes as many components as can span a certam fraction, e g 90%, of the variance of the data [21] Apphcatlon of the ratio of two connected eigenvalues (RCE) has also been employed [22] The number of slgmflcant components IS determmed at the maximum values of the ratlo (when there 1s only one maximum) or at the second maxmmm value (when there 1s more than one maxunum) Other crlterla [231, such as real error (RE), embedded error (IE), and mdlcator function (IND) have been Investigated as well In the present problem (see Table 9, the first four elgenvalues are greater than umty and they span about 86% of the total
9
APPLICATION OF PRINCIPAL COMPONENT ANALYSIS
TABLE 5 Evaluation of the different approaches for assessmg the number of slgmficant components ’ Elgenvalue
FVb
RCE
RE
IE
IND (x103)
PRESS+
5 0503 2 3934 13422 11077 0 5819 04004 0 2933 0 1897 0 0827 0 0670 0 0255 0 0130
0 458 0 658 0 770 0 862 0 911 0944 0968 0 984 0 991 0997 0999 1000
230 178 121 I 90 145 136 154 229 123 262 196
0 801 0528 0 353 0 I63 0 132 0 121 0 110 0085 0 072 0066 0 052
0231 0 152 0 102 0049 0 038 0 035 0031 0 027 0 024 0 019 0 015
662 528 4 36 255 269 3 36 440 532 800 16 50 5200
0 197 0 749 0741 0 705 0 741 0 773
a Values m ltahcs mchcate the number of ssruticant component determmed by the approach b Fraction of vanance spanned by the factor counted here
vanance RCE has the maXlmum and IND and PRESS; have muuma at the fourth elgenvalues The values of RE and IE declme smoothly after the fourth elgenvalue All of those different approaches consistently show there are four slgmficant components m the data, although they can give discordant results m some problems [14,181 Besides cross-vahdatlon, RCE ts worthy of attentlon because It 1s easy to calculate and the result IS dlstmct This study has confirmed that prmclpal component analysts based on NIPALS coupled with cross-vahdatron IS a robust method for the study of multlvarlate data The NIPAJ_S approach IS convenient for programmmg, It 1s also necessary for a good understandmg of PCA In most real problems, It converges qmcldy The results obtamed are the same as those calculated by other methods Nevertheless, It has the potential hazard of non-convergence [15], especially when the method 1s used with a reduced matrrx where some data are deleted and two or more very smular elgenvalues exist One of the authors, P Zhang, gratefully acknowledges the award of a subsistence grant from the Chmese Academy of Saences.
REFERENCES RM Hamson, J N B Bell and JN Lester, m R Perry (Ed ), Acid Ram Saentic and Techmcal Advances, Selpher Ltd , London, 1987 J C White (Ed ), Acid Ram, The RelatIonshIp between Sources and Reports, Elsevler, Amsterdam, New York, 1988 The Effects of Acid DeposItion on Buddmgs and Bmldmg Matenals III the Umted Kmgdom, Buddmg Effects of Rewew Group Report, HMSO, London, 1989 Acid DeposItIon m the Umted Kmgdom, Warren Sprmg Laboratory, London, 1983 M Flrbery, Anal Chum Acta, 191 (1986) 75 S Wold, K Esbensen and P Geladl, Chemometrlcs Intel1 Lab Systems, 1 (1987) 135 K Mardla, J Kent and J B&by, Multwanate Analysti, Academic Press, London, 1980 W R Ddlon and M Golgstem, Multnrarlate Aualys~s, Methods and Apphcatlons, Whey, New York, 1984 9 EJ Knudson, DL Duewer, GD Chnsuan and TV Larson, m B R Kowalslu (Ed ), Chemometncs Theory and Applsatlon, (ACS Symposmm Senes 521,AC& Washmgton, DC, 1977, pp 80-116 10 P K Hopke, Trends Anal Chem ,4 (1985) 104 11 J Salhel and D W Eaker, J Am Chem Sot, 106 (1984) 7624 12 X Lm, P Van Espen, F Adams, S Yan and M Vanbelle, Anal Chum Acta, 200 (1987) 421 13 M k Sharaf, D L Illmen and B R Kowalslu, Chemometncs, Wde.y, New York, 1986 14 S Wold, Technometncs, 20 (1978) 397
10 15 P Geladl and B R Kowalslu, Anal Chum Acta, 185 (1986) 1 16 H F mser, Educational and PsychologIcal Measurement, 19 (1959) 413 17 L S Ramos, KR Beebe, W P Carey, E Sanchez, B C Eru%on, B E W&on, L E Wangen and B R Kowalslu, Anal Chem , 58 (1986) 294R 18 MT Eastment and W J Krzanowslu, Technometncs, 24 (1982) 13 19 M E Kargancm and B R Kowalslu, Anal Chem ,58 (1986) 2300
P ZHANGETAL
20 J N R Jeffen, Appbed Statlstlcs, 16 (1967) 225 21 MS Watanabe and N Pakvasa, Subspace Methods m Pattern Recogmtlon, Proc 1st Int Jomt Conf on Pattern Recogmtlon, Washmgton, DC, IEEE Cat No 73, CHO 821-9C, 1973 22 X He, H LI and H Shl, Femu Huaxue, 14 (1986) 34 23 ER Malmowslu and DG Howery, Factor Analysts 111 Chemlstly, Wdey, New York, 1980