Chemometrics and Intelligent Laboratory Systems 41 Ž1998. 127–134
Stabilization of cyclic subspace regression Jason M. Brenchley a , Patrick M. Lang b, Reinaldo G. Nieves a , John H. Kalivas
a,)
a b
Department of Chemistry, Idaho State UniÕersity, Pocatello, ID 83209, USA Department of Mathematics, Idaho State UniÕersity, Pocatello, ID 83209, USA
Abstract Developments that produced a stable numerical algorithm for cyclic subspace regression ŽCSR. are described. This simple algorithm produces solutions for principal component regression, partial least squares, least squares, and other related intermediate regression methodologies by exactly the same procedure. The development begins with a theoretical CSR algorithm that should produce accurate results. However, when used in numerical form, it does not produce accurate results because numbers are generated which are too small for most computational tools to accurately represent. Several strategies to deal with this numerical instability are described. Results obtained using each approach are reported as applied to two data sets. The development ends with presentation of the stable algorithm as well as MATLAB code for the algorithm. q 1998 Elsevier Science B.V. All rights reserved. Keywords: Cyclic subspace regression; Moore–Penrose generalized inverse; Krylov sequence; Gram–Schmidt; Lanczos
1. Introduction Three regression techniques commonly used in analytical chemistry are partial least squares ŽPLS., principal component regression ŽPCR., and least squares ŽLS.. These three techniques have been shown to be special cases of a more general, easily implemented regression method called cyclic subspace regression ŽCSR. w1x. The focus of this paper is to discuss steps taken in creation of the stable numerical CSR algorithm applied in Ref. w1x. In doing so, the authors show that well known protocols for numerically stabilizing theoretically valid procedures sometimes are of limited utility. By understanding the steps leading to numerical stabilization of CSR, realization of the relationships between the practical sta-
)
Corresponding author.
ble numerical algorithm and the theoretical form of CSR can be achieved. MATLAB code for the algorithm is provided in this paper. With CSR, some of the interrelationships between PCR, PLS, and LS can be explained. Other studies have investigated these interrelationships in greater details w1–3x. It should be noted that CSR is different from continuum regression w4,5x and the H-Principle w6x which are other methods describing interrelationships between PCR, PLS, and LS.
2. Theory and primary problem In all that follows, R denotes a m = p matrix of rank k w7x containing near infrared ŽNIR. calibration spectra, where m is the number of samples and p is the number of wavelengths over which each sample
0169-7439r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. PII S 0 1 6 9 - 7 4 3 9 Ž 9 8 . 0 0 0 2 9 - X
128
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
is measured. The m = 1 vector c contains quantitative information about the desired prediction property. The p = 1 vector runk represents the responses of the unknown sample. The scalar quantity c unk denotes the quantified value to be predicted for each respective runk . A superscript t indicates transpose operations, a superscript q denotes the Moore– Penrose generalized inverse and 5x 5 2 refers to the Euclidean norm of a vector x w8x. The following steps represent the theoretical CSR algorithm presented in Ref. w1x. It is numerically unstable and should not be implemented as presented, for reasons that will become apparent in the discussion that follows. Step 1: Obtain a singular value decomposition of R, i.e., find orthogonal m atrices U s Žu 1 ,u 2 , . . . ,u k . and V s Žv1 ,v2 , . . . ,vk . such that R s U S V t , where S s Ž si j . is a m = k matrix whose entries are all zero except for si i which has the value si s l i ,i s 1,2, . . . ,k ŽThe value l i is the ith largest eigenvalue of Rt R..
first k eigenvectors to form j factors whereas PCR uses only a scaled sum of the first j eigenvectors.. The numerical instability of this CSR algorithm arises from the nature of the vectors used in creating A lj and B lj in Step 6. Specifically, since c s Ý mis1 a i u i where a i s c t u i , it follows that z l s R t Pl c s jy 1 Ý lis 1 s i a i vi , A lj s Ž Ý lis 1 s i 2 nq 1 a i vi . ns 0 , and B lj js1 l 2 nq2 s Ž Ý is1 si a i u i . ns0 . Should some of the singular values si be small, as is the case with many matrices of spectroscopic data, the above shows that coefficients multiplying vi and u i in A lj and B lj can be extremely small. Numerically, these small values get set to zero resulting in theoretically full rank j matrices A lj and B lj being rank deficient. This rank deficiency is problematic when the inversion involving B lj is implemented in Step 7. The following discusses several approaches that were taken to bypass the numerical Žbut not theoretical. problems.
(
Step 2: Set a s U Sq V t runk ŽNote, a s ŽRt .q runk .. Step 3: Let j and l be fixed integers satisfying 1 F j F k and j F l F k. Step 4: Set Pl s Žu 1 ,u 2 , . . . ,u l .Žu 1 ,u 2 , . . . ,u l . t , where the u vectors come from the first l columns of U. Step 5: Set z l s Rt Pl c ŽNote, z l s Ýlis1 si a i vi , where a i s c t u i .. Step 6: Set Alj s wz l ŽRt R.z l ŽRt R. 2 z l , . . . , ŽRt R. jy1 z l x and B lj s RA lj . Step 7: Set cˆlj s a t B lj ŽŽB lj . t B lj .y1 ŽB lj . t c. In this notation A lj and B lj represent matrices based on j factors which are formed from scaled sums of the first l eigenvectors from V. The value cˆlj represents the estimate of c unk produced by CSR using the corresponding j factors and the first l eigenvectors. By letting j range from 1 to k and l range from j to k, it is apparent that the algorithm produces k 2r2 q kr2 prediction values. For a fixed j with l s j, the value cˆ jj is that of PCR using j factors, if l s k, the value cˆkj is that of PLS using j factors, and if l s j s k, the value cˆkk is that of LS. ŽNote that in Step 5, PLS always uses a scaled sum of the
3. Experimental 3.1. Computations Computations were performed with MATLAB 4.2C ŽThe MathWorks, MA. on an IBM compatible pentium computer under Windows 95. Data was mean centered before analysis. In all cases, correct PCR, PLS, and LS predictions were obtained by PCR, PLS, and LS algorithms described in Ref. w4x. In order to validate CSR solutions obtained for each stabilization attempt, standard error of validation ŽSEV. w9x values from CSR PCR, CSR PLS, and CSR LS were compared to SEV values obtained using PCR, PLS, and LS algorithms. Respective SEV values were compared to seven decimal places. 3.2. Data Two data sets of NIR spectroscopic data were studied. One data set ŽFISH. contains NIR spectra of 38 fish samples measured at nine wavelengths w10x. The fat content of each fish sample is known. Twenty-eight samples formed the calibration set and 10 samples served as the validation set. The other data set ŽWHEAT. contains NIR spectra of 38 wheat
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
samples measured at 6 wavelengths w11x. The protein content of each wheat sample is known. Twelve samples formed the calibration set and 26 samples served as the validation set.
4. Results and discussion
Table 1 Singular values for WHEAT and FISH R matrices WHEAT
FISH
1.513=10 2 3.368=10 1 1.672=10 1 5.587 1.234 9.023=10y1
1.506 1.228=10y1 6.360=10y2 2.958=10y2 1.319=10y2 3.461=10y3 3.203=10y3 2.035=10y3 8.178=10y4
4.1. Original algorithm As can be seen in Fig. 1a and c, the original algorithm, presented earlier, produced correct PCR and PLS solutions corresponding to the incorporation of four factors Ž j s 4. for the WHEAT data. Fig. 1b and d show that this CSR algorithm produced correct PCR and PLS solutions up to three factors for the FISH data. It should be noted that the deviations for CSR PCR are not large for FISH based on four and five factors. The reason deviations occur sooner for the FISH data is due to the small magnitude of singular values for R. Listed in Table 1 are the singular values of R for WHEAT and FISH. Table 1 reveals that the singular values for FISH are closer to zero than are those for WHEAT. By inspecting condition numbers and indices in Table 2 for the two data sets further insight is gained. The condition number for a matrix is determined by dividing the first and largest singular value, s 1 , by the
129
smallest singular value, s k . A matrix is said to be well-conditioned when the condition number is one and ill-conditioned when the condition number is much greater than one. The ith condition index is formed from the ratio of s 1 to si . Table 2 and Fig. 1a and b demonstrate that the original CSR algorithm deviates from actual PCR results when the condition index becomes significantly greater than 27 and 23 for WHEAT and FISH, respectively. This problem is further substantiated by looking at Tables 3–5 which contain singular values and condition numbers of PCR B lj matrices for WHEAT and FISH. Again, singular values for WHEAT are substantially larger than those for FISH. It appears that the original CSR algorithm becomes unstable when condition numbers of PCR B lj matrices range from 10 10 to 10 16 . ŽIt should be noted that for the described computers, inversion routines utilized by MATLAB work poorly when condition numbers approach such numbers.. Specifically, in Step 7 the inverse of ŽB lj . t B lj must be computed. The inverse will exist if and only if B lj is
Table 2 Condition indices for WHEAT and FISH R matrices
Fig. 1. Comparison of SEV values using the original CSR algorithm and traditional algorithms. Actual PCR Ž-`-. and CSR PCR Ž –)– . results for Ža. WHEAT and Žb. FISH. Actual PLS Ž-q-. and CSR PLS Ž –=– . results for Žc. WHEAT and Žd. FISH.
Singular value ratio
WHEAT
FISH
s1 rs1 s1 rs 2 s1 rs 3 s 1 r s4 s 1 r s5 s 1 r s6 s1 rs 7 s1 rs 8 s1 rs 9
1 4.492 9.049 2.708=10 1 1.226=10 2 1.677=10 2
1 1.226=10 1 2.368=10 1 5.091=10 1 1.142=10 2 4.351=10 2 4.702=10 2 7.400=10 2 1.841=10 3
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
130
Table 3 WHEAT singular values for PCR B lj matrices B11 3.457 = 10
B 22 3
B 33 7
B 44 11
4.954 = 10 8.771 = 10 2
6.101 = 10 1.051 = 10 6 8.629 = 10 2
8.105 = 10 1.133 = 10 9 2.537 = 10 5 1.422 = 10 1
Table 4 WHEAT and FISH condition numbers for PCR B lj matrices B tj
WHEAT
FISH
B11 B 22 B 33 B 44 B 55 B 66 B77 B 88 B99
1 5.648=10 4 7.070=10 8 5.700=10 14 7.030=10 16 1.057=10 20
1 8.269=10 2 5.137=10 5 1.482=10 10 1.292=10 14 3.804=10 16 6.066=10 16 4.296=10 17 1.804=10 18
B 55 15
B 66 20
1.430 = 10 24 1.452 = 10 15 1.988 = 10 10 1.715 = 10 7 1.353 = 10 4 0
1.077 = 10 1.280 = 10 12 7.112 = 10 7 1.532 = 10 3 0
point is a function of the inadequacy of the computational tools described in Section 3 to accurately represent small numbers w12x. Similar results for singular values and condition numbers were obtained for WHEAT and FISH using PLS B lj matrices. 4.2. Moore–Penrose generalized inÕerse
rank j. Table 3 shows that successive PCR B lj matrices for WHEAT are indeed rank j until B 55 which is rank 4. Hence, the inverse cannot be computed. Fig. 1b shows only small deviations for 4 and 5 factor PCR B lj matrices for FISH. The small deviations stem from the fact that MATLAB found an inverse that is a poor approximation of the true inverse. Beyond this, respective B lj matrices have become even closer to singular and further deviations occur. The inability of MATLAB to produce correct solutions beyond this
The first attempt to stabilize the algorithm employed the Moore–Penrose generalized inverse at Step 7 of the original algorithm. In this case, Step 7 was modified to read cˆ lj s a t B lj Ž B lj . q c. The Moore–Penrose generalized inverse of B lj is obtained by first performing a singular value decomposition on B lj, producing B lj s Uj S j Vjt , where Uj , S j , and Vj are m = j, j = j, and j = j, respectively, and t then computing Vj Sq j U j . Fig. 2a and c show that CSR PCR and PLS values, respectively, are not correct for the WHEAT data after j s 4. As expected, no improvement is possible for WHEAT due to B 55 and B 66 being rank 4 and 5 respectively. Fig. 2b and d reveal that CSR PCR and PLS values differ from actual PCR and PLS solutions after j s 5 for the FISH data. This represents an improvement of 2 factors
Table 5 FISH singular values for PCR B lj matrices B11
B 22
B 33
B 44
3.212 = 10 1 7.961 = 10 1 1.894 = 10 2 4.172 = 10 2 9.628 = 10y2 1.111 = 10y1 1.126 = 10y1 3.687 = 10y4 3.994 = 10y4 2.816 = 10y8
B 55
B 66
B 77
B 88
B99
9.467 = 10 2 1.129 = 10y1 4.048 = 10y4 3.285 = 10y8 7.327 = 10y1 2
2.147 = 10 3 1.130 = 10y1 4.059 = 10y4 3.330 = 10y8 7.948 = 10y12 6.961 = 10y1 4
4.870 = 10 3 1.130 = 10y1 4.061 = 10y4 3.338 = 10y8 8.052 = 10y12 1.333 = 10y13 8.029 = 10y1 4
1.104 = 10 4 1.130 = 10y1 4.061 = 10y4 3.340 = 10y8 8.076 = 10y12 3.959 = 10y13 1.169 = 10y13 2.570 = 10y1 4
2.505 = 10 4 1.130 = 10y1 4.061 = 10y4 3.340 = 10y8 8.092 = 10y12 1.661 = 10y12 2.204 = 10y13 8.887 = 10y14 2.310 = 10y1 4
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
Fig. 2. SEV values using the Moore–Penrose generalized inverse. Actual PCR Ž-`-. and CSR PCR Ž –)– . results for Ža. WHEAT and Žb. FISH. Actual PLS Ž-q-. and CSR PLS Ž –=– . results for Žc. WHEAT and Žd. FISH.
compared to the original CSR results shown in Fig. 1b and d. 4.3. Gram–Schmidt orthogonalization The second attempt at stabilization was to use the Gram–Schmidt method of orthogonalization on B lj to form another matrix with orthogonal columns, G lj. This matrix was then used in Step 7 instead of B lj, i.e., cˆlj s a t G lj ŽŽG lj . t G lj .y1 ŽG lj . t c. The Gram–Schmidt methodology has been successfully used to orthogonalize matrices similar to those described in this paper w13x. Fig. 3a shows that CSR PCR with this Gram–Schmidt orthogonalization produced correct PCR results for WHEAT. The CSR PLS results for WHEAT were also close as seen in Fig. 3c. However, Fig. 3b reveals that CSR PCR solutions for FISH are only correct up to j s 6. After this, the algorithm becomes unstable with the use of G lj. The reasoning for the achievement of correct PCR solutions for WHEAT and FISH up to j s 6 can be found in Tables 6 and 7. The singular values of PCR G lj matrices for WHEAT and FISH show that B lj matrices for WHEAT were better orthogonalized than those for FISH. In this case, a condition number of approximately 10 10 indicates instability. PLS solutions are not achievable due to the difficulty associated with the orthogonalization of the PLS B lj matri-
131
Fig. 3. SEV values using the Gram–Schmidt method of orthogonalization with B lj. Actual PCR Ž-`-. and CSR PCR Ž –)– . results for Ža. WHEAT and Žb. FISH. Actual PLS Ž-q-. and CSR PLS Ž –=– . results for Žc. WHEAT and Žd. FISH.
ces. By the time the algorithm has formed the PLS B lj matrices, they are already too linearly dependent for the Gram–Schmidt algorithm to orthogonalize. This stems from the fact that PLS uses all the eigenvectors and singular values in forming each B lj. Table 1 shows that many of these singular values are small for the FISH data. The Gram–Schmidt orthogonalization was then applied to the A lj matrices formed in Step 6. By orthogonalizing A lj instead of B lj, correct PCR solutions were again obtained for the WHEAT data, see Fig. 4a. The PLS solutions obtained for the WHEAT were also very close to the actual PLS solutions and were exactly the same solutions as those obtained by orthogonalizing B lj. For the FISH data, correct PCR solutions were obtained up to j s 6 and no correct
Table 6 WHEAT singular values for orthogonalized PCR G lj matrices G11
G 22
G 33
G44
G 55
G66
1
1 1
1 1 1
1 1 1 1
1 1 1 1 1
1.414 1.000 1.000 1.000 1.000 5.000=10y4
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
132
Table 7 FISH singular values for the orthogonalized PCR G lj matrices G11
G 22
G 33
G 44
G 55
G66
G77
G 88
G 99
1
1 1
1 1 1
1.000 1.000 1.000 1.000
1.414 1.000 1.000 1.000 2.497 = 10y3
1.732 1.000 1.000 1.000 7.863 = 10y3 2.185 = 10y8
2.000 1.000 1.000 1.000 7.975 = 10y3 4.567 = 10y9 3.222 = 10y1 0
2.236 1.000 1.000 1.000 2.251 = 10y2 5.286 = 10y9 2.917 = 10y10 1.839 = 10y1 0
2.450 1.000 1.000 1.000 6.856 = 10y3 7.146 = 10y9 3.713 = 10y10 2.858 = 10y10 2.207 = 10y1 0
PLS solutions were computed. Condition numbers of approximately 10 11 implied instability. 4.4. Lanczos method of orthogonalization The fourth attempt at stabilization was the Lanczos method w12,14x, which is similar to Gram – Schmidt orthogonalization. The Lanczos method is commonly implemented in order to stabilize computa tio n s in v o lv in g m a tr ic e s o f th e f o r m wx,Ax, . . . ,Any 1 xx. Such matrices are often referred to as Krylov matrices. The columns of these matrices are the first n terms of a Krylov sequence wAi xx`is0 . After implementation of the Lanczos method to B lj, Fig. 5 shows that correct results were achieved for
Fig. 4. SEV values using the Gram–Schmidt method of orthogonalization with A lj . Actual PCR Ž-`-. and CSR PCR Ž –)– . results for Ža. WHEAT and Žb. FISH. Actual PLS Ž-q-. and CSR PLS Ž –=– . results for Žc. WHEAT and Žd. FISH.
WHEAT, but the FISH results are still incorrect. Solutions for CSR PCR and CSR PLS were correct up to seven factors for the FISH data. In this case, a condition number on the order of 10 14 or bigger indicated instability. 4.5. Stabilized algorithm The final attempt at stabilization of the CSR algorithm entailed creating an algorithm that closely resembles the PLS-1 algorithm designed by Manne w15x, which is a coupled iterative algorithm that produces two orthogonal sets of vectors for the subspaces spanned by the columns of A kj and B kj , respectively. The developed algorithm described below
Fig. 5. SEV values using the Lanczos method. Actual PCR Ž-`-. and CSR PCR Ž –)– . results for Ža. WHEAT and Žb. FISH. Actual PLS Ž-q-. and CSR PLS Ž –=– . results for Žc. WHEAT and Žd. FISH.
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
133
The Z lj and Wlj matrices are the loading and score arrays, respectively. When l s j, the value cˆ jj is the PCR solution with j factors, when l s k the value cˆkj is the PLS solution with j factors, and when j s l s k, the value cˆkk is the LS solution. All other cˆlj are intermediate regression values more fully described in Ref. w1x. Fig. 6 shows that this algorithm produced correct PLS, PCR, and LS solutions for both data sets studied. Successful applications of this stable CSR algorithm to several other NIR spectral data sets are described in Refs. w1–3x. These data sets are substantially larger than those presented in this paper, with more than 60 samples and 400 wavelengths. Fig. 6. Comparison of SEV values using the stabilized CSR algorithm and traditional algorithms. Actual PCR Ž-`-. and CSR PCR Ž –)– . results for Ža. WHEAT and Žb. FISH. Actual PLS Ž-q-. and CSR PLS Ž –=– . results for Žc. WHEAT and Žd. FISH.
produces orthonormal vectors that span the same subspaces spanned by the columns of A lj and B lj, respectively. The Appendix contains MATLAB code based on this algorithm. It differs only in that it produces all CSR solutions, i.e., it allows j to run from 1 to k and l to run from j to k.
5. Conclusion It is due to the inability of computational tools to handle numbers beyond their operational range that the theoretical CSR algorithm fails. The stable CSR algorithm presented here provides scientists with a single regression algorithm capable of producing accurate PCR, PLS, LS, and a finite number of intermediate regression solutions.
Step 1: Repeat step 1 of the original CSR algorithm. Step 2:
Let j and l be fixed integers satisfying 1 F j F k and j F l F k. Step 3: Set Pl s Žu 1 ,u 2 , . . . ,u l .Žu 1 ,u 2 , . . . ,u l . t. Rtl s R
and
yl1 s Pl c
Step 4:
Set
Step 5:
For i s 1,2, . . . , j do the following steps; t Ž R il . yli Ži. Set z il s . t 5 Ž R il . yli 5 2 Žii. Set w li s
R il z il
Step 7:
. 5R il z il 5 2 Žiii. Set R iq1 s ŽI y w li Žw li . t .R il . l Živ. Set yliq1 s ŽI y w li Žw li . t .yli. Create matrices Z lj s Žz 1l ,z 2l , . . . ,z lj ., Wlj s Ž w l1 , w l2 , . . . ,w l j . , a n d R lj s Wlj ŽWlj . t RZ lj ŽZ lj . t Set pˆ lj s ŽR lj .q c.
Step 8:
t Set cˆlj s runk pˆ lj.
Step 6:
Acknowledgements The work described was supported by the Idaho State Board of Education, Camille and Henry Dreyfus ScholarrFellow Program for Undergraduate Institutions, NSF-Idaho EPSCor Program, and National Science Foundation grant number OSR-935039.
Appendix A MATLAB code for the CSR algorithm as defined by the final stabilized algorithm %R: response matrix of m samples and p wavelengths Ž m = p . %Rs: R saved %RP: projected R based on l eigenvectors and j factors
134
J.M. Brenchley et al.r Chemometrics and Intelligent Laboratory Systems 41 (1998) 127–134
%c: concentration vector of m samples Ž m = 1. %cs: c saved %runk : spectra of t samples to be predicted Ž t = p . %chat: predicted values Ž t = 1. of c unk %phat: estimated regression vector Ž p = 1. cs s c; Rs s R; %perform an SVD on R wU,S,Vx s svdŽR.; w t, p x s sizeŽrunk .; w m, p x s sizeŽR.; %perform the CSR algorithm for l s 1:rankŽRs. R s Rs; P s UŽ:,1:l .)UŽ:,1:l .X ; c s P)cs; W s zerosŽsizeŽc., 1.; Z s zerosŽsizeŽRX )c., 1.; for i s 1:l a s RX )c; z s arnormŽ a.; Z s wZ zx; b s R)z; w s brnormŽb.; W s wW wx; R s ŽeyeŽsizeŽU.. y w)wX .)R; c s ŽeyeŽsizeŽU.. y w)wX .)c; end W s WŽ:,2:l q 1.; Z s ZŽ:,2:l q 1.; for j s 1:l
%project R with the orthonormal vectors Z and W RP s WŽ:,1: j .)WŽ:,1: j .X )Rs)Z Ž:,1: j .)ZŽ:, 1: j .X ; %obtain the regression vector, and predicted c unk phat s pinvŽRP.)cs; chat s runk )phat; end end References w1x P.M. Lang, J.M. Brenchley, R.G. Nieves, J.H. Kalivas, J. Multivariate Analysis, in press. w2x J.H. Kalivas, submitted to Chemom. Intell. Lab. Syst. w3x G.A. Bakken, T.P. Houghton, J.H. Kalivas, submitted to Chemom. Intell. Lab. Syst. w4x A. Lorber, L.E. Wangen, B.R. Kowalski, J. Chemom. 1 Ž1987. 19–31. w5x M. Stone, R.J. Brooks, Stat. Soc. 2 Ž1990. 237–268. w6x A. Hoskldsson, Chemom. Intell. Lab. Syst. 2 Ž1994. 1–28. ¨ w7x J.M. Brenchley, U. Horchner, J.H. Kalivas, Appl. Spec¨ troscopy 5 Ž1997. 689–699. w8x J.H. Kalivas, P.M. Lang, Mathematical Analysis of Spectral Orthogonality, Marcel Dekker, New York, 1994. w9x Standard Practices for Infrared, Multivariate, Quantitative Analysis, E 1655, American Society for Testing and Materials, 1995. w10x T. Næs, Techometrics 27 Ž1985. 301. w11x T. Fearn, Appl. Stat. 32 Ž1983. 73. w12x G.H. Golub, C.F. Van Loan, Matrix Computations, The Johns Hopkins Univ. Press, Baltimore, 1983. w13x S. de Jong, J. Chemom. 4 Ž1995. 323–326. w14x G. Strang, Linear Algebra and its Applications, 3rd edn., Harcourt Brace Jovanovich, San Diego, 1986. w15x R. Manne, Chemom. Intell. Lab. Syst. 2 Ž1987. 187–197.