On the ℓ1 Procrustes problem

On the ℓ1 Procrustes problem

Future Generation Computer Systems 19 (2003) 1177–1186 On the 1 Procrustes problem Nickolay T. Trendafilov Faculty of Computing, Engineering and Mat...

126KB Sizes 0 Downloads 32 Views

Future Generation Computer Systems 19 (2003) 1177–1186

On the 1 Procrustes problem Nickolay T. Trendafilov Faculty of Computing, Engineering and Mathematical Sciences, University of the West of England, Bristol BS16 1QY, UK

Abstract In this paper, the well-known Procrustes problem is reconsidered. The usual least squares objective function is replaced by more robust one, based on a smooth approximation of the 1 matrix norm. This smooth approximation to the 1 Procrustes problem is solved making use of the projected gradient method. The Procrustes problem with partially specified target is treated and solved as well. Several classical numerical examples from factor analysis (well-known with their least squares Procrustes solutions) are solved with respect to the smooth approximation of the 1 matrix norm goodness-of-fit measure. © 2003 Elsevier Science B.V. All rights reserved. Keywords: Fitting configurations; Constrained optimization; Dynamical system on manifolds; Descent flows; Optimality conditions; Factor-pattern and reference-structure; Partially specified target; MATLAB

1. Introduction This paper proposes dynamical system approach to a problem from multivariate data analysis (called in jargon linear algebra with data matrices). Let A, B and C be given and fixed matrices with conformal dimensions, and consider the problem of finding a matrix X to solve the problem: minimizeAXC − B subject to X ∈ C,

(1)

where C is some constraint smooth matrix manifold. This matrix matching problem is known as the Procrustes problem (strictly speaking it is called weighted if the matrix C is present). In most applications the norm in (1) is taken to be the Frobenius norm. Then the Procrustes problem (1) is an example of a least squares matrix matching problem (very popular since it was introduced by Gauss in 1795), i.e. least squares matching of AXC to the target B. In [25], the class of Schatten norms are used E-mail address: [email protected] (N.T. Trendafilov).

as discrepancy measure in (1) of which the Frobenius norm is a special case. In multivariate data analysis, the choice of the 1 matrix norm in (1) may be of great importance if the target B contains ‘wild’ entries. There is a number of Procrustes problems depending on the choice of the constrained manifold C. They arise in a great variety of scientific areas, e.g. multivariate data analysis (factor analysis, scaling, etc.) [8,14], shape and image analysis (global positioning systems, morphometrics, neuro and brain imaging, etc.) [6,16], numerical analysis [1], etc. Compact introduction to the Procrustes problems can be found in [8,11]. In this paper, we replace the standard for the most applications Frobenius norm in (1) with the 1 matrix norm, where the sum of moduli of the errors eij is minimized:  |eij |. AXC − Bl1 = El1 = i,j

This criterion in fact predates least squares, going back to Laplace in 1786. Obviously, the 1 norm goodness-of-fit criterion is less sensitive to large errors

0167-739X/$ – see front matter © 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0167-739X(03)00043-8

1178

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

eij , than the least squares one. When the least squares are trying to compensate these large errors, the solution of the problem may be ‘shifted’ towards an uncorrect one. The least squares theory is developed and works well if there are small changes in all data points (noise). Moreover, there are well established estimates of how big this ‘small’ can be in order to have stable solutions. However, if there are large changes in a few data points (outliers) the least squares approach yields biased solutions [17,24]. Most frequently the outliers are simply some kind of recording/transmission errors. Sometimes the outliers can be interesting objects but not typical for the data collected for the particular analysis. If the outliers can be detected and identified as errors they are simply removed from further analysis or replaced following some imputation technique. Unfortunately, this is not always possible and one is interested in alternative data analysis approaches and techniques which are less sensitive to outliers in the data. One possible strategy is to replace the least squares goodness-of-fit criterion with less sensitive ones as 1 norm, least median of squares, the Huber estimator and a number of redescending functions [17,24]. Example 1. Consider the following small artificial example influenced by the example described in [24]. It illustrates the poor behaviour of the least squares solution when an outlier is introduced. We are given two configurations A and B both consisting of four points in two dimensions. The points in these configurations are vertices of squares around the origin rotated towards each other with a 45◦ rotation angle. The coordinates of the vertices of the initial (夹) and the target (䊊) squares A and B are collected in the following 4 × 2 matrices (see Fig. 1):  √

2   0 A=  −√2  0



0 √  2   0   √ − 2



and

1  −1  B=  −1 1



1 1   . −1  −1

Suppose we know the initial and the target configurations only. One is interested to find a (rigid) rotation, i.e. an orthogonal 2 × 2 matrix X, that transforms the initial square A as close as possible in terms of Frobenius norm to the target square B. This is a

classical orthogonal Procrustes problems (for short, OPP): minimizeAX − BF subject to XT X = XXT = I2 . (2) More generally, one may be interested in a rotation that permits stretching in one dimension and shrinking in another. Such a transformation is known as the oblique rotation and in this case the classical Procrustes problems is called oblique (for short, ObPP): minimizeAX − BF subject to diag(XT X) = I2 . (3) The solutions of the orthogonal and oblique Procrustes problems, where the initial configuration A is fitted to the target B in the least squares (2 ) sense is a 45◦ rotation. The rotation matrices are  0.7071 −0.7071 XOPP = and 0.7071 0.7071  0.7071 0.7071 XObPP = . −0.7071 0.7071 In Fig. 1, these Procrustes solutions are denoted by (+) and they exactly coincide with the desired target B (䊊). A single outlier is introduced in the target by multiplying by −10 the second coordinate of the first point of the configuration B, i.e. the new target becomes:   1 −10  −1 1    B= .  −1 −1  1

−1

Then the 2 solutions of the standard OPP and ObPP are as follows:  0.0000 −1.0000 XOPP = and −1.0000 −0.0000  0.7071 −0.9762 XObPP = . −0.7071 0.2169 The orthogonal Procrustes rotation XOPP is simply a permutation of the initial points. In fact the rotated

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

1179

Fig. 1. Plot of 2 orthogonal and oblique Procrustes rotations (+) to configuration without outlier.

configuration coincides with the initial one when an outlier is introduced in the target configuration. The initial configuration (夹), the target one with outlier (䊊), the configuration after oblique Procrustes rotation (+) depicted in Fig. 2. The solutions of the 1 OPP and ObPP are almost identical for this

artificial example:

0.7067 −0.7075 XOPP = −0.7075 −0.7067

0.7071 0.7068 XObPP = . −0.7071 0.7073

Fig. 2. Plot of 2 oblique Procrustes rotation (+) to configuration containing an outlier.

and

1180

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

Fig. 3. Plot of 1 oblique Procrustes rotation (+) to configuration containing an outlier.

Obviously the 1 orthogonal and oblique Procrustes rotations are very slightly affected by the outlier introduced in B. The rotated orthogonal and oblique configurations are very close for this example. Only the oblique configuration (+) is depicted in Fig. 3. In statistical literature, methods aiming the influence reduction of such large errors are known as robust. The utilization of the 1 norm as a goodness-of-fit measure leads to an important class of robust methods, especially efficient in regression type of problems, and particularly the Procrustes problem. Applying the 1 norm goodness-of-fit criterion can be quite relevant when the target B is not known or specified [2,3,9]. An important assumption is that errors are only present in the target matrix B, and the matrix A and C are error-free. See Chapter 1 in [17] for a comprehensive introduction and further details. To our knowledge the first consideration of a Procrustes problem aiming a robust or resistant fit is given in [19,20] for two- and three-dimensional orthogonal rotations, i.e. XT X = XXT = In for n = 2 or 3. The method is known as the repeated medians method [17]. See also [16] for a brief review and some generalizations. There is no explicit goodness-of-fit measure to be optimized in this approach which makes it difficult to judge the quality of the result. Explicit robust loss functions are employed in [24], namely the 1 , Huber

and Hampel loss functions [17]. The algorithms provided in [24] solve the unweighted (C missing in (1)) OPPs for arbitrary n. The main difficulty in using 1 norm goodness-of-fit criterion is that the standard differentiable optimization theory does not apply. One approach to get round this is to smooth the 1 norm goodness-of-fit criterion. For example, it can be replaced by the Huber M-estimator [17], which has continuous derivatives. See [15] for other convenient approximations of the 1 norm. We employ here a smooth approximation of the 1 norm, making use of the function tanh (x) [23]. Thus we consider in this paper an approximation of the Procrustes problem (1) which for short will be called 1 Procrustes problem. The plan of the paper is as follows. In Section 2, we solve the 1 Procrustes problem (1) for X ∈ C = O(n1 , n2 ), where O(n1 , n2 ) is the Stiefel manifold of all n1 × n2 orthonormal matrices, i.e. all n1 × n2 full column-rank matrices such that XT X = In2 . The result is extended to the case when some of the entries of the target B are not specified or missing. This is demonstrated in Example 2, where the 1 factor-pattern matrix is found by unweighted orthogonal fitting of a partially specified target for the Holzinger and Harman data of 24 psychological tests [14]. In Section 3, we solve the 1 Procrustes problem

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

(1) for X ∈ C = OB(n1 , n2 ), where OB(n1 , n2 ) is the smooth manifold of all n1 × n2 oblique matrices, i.e. all n1 × n2 full column-rank matrices such that diag(XT X) = In2 . The particular case n1 = n2 is considered separately, because in this case there are two different oblique Procrustes subproblems, important for the applications. Example 2 is continued and the 1 factor-pattern and factor-structure are found by oblique unweighted fitting to partially specified target. In [1], several other constrained manifolds C are studied, e.g. C is the cone of all symmetric positive definite matrices. To our knowledge, such problems do not appear in multivariate analysis applications and are left without consideration here, although they can also be solved by the dynamical system approach employed in the paper. The computations in the numerical examples listed above are carried out in MATLAB 5.3 under Windows NT. The solver used for the initial value problem is ode15s from the MATLAB ODE suite [13,18]. The code ode15s is a quasi-constant step size implementation of the Klopfenstein–Shampine family of numerical differential formulas for stiff systems. Another efficient integrator for the problems is ode23t, while ode23s and ode23tb are not appropriate, especially the former one. The nonstiff integrators are not suitable for the problems. In all examples, the tolerance for the absolute error is set at 10−9 , and for the relative error at 10−5 . This criterion is used to control the accuracy in following the solution path. Higher accuracy does not change the dynamics of the solution and simply needs more CPU. We examine the output values at a time interval of 100. The integration terminates automatically when the relative improvement between two consecutive output points is less than 10−7 indicating that a local minimizer has been found.

2. The weighted orthonormal Procrustes problem (WOPP) The general form of the WOPP can be formulated as follows. Let A ∈ Rm×n1 , B ∈ Rm×p and C ∈ Rn2 ×p be given and consider the problem: minimizeAXC − Bl1 subject to X ∈ O(n1 , n2 ). (4)

1181

The special case of unweighted orthogonal Procrustes problem (n1 = n2 = p and C = Ip ) is considered in [24]. Note that the objective function of the WOPP can be rewritten as AXC − Bl1 = trace[(AXC − B)T sign(AXC − B)] (5) and we approximate sign(AXC − B) with tanh (α(AXC − B)) for some sufficiently large α, e.g. α = 1000. One should bear in mind that the choice of α depends on the particular problem. In practical situations, it always makes sense to solve the problem for several values of α. The gradient ∇ of this approximation to the objective function (5) can be calculated by standard matrix differentiation (e.g. [12]) as the following n1 × n2 matrix: ∇ = AT YCT ,

(6)

where Y = tanh (Z) + Z cosh −2 (Z),

(7)

where denotes the standard elementwise (Hadamard) matrix product and Z = α(AXC − B). The projection of this gradient onto the Stiefel manifold O(n1 , n2 ) at a point X can be calculated as follows (e.g. [4,10]): πO(n1 ,n2 ) (∇) := 21 X(XT ∇ − ∇ T X) + (In1 − XXT )∇. (8) Thus the 1 WOPP (4) can be solved approximately by solving a descent matrix ordinary differential equation (ODE): dX = −πO(n1 ,n2 ) (∇), dt

(9)

where ∇ is given in (6) and the flow is started with an appropriate initial value X0 ∈ O(n1 , n2 ). A first-order derivative necessary conditions for stationary point identification can be readily stated by simply setting the right-hand side of the Eq. (9) to zero. Outside a small neighbourhood of zero residuals AXC − B the necessary conditions simply becomes: • XT AT sign(AXC − B)CT must be symmetric; • (In1 − XXT )AT sign(AXC − B)CT = 0.

1182

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

Note that the corresponding necessary conditions for the least squares WOPP [4] look the same but without the sign function applied to the residual matrix AXC − B. We can extend this result to the situation when the target matrix B is not specified entirely. This is the case in many applications when the user wants to fix some of the elements of B, while leaving the rest of them unspecified, or simply part of the entries in B are unknown or missing. In such situation one is interested to fit only the specified elements of the target. This problem is known in psychometrics as a Procrustes rotation problem to a partially specified target, e.g. [2], and as partial Procrustes problem in computer science and geodesy (global positioning systems), e.g. [9]. So far this type of Procrustes problems has been studied in the literature assuming the least squares discrepancy function. Here we consider the same type of Procrustes problems but with 1 norm discrepancy function. For given fixed A ∈ Rm×n1 , C ∈ Rn2 ×p and V ∈ Rm×p and given partially fixed B ∈ Rm×p , the WOPP to a partially specified target concerns the optimization:

0.0255

minimize(AXC − B) V l1 subject to X ∈ O(n1 , n2 ),

in factor analysis [14]. We apply the algorithm developed in this paper and find approximate 1 unweighted orthogonal Procrustes solution for these data. The initial configuration to be rotated (i.e. the matrix A) is the normalized VARIMAX four-factor solution for 24 psychological tests as provided in Table 10-2 in [14]. The target configuration (i.e. the matrix B) is prepared by the PROMAX method and is given in Table 12-3 in [14]. Hypothetical zero entries can be defined as any element in the matrix B within the interval (−0.06, 0.06), e.g. [14, p. 3017]. The rest of the entries are considered unspecified. Here we have m = 24, n1 = n2 = p = 4 and C = I4 . We solve the problem starting the algorithm with the following rational start. Let U1 DU2 be the SVD of the matrix AT B, then the initial value X0 is taken to be U1 U2T . Note that here we work with the original target B ([14], Table 12-3) before considering part of its entries unspecified. For the particular example we find:   0.9994 −0.0202 −0.0116 −0.0252  0.0203 0.9994 0.0267 −0.0058    X0 =  .  0.0107 −0.0270 0.9995 −0.0139 

(10)

where the matrix V = {vij } is defined as 1 if bij is specified, vij = 0 otherwise.

0.0049

0.0137

0.9996

The solution is given by the following orthogonal matrix obtained with α = 1000:   0.9746 −0.0682 0.2027 −0.0657  0.0407 0.9856 0.1540 0.0559    X∞ =  .  −0.2192 −0.1279 0.9475 −0.1946  0.0196

−0.0864

0.1935

0.9771

With other words, this rotation problem seeks for orthonormal X such that AXC gives the best fit to the specified elements in the target matrix B in 1 sense. The gradient ∇V of the approximation to the objective function in (10) is the following n1 × n2 matrix:

The interested reader can find easily the corresponding approximate 1 factor-matrix AX∞ . Solving the problem with random starts gives practically the same solution but requires more CPU time.

∇V = AT (Y V)CT ,

3. The weighted oblique Procrustes problem (WObPP)

(11)

where the operator Y is defined in (7) and in this case Z = α(AXC−B) V . Then the solution of the WOPP (10) can be found by numerical integration of the ODE (9), where the gradient ∇ should be replaced by ∇V from (11). Example 2. We consider the Holzinger and Harman data of 24 psychological tests which is well-known

The general form of the WObPP can be formulated as follows. Let A ∈ Rm×n1 , B ∈ Rm×p and C ∈ Rn2 ×p be given and consider the problem: minimizeAXC − Bl1 subject to X ∈ OB(n1 , n2 ).

(12)

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

The objective function of the WObPP (12) can be rewritten in the following form: AXC − Bl1 = trace[(AXC − B)T sign(AXC − B)]. (13) As before we approximate sign(AXC − B) by tanh (α(AXC − B)) for some sufficiently large α, e.g. α = 1000. The gradient ∇ of this approximation to the right-hand side of (13) has been already calculated in (6) and (7) by standard matrix differentiation (e.g. [12]). This gradient ∇ differs from AT sign(AXC − B)CT in a small neighbourhood, whose diameter depends on the magnitude of α. The projection of the gradient ∇ of the approximation to (13) onto the feasible set OB(n1 , n2 ) at a point X can be calculated following the technique developed in [21]: πOB(n1 ,n2 ) (∇) := X(XT X)−1 off(XT ∇) + (In − X(XT X)−1 XT )∇,

(14)

where off(X) is the matrix identical to X, but with zero main diagonal. Thus the 1 WObPP (12) can be solved approximately by solving a descent matrix ODE: dX = −πOB(n1 ,n2 ) (∇), dt

(15)

starting the flow with an appropriate initial value X0 ∈ OB(n1 , n2 ). A first-order necessary condition for a stationary point can readily be stated by simply setting the right-hand side of Eq. (15) to zero. Except for AXC − B in a small neighbourhood of the zero in Rm×p , the necessary condition simply becomes: •

XT AT sign(AXC − B)CT

must be an n2 × n2 diagonal matrix; • (In1 − X(XT X)−1 XT )AT sign(AXC − B)CT = 0.

Here we concentrate on the particular case of square oblique rotations (n1 = n2 = n). In this case, OB(n1 , n2 ) is denoted simply by OB(n) and is the smooth manifold of all n × n non-singular matrices such that diag(XT X) = In . The reason to consider the WObPP for square oblique rotations is that in the applications there are

1183

two important WObPP subproblems. The oblique Procrustes problem (12) becomes minimizeAXC − Bl1 subject to X ∈ OB(n)

(16)

and is known in factor analysis as a ‘rotation to factor-structure matrix’ [14]. When an approximation to a factor-pattern matrix [14] is sought one should solve the following oblique Procrustes problem: minimizeAX−T C − Bl1 subject to X ∈ OB(n). (17) The projection of the gradient ∇ from (14) of the approximation to the WObPP objective function (13) onto the feasible set OB(n) considerably simplifies (e.g. [21]): πOB(n) (∇) := X−T off(XT ∇).

(18)

The gradient of the approximation to the objective function in (16) has been already calculated. The corresponding gradient of the approximation to the objective function in (17) can be calculated as the following n × n matrix: ∇ = −X−T CYT AX−T ,

(19)

where the operator Y is defined in (7) and in this case Z = α(AX−T C − B). Its projection onto the feasible set OB(n) at a point X is computed making use of the projection formula (18). Thus the 1 WObPPs (16) and (17) can be approximately solved by solving a descent matrix ODE (15) with the appropriate choice of ∇ and starting the flow with an initial value X0 ∈ OB(n). Except for AXC−B in a small neighbourhood of the zero in Rm×p , the necessary condition of the problem (17) becomes X−1 AT sign(AX−T C − B)CT must be diagonal. (20) Note that the corresponding necessary conditions for the least squares WObPPs [21] look the same, but without the sign function applied to the residual matrices AXC − B and AX−T C − B. As in the previous section, these results can be extended to the situation when the target matrix B is not specified entirely. The unweighted oblique Procrustes problem (C missing) of the type (16) with unspecified target has been studied in [3] assuming least squares discrepancy function. The Procrustes problem of the

1184

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

type (17) with unspecified target has never been solved in the literature. Here we reconsider these two types Procrustes problems with unspecified targets, but with 1 norm discrepancy function. For given fixed A ∈ Rm×n , C ∈ Rn×p and V ∈ m×p R and given partially fixed B ∈ Rm×p , the WObPPs (16) and (17) to a partially specified targets lead to the following optimization problems: minimize(AXC − B) V l1 subject to X ∈ OB(n)

(21)

and, respectively, minimize(AX−T C − B) V l1 subject to X ∈ OB(n),

(22)

where the matrix V is defined in the previous section. The gradient ∇V of the approximation to the objective function in (21) is the following n × n matrix: ∇V = AT (Y V)CT ,

(23)

where the operator Y is defined in (7) and in this case Z = α(AXC − B) V . Similarly, the gradient ∇V of the approximation to the objective function in (22) is the following n × n matrix: ∇V = −X−T C(Y V)T AX−T ,

(24)

where the operator Y is defined in (7) and in this case Z = α(AX−T C − B) V . Example 2 (continued). We consider again the Holzinger and Harman data of 24 psychological tests [14] to illustrate the 1 solutions of both of the oblique Procrustes subproblems (21) and (22). We start with the WObPP (21). We solve the problem starting the algorithm with the following rational start. The initial value X0 is taken to be X0 = A\(B/C) in MATLAB [13] notations and then normalized to diag(X0T X0 ) = In . For the particular example we find:   0.9588 −0.0849 −0.2056 −0.1296  −0.0950 0.9675 −0.1366 −0.1994    X0 =  .  −0.2191 −0.1557 0.9498 −0.1664  −0.1539 −0.1803 −0.1921

0.9569

The solution is given by the following oblique matrix obtained with α = 1000:   0.9632 −0.1443 −0.1969 −0.0880    −0.1576 0.9609 −0.1299 −0.0907    X∞ =  .  −0.1592 −0.0766 0.9531 −0.1865    −0.1486 −0.2236 −0.1897

0.9743

Next we solve the WObPP (22) starting the algorithm with the following rational start. The initial value X0 is taken to be X0 := [CT \(BT /AT )]−T in MATLAB [13] notations and then normalized to diag(X0T X0 ) = In . For the particular example we find:   0.9259 0.1624 0.2764 0.2235    0.1874 0.9422 0.2383 0.2844    X0 =    0.2528 0.2029 0.8960 0.2460    0.2090 0.2116 0.2529 0.8992 The solution is given by the following oblique matrix obtained with α = 1000:   0.9383 0.1893 0.2153 0.2101    0.1954 0.9722 0.1579 0.2605    Xι,∞ =    0.2432 0.0672 0.9394 0.1390    0.1496 0.1201 0.2151 0.9320 The corresponding approximate 1 factor-matrices can be easily found: AX∞ for the factor-structure matrix and AX−T ι,∞ for the factor-pattern matrix. These two 1 solutions can be compared with the corresponding least squares solutions given in [14, Tables 12–7 and 12–8a]. Note that the primary-factor-pattern matrix reported in [14, Table 12–8a] is not a result of direct computation (Procrustes matching), but it is derived from the already found reference-structure matrix and reported in [14, Table 12–7]. Solving the problems with random starts gives practically the same solutions but requires more CPU time. The rational start based on SVD and used in Example 2 for the WOPP can be pretty good choice for such unweighted oblique Procrustes problems as well.

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

The correlations among the primary factors are given in 

1.0000 0.4019 0.4935 0.4213



 0.4019 1.0000 0.2833 0.4144    T Xι,∞ Xι,∞ =  .  0.4935 0.2833 1.0000 0.4174  0.4213 0.4144 0.4174 1.0000 Obviously, these 1 primary factors are considerably less correlated than those reported in [14, Table 12–8b]. The 1 versions of the Procrustes problems considered in this paper have been solved numerically by projection “off the shelf” solvers such as ode15s from the MATLAB ODE suite [13,18]. The algorithms for these 1 Procrustes problems are more CPU time consuming comparing to their 2 analogues considered in [4]. There are a number of alternative approaches for solving such matrix matching problems with respect to the least squares discrepancy function. They work directly on the constraint matrix manifold C, e.g. [5,7,22], and can be adapted for application to other smooth discrepancy functions, as that one employed in this paper.

Acknowledgements The author thanks the anonymous referee for the comments and suggestions which improve and cleared the motivation of the work.

References [1] L.E. Andersson, T. Elfving, A constrained Procrustes problem, SIAM J. Matrix Anal. Appl. 18 (1997) 124–139. [2] M.W. Browne, Orthogonal rotation to partially specified target, Br. J. Math. Stat. Psychol. 25 (1972) 115–120. [3] M.W. Browne, Oblique rotation to a partially specified target, Br. J. Math. Stat. Psychol. 25 (1972) 207–212. [4] M.T. Chu, N.T. Trendafilov, The orthogonally constrained regression revisited, J. Comput. Graph. Stat. 10 (4) (2001) 746–771. [5] N. Del Buono, L. Lopez, Geometric integration on manifold of square oblique rotation matrices, SIAM J. Matrix Anal. Appl. 23 (4) (2002) 974–989. [6] I.L. Dryden, K.V. Mardia, Statistical Shape Analysis, Wiley, Chichester, 1998.

1185

[7] A. Edelman, T.A. Arias, S.T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Mater. Anal. Appl. 20 (2) (1998) 303–353. [8] J.C. Gower, Multivariate analysis: ordination, multidimensional scaling and allied topics, in: E. Lloyd (Ed.), Handbook of Applicable Mathematics, vol. VI, Statistics, Part B, Wiley, New York, 1984. [9] M. Gulliksson, Algorithms for the partial Procrustes problem, Report UMINF-95, Department of Computer Science, Umeå University, 1995. [10] U. Helmke, J.B. Moore, Optimization and Dynamical Systems, Springer, London, 1994. [11] N.J. Higham, Matrix Procrustes problems, 1994. http://www. ma.mman.ac.uk/∼higham/talks.html. [12] J.R. Magnus, H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, Wiley, New York, 1988. [13] MATLAB, Using MATLAB Ver. 5, The MathWorks Inc., 1999. [14] S. Mulaik, The Foundations of Factor Analysis, McGraw-Hill, New York, 1972. [15] M.R. Osborne, B. Presnell, B.A. Turlach, On the LASSO and its dual, J. Comput. Graph. Stat. 9 (2000) 319–337. [16] F.J. Rohlf, D. Slice, Extensions of the Procrustes method for the optimal superimposition of landmarks, Syst. Zool. 39 (1) (1990) 40–59. [17] P.J. Rousseeuw, A.M. Leroy, Robust Regression and Outlier Detection, Wiley, New York, 1987. [18] L.F. Shampine, M.W. Reichelt, The MATLAB ODE suite, SIAM J. Sci. Comput. 18 (1997) 1–22. [19] A.F. Siegel, R.H. Benson, A robust comparison of biological shapes, Biometrics 38 (1982) 341–350. [20] A.F. Siegel, J.R. Pinkerton, Robust comparison of threedimensional shapes with an application to protein molecule configurations, Technical Report #217, Series 2, Department of Statistics, Princeton University, 1982. [21] N.T. Trendafilov, A continuous-time approach to the oblique Procrustes problem, Behaviormetrika 26 (2) (1999) 167–181. [22] N.T. Trendafilov, R.A. Lippert, The multi-mode Procrustes problem, Lin. Alg. Appl. 349 (2002) 245–264. [23] N.T. Trendafilov, G.A. Watson, The 1 oblique Procrustes problem, Statistics and Computing, submitted for publication. [24] P. Verboon, A Robust Approach to Nonlinear Multivariate Analysis, DSWO Press, Leiden, 1994. [25] G.A. Watson, The solution of the orthogonal Procrustes problems for a family of orthogonally invariant norms, Adv. Comput. Math. 26 (2) (1993) 167–181. Nickolay T. Trendafilov was born in Burgas, Bulgaria in 1954. He received his MSc and PhD in Mathematical Sciences in 1979 and 1986, both from University of Sovia “St. Kl. Ohridski”. In 1986, he joined the Laboratory of Computational Stochastic, Bulgarian Academy of Sciences. He held Research Associate position at NORC, The University of Chicago in 1991–1993, and visiting

1186

N.T. Trendafilov / Future Generation Computer Systems 19 (2003) 1177–1186

positions at School of Education, Nagoya University in 1996, Department of Mathematics, University of Bari in 1999, and Department of Electrical Engineering, Catholic University Leuven in 1997–2000. Currently he is a Senior Lecturer in

Statistics, School of Mathematical Sciences, The University of the West of England. His research interests are in the fields of multivariate data analysis, matrix computing and dynamical systems on matrix manifolds.