Linear Algebra and its Applications 490 (2016) 145–161
Contents lists available at ScienceDirect
Linear Algebra and its Applications www.elsevier.com/locate/laa
Structured Procrustes problem Bibhas Adhikari a , Rafikul Alam b,∗ a b
Department of Mathematics, IIT Kharagpur, India Department of Mathematics, IIT Guwahati, India
a r t i c l e
i n f o
Article history: Received 5 January 2015 Accepted 24 October 2015 Available online 12 November 2015 Submitted by V. Mehrmann MSC: 15A24 65F15 65F18 15A18 Keywords: Structured matrices Jordan and Lie algebras Moore–Penrose pseudoinverse SVD Hadamard product Procrustes problem
a b s t r a c t Let S be a class of structured matrices. Given a pair of matrices X and B, we consider the structured Procrustes problem (SPP) A = argmin GX − BF G∈S
and provide a complete solution when S is either a Jordan algebra or a Lie algebra associated with an orthosymmetric scalar product. We characterize and determine all solutions of the structured Procrustes problem as well as those solutions which have the smallest norm. We show that, for the spectral norm, there may be infinitely many smallest norm solutions of the structured Procrustes problem whereas, for the Frobenius norm, the smallest norm solution is unique. © 2015 Elsevier Inc. All rights reserved.
* Corresponding author. Fax: +91 361 2690762/2582649. E-mail addresses:
[email protected] (B. Adhikari), rafi
[email protected] (R. Alam). http://dx.doi.org/10.1016/j.laa.2015.10.026 0024-3795/© 2015 Elsevier Inc. All rights reserved.
146
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
1. Introduction Let {x1 , . . . , xp } and {b1 , . . . , bp } be two sets of vectors in Kn and let S be a class of structured matrices in Kn×n , where K = R or C. In many applications, it is necessary to determine a structured matrix A ∈ S that minimizes p
Axj − bj 22 or
j=1
p
wj Axj − bj 22 ,
j=1
where w1 , . . . , wp are positive weights. Equivalently, setting X := [x1 , · · · , xp ], B := √ √ [b1 , · · · , bp ] and W := diag( w1 , · · · , wp ), we have the structured Procrustes problem (SPP) A = argmin GX − BF or A = argmin GXW − BW F . G∈S
G∈S
Notice that the weighted minimization problem can be reduced to an un-weighted (absoˆ := XW and B ˆ := BW . It is therefore sufficient lute) minimization problem by setting X to consider the Procrustes problem with W = Ip , the identity matrix. The orthogonal Procrustes problem, in which case S is chosen to be the (special) group of orthogonal/unitary matrices, arises in many applications such as in computer vision, factor analysis, and in multidimensional scaling, see [9,19,7,17,8] and references therein. On the other hand, the symmetric Procrustes problem, in which case S is chosen to be the set of symmetric/Hermitian (positive definite) matrices, arises in applications such as the determination of spacecraft altitudes and the stiffness matrix or the compliance matrix of an elastic structure [5,13,11]. Indeed, in the case of the stiffness matrix of an elastic structure, the columns of X represent displacements of the elastic structure and the columns of B represent the forces acting on the elastic structure, and are related by AX = B, where A is a symmetric positive definite stiffness matrix which is required to be estimated from the experimental data X and B. Since the experimental data X and B are contaminated by errors, an optimal choice of A is obtained by solving the Procrustes problem A = argmin GX − BF , G∈S
(1)
where S is the set of symmetric positive definite matrices. See also [21,23] for positive semidefinite and skew-Hermitian Hamiltonian Procrustes problems. A slight generalization of (1) in which case A is additionally required to be closest to a given fixed matrix K (that is, A − KF is minimized) has been considered in [18] when S is the set of real symmetric matrices and in [22] when S is the set of real skewsymmetric matrices.1 1
We thank one of the referees for bringing these papers to our notice.
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
147
Structured eigenvalue perturbation theory for structured matrices, structured matrix distance problems, and the passivation problem for a stable linear time-invariant control system crucially depend [3,4,10,20] on solutions of the structured mapping problem. Given a pair of matrices X and B in Kn×p and a class S of structured matrices, finding a matrix A ∈ S such that AX = B is known as the structured mapping problem [2,14,15]. For example, the passivation problem for a stable linear time-invariant (LTI) control system x˙ = Ax + Bu, x(0) = 0, y = Cx + Du, gives rise to a structured mapping problem for Hamiltonian matrices [3,4,10], where A ∈ Kn×n , B ∈ Kn×p , C ∈ Kp×n , D ∈ Kp×p , u is the input, x is the state and y is the output. Note that the structured mapping problem can be reformulated as the structured Procrustes problem. Thus, the structured Procrustes problem (1) subsumes the structured mapping problem AX = B and provides a robust framework for solutions of the structured mapping problem. Moreover, it provides an optimal solution when the structured mapping problem is inconsistent and has no solution. See [2,1] for consistency and solutions of the structured mapping problem. Motivated by these applications, we consider the structured Procrustes problem for various structured matrices. More precisely, we consider the structured Procrustes problem when S is either a Jordan algebra or a Lie algebra associated with an appropriate scalar product on Kn . The Jordan/Lie algebra framework encompasses many important classes of structured matrices such as Hamiltonian, skew-Hamiltonian, symmetric, skew-symmetric, pseudosymmetric, persymmetric, Hermitian, skew-Hermitian, pseudoHermitian, pseudo-skew-Hermitian, to name only a few, see [14]. Problem-SPP (Structured Procrustes problem). Let S ⊂ Kn×n be a class of structured matrices and let X and B be matrices in Kn×p . Set SP(X, B) := {A ∈ S : A = argmin GX − BF }, G∈S
S
ρ (X, B) := min{AX − BF : A ∈ S}, σ S (X, B) := min{A : A = argmin GX − BF }, G∈S
SPmin (X, B) := {A ∈ SP(X, B) : A = σ S (X, B)}, where · is either the spectral norm or the Frobenius norm. Determine ρS (X, B) and all matrices in SP(X, B). Also determine σ S (X, B) and all matrices in SPmin (X, B). Observe that by considering X to be the identity matrix in Problem-SPP we obtain a best structured approximation of B from S.
148
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
The main contributions of this paper are as follows. We provide a complete solution of the structured Procrustes problem when the class S ⊂ Kn×n of structured matrices is either a Jordan algebra or a Lie algebra associated with an orthosymmetric scalar product on Kn . We characterize the solution set SP(X, B) and show that SP(X, B) is an affine subspace. Further, we characterize the smallest norm solution set SPmin (X, B) and show that SPmin (X, B) is a convex (resp., singleton) set when · is the spectral (resp., Frobenius) norm. Furthermore, we provide a characterization of SP(X, B) via Lyapunov equation which provides an alternative approach to solving the Procrustes problem. Notation. Let Km×n denote the set of all m-by-n matrices with entries in K, where K = R or C. We denote the transpose of a matrix X ∈ Km×n by X T , the conjugate ¯ We often write A∗ for ∗ ∈ {T, H} to transpose by X H and the complex conjugate by X. denote the transpose AT or the conjugate transpose AH . We denote the Moore–Penrose pseudoinverse of X by X † . The spectral norm X2 and the Frobenius norm XF of X are given by X2 := max{Xu2 : u2 = 1} and XF := where u2 := matrix A.
√
Tr(X H X),
uH u is the usual innerproduct and Tr(A) is the trace of a square
2. Structured matrices We now briefly define structured matrices that we consider in this paper; see [14] for further details. Let M ∈ Kn×n be unitary. Assume further that M is either symmetric or skew-symmetric or Hermitian or skew-Hermitian. Define the scalar product · , ·M : Kn × Kn → K by x, yM :=
bilinear form, y T M x, y H M x, sesquilinear form.
(2)
Then for A ∈ Kn×n there is a unique adjoint operator A relative to the scalar product (2) such that Ax, yM = x, A yM for all x, y ∈ Kn . The adjoint A is explicitly given by
A =
bilinear form, M −1 AT M, M −1 AH M, sesquilinear form.
(3)
Consider the Lie algebra L and the Jordan algebra J associated with the scalar product (2) given by L := {A ∈ Kn×n : A = −A} and J := {A ∈ Kn×n : A = A}.
(4)
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
149
In this paper, we consider S = L or S = J and refer to the matrices in S as structured matrices. The algebras so defined provide a general framework for analyzing many important classes of structured matrices including Hamiltonian, skew-Hamiltonian, symmetric, skew-symmetric, pseudosymmetric, persymmetric, Hermitian, skew-Hermitian, pseudoHermitian, pseudo-skew-Hermitian matrices, to name only a few, see Table 2.1 in [14]. For the rest of the paper, we set sym := {A ∈ Kn×n : AT = A}, skew-sym := {A ∈ Kn×n : AT = −A}, Herm := {A ∈ Cn×n : AH = A}, skew-Herm := {A ∈ Cn×n : AH = −A}. Also define the set M S by M S := {M A : A ∈ S}. Then, in view of (3) and (4), it follows that S ∈ {J, L} ⇐⇒ M S ∈ {sym, skew-sym, Herm, skew-Herm}.
(5)
This shows that the four classes of structured matrices, namely, symmetric, skewsymmetric, Hermitian and skew-Hermitian matrices are prototypes of more general structured matrices belonging to the Jordan and Lie algebras given in (4). A C Let A, B, C and D be matrices. Then the matrix T := is called a dilation B D of A. The norm preserving dilation problem is then stated as follows. Given matrices A, B, C and a positive number A μ ≥ max B , [ A
C ]2 ,
(6)
2
A find all possible D such that B
C ≤ μ. D 2
matrices. Then for Theorem 2.1. (Davis–Kahan–Weinberger [6]) Let A, B, C be given A C any positive number μ satisfying (6), there exists D such that B D ≤ μ. Indeed, 2 those D which have this property are exactly those of the form D = −KAH L + μ(I − KK H )1/2 Z(I − LH L)1/2 , where K H := (μ2 I − AH A)−1/2 B H , L := (μ2 I − AAH )−1/2 C and Z is an arbitrary contraction, that is, Z2 ≤ 1. We mention that when (μ2 I −AH A) is singular, the inverses in K H and L are replaced by their Moore–Penrose pseudo-inverses (see, [16]). An interesting fact about Theorem 2.1 is that if T (D) is symmetric (resp., skew-symmetric, Hermitian, skew-Hermitian) then the solution matrices D are symmetric (resp., skew-symmetric, Hermitian, skewHermitian).
150
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
Let S ∈ {sym, skew-sym, Herm, skew-Herm} and A ∈ S be given by
A11 A := A12
1/2 A11 2 ±A∗12 2 2 . Then AF = 2 − A11 F + A22 F A22 A12 F
(7)
where ∗ ∈ {T, H}. We repeatedly use this fact in the sequel. 3. Solution of the structured Procrustes problem Consider the Lie algebra L and the Jordan algebra J given in (4) which are associated with the scalar product induced by M . Let S ∈ {L, J}. For any subset V ⊂ S, we define the set M V by M V := {M A : A ∈ S}. Then we have the following. Theorem 3.1. Let (X, B) ∈ Kn×p × Kn×p and S ∈ {J, L}. Then we have G(X, B) ∈ M SP(X, B) ⇐⇒ M −1 G(X, M B) ∈ SP(X, B) and ρS (X, B) = ρM S (X, M B). Further, we have G(X, B) ∈ M SPmin (X, B) ⇐⇒ M −1 G(X, M B) ∈ SPmin (X, B) and σ S (X, B) = σ M S (X, M B). Proof. Since M is unitary and AX − BF = M AX − M BF for A ∈ S, it follows that min AX − BF = min GX − M BF . A∈S
G∈M S
Hence it follows that ρS (X, B) = ρM S (X, M B) and G(X, B) = arg min AX − BF ⇐⇒ M −1 G(X, M B) = argmin AX − BF . A∈M S
A∈S
This shows that G(X, B) ∈ M SP(X, B) ⇐⇒ M −1 G(X, M B) ∈ SP(X, B). Further, we have M A2 = A2 and M AF = AF . Consequently, for the spectral and the Frobenius norms, we have σ S (X, B) = σ M S (X, M B) and G(X, B) ∈ M SPmin (X, B) ⇐⇒ M −1 G(X, M B) ∈ SPmin (X, B). This completes the proof. 2 Theorem 3.1 shows that G(X, B) is a solution of Problem-SPP for the structured matrices M S if and only if A := M −1 G(X, M B) is a solution of ProblemSPP for the general case when S ∈ {J, L}. Consequently, in view of (5), the structured Procrustes problem can be solved by reducing it to the special case when S ∈ {sym, skew-sym, Herm, skew-Herm}. We proceed as follows.
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
151
Let (X, B) ∈ Kn×p × Kn×p . Suppose that rank(X) = r and consider the “trimmed” SVD X = U1 Σ1 V1H , where Σ1 = diag{σ1 , . . . , σr }. Let D ∈ Cr×r be given by Dij := 1/(σi2 + σj2 ). For the rest of this section, define T F± (X, B) := U 1 [D ◦ (U1T BV1 Σ1 ± Σ1 V1T B T U1 )]U1H ± (BX † )T (I − XX † )
+ (I − XX † )T BX † H F+ (X, B) := U1 [D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 )]U1H + (BX † )H (I − XX † )
+ (I − XX † )H BX † where ◦ denotes the Hadamard product and X † is the Moore–Penrose pseudoinverse of X. We repeatedly employ the following elementary result. Lemma 3.2. Let α and β be real numbers and b1 , b2 ∈ C. Then minx∈C (|xα − b1 |2 + |xβ − αb1 + βb2 b2 |2 ) is attained at x = 2 . α + β2 Theorem 3.3 characterizes the solution set SP(X, B) and determines ρS (X, B). A similar result can be found in [11,18] for real symmetric matrices and in [22] for real skewsymmetric matrices. Theorem 3.3. Let (X, B) ∈ Kn×p × Kn×p and S ∈ {L, J}. Suppose that rank(X) = r and consider the SVD X = U ΣV H . Partition U = [U1 U2 ] and V = [V1 V2 ], where U1 ∈ Kn×r and V1 ∈ Kp×r . Let Σ1 := Σ(1 : r, 1 : r) = diag(σ1 , . . . , σr ). Then we have ⎧ (D ◦ (U1∗ M BV1 Σ1 + Σ1 V1∗ (M B)∗ U1 ))Σ1 − U1∗ M BV1 2F + BV2 2F , ⎪ ⎪ ⎪ ⎪ ⎨ if M S ∈ {sym, Herm}, ρS (X, B) = ⎪ ⎪ (D ◦ (U1T M BV1 Σ1 − Σ1 V1T (M B)T U1 ))Σ1 − U1T M BV1 2F + BV2 2F , ⎪ ⎪ ⎩ if M S = skew-sym, where ∗ = T when M S = sym, and ∗ = H when M S = Herm. Here D ∈ Cr×r is given 1 by Dij := 2 and ◦ denotes the Hadamard product. σi + σj2 We have A ∈ SP(X, B) if and only if A is of the form ⎧ −1 T −1 † T † ⎪ ⎨ M F+ (X, M B) + M (I − XX ) Z(I − XX ), if M S = sym T A = M −1 F− (X, M B) + M −1 (I − XX † )T Z(I − XX † ), if M S = skew-sym ⎪ ⎩ −1 H M F+ (X, M B) + M −1 (I − XX † )H Z(I − XX † ), if M S = Herm for some Z ∈ M S.
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
152
Proof. Observe that, in view of Theorem 3.1 and (5), we only need to consider the case when S ∈ {sym, skew-sym, Herm, skew-Herm}. Note that G(X, B) is skew-Hermitian solution of minA AX − BF if and only of −iG(X, iB) is a Hermitian solution of minA AX − BF . Consequently, we only need to prove the result for the case when S ∈ {sym, skew-sym, Herm}. Suppose that S = sym. Then we have
H
A = U U AU U
H
= [ U1
A11 U2 ] A12
AT12 A22
U1H U2H
where A11 = AT11 ∈ Cr×r and, A12 and A22 = AT22 are matrices of compatible sizes. Then A11 AX − B2F = U T AU U H X − U T B2F = A12
AT12 A22
T 2 U1H X U1 B − 0 U2T B F
= A11 U1H X − U1T B2F + A12 U1H X − U2T B2F . Note that A12 U1H X − U2T B2F = A12 U1H U ΣV H − U2T B2F = A12 U1H U Σ − U2T BV 2F = A12 Σ1 − U2T BV1 2F + U2T BV2 2F ≥ U2T BV2 2F and the minimum is attained when A12 Σ1 −U2T BV1 = 0, that is, when A12 = U2T BV1 Σ−1 1 . Also note that A11 U1H X − U1T B2F = A11 Σ1 − U1T BV1 2F + U1T BV2 2F is minimized when A11 Σ1 − U1T BV1 2F is minimized over the symmetric matrices A11 . Suppose that A11 = [aij ] and U1T BV1 = [bij ]. Then by Lemma 3.2 r r
A11 Σ1 − U1T BV1 2F =
(|aij σj − bij |2 + |aij σi − bji |2 ) +
i
is minimized when aij =
σj bij + bji σi , aij = aji , for i, j = 1 : r. σi2 + σj2
This yields A11 = D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ). Hence
r i=1
|σi aii − bii |2
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
153
ρS (X, B) = (D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ))Σ1 − U1T BV1 2F + U1T BV2 2F + U2T BV2 2F = (D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ))Σ1 − U1T BV1 2F + BV2 2F as desired. Now, substituting A11 and A12 , we have T T Σ−1 1 V1 B U2 UH A22
D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ) A=U U2T BV1 Σ−1 1
(8)
which upon simplification yields A = U 1 [D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 )]U1H + (BX † )T (I − XX † ) + (I − XX † )T BX † + (I − XX † )T Z(I − XX † ), for some Z ∈ S, as desired. Next, suppose that S = skew-sym. Then we have
A = [ U1
A11 U2 ] A12
−AT12 A22
U1H U2H
where A11 = −AT11 ∈ Cr×r and, A12 and A22 = −AT22 are matrices of compatible sizes. Then as before AX − B2F = A11 U1H X − U1T B2F + A12 U1H X − U2T B2F is minimized when A12 = U2T BV1 Σ−1 and A11 U1H X − U1T B2F is minimized over all 1 skew-symmetric matrices A11 . Again by Lemma 3.2, a minimizer is given by aij =
σj bij − σi bji , aji = −aij for i, j = 1 : r, σi2 + σj2
where U1T BV1 = [bij ], that is, A11 = D ◦ (U1T BV1 Σ1 − Σ1 V1T B T U1 ). Consequently, we have ρS (X, B) = (D ◦ (U1T BV1 Σ1 − Σ1 V1T B T U1 ))Σ1 − U1T BV1 2F + BV2 2F and
D ◦ (U1T BV1 Σ1 − Σ1 V1T B T U1 ) A=U U2T BV1 Σ−1 1
T T −Σ−1 1 V1 B U2 UH A22
which upon simplification yields the desired form of A.
(9)
154
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
Finally, suppose that S = Herm. Then we have A11 A = U U H AU U H = [U1 U2 ] A12
AH 12 A22
U1H , U2H
r×r where AH and, A12 and AH 11 = A11 ∈ C 22 = A22 are matrices of compatible sizes. Again note that
AX − B2F = A11 U1H X − U1H B2F + A12 U1H X − U2H B2F is minimized when A12 = U2H BV1 Σ−1 and A11 U1H X − U1H B2F is minimized over all 1 Hermitian matrices A11 . In view of Lemma 3.2, a minimizer is given by aij =
σj bij + σi bji , aji = aij for i, j = 1 : r, σi2 + σj2
where U1H BV1 = [bij ]. In other words, A11 = D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 ). Thus, we have ρS (X, B) = (D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 ))Σ1 − U1H BV1 2F + BV2 2F and A=U
D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 ) U2H BV1 Σ−1 1
H H Σ−1 1 V1 B U2 UH A22
(10)
which upon simplification yields the desired form of A. This completes the proof. 2 Theorem 3.4 characterizes the smallest norm solution set SPmin (X, B) and determines σ (X, B). For the Frobenius norm, a similar result can be found in [11,18] for real symmetric matrices and in [22] for real skewsymmetric matrices. S
Theorem 3.4. Let (X, B) ∈ Kn×p × Kn×p and S ∈ {L, J}. Suppose that rank(X) = r and consider the SVD X = U ΣV H . Partition U = [U1 U2 ] and V = [V1 V2 ], where U1 ∈ Kn×r and V1 ∈ Kp×r . Let Σ1 := Σ(1 : r, 1 : r) = diag(σ1 , . . . , σr ). 1. Frobenius norm: We have ⎧ 2 ⎪ D ◦ (U1∗ M BV1 Σ1 + Σ1 V1∗ (M B)∗ U1 )2F + 2U2∗ M BV1 Σ−1 ⎪ 1 F , ⎪ ⎪ ⎪ ⎨ if M S ∈ {sym, Herm}, σ S (X, B) = ⎪ 2 ⎪ D ◦ (U1T M BV1 Σ1 − Σ1 V1T (M B)T U1 )2F + 2U2T M BV1 Σ−1 ⎪ 1 F , ⎪ ⎪ ⎩ if M S = skew-sym, where ∗ = T when M S = sym, and ∗ = H when M S = Herm. Here D ∈ Cr×r is 1 given by Dij := 2 and ◦ denotes the Hadamard product. σi + σj2
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
155
Further, we have SPmin (X, B) = {Ao }, that is, the smallest norm solution is unique and is given by ⎧ −1 T ⎪ ⎨ M F+ (X, M B), if M S = sym, T Ao = M −1 F− (X, M B), if M S = skew-sym, ⎪ ⎩ −1 H M F+ (X, M B), if M S = Herm. 2. Spectral norm: We have ⎧ D ◦ (U1∗ M BV1 Σ1 + Σ1 V1∗ (M B)∗ U1 ) ⎪ =: μ1 , ⎪ ⎪ ⎪ U2∗ M BV1 Σ−1 ⎪ 1 2 ⎪ ⎪ ⎨ if M S ∈ {sym, Herm}, σ S (X, B) = D ◦ (U1T M BV1 Σ1 − Σ1 V1T (M B)T U1 ) ⎪ ⎪ =: μ2 , ⎪ ⎪ −1 T ⎪ U M BV Σ ⎪ 1 2 1 2 ⎪ ⎩ if M S = skew-sym, where ∗ = T when M S = sym, and ∗ = H when M S = Herm. Further, we have Ao ∈ SPmin (X, B) if and only if Ao = M −1 G(X, M B), where
G(X, B) =
⎧ T ⎪ F+ (X, B) − (I − XX † )T K[D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 )] ⎪ ⎪ ⎪ ⎪ ⎪ × K T (I − XX † ) + φ(Z), if M S = sym, ⎪ ⎪ ⎪ ⎪ ⎨ F T (X, B) − (I − XX † )T K[D ◦ (U T BV Σ − Σ V T B T U )] −
1
1
1
1 1
1
⎪ × K T (I − XX † ) + φ(Z), if M S = skew-sym, ⎪ ⎪ ⎪ ⎪ ⎪ H ⎪ F+ (X, B) − (I − XX † )H K[D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 )] ⎪ ⎪ ⎪ ⎩ × K H (I − XX † ) + φ(Z), if M S = Herm,
where
K=
⎧ ⎪ BX † U1 [μ21 I − (D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 )) ⎪ ⎪ ⎪ ⎪ ⎪ × (D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ))]−1/2 , if M S = sym, ⎪ ⎪ ⎪ ⎪ ⎨ BX † U [μ2 I − (D ◦ (U T BV Σ − Σ V T B T U )) 1
2
1
1
1
1 1
1
⎪ × (D ◦ (U1T BV1 Σ1 − Σ1 V1T B T U1 ))]−1/2 , if M S = skew-sym, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ BX † U1 [μ21 I − (D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 )) ⎪ ⎪ ⎪ ⎩ × (D ◦ (U1H BV1 Σ1 + Σ1 V1H B H U1 ))]−1/2 , if M S = Herm,
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
156
and ⎧ μ1 U 2 [I − U2T KK H U 2 ]1/2 Z[I − U2H KK T U2 ]1/2 U2H , Z = Z T , Z2 ≤ 1 ⎪ ⎪ ⎪ ⎪ ⎪ if M S = sym, ⎪ ⎪ ⎪ ⎪ ⎨ μ U [I + U T KK H U ]1/2 Z[I + U H KK T U ]1/2 U H , Z = −Z T , Z ≤ 1 2 2 2 2 2 2 2 2 φ(Z) = ⎪ if M S = skew-sym, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ μ U [I − U2H KK H U2 ]1/2 Z[I − U2H KK H U2 ]1/2 U2H , Z = Z H , Z2 ≤ 1 ⎪ ⎪ 1 2 ⎩ if M S = Herm. Proof. By the same arguments as in the proof of Theorem 3.3, we only need to prove the results for the case when S ∈ {sym, skew-sym, Herm}. So, suppose that S = sym. Then by Theorem 3.3 and (8) we have
D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ) Ao = U U2T BV1 Σ−1 1
T T Σ−1 1 V1 B U2 UH. A22
This shows that 2 2 Ao F = D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 )2F + 2U2T BV1 Σ−1 1 F + A22 F is minimized when A22 = 0 which yields the desired results for the Frobenius norm. For spectral norm, note that Ao 2 ≥ μ1 , where D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ) . μ1 = U T BV1 Σ−1 2
1
2
By Theorem 2.1, we have Ao 2 = μ1 when A22 = −K(D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ))K T + μ1 (I − KK H )Z(I − KK T )1/2 where K = U2T BX † U1 μ21 I − (D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 )) × (D ◦ (U1T BV1 Σ1 + Σ1 V1T B T U1 ))
−1/2
and Z is an arbitrary contraction such that Z = Z T . Hence the desired form of Ao follows for the spectral norm. The proof is similar when S = Herm and follows from Theorem 3.3 and (10). So, suppose that S = skew-sym. Then by Theorem 3.3 and (9), we have Ao = U
D ◦ (U1T BV1 Σ1 − Σ1 V1T B T U1 Σ1 ) U2T BV1 Σ−1 1
T T −Σ−1 1 V1 B U2 UH. A22
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
157
As before, setting A22 = 0, the desired results follow for the Frobenius norm. On the other hand, applying Theorem 2.1 to the matrix Ao and following steps similar to those in the case when S = sym, we obtain the desired results for the spectral norm. 2 Remark 3.5. We mention that the inverses in the expression of K in Theorem 3.4 are replaced with their Moore–Penrose pseudo-inverses when the matrices are singular. As an immediate consequence of Theorems 3.3 and 3.4, we have the following result. Corollary 3.6. Let (X, B) ∈ Kn×p × Kn×p and S ∈ {L, J}. Then the following results hold. (a) The solution set SP(X, B) is an affine subspace of S. In particular, if the row-rank of X equals rank(X) then SP(X, B) = SPmin (X, B) is a singleton set. (b) For the Frobenius norm, the smallest norm solution set SPmin (X, B) is a singleton subset of SP(X, B). (c) For the spectral norm, the smallest norm solution set SPmin (X, B) is a nontrivial convex subset of SP(X, B) when X is rank deficient. We now show that solutions of Hermitian, real symmetric and real skew-symmetric Procrustes problems can be characterized via solutions of Lyapunov equation. This in turn would provide a characterization of solutions of the structured Procrustes problem for the case when M S = Herm and when M S ∈ {sym, skew-sym} with K = R. Note that Herm = sym when K = R. Hence solutions of the real symmetric Procrustes problem follow from the Hermitian Procrustes problem by considering K = R. We therefore only need to characterize SP(X, B) via Lyapunov equation for the case when S = Herm and when S = skew-sym with K = R. Theorem 3.7. Let (X, B) ∈ Kn×p × Kn×p and S ∈ {sym, skew-sym, Herm}. Then we have A ∈ SP(X, B) ⇐⇒ A ∈ S and re(Tr(Y H (AX − B)X H )) = 0 for all Y ∈ S. Consider the Lyapunov operator LX : Kn×n → Kn×n given by LX (A) = XX H A + AXX H . Then the following results hold. (a) Let S := Herm. Then SP(X, B) = {A ∈ S : LX (A) = BX H + XB H }. In particular, if the row-rank of X equals rank(X) then
SP(X, B) =
⎧∞ ⎨ ⎩
0
e−(XX
H
)t
(BX H + XB H )e−(XX
H
)t
⎫ ⎬ dt
⎭
158
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
which shows that the Hermitian Procrustes problem has a unique positive definite (resp., semi-definite) solution whenever BX H + XB H is positive definite (resp., semi-definite). (b) Let K := R and S := skew-sym. Then SP(X, B) = {A ∈ S : LX (A) = BX T − XB T }. In particular, if the row-rank of X equals rank(X) then SP(X, B) =
⎧∞ ⎨ ⎩
0
e−(XX
T
)t
(BX T − XB T )e−(XX
T
)t
⎫ ⎬ dt
⎭
which shows that the skew-symmetric Procrustes problem has a unique solution. Proof. For Y ∈ S, define φY : R → R by φY (t) := (A + tY )X − B2F . Then we have φY (t) = AX − B2F + 2t re(Tr(Y H (AX − B)X H )) + t2 Y X2F which shows that φY (0) = 2 re(Tr(Y H (AX − B)X H )). Now if A ∈ SP(X, B) then obviously φY (t) has a global minimum at t = 0 for all Y ∈ S and hence φY (0) = 0 for all Y ∈ S gives the desired condition. Conversely, suppose that re(Tr(Y H (AX − B)X H )) = 0 for all Y ∈ S. Then we have (A + Y )X − B2F = AX − B2F + 2 re(Tr(Y H (AX − B)X H )) + Y X2F ≥ AX − B2F for all Y ∈ S which shows that A ∈ SP(X, B). Next, let S := Herm and A ∈ S. Then for all Y ∈ S we have 2 re(Tr(Y H (AX − B)X H )) = Tr((XX H A + AXX H − BX H − XB H )Y ). Hence re(Tr(Y H (AX − B)X H )) = 0 for all Y ∈ S ⇐⇒ LX (A) = (BX H + XB H ). This shows that A ∈ SP(X, B) ⇐⇒ LX (A) = (BX H +XB H ). Note that XX H is invertible if and only if the row-rank of X equals rank(X). In such a case, the solution of the Lyapunov ∞ H H equation can be written as [12, p. 415] A = 0 e−(XX )t (BX H +XB H )e−(XX )t dt from which the desired conclusions follow. Finally, let K := R, S := skew-sym and A ∈ S. Then for all Y ∈ S we have 2 re(Tr(Y H (AX − B)X H )) = Tr((XX T A + AXX T − BX T + XB T )Y ) from which it follows that A ∈ SP(X, B) ⇐⇒ LX (A) = (BX T − XB T ). If the row-rank of X equals rank(X) then XX T is invertible and, in such a case, the desired result follows from the solution of the Lyapunov equation LX (A) = BX T − XB T . This completes the proof. 2 4. Computation The proof of Theorem 3.4 provides a computational method for solving the structured Procrustes problem. Indeed, a backward stable algorithm for computing the smallest
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
159
Frobenius norm solution of the SPP can be derived as follows. Consider the singular value decomposition X = U diag(σ1 , · · · , σr , 0, · · · , 0)V H and set Σr := diag(σ1 , · · · , σr ), where r := rank(X). Define the real symmetric matrix D ∈ Cr×r by D(i, j) := (σi2 + σj2 )−1 ˆ := U H BV when S = Herm and B ˆ := U T BV when S ∈ for i, j = 1 : r. Compute B ˆ as B ˆ = B11 B12 with B11 ∈ Kr×r and compute the {sym, skew-sym}. Partition B B21 B22 structured matrix Ar ∈ Kr×r by ⎧ H ⎪ ⎨ D ◦ (B11 Σr + Σr B11 ) if S = Herm, T Ar := D ◦ (B11 Σr + Σr B11 ) if S = sym, ⎪ ⎩ D ◦ (B Σ − Σ B T ) if S = skew-sym. 11 r r 11
(11)
Note that in order to compute the structured matrix Ar , it is enough to compute the upper triangular part of Ar and then compute the lower triangular part of Ar by imposing the structure on Ar . For example, when S = Herm, the upper triangular part of Ar can be computed as Ar (i, i) :=
¯11 (j, i) σi re(B11 (i, i)) B11 (i, j)σj + B and Ar (i, j) := for j > i. 2 σi σi + σj2
Similarly, when S = skew-sym, the upper triangular part of Ar can be computed as Ar (i, i) := 0 and Ar (i, j) :=
B11 (i, j)σj − B11 (j, i) σi for j > i. σi2 + σj2
Now defining A ∈ S by ˆ H when S = Herm and A := U ¯ AU ˆ H when S ∈ {sym, skew-sym}, A := U AU
(12)
we obtain the desired smallest Frobenius norm solution of the SPP, where Aˆ =
Ar B21 Σ−1 r
H (B21 Σ−1 r ) 0
if S = Herm, Aˆ =
Ar B21 Σ−1 r
T (B21 Σ−1 r ) 0
if S = sym,
T Ar −(B21 Σ−1 r ) if S = skew-sym. Further, the residual ρS (X, B) and B21 Σ−1 0 r the norm σ S (X, B) of the smallest norm solution A are given by and Aˆ =
S
Ar Σr −
ρ (X, B) = S
σ (X, B) =
B11 2F
B12 2 , + B22 F
2 ˆ Ar 2F + 2B21 Σ−1 r F = AF .
Thus we have the following algorithm.
160
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
Algorithm: Smallest Frobenius norm solution of the SPP Input: Two n × p matrices X and B and a structure S ∈ {Herm, sym, skew-sym}. Output: A ∈ SPmin (X, B), ρS (X, B) and σ S (X, B). 1. 2. 3. 4.
Compute the SVD X = U diag(σ1 , · · · , σr , 0, · · · , 0)V H , where r := rank(X). ˆ := U H BV else compute B ˆ := U T BV . If S = Herm then compute B Compute the structured matrices Ar as in (11). ˆ + 1 : n, j)/σj , for j = 1 : r, and ˆ + 1 : n, j) ← B(r Compute B(r ˆ + 1 : n, 1 : r)2 . σ S (X, B) := Ar 2 + 2B(r F
F
5. Compute the structured matrix A as in (12). (:, j) ← Ar (:, j) ∗ σj , for j = 1 : r, and 6. Compute Ar S ˆ : r, 1 : r)2 + B(:, ˆ r + 1 : p)2 . ρ (X, B) := Ar − B(1 F F We illustrate our computational method by considering a numerical example. We 2n×2n consider the Hamiltonian Procrustes is Hamiltonian if problem. A matrix A ∈ K 0 I H and I the identity matrix of size n. In other words, (J A) = J A, where J := −I 0 the Lie algebra L associated with M := J consists of Hamiltonian matrices. Equivalently, A F , where GH = G ∈ Kn×n and A is Hamiltonian if it is of the form A := G −AH F H = F ∈ Kn×n . The following results are computed using Matlab and are correct to the digits shown. For ⎡
⎤ 1.9805 1.7235 0.4305 2.0193 ⎢ 1.7235 1.7610 0.2943 1.8172 ⎥ ⎥ X := ⎢ ⎣ 0.4305 0.2943 0.1185 0.4213 ⎦ and 2.0193 1.8172 0.4213 2.0754 ⎡ ⎤ 0.9572 0.4218 0.6557 0.6787 ⎢ 0.4854 0.9157 0.0357 0.7577 ⎥ ⎥ B := ⎢ ⎣ 0.8003 0.7922 0.8491 0.7431 ⎦ 0.1419 0.9595 0.9340 0.3922 the smallest Frobenius norm solution of the Hamiltonian Procrustes problem is given by ⎡
⎤ 67.7293 8.7880 −15.7491 −70.0422 ⎢ −27.9298 −37.4144 −70.0422 74.5316 ⎥ σ S (X, B) = 176.1678 ⎥ A=⎢ ⎣ 10.6982 −27.5901 −67.7293 27.9298 ⎦ and ρS (X, B) = 1.2681 . −27.5901 −9.7250 −8.7880 37.4144 Conclusion. We have provided a complete solution of the structured Procrustes problem (SPP) A = argminG∈S GX − BF when S is either a Jordan algebra or a Lie algebra associated with an appropriate scalar product. We have characterized solutions of the
B. Adhikari, R. Alam / Linear Algebra and its Applications 490 (2016) 145–161
161
SPP (Theorem 3.3) and have determined all solutions which have the smallest norm (Theorem 3.4). We have also provided a characterization of solutions of the SPP based on Lyapunov equation (Theorem 3.7) and have presented a computational method for obtaining the smallest Frobenius norm solution of the SPP. References [1] B. Adhikari, Backward perturbation and sensitivity analysis of structured polynomial eigenvalue problem, PhD thesis, Department of Mathematics, Indian Institute of Technology Guwahati, India, 2008. [2] B. Adhikari, R. Alam, Structured mapping problems for linearly structured matrices, Linear Algebra Appl. 444 (2014) 132–145. [3] R. Alam, S. Bora, M. Karow, V. Mehrmann, J. Moro, Perturbation theory for Hamiltonian matrices and the distance to bounded-realness, SIAM J. Matrix Anal. Appl. 32 (2011) 484–514. [4] C. Antoulas, Approximation of Large-Scale Dynamical Systems, SIAM, Philadelphia, 2005. [5] J.E. Brock, Optimal matrices describing linear systems, AIAA J. 6 (1968) 1292–1296. [6] C. Davis, W.M. Kahan, H.F. Weinberger, Norm-preserving dialations and their applications to optimal error bounds, SIAM J. Numer. Anal. 19 (1982) 445–469. [7] D.W. Eggert, A. Lorusso, R.B. Fisher, Estimating 3-D rigid body transformations: a comparison of four major algorithms, Mach. Vis. Appl. 9 (1997) 272–290. [8] J.C. Gower, G.B. Dijksterhuis, Procrustes Problems, Oxford University Press, 2004. [9] B.F. Green, The orthogonal approximation of an oblique structure in factor analysis, Psychometrika 17 (1952) 429–440. [10] S. Grivet-Talocia, Passivity enforcement via perturbation of Hamiltonian matrices, IEEE Trans. Circuits Syst. I. Regul. Pap. 51 (2004) 1755–1769. [11] N.J. Higham, The symmetric Procrustes problem, BIT 28 (1988) 133–143. [12] P. Lancaster, M. Tismenetsky, The Theory of Matrices, 2nd ed., Academic Press, 1985. [13] H.J. Larson, Least squares estimation of the components of a symmetric matrix, Technometrics 8 (1966) 360–362. [14] D.S. Mackey, N. Mackey, F. Tisseur, Structured mapping problems for matrices associated with scalar products, Part I: Lie and Jordan algebras, SIAM J. Matrix Anal. Appl. 29 (2008) 1389–1410. [15] C. Khatri, S.K. Mitra, Hermitian and nonnegative definite solutions of linear matrix equations, SIAM J. Appl. Math. 31 (1976) 579–585. [16] J. Meinguet, On the Davis–Kahan–Weinberger solution of the norm-preserving dilation problem, Numer. Math. 49 (1986) 331–341. [17] S.A. Mulaik, The Foundations of Factor Analysis, McGraw–Hill, New York, 1972. [18] J.-G. Sun, Two kinds of inverse eigenvalue problems for real symmetric matrices, Math. Numer. Sin. 9 (1987) 206–216 (in Chinese). [19] P.H. Schonemann, A generalized solution of the orthogonal Procrustes problem, Psychometrika 31 (1966) 1–10. [20] F. Tisseur, A chart of backward errors and condition numbers for singly and doubly structured eigenvalue problems, SIAM J. Matrix Anal. Appl. 24 (2003) 877–897. [21] K.G. Woodgate, Least-squares solution of F = P G over positive semidefinite symmetric P , Linear Algebra Appl. 245 (1996) 171–190. [22] D.X. Xie, L. Zhang, Least-squares solutions of inverse problems for antisymmetric matrices, Chinese J. Engrg. Math. 10 (1993) 25–34 (in Chinese). [23] Z. Zhang, C. Liu, Least-squares solutions of the equation AX = B over anti-Hermitian generalized Hamiltonian matrices, Numer. Math. J. Chinese Univ. (English Ser.) 15 (2006) 60–66.