Linear Algebra and its Applications 438 (2013) 1946–1961
Contents lists available at SciVerse ScienceDirect
Linear Algebra and its Applications journal homepage: w w w . e l s e v i e r . c o m / l o c a t e / l a a
A “shift-and-deflate” technique for quadratic matrix polynomials Beatrice Meini Dipartimento di Matematica, Università di Pisa, largo Bruno Pontecorvo 5, 56127 Pisa, Italy
ARTICLE INFO
ABSTRACT
Article history: Received 29 November 2010 Accepted 22 November 2011 Available online 27 December 2011
A technique for deflating two eigenvalues λ1 , λ2 of an n × n quadratic matrix polynomial A(z ) is proposed. The method requires the knowledge of a right eigenvector corresponding to λ1 , and of a left eigenvector corresponding to λ2 . This technique consists of two stages: the shift of λ1 and λ2 to 0 and ∞, respectively, and the deflation of 0 and ∞. The final result is the construction of an (n − 1) × (n − 1) quadratic matrix polynomial, which shares with A(z ) all the eigenvalues, except for λ1 and λ2 . The particular case of ∗-palindromic quadratic matrix polynomials is treated. © 2011 Elsevier Inc. All rights reserved.
Submitted by V. Mehrmann AMS classification: 15A18 65F15 65F30 Keywords: Matrix polynomial Eigenvalue Shift Deflation Palindromic matrix polynomial
1. Introduction The technique of deflation is often used in the numerical computation of the roots of a polynomial p(z ) of degree m. More specifically, once a root λ is approximated within a required accuracy, this root is deflated, i.e., the problem is reduced to computing the roots a new polynomial of degree m − 1, which shares with p(z ) all the roots except for λ. One important effect of deflation is to prevent the repeated numerical approximation of the same root λ, without approximating the remaining roots of p(z ). The goal of this paper is to extend the deflation technique to the eigenvalues of quadratic matrix polynomials. Let Ai , i = 0, 1, 2, be n × n matrices with complex entries, which define the coefficients of the n × n regular quadratic matrix polynomial E-mail address:
[email protected] 0024-3795/$ - see front matter © 2011 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.laa.2011.11.037
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
A(z )
1947
= A0 + zA1 + z2 A2 .
The eigenvalues of A(z ) are defined as the roots of the polynomial a(z ) = det A(z ), which has degree at most 2n. By using the customary convention that A(z ) has r eigenvalues at infinity if a(z ) has degree 2n − r, then A(z ) has 2n eigenvalues, which we denote by λ1 , . . . , λ2n . Assume that we know two eigenvalues λ1 and λ2 , and the corresponding left and right eigenvectors, i.e., nonzero vectors u1 , v2 such that u∗1 A(λ1 )
= 0, A(λ2 )v2 = 0.
Our goal is to construct a new (n − 1) × (n − 1) quadratic matrix polynomial, whose eigenvalues are λ3 , . . . , λ2n . In this way, the eigenvalues λ1 and λ2 are deflated. This is not the first attempt to devise a deflation technique for quadratic matrix polynomials. To this regard one can cite the papers [5,15]. In [5] the authors propose a deflation based on a low-rank modification of the coefficients. In [15] the authors propose a deflation based on structure preserving transformations. In this paper we propose a deflation technique which consists in two stages, which will be denoted by shift and deflate, respectively. The shift stage relies on the ideas developed in [1,13] for shifting a known eigenvalue of a matrix polynomial to a different value, keeping unchanged the remaining eigenvalues. In this stage we shift the eigenvalue λ1 to 0, and the eigenvalue λ2 to ∞. To this purpose we define a new n × n quadratic A(z ) differ from the polynomial A(z ) having eigenvalues 0, ∞, λ3 , . . . , λ2n . The matrix coefficients of original matrix coefficients by rank-one, or rank-two, modifications. A(z ). The deflate stage consists in deflating the eigenvalues 0 and ∞ of the matrix polynomial A(z ), we can define a new (n − 1) × Indeed, by using the property that 0 and ∞ are eigenvalues of (n − 1) quadratic matrix polynomial B(z), whose eigenvalues are λ3 , . . . , λn . To this end, a new A(z )K is constructed, where H and K are nonsingular matrices such that matrix polynomial T (z ) = H ∗ He1 = u1 , Ke1 = v2 , where e1 = [1, 0, . . . , 0]T , and the symbol ∗ denotes conjugated transposition. The matrix coefficients of B(z ) are rank-one modifications of the trailing (n − 1) × (n − 1) principal submatrices of the coefficients of T (z ). The construction of B(z ) is possible under the assumption that A1 v2 = 0. u∗1 In the case where the original polynomial is ∗-palindromic, i.e., A(z ) = A0 + zA1 + z 2 A∗0 , where A1 = A∗1 , the new polynomial B(z ) can be constructed in such a way that it is still ∗-palindromic. Therefore, the shift-and-deflate technique preserves the ∗-palindromic structure. There is a lot of freedom in performing the shift-and-deflate technique. In the shift stage, the coefficients of A(z ) depend on two arbitrary vectors x and y such that x∗ u1 = 1 and y∗ v2 = 1. In the deflate stage, the coefficients of B(z ) depend on the choice of the nonsingular matrices H and K. When A(z ) is ∗-palindromic a clever choice of x, y, H and K allows to keep the palindromic property for the matrix polynomial B(z ). For generic matrix polynomials, it is still an open issue to find a strategy for the choice of x, y, H and K which guarantees for instance good numerical properties of the overall shift-and-deflate process. We have performed some numerical experiments where, in the generic case, we have chosen x = u1 /||u1 ||2 , y = v2 /||v2 ||2 , and H, K are Householder matrices such that Hu1 = σ1 e1 , Kv2 = σ2 e1 , for suitable scalars σ1 and σ2 . According to these experiments, in many cases the condition number of the eigenvalues/vectors of B(z ) turns out to be lower than the condition number of the eigenvalues/vectors of the original polynomial A(z ). Moreover, the repeated approximation of the shift-anddeflate technique does not deteriorate the accuracy of the approximation of the eigenvalues. Therefore the proposed shift-and-deflate approach seems promising, even though much is still to be investigated. The paper is organized as follows. In Section 2 we recall some properties and definitions on matrix polynomials. In Section 3 we describe the shift technique, while in Section 4 we propose the deflation of the eigenvalues 0 and ∞. In Section 5 we synthesize the shift-and-deflate technique. The special case of ∗-palindromic matrix polynomials is discussed in Section 6. Some numerical experiments are reported in Section 7. We draw conclusions and open issues in Section 8.
1948
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
2. Preliminaries on matrix polynomials Given an integer k ≥ 1 and the n × n complex matrices Ai , i = 0, . . . , k, with Ak = 0, the function A(z ) = ki=0 z i Ai is called n × n matrix polynomial of degree k. The matrix polynomial A(z ) is said to be regular if det A(z ) is not identically equal to zero. For a regular matrix polynomial, a(z ) = det A(z ) is a polynomial of degree at most nk, and its roots are called eigenvalues of A(z ). Given an eigenvalue λ, there exist nonzero vectors u, v such that u∗ A(λ) = 0 and A(λ)v = 0; the vectors u and v are called left and right eigenvectors, respectively, associated with λ. If a(z ) has degree r < nk, we adopt the convention that A(z ) has nk − r eigenvalues equal to ∞, so that each regular matrix polynomial has exactly nk eigenvalues (including possible eigenvalues equal to ∞). In the following, all the matrix polynomials will be assumed to be regular. Denote by rev(A(z )) the reversed matrix polynomial rev(A(z ))
= zk A(z−1 ) =
k i=0
z i Ak−i ,
obtained by reversing the order of the matrix coefficients of A(z ). One can easily verify that λ ∈ C\{0} is an eigenvalue of A(z ) if and only if λ−1 is an eigenvalue of rev(A(z )). For this property, by using the convention that 1/∞ = 0 and 1/0 = ∞, the eigenvalues equal to ∞ of A(z ) can be identified with null eigenvalues of the reversed matrix polynomial. In fact, λ = ∞ is an eigenvalue of A(z ) if and only if λ−1 = 0 is an eigenvalue of rev(A(z)). Moreover, any nonzero vectors u, v such that u∗ rev(A(0)) = 0 and rev(A(0))v = 0 are called left and right eigenvectors, respectively, of A(z ) associated with the eigenvalue ∞, and we will write u∗ A(∞) = 0 and A(∞)v = 0. By following [6], it is useful to introduce the homogeneous matrix polynomial A(α, β)
=
k i=0
α i β k−i Ai .
(1)
One has A(z ) = A(z , 1), rev(A(z )) = A(1, z ). The pairs (α, β) such that det A(α, β) = 0 are called eigenvalues of the homogeneous matrix polynomial A(α, β). One may easily verify that A(α, β)v = 0, with v = 0 and β = 0, if and only if λ = α/β is a finite eigenvalue of A(z ) and A(λ)v = 0. Moreover, A(α, 0)v = 0, with v = 0, if and only if λ = ∞ is an eigenvalue of A(z ) and A(∞)v = 0. For more details on matrix polynomials we refer the reader to [9]. 3. Shift technique In this section we show how to move an eigenvalue of a matrix polynomial to a different value, keeping unchanged the corresponding eigenvector and the remaining eigenvalues. We will refer to this transformation as the shift technique. A shift technique applied to matrices was proposed by Brauer in [3], and later rediscovered and applied to different fields in [4,8,10,12–14]. The shift can be extended to matrix power series, according to the results of [2, Theorem 3.32]. Here we adapt the results of [2] to the case of a matrix polynomial. Theorem 1. Let k ≥ 1 and let A(z ) = ki=0 z i Ai be a n × n matrix polynomial of degree k. Let λ ∈ C and v ∈ Cn , v = 0, such that A(λ)v = 0. Then for any η ∈ C and for any vector x such that x∗ v = 1 the following properties hold: (1) the function
λ−η ∗ vx A(z ) = A(z ) I + z−λ is a matrix polynomial of degree k;
(2)
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
1949
z −η
(2) det A(z ) = z−λ det A(z ); (3) A(η)v = 0 and, if σ ∈ {λ, η} is such that A(σ )w w
= 0 where = 0, then A(σ )w
= (I − (λ − η)/(σ − η)vx∗ )w.
Moreover, by setting A(z )
=
Ak
= Ak ,
Ai
= Ai + (λ − η)
k
i=0
k j=i+1
(3)
Ai , one has z i
λj−i−1 Aj vx∗ , i = 0, . . . , k − 1.
(4)
Proof. We first prove that A(z ) is a matrix polynomial of degree k. Assume that |z | > |λ|, therefore i −i λ z . By replacing this expression in the definition of A(z ) 1/(z − λ) = z −1 /(1 − λz −1 ) = z −1 ∞ i=0 one finds that, for |z | > |λ|, ⎛ ⎞ ∞ i λ ∗⎠ −1 ⎝ A(z ) = A(z ) I + (λ − η)z vx . zi i=0 Therefore, by setting A(z ) immediate to verify that Ai = 0, i Ak = Ak , Ai
+∞
i=−∞
Ai and by comparing the coefficients of z i on both sides, it z i
> k,
= Ai + (λ − η)
A−i
=
= (λ − η)
k j=0
k j=i+1
λj−i−1 Aj vx∗ , i = 0, . . . , k − 1,
λi−1+j Aj vx∗ , i > 0.
= (λ − η)λi−1 kj=0 λj Aj vx∗ = (λ − η)λi−1 A(λ)vx∗ , which is equal to zero by hypothesis. Therefore A(z ) is a matrix polynomial of degree k, and the matrix
The latter equation can be rewritten as A−i
coefficients are given by (4). By computing the determinant of both sides of (2) one obtains that det A(z ) = det A(z ) det (I + (λ − η)/(z − λ)vx∗ ) = det A(z )(z −η)/(z −λ). By direct inspection, one A(η)v = 0. If A(σ )w = 0 and σ ∈ {λ, η}, may verify that (I + (λ − η)/(η − λ)vx∗ ) v = 0, therefore = w, therefore = 0. of (3) is such that (I + (λ − η)/(σ − λ)vx∗ ) w A(σ )w then the vector w An immediate consequence of the theorem above is that the matrix polynomial A(z ) has the same A(z ), eigenvalues of A(z ), except for the eigenvalue λ of A(z ) which is replaced by the eigenvalue η in i.e., the eigenvalue λ is shifted to η. Moreover, the corresponding right eigenvector is left unchanged. We observe also that the left eigenvectors of A(z ) are the same as the ones of A(z ). In some applications it is useful to choose η = 0 or to shift λ to ∞, or to perform the shift by multiplying A(z ) on the left by a suitable matrix function, thus keeping unchanged the right eigenvectors. The following lemma synthesizes how to obtain these different shifts: Proposition 2. Let k ≥ 1 and let A(z ) = ki=0 z i Ai be a n × n matrix polynomial of degree k. Let λ ∈ C, u, v ∈ Cn , u, v = 0 such that u∗ A(λ) = 0, A(λ)v = 0, and let x, y ∈ Cn such that x∗ v = 1 and y∗ u = 1. Then the following properties hold: (1) the matrix polynomial A(z ) = A(z ) I +
λ vx∗ z−λ
(5)
1950
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
z is such that det A(z ) = det A(z ) z−λ and A(0)v keeping the same right eigenvector v; (2) if λ = 0, the matrix polynomial
z vx∗ A(z ) = A(z ) I + λ−z
= 0, i.e., the eigenvalue λ is shifted to zero, while
(6)
λ and A(∞)v is such that det A(z ) = det A(z ) λ− z keeping the same right eigenvector v; (3) the matrix polynomial λ ∗ A(z ) = I + yu A(z ) z−λ
= 0, i.e., eigenvalue λ is shifted to infinity, while
z A(0) is such that det A(z ) = det A(z ) z−λ and u∗ keeping the same left eigenvector u; (4) if λ = 0, the matrix polynomial
z A(z ) = I + yu∗ A(z ) λ−z
= 0, i.e., the eigenvalue λ is shifted to zero, while
(7)
λ is such that det A(z ) = det A(z ) λ− A(∞) and u∗ z while keeping the same left eigenvector u.
(8)
= 0, i.e., the eigenvalue λ is shifted to infinity,
Proof. The first property is a direct consequence of Theorem 1, by choosing η = 0. The shift of λ to ∞, obtained with (6), can be proved by applying Theorem 1 to the reversed matrix polynomial A(z ) is the reversed polynomial of A(z ) = rev(A(z )),by shifting λ−1 to 0; the matrix polynomial rev(A(z )) I
+
λ−1
z −λ−1
vx∗ . The remaining parts of the theorem can be proved similarly.
4. Deflation Let A(z ) = A0 + zA1 + z 2 A2 be an n × n quadratic matrix polynomial. We assume that the eigenvalues of A(z ) are λ1 = 0, λ2 = ∞, λ3 , . . . , λ2n . Our goal is to deflate the eigenvalues λ1 and λ2 , i.e., to construct an (n − 1) × (n − 1) quadratic matrix polynomial B(z ) having eigenvalues λ3 , . . . , λ2n . In order to perform the deflation, the left eigenvector u1 corresponding to λ1 and the right eigenvector v2 corresponding to λ2 must be known. Moreover, the vectors u1 and A1 v2 must not be orthogonal, i.e., we need that u∗1 A1 v2 = 0. The following result allows us to construct the sought polynomial B(z ): Theorem 3. Assume that the n × n matrix polynomial A(z )
= A0 + zA1 + z2 A2 has eigenvalues
λ1 = 0, λ2 = ∞, λ3 , . . . , λ2n , and let u1 , v2 ∈ Cn , u1 , v2 = 0 such that u∗1 A0 = 0, A2 v2 = 0. Assume that γ = u∗1 A1 v2 = 0. Let H and K be nonsingular matrices such that He1 = u1 and Ke1 = v2 , where e1 = [1, 0, . . . , 0]T . Denote by V (z ) the trailing (n − 1) × (n − 1) principal submatrix of the matrix polynomial T (z ) = H ∗ A(z )K and set ∗ ∗ s0 (z ) s1 (z ) · · · sn−1 (z ) = u1 A1 K + zu1 A2 K , ⎡ ⎤ t (z ) ⎢ 0 ⎥ ⎢ ⎥ ⎢ t1 (z ) ⎥ (9) ⎢ ⎥ ⎢ = H ∗ A0 v2 + zH ∗ A1 v2 . .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎣ ⎦ tn−1 (z )
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
Then
⎡ B(z )
= V (z) −
⎢ 1 ⎢ ⎢ γ ⎢ ⎣
1951
⎤
t1 (z )
.. .
tn−1 (z )
⎥ ⎥ ⎥ s (z ) ⎥ 1 ⎦
· · · sn−1 (z)
(10)
is an (n − 1) × (n − 1) quadratic matrix polynomial having eigenvalues λ3 , . . . , λ2n . Proof. Since H and K are nonsingular matrices, the matrix polynomial T (z ) = H ∗ A(z )K has the same eigenvalues as A(z ). Since e1T H ∗ A0 = 0 and A2 Ke1 = 0, the polynomial T (z ) can be factorized as ⎡
0
0
···
⎣
⎤⎡
⎤ γ s1 (z ) · · · sn−1 (z ) ⎥⎢ ⎥ .. ⎥ ⎥ ⎥⎢ ⎢ t1 (z ) ⎥ .⎥ ⎢ ⎥ ⎥⎢ ⎥, ⎥⎢ . ⎥ .. ⎥⎢ ⎥ V (z ) 0⎥⎣ ⎦ ⎦ t ( z ) n − 1 0 1
··· ⎢ ⎢ .. ⎢ . ⎢0 1 ⎢ T (z ) = ⎢ ⎢ .. . . . . . . ⎢. z
0
where γ = e1T H ∗ A1 Ke1 = u∗1 A1 v2 and si (z ), ti (z ) are defined in (9). Since γ = 0, by performing one step of Gaussian elimination in the matrix on the right in the above equation, one finds that det T (z ) = γ z det B(z ) where B(z ) is defined in (10). The factor z γ takes care of the eigenvalues λ1 = 0 and λ2 = ∞, and the eigenvalues of B(z ) are the remaining eigenvalues of T (z ), namely λ3 , . . . , λ2n . We may easily check that B(z ) is an (n − 1) × (n − 1) quadratic matrix polynomial. An assumption needed in the above theorem is that the vector u1 must not be orthogonal to A1 v2 . The following example shows that γ can be equal to, or different from zero. Example 1. Let ⎤ ⎡ 1 1 ⎦ + zA1 A(z ) = ⎣ 1 1
⎤
⎡
+z ⎣ 2
1
−1
−1 1
⎦,
× 3 matrix. The of eigenvalues A(z ) are λ1 = 0, λ2 = ∞, λ3 , λ4 . With the = 1 −1 and v2T = 1 1 , therefore if A1 = I, then γ = 0. On the other hand, if A1 is such that A1 v2 = σ v2 , i.e., v2 is not an eigenvector of A1 , then γ = 0.
where A1 is a generic 3
notation of Theorem 3, uT1
There is a lot of freedom in the choice of the matrices H and K. Specific properties of the matrix polynomial can help in the optimal choice of these matrices; for instance, in Section 6 we will show how to choose H and K in order to preserve the palindromic properties. In general, the matrices H and K might be chosen to be Householder matrices: H and K are Householder matrices such that Hu1 = σ1 e1 , Kv2 = σ2 e1 , respectively, for suitable scalars σ1 and σ2 . This choice guarantees a low computational cost and numerical stability in the construction of the coefficients of T (z ). In particular, the coefficients of T (z ) are low rank corrections of the coefficients of A(z ) and, according to the results of [6], the numerical conditioning of the eigenvalues/vectors of T (z ) is the same as that of the eigenvalues/vectors of A(z ). 5. Shift-and-deflate technique Here, we use the results of Sections 3 and 4 in order to deflate a pair of eigenvalues of a quadratic matrix polynomial.
1952
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
Let A(z ) = A0 + zA1 + z 2 A2 be an n × n quadratic matrix polynomial, with eigenvalues λ1 , . . . , λ2n . Our goal is to deflate the eigenvalues λ1 and λ2 , i.e., to construct an (n − 1) × (n − 1) quadratic matrix polynomial B(z ) having eigenvalues λ3 , . . . , λ2n . A(z ) having In order to construct B(z ) we first define a new n × n quadratic matrix polynomial A(z ). eigenvalues 0, ∞, λ3 , . . . , λ2n , and then we apply the deflation to the eigenvalues 0 and ∞ of The construction of the polynomial A(z ) relies on the shift technique described in Section 3, and is based on the use of the left eigenvector associated with λ1 and the right eigenvector associated with λ2 . Theorem 4. Let A(z ) be a quadratic matrix polynomial, and let λ1 , . . . , λ2n be its eigenvalues, where we assume that λ2 = 0. Let u1 , v2 ∈ Cn , u1 , v2 = 0, be such that u∗1 A(λ1 ) = 0 and A(λ2 )v2 = 0. Let x, y ∈ Cn be such that x∗ u1 = 1 and y∗ v2 = 1, and set Q = xu∗1 , P = v2 y∗ . Then the function
z λ1 Q A(z ) I + A(z ) = I + P (11) z − λ1 λ2 − z is a quadratic matrix polynomial, having eigenvalues 0, ∞, λ3 , . . . , λ2n and matrix coefficients A0
= A0 − QA0 ,
A1
1 = A1 + λ1 QA2 + λ− 2 (A0 − QA0 )P ,
A2
= A2 − A2 P .
Moreover, one has u∗1 A(0)
(12)
= 0 and A(∞)v2 = 0. λ
1 Q ) in power Proof. We proceed as in the proof of Theorem 1. By expanding the function (I + z−λ 1 λ1 Q )A(z ) has matrix coefficients series of z, we find that the matrix polynomial A(z ) = (I +
z −λ1
A0
= A0 − QA0 ,
A1
= A1 + λ1 QA2 ,
A2
= A2 .
(13)
Similarly, the matrix polynomial A(z ) A0
A0 , =
A1
−1 A1 + λ2 A0 P , =
A2
A2 =
A(z )(I + =
λ2 −z P ) has matrix coefficients z
(14)
A2 P . −
By replacing (13) in (14) we arrive at (12). Therefore, the polynomial A(z ) has the eigenvalues of A(z ), except for λ1 which is shifted to 0, and λ2 which is shifted to ∞. In order to perform the deflation, we apply Theorem 3 to A(z ). The condition for the applicability of the deflation is A1 v2 = 0. γ = u∗1 Observe that, since u∗ A
1 0
∗
(15)
=
0, A2 v2
=
0 and u∗ (A 1
0
− QA0 ) = 0, from (12) it follows that
∗
γ = u1 A1 v2 + λ1 u1 A2 v2 . Actually, the value of γ does not depend on the choice of the vectors x and y, needed in the construction −1 of A(z ). If λ1 = 0 then γ = u∗1 A1 v2 . If λ1 = 0, since u∗1 A(λ1 ) = 0, we obtain that γ = −λ1 u∗1 A0 v2 . ∗ Therefore the condition for applicability of the shift-and-deflate technique is u1 A0 v2 = 0 if λ1 = 0, and u∗1 A1 v2 = 0 if λ1 = 0.
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
1953
The following results give some insights on the condition γ = 0. These results will be useful in Section 6, when we will consider the case where A(z ) is a ∗-palindromic matrix polynomial. Theorem 5. Assume that the matrix polynomial A(z ) can be factorized as A(z )
= (zI − R)W (I − zG)
(16)
where R, G and W are n × n matrices. Let λ1 , λ2 , u1 , v2 , Q , P, and A(z ) as in Theorem 4. Moreover, −1 assume that u∗1 R = λ1 u∗1 and Gv2 = λ2 v2 . Then the matrix polynomial A(z ) can be factorized as R)W (I − z G) where R = R − λ1 Q , G = G − λ2 P. Moreover, the constant γ of (15) is A(z ) = (zI − γ = u∗1 Wv2 . Proof. From the factorization of A(z ) and from (11) it follows that
z λ1 Q (zI − R)W (I − zG) I + A(z ) = I + P . z − λ1 λ2 − z The property u∗1 R
= λ1 u∗1 implies that QR = λ1 Q , therefore
λ1 R. I+ Q (zI − R) = zI − (R − λ1 Q ) = zI − z − λ1
Similarly, the property Gv2
(I − zG) I +
1 −1 = λ− 2 v2 implies that GP = λ2 P, therefore
z
λ2 − z
P
= I − z(G − λ2 P ) = I − z G.
From the factorization of A(z ), it follows that its matrix coefficients have the expression A0
= −RW , A1 = W + RWG, A2 = −WG. −1
If λ1 = 0 one has γ = λ1 u∗1 A0 v2 ; therefore from the expression for A0 and the property u∗1 R = λ1 u∗1 , it follows that γ = u∗1 Wv2 . Similarly, if λ1 = 0 one has γ = u∗1 A1 v2 , therefore from the expression of A1 one deduces that γ = u∗1 Wv2 . The above theorem relates the condition γ = 0 to the existence and properties of the factorization −1 (16), where G and R are such that u∗1 R = λ1 u∗1 and Gv2 = λ2 v2 . Conditions for the existence of such factorization can be derived directly for instance from [2, Theorem 3.20]: Theorem 6. Assume that n eigenvalues of A(z ) have modulus less than one, and that n eigenvalues have modulus greater than one. Assume that there exists a matrix R with spectral radius less than one, which solves the matrix equation A0 X 2 + A1 X + A2 = 0. Then there exists a matrix G with spectral radius less than one, which solves the matrix equation A0 X 2 + A1 X + A2 = 0. Moreover the factorization (16) holds. One may easily check that u∗ A(λ) = 0 for any λ, u such that u∗ R = λu∗ , and A(μ)v = 0 for any μ, v such that Gv = μ−1 v. Therefore the eigenvalues of R are the eigenvalues of A(z) in the open unit disk, while the eigenvalues of G are the reciprocal of the eigenvalues of A(z ) outside the closed unit disk. 6. The case of palindromic quadratic matrix polynomials In the case where A(z ) is a ∗-palindromic quadratic matrix polynomial, a suitable choice of the vectors x and y in the shift stage, and of the matrices H and K in the deflation stage, allow to keep B(z ) palindromic.
1954
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
Assume that A(z ) is a quadratic ∗-palindromic matrix polynomial, i.e., z 2 A∗ (z −1 ) terms of matrix coefficients, that A(z )
= A(z), or, in
= A0 + zA1 + z2 A∗0 ,
= A∗1 . Then the eigenvalues of A(z) comes in pairs (λ, λ¯ −1 ), and if u∗ A(λ) = 0, then 1 − 1 ¯ ¯− A(λ )u = 0. Our goal is to deflate a pair (λ1 , λ 1 ), where λ1 = 0. We assume without loss of generality that |λ1 | ≤ 1. 1 ¯ −1 ¯− We first shift λ1 to 0 and λ 1 to ∞. To this purpose, we apply Theorem 4, where we set λ2 = λ1 and v2 = u1 . Moreover, we may assume u1 2 = 1 and choose x = y = u1 , therefore Q = P = u1 u∗1 . where A1
The shifted matrix polynomial λ1 A(z ) = I + Q A(z ) I z − λ1
+
z 1 λ¯ − 1 −z
Q
A∗ (z −1 ) is ∗-palindromic, since by direct computation one may verify that z 2 coefficients are A0
= A0 − QA0 ,
A1
= A1 + λ1 QA∗0 + λ¯ 1 (A0 − QA0 )Q ,
A2
A∗0 , =
A(z ). Its matrix =
−1
¯ 1 ), which is replaced by the and its eigenvalues are the eigenvalues of A(z ), except for the pair (λ1 , λ pair (0, ∞). In order to deflate the eigenvalues 0 and ∞, we proceed as in Section 4, where we apply Theorem 3 A(z )H is ∗-palindromic, therefore to A(z ), by choosing K = H. Then the matrix polynomial T (z ) = H ∗ V (z ) is ∗-palindromic as well. Moreover, one has ⎡ ⎤ t (z ) ⎢ 0 ⎥ ⎢ ⎥ ⎢ t1 (z ) ⎥ ⎢ ⎥ A0 u1 + zH ∗ A1 u1 , t (z ) := ⎢ = H ∗ .. ⎥ ⎢ ⎥ ⎢ ⎥ . ⎣ ⎦ tn−1 (z ) ∗ A1 H + zu∗1 A∗0 H = zt ∗ (z −1 ), s0 (z ) s1 (z ) · · · sn−1 (z ) = u1 therefore the deflated matrix polynomial B(z ) defined in (10) is still ∗-palindromic. −1 A1 u1 = −λ1 u∗1 A0 u1 . In particular, if A1 is Concerning the condition γ = 0, observe that γ = u∗1 positive definite, then γ is different from zero. Other conditions for γ = 0 can be stated by using Theorem 5. Consider the matrix equation X
= A1 − A0 X −1 A∗0 .
(17)
According to the results of [7], if A1 is positive definite, then Eq. (17) has a Hermitian positive definite solution if and only if the function φ(z ) = z −1 A0 + A1 + zA∗0 is regular and positive semidefinite for any z on the unit circle. Under this assumption there exists a maximal solution W , such that W ≥ X for any other possible solution. By multiplying (17) on the left by R = −A0 W −1 , one finds that the matrix R solves the matrix equation A0 + XA1 + X 2 A∗0 = 0. According to [11] the matrix R has spectral radius at most one. Moreover, by direct inspection one finds that the following factorization holds A(z )
= (zI − R)W (I − zR∗ ).
Therefore the eigenvalues of R are the eigenvalues of A(z ) in the closed unit disk, and the corresponding left eigenvectors are the same for R and for A(z ). In particular it holds that u∗1 R = λ1 u∗1 . By Theorem 5 one has γ = u∗1 Wu1 , which is different from zero since W is Hermitian positive definite. Therefore,
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
1955
if A1 is positive definite, if φ(z ) = z −1 A0 + A1 + zA∗0 is regular and positive semidefinite for any z on the unit circle, then the shift-and-deflate technique can be applied. 7. Numerical experiments We have performed some numerical experiments in MATLAB, in order to investigate the numerical behavior of the shift-and-deflate technique. In all the experiments, we have chosen x = u1 /||u1 ||2 , y = v2 /||v2 ||2 , and H, K Householder matrices. To evaluate the condition number of eigenvalues/eigenvector, we have used the definitions of [6]. The condition number of an eigenvalue λ of a matrix polynomial A(z ) is the condition number C2 (A, α, β) of the eigenvalue (α, β) of the homogeneous matrix polynomial A(α, β) defined in (1), where α/β = λ. Similarly, the condition number of an eigenvector v such that A(λ)v = 0 is the condition number C1 (A, v), where A(α, β)v = 0 and α/β = λ. According to [6, Theorems 4.1 and 4.2], if (α, β) is a simple eigenvalue of the homogeneous polynomial A(α, β), and if u∗ A(α, β) = 0 and A(α, β)v = 0, then the condition number C1 (A, v) of the eigenvector v, and the condition number C2 (A, α, β) of the eigenvalue (α, β) are given by ⎛ ⎞1/2 k 2i 2 ( k − i ) ⎠ |α| |β| ( x⊥ A(α, β) |v⊥ )−1 , C1 (A, v) = ⎝ i=0
⎛
C2 (A, α, β)
=⎝
k i=0
⎞1/2 2(k−i) ⎠
|α| |β| 2i
(18)
uv , |u∗ w|
respectively, where w = (β¯ Dα A(α, β) − α ¯ Dβ A(α, β))v, Dα A(α, β) and Dβ A(α, β) denote the partial derivative of A(α, β) with respect to α and β , and x⊥ is the projection on the orthogonal subspace to x. Example 2. This example is taken from the paper [6]. The block coefficients are the 3 × 3 matrices ⎤ ⎡ ⎡ ⎡ ⎤ ⎤ 2 0 9 1 −1 −1 −3 1 0 ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ A0 = ⎢ 0 0 0 ⎥ , A1 = ⎢ 0 −1 − 0 ⎥ , A2 = ⎢ 0 1 0 ⎥ , (19) ⎦ ⎣ ⎣ ⎣ ⎦ ⎦ 0 0 −3 0 0 0 0 0 1 depending on the parameter . The eigenvalues and the corresponding right and left eigenvectors are reported in Table 1. The eigenvalues are simple, and λ1 = 0 and λ6 = ∞ are two eigenvalues. It is not possible to deflate 0 and ∞ by means of Theorem 3, since u∗1 A1 v6 = 0. Therefore we shift two eigenvalues λr and λs , r , s = 1, 6, to 0 and ∞, respectively, and then we perform deflation. The choice of r and s depends on the value of , since our goal is to avoid the growth of the condition number of eigenvalues and eigenvectors. √ As pointed out in [6], if ≈ u, where u is the machine precision, then the eigenvalues λ2 and λ3 are close, and the condition number of the eigenvectors v2 and v3 is high. Hence, in order to improve Table 1 Eigenvalues/vectors of the matrix polynomial with coefficients defined in (19). i
λi
1 0
2 1
⎤ 0 ⎢ ⎥ ⎣1⎦ 0
⎡
⎤ 0 ⎢ ⎥ ⎣1⎦ 0
⎡
⎡ ui
⎡ vi
1
⎤
⎢ 4 ⎥ ⎣ 0 ⎦ 1 ⎤ 1 ⎢ ⎥ ⎣0⎦ 0
3 1+
4 2
⎡
⎤ 0 ⎢ ⎥ ⎣1⎦ 0
⎡
⎤
⎡
⎤ 0 ⎢ ⎥ ⎣0⎦ 1
⎡
⎡
5 ⎢ ⎥ 1 ⎣ 5(1− ) ⎦ 1 ⎡ ⎤ 1 ⎢ ⎥ ⎣0⎦ 0
⎡
⎤ 0 ⎢ ⎥ ⎣0⎦ 1
⎡
⎤ 1 ⎢ −1 ⎥ ⎣ +1 ⎦ 0
5 3 1
6
∞ ⎤ 0 ⎢ ⎥ ⎣0⎦ 1 ⎤ 1 ⎢ ⎥ ⎣0⎦ 1
1956
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961 Table 2 Conditioning of eigenvalues/vectors of original and deflated matrix polynomial, where λ2 and λ5 are deflated, with = 10−8 . i
1 0 1.6 1.0 0.1 1.0
λi
C1 (A, vi ) C2 (A, λi ) vi ) C1 (B, C2 (B, λi )
2 1 1.7 · 108 3.6 – –
3 1+
5.0 · 108 1.2 0.3 1.2
4 2 2.6 4.8 0.5 0.9
5 3 6.9 0.9 – –
6
∞ 1.4 1.4 1.0 2.4
Table 3 Conditioning of eigenvalues/vectors of original and deflated matrix polynomials, where λ2 and λ5 , or λ2 and λ4 , are deflated, with = 1 − 10−8 . i
3 1+
1.2 · 101 9.2 · 107
4 2 1.2 · 101 9.2 · 107
5 3 1.1 · 101 0.9
1.4 1.4
Shift-and-deflate applied to λ2 and λ5 vi ) C1 (B, 1.2 – 0.5 – C2 (B, λi )
0.5 9.2 · 107
0.5 9.2 · 107
– –
1.0 0.2
Shift-and-deflate applied to λ2 and λ4 vi ) C1 (B, 8.1 · 10−2 – 0.5 – C2 (B, λi )
1.1 0.4
– –
3.2 0.2
1.0 0.2
λi
C1 (A, vi ) C2 (A, λi )
1 0 1.6 0.5
2 1 1.7 3.6
6
∞
the numerical conditioning, we choose to deflate one eigenvalue between λ2 and λ3 . Therefore, we set
= 10−8 and apply the shift-and-deflate technique to the pair of eigenvalues λ2 and λ5 . The resulting (n − 1) × (n − 1) matrix polynomial B(z) has eigenvalues λ1 , λ3 , λ4 and λ6 . Table 2 reports the condition number C1 (A, vi ) and C2 (A, λi ) of the eigenvectors/values of the original polynomial A(z ), vi ) and C2 (B, λi ) of the eigenvectors/values of the new (n − 1)×(n − 1) and the condition number C1 (B, matrix polynomial B(z ). Here C1 (·) and C2 (·) are computed according to formulas (18). As one may expect, while the condition number C1 (A, v3 ) of eigenvector v3 of the original polynov3 ) is small. Concerning the condition number mial is high, after deflation the condition number C1 (B, of the remaining eigenvalues/vectors, we observe that they slightly improve after deflation. By setting = 1 − 10−8 , the eigenvalues λ3 and λ4 are poorly conditioned; in fact, their condition number is around 107 . In this case, a shift-and-deflate technique applied to the eigenvalues λ2 and λ5 , as before, does not lead to any improvement in the numerical conditioning of λ3 and λ5 . Instead, we expect an improvement by applying the shift-and-deflate to the eigenvalues λ2 and λ4 . Table 3 reports the condition numbers for the original polynomial, for the polynomial obtained by deflating λ2 and λ5 , and for the polynomial obtained by deflating λ2 and λ4 . In the latter case the eigenvalue λ4 is well conditioned, as well as all the other eigenvalues of B(z ). Example 3. This example is an n × n ∗-palindromic matrix polynomial, which is constructed in the following way: A(z )
= (zI − R)W (I − zR∗ ),
where R = S −1 DS, D = Diag(1 − , 2, 3, . . . , n) and is a real parameter, S is a nonsingular matrix, and W is a nonsingular Hermitian matrix. In this way the eigenvalues of A(z ) are 1 − , 2, 3, . . . , n and their reciprocals, while the corresponding left eigenvectors are the rows of S. In our experiments we have chosen W = I and S the tridiagonal matrix having n on the main diagonal, 1 on the upper diagonal, and 2 on the lower diagonal. When is small the eigenvalue 1 − has a large condition number. The shift-and-deflate technique is applied to the eigenvalues λ1 = 1 − and λ2 = 1/λ1 , with y = x and K = H, in order to preserve the palindromic structure. Fig. 1(a) and (b) represent the relative approximation error of the eigenvalues outside the unit disk of A(z ), ordered by magnitude, and of the deflated polynomial B(z ), where the eigenvalues are computed by means of the MATLAB function polyeig, for n = 20 and = 10−1 , = 10−8 .
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
1957
Fig. 1. Relative approximation error for Example 3, for n = 20.
We observe that the accuracy of the computation of the eigenvalues of A(z ) and B(z ) is on average the same. Therefore the shift-and-deflate technique does not seem to introduce ill-conditioning to eigenvalues and eigenvectors of B(z ). Example 4. This example matrices ⎡ 3 1 ⎢ ⎢ . ⎢ ⎢ 2 3 .. ⎢ A0 = − ⎢ ⎢ ... ... ⎢ ⎣ 0 2
is the T-palindromic matrix polynomial whose coefficients are the n ⎤ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ 1⎥ ⎦ 3
⎤
⎡
0
A1
=
⎢ ⎢ ⎢ ⎢ 2n ⎢ ⎢ ⎢ ⎣
×n
1
0 2
.. 0
. n
⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎦
A2
= AT0 .
(20)
1958
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
Fig. 2. Eigenvalues of the matrix polynomial with coefficients defined in (20).
The eigenvalues are all real and positive. Fig. 2 represents the eigenvalues inside the unit circle for n = 30 and n = 60, computed with the polyeig function, ordered in increasing modulus. We have approximated the eigenvalues inside the unit circle by repeatedly using the shift-and(1)
deflate technique. More specifically, starting from B0 have:
(1)
= A0 and B1 = A1 , for k = 1, . . . , n − 1, we
(1) approximated the smallest modulus eigenvalue λk and the corresponding left eigenvector u∗k of (k)
(k)
(k)T
the matrix polynomial B(k) (z ) = B0 + zB1 + z 2 B0 ; (2) applied the shift-and-deflate technique to B(k) (z ), by using k) matrix polynomial B(k+1) (z )
(n − k) × (n − eigenvalues λk and 1/λk are deflated.
(k+1)
= B0
λk and uk , thus constructing the (k+1) (k+1)T + zB1 + z 2 B0 , where the
In order to compute the smallest modulus eigenvalue of B(k) (z ) we have used Newton’s method applied to p(k) (z ) = det B(k) (z ), starting from the initial approximation z0 = 0, for any value of k.
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
1959
Fig. 3. Relative errors of the computed eigenvalues and their condition number for Example 4, for n = 30.
We have compared the eigenvalues inside the unit circle λk , k = 1, . . . , n, provided by polyeig ˜ k , k = 1, . . . , n, computed by using the repeated application of the shift-andwith the eigenvalues λ ˜ k |/|λk |, k = 1, . . . , n, and deflate technique. Figs. 3b and 4a report on the left the relative error |λk − λ on the right the corresponding condition number of the eigenvalues provided by polyeig, for n = 30 and n = 60, respectively. It is worth pointing out that the relative error is larger for the eigenvalues having higher condition number, while is close to the machine precision for the eigenvalues with small condition number. The eigenvalues on the rightmost part of Figs. 3(a) and 4(a) are the last computed, since they have larger moduli. Despite the fact that these eigenvalues are the last computed, they have small relative error. Therefore the recursive application of the shift-and-deflate technique does not deteriorate the accuracy of computation. 8. Conclusions and open issues We have introduced a deflation technique for eigenvalues of quadratic matrix polynomials, which is based on a shift stage, and on a deflate stage. Some issues need still to be investigated:
1960
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
Fig. 4. Relative errors of the computed eigenvalues and their condition number for Example 4, for n = 60.
• The condition u∗1 A1 v2 = 0 is needed in the deflation stage; some hypotheses which guarantee that •
• •
•
this condition is satisfied are reported in the paper; however we think that some issues concerning this condition should be still understood. In the numerical experiments reported in this paper, where H and K are Householder matrices, the numerical conditioning of eigenvalues/vectors of B(z ) is improved, with respect to numerical conditioning of eigenvalues/vectors of the original matrix polynomial; moreover the repeated application of the shift-and-deflate technique does not seem the affect the accuracy of the computation. However a precise error analysis of the shift-and-deflate technique should be performed. In the case of quadratic matrix polynomials with Hermitian coefficients, it would be interesting to devise a shift-and-deflate technique which preserves the Hermitianity of the matrix coefficients. In the case where the matrix coefficients are real, and we wish to deflate a complex conjugate pair of eigenvalues, the matrix polynomial obtained after the shift-and-deflate technique is not real in general; in fact, the realness of the coefficients can be preserved by shifting both the eigenvalues to zero, and not by shifting one eigenvalue to zero, and the other one to infinity. It would be interesting to adjust the shift-and-deflate technique to preserve the realness of the matrix coefficients when the eigenvalues are complex conjugate pairs. The extension of this technique to matrix polynomials of degree greater than 2 is still an open issue.
B. Meini / Linear Algebra and its Applications 438 (2013) 1946–1961
1961
References [1] D.A. Bini, L. Gemignani, B. Meini, Solving certain matrix equations by means of Toeplitz computations: algorithms and applications, in: Fast Algorithms for Structured Matrices: Theory and Applications (South Hadley, MA, 2001), vol. 323 of Contemp. Math., Amer. Math. Soc., Providence, RI, 2003, pp. 151–167. [2] D.A. Bini, G. Latouche, B. Meini, Numerical Methods for Structured Markov Chains, Numerical Mathematics and Scientific Computation, Oxford University Press, Oxford Science Publications, New York, 2005. [3] A. Brauer, Limits for the characteristic roots of a matrix. IV. Applications to stochastic matrices, Duke Math. J. 19 (1952) 75–91. [4] J.B. Carvalho, B.N. Datta, W.W. Lin, C.S. Wang, Symmetry preserving eigenvalue embedding in finite-element model updating of vibrating structures, J. Sound Vibration 290 (3–5) (2006) 839–864. [5] M. Chu, T.M. Hwang, W.W. Lin, A novel deflation technique for solving quadratic eigenvalue problems, Technical Report NCTS Preprints in Mathematics 2005-3-002, National Center for Theoretical Sciences, National Tsing Hua University, Taiwan, 2005. [6] J.P. Dedieu, F. Tisseur, Perturbation theory for homogeneous polynomial eigenvalue problems, Linear Algebra Appl. 358 (2003) 71–94 (special issue on accurate solution of eigenvalue problems [Hagen, 2000]). [7] J.C. Engwerda, A.C.M. Ran, A.L. Rijkeboer, Necessary and sufficient conditions for the existence of a positive definite solution of the matrix equation X + A∗ X −1 A = Q , Linear Algebra Appl. 186 (1993) 255–275. [8] W.R. Ferng, W.W. Lin, D.J. Pierce, C.S. Wang, Nonequivalence transformation of λ-matrix eigenproblems and model embedding approach to model tuning, Numer. Linear Algebra Appl. 8 (1) (2001) 53–70. [9] I. Gohberg, P. Lancaster, L. Rodman, Matrix Polynomials, Academic Press Inc., [Harcourt Brace Jovanovich Publishers], New York, 1982. [10] C.H. Guo, B. Iannazzo, B. Meini, On the doubling algorithm for a (shifted) nonsymmetric algebraic Riccati equation, SIAM J. Matrix Anal. Appl. 29 (4) (2007) 1083–1100. [11] C.H. Guo, P. Lancaster, Iterative solution of two matrix equations, Math. Comp. 68 (228) (1999) 1589–1603. [12] J.S. Guo, W.W. Lin, C.S. Wang, Nonequivalence deflation for the solution of matrix latent value problems, Linear Algebra Appl. 231 (1995) 15–45. [13] C. He, B. Meini, N.H. Rhee, A shifted cyclic reduction algorithm for quasi-birth–death problems, SIAM J. Matrix Anal. Appl. 23 (3) (2001/02) 673–691. [14] S. Serra-Capizzano, Jordan canonical form of the Google matrix: a potential contribution to the Page Rank computation, SIAM J. Matrix Anal. Appl. 27 (2) (2005) 305–312 (electronic). [15] F. Tisseur, S.D. Garvey, C. Munro, Deflating quadratic matrix polynomials with structure preserving transformations, Linear Algebra Appl. 435 (3) (2011) 464–479 (special issue: dedication to Pete Stewart on the occasion of his 70th birthday).