Applied Numerical Mathematics 64 (2013) 50–58
Contents lists available at SciVerse ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
On the ADI method for the Sylvester equation and the optimal-H2 points Garret M. Flagg ∗ , Serkan Gugercin Department of Mathematics, Virginia Tech., Blacksburg, VA 24061-0123, USA
a r t i c l e
i n f o
a b s t r a c t
Article history: Received 23 January 2012 Received in revised form 5 September 2012 Accepted 2 October 2012 Available online 8 October 2012 Keywords: ADI method Rational Krylov Sylvester equations H2 -optimal interpolation
The ADI iteration is closely related to the rational Krylov projection methods for constructing low rank approximations to the solution of Sylvester equations. In this paper we show that the ADI and rational Krylov approximations are in fact equivalent when a special choice of shifts are employed in both methods. We will call these shifts pseudo H2 optimal shifts. These shifts are also optimal in the sense that for the Lyapunov equation, they yield a residual which is orthogonal to the rational Krylov projection subspace. Via several examples, we show that the pseudo H2 -optimal shifts consistently yield nearly optimal low rank approximations to the solutions of the Lyapunov equations. © 2012 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Let A ∈ Rn×n , B ∈ Rm×m and Y ∈ Rn×m be given matrices. Then, the Sylvester equation for the unknown matrix X ∈ Rn×m is given by
A X + X B + Y = 0.
(1)
Eq. (1) has a unique solution if and only if λi ( A ) + λ j ( B ) = 0 for i = 1, . . . , n and j = 1, . . . , m. A special case of the Sylvester equation is the Lyapunov equation, where B = A ∗ 1 and Y = Y ∗ . Both the Sylvester and Lyapunov equations are an important tool in the analysis of asymptotically stable linear dynamical systems of the form
y (t ) = c ∗ x(t ),
x˙ (t ) = Ax(t ) + bu (t ), n×n
and b, c ∗
(2)
where A ∈ R ∈ R . In (2), x(t ) ∈ R , u (t ) ∈ R, y (t ) ∈ R, are, respectively the state, input, and output, of the underlying system. While the cross Gramian X of (2) solves the Sylvester equation n
n
A X + X A + bc ∗ = 0,
(3)
the controllability Gramian P and the observability Gramian Q solve the Lyapunov equations
A P + P A ∗ + bb∗ = 0 and
A ∗ Q + Q A + c ∗ c = 0,
(4)
respectively. These three Gramians are of fundamental importance especially in the concept of model reduction, see [1]. In what follows, we will mainly focus on the Sylvester equation (1) where Y is rank 1; hence our discussion already contains the Lyapunov equations as a special case.
* 1
Corresponding author. E-mail addresses:
[email protected] (G.M. Flagg),
[email protected] (S. Gugercin). To unify the notation, we will use complex-conjugate transpose (“∗ ”) even for real matrices since for a real matrix A, A T = A ∗ .
0168-9274/$36.00 © 2012 IMACS. Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.apnum.2012.10.001
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
51
The standard direct method for solving (1) is due to Bartels and Stewart [4]. However, this method requires dense matrix operations such as the Schur decomposition; thus is not applicable in large-scale settings. For large-scale settings, iterative methods have been developed that take advantage of the sparsity and the low-rank structure of Y . The two most common ones are the Alternating Direction Implicit (ADI) method [28,9,25,8,40,18,31,19,29,41,36,42] and the (rational) Krylov projection methods [20,23,13,22,30,35,14,3]. The ADI method was first introduced by Peaceman and Rachford [27] for solving parabolic and elliptic PDEs, and was later adapted to solving the Sylvester equation by Wachspress in [41]. It is a fixed point iteration scheme for approximating X . Given two sequences of shifts {α1 , α2 , . . . , αr , . . .}, {β1 , β2 , . . . , βr , . . .} ⊂ C and an initial guess X 0 , the ADI iteration for (1) proceeds as follows:
X i = ( A − αi I )( A + βi I )−1 X i −1 ( B − βi I )( B + αi I )−1 −1
− (αi + βi )( A + βi I )
−1
Y ( B + αi I )
.
(5) (6)
The performance of the ADI iteration depends heavily on the choice of shifts used in the iteration. Several schemes have been developed for making asymptotically optimal shift selections if some information is known about the boundaries of the numerical ranges of A, and B. See [39,41,33,34,29] and the references therein for further details on the shift selection problem in the ADI iteration. A closely related method to the ADI iteration is the rational Krylov projection method (RKPM). In the RKPM, the Sylvester equation A X + X B + Y = 0 is projected onto the special rational Krylov subspaces Krrat ( A , b, σ ) = span{(σ1 I − A )−1 b, ¯ ) where σ = {σ1 , . . . , σr }, and μ ¯ = {μ ¯ 1, . . . , μ ¯ r } are the sets of shifts used to construct . . . , (σr I − A )−1 b} and Krrat ( B ∗ , c , μ the respective rational Krylov spaces and ν¯ denotes the conjugate of ν . See [6] for further details regarding Krrat ( A , b, σ ), and constructing an orthonormal basis via the rational Arnoldi iteration. Let Q r and U r denote the orthonormal basis for ¯ ). Then, the RKPM approximation is constructed by first solving Krrat ( A , b, σ ) and Krrat ( B ∗ , c , μ
˜ r + X˜ r U r∗ B ∗ U r + Q r∗ bc ∗ U r = 0 Q r∗ A Q r X
(7)
˜ r U r∗ . The solution of the projected Sylvester equation (7) is very cheap. Like the ADI and then approximating X by Q r X method, the RKPM method also relies heavily on a good choice of shifts to produce accurate results. In the next section we will derive results that show for a certain choice of shifts, the RKPM and ADI methods are indeed equivalent. Since in almost all applications, the quantities A, B, b, and c are real, we will assume that the set of shifts σ and μ are closed under conjugation so that the approximants are real as well. This will guarantee that the orthonormal bases Q r for ¯ ) can be computed to be real as well. Krrat ( A , b, σ ) and U r for Krrat ( B ∗ , c , μ 2. Equivalence of the ADI and rational Krylov projection methods for pseudo H2 -optimal points In this section, we present our main results illustrating the connection between the ADI and RKPM. Since the discussion requires the concept of H2 -optimal points for model reduction, we first briefly review the H2 approximation problem. 2.1. Optimal H2 model reduction For a full-order model as given in (2), the model reduction problem seeks to construct a dynamical system
x˙ r (t ) = A r xr (t ) + br u (t ),
y r (t ) = c r∗ xr (t )
(8)
of much smaller dimension r n, with A r ∈ Rr ×r and br , c r∗ ∈ Rr such that y r (t ) approximates y (t ) well for a wide range of inputs u (t ). The reduced model in (2) is usually obtained via state-space projection: Two matrices V r , W r ∈ Rn×r are constructed with W r∗ V r = I r to produce
A r = W r∗ A V r ,
br = W r∗ b,
and
cr = V r c.
(9)
One can measure the quality of the approximation using the concept of transfer function. By taking the Laplace transforms of (2) and (8), one obtains the transfer functions H (s) = c (sI − A )−1 b and H r (s) = c r (sI r − A r )−1 br , respectively. Hence, one can consider model reduction in terms of these transfer functions as approximating a degree-n rational function H (s) with a degree-r one H r (s). For more details on model reduction of linear dynamical systems, see [1]. In this paper, we focus on the H2 -norm to measure accuracy of the reduced model. The H2 -optimal model reduction problem seeks to construct a reduced system as in (8), so that H r (s) minimizes the H2 error over all stable linear dynamical systems of the form (8), i.e.
H − H r H2 =
min
˜ r )=r deg( H
H − H˜ r H2 ,
(10)
52
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
where
H − H r H2 =
1 2π
∞
H (ı ω) − H r (ı ω)2 dω
1/2 .
−∞
Several methods have been introduced to solve (10); see, for example, [32,21,43,26,16,38,11,5], and the references therein. Since the optimization problem (10) is nonconvex, the common approach involves finding reduced-order models satisfying the first-order necessary conditions of H2 -optimality. The next theorem states the interpolation-based necessary conditions for H2 -optimality introduced by Meier and Luenberger [26]. Theorem 1. (See [26,16].) Given a full-order system H (s) of order n, if H r (s) = then
H (−λi ) = H r (−λi )
for i = 1, . . . , r ,
H (−λi ) = H r (−λi )
for i = 1, . . . , r .
r
φi
i =1 s−λi
is an H2 -optimal approximation to H (s),
and
(11) (12)
A reduced-order model which satisfies the H2 -optimality conditions can be obtained by using the Iterative Rational Krylov Algorithm (IRKA) of Gugercin et al. in [16]. However, in this paper we will focus on satisfying only (11) (without the derivative condition). We will call these interpolation points pseudo H2 -optimal points to emphasize that they only satisfy a subset of the optimality conditions. In terms of the projection framework (9) for model reduction, this corresponds to finding interpolation points σ = {σ1 , . . . , σr } and choosing, in (9), V r = W r = Q r , where Q r is an orthonormal basis for the rational Krylov subspace Krrat ( A , b, σ ) so that {λ1 , . . . , λr }, i.e. the eigenvalues of A r = Q rT A Q r , become the mirror images of the interpolations points σ = {σ1 , . . . , σr }, i.e.
λ( A r ) = λ Q rT A Q r = −σ .
(13)
The pseudo H2 -optimal points interpolation points can be computed iteratively in a manner similar to IRKA [16] as done in [17] for port-Hamiltonian systems. 2.2. The ADI iteration and rational Krylov projection method The main theorem requires the following lemma, which connects the ADI approximation for the Sylvester equation with rational Krylov subspaces. This extends an earlier result by Li and White [24] which establishes a similar connection for the case of the Lyapunov equation. Lemma 1. Let Y = bc ∗ , where b ∈ Rn and c ∈ Rm . Let {σ1 , . . . , σr } and {μ1 , . . . , μr } be two collections of shifts that satisfy
(μi ), (σi ) > 0 for i = 1, . . . , r. Suppose X r is the approximate solution to the Sylvester equation (1) obtained by applying the pair of shifts αi = −σi and βi = −μi in the ADI iteration (5) for i = 1, . . . , r with X 0 = 0. Then there exist L r ∈ Cn×r and R r ∈ Cm×r such that X r = L r R r∗ and colspan( L r ) ⊂ Krrat ( A , b, μ) and colspan( R r ) ⊂ Krrat ( B ∗ , c , σ¯ ) Proof. The proof is given by induction on i, the iteration step. First note that for i = 1, X 1 = (μ1 + σ1 )( A − μ1 I )−1 bc ∗ × ( B − σ1 I )−1 , so let L 1 = [(μ1 + σ1 )( A − μ1 I )−1 b] and R 1 = [( B ∗ − σ¯1 I )−1 c ]. Then L 1 and R 1 clearly satisfy the hypothesis
and X 1 = L 1 R ∗1 . Now suppose that the statement holds for X i . Then, for j = 1, . . . , i, the jth column of L i is T i ( A )b, ( j)
( j)
where T i (λ) is a proper rational function that lies in the span of { λ−1μ , . . . , λ−1μ }. Similarly, the jth column of R i is 1 i ( j) ( j) S i ( B ∗ )c, where S i (λ) lies in the span of { λ−1σ¯ , . . . , λ−1σ¯ }. Therefore X i +1 can be written as 1
i
X i +1 = ( A + σi +1 I )( A − μi +1 I )−1 L R ∗ ( B + μi +1 I )( B − σi +1 B )−1
+ (μi +1 + σi +1 )( A − μi +1 I )−1 bc ∗ ( B − σi +1 I )−1 =
i ( j) ( j) ( A + σi +1 I )( A − μi +1 I )−1 T i ( A )bc ∗ S i ( B )( B + μi +1 I )( B − σi +1 I )−1 j =1
+ (μi +1 + σi +1 )( A − μi +1 I )−1 bc ∗ ( B − σi +1 I )−1 .
σi+1 I )( A − μi+1 I )−1 T i( j) ( A )b and let the (i + 1)th column be (i +1) ∗ ¯ i +1 I )−1 × T i +1 ( A )b = (μi +1 + σi +1 )( A − μi +1 I )−1 b. Then clearly colspan( L i +1 ) ⊂ Kirat +1 ( A , b , μ). Similarly, let ( B − σ ( j) (i +1) ∗ ∗ ∗ ∗ − 1 ¯ i +1 I ) S i ( B )c be the jth column of R i +1 for j = 1, . . . , i, and S i +1 ( B )c = ( B − σ¯ i +1 I ) c be the (i + 1)th (B + μ ∗ ∗ ¯ column. Then colspan( R i +1 ) ⊂ Kirat +1 ( B , c , σ ). Finally, we note that by construction, X i +1 = L i +1 R i +1 . 2
For j = 1, . . . , i, let the jth column of L i +1 be ( A +
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
53
Next, we give our first main result showing that the approximate solutions of the Sylvester equation (1) by ADI and RKPM are indeed equivalent when the shifts are chosen as pseudo H2 -optimal points. This result applied to the special case of Lyapunov equation was first presented at the 2010 SIAM Annual Meeting [15], then later published independently in [13]. Our new result here, on the other hand, is more general than both [15], and [13] since it tackles the case of Sylvester equation and includes the Lyapunov equation as a special case. Moreover, while the proof given in [13] for the special case of Lyapunov equation makes use of a novel connection between the ADI iteration and the so-called Skeleton approximation framework first developed in the work of Tyrtyshnikov [37], the proof we provide here for the more general Sylvester equation case is given directly in terms of rational Krylov interpolation conditions, and in that sense is simpler. Theorem 2. Given the Sylvester equation (1) with Y = bc ∗ , where b ∈ Rn and c ∈ Rm , let Q r ∈ Rn×r be an orthonormal basis for the rational Krylov subspace Krrat ( A , b, σ ) and let U r ∈ Rm×r be an orthonormal basis for the rational Krylov subspace Krrat ( B ∗ , c , σ¯ ) for ˜ r ∈ Rr ×r solve the projected Sylvester equation a set of shifts σ = {σ1 , . . . , σr } where (σi ) > 0 for i = 1, . . . , r. Let X
˜ r + X˜ r U r∗ BU r + Q r∗ bc ∗ U r = 0, Q r∗ A Q r X
(14)
n×m
and let X r ∈ R be computed by applying the shifts αi = −σi and βi = −σi to exactly r steps of the ADI iteration (5) for i = 1, . . . , r. ˜ r U r∗ if and only if either λ( Q r∗ A Q r ) = −{σ1 , . . . , σr } or λ(U r∗ BU r ) = −{σ1 , . . . , σr }. Then X r = Q r X Proof. (⇐) First suppose that λ( Q r∗ A Q r ) = −{σ1 , . . . , σr }. The proof remains the same if we instead suppose that λ(U r∗ B U r ) = −{σ1 , . . . , σr }. Let A r = Q r∗ A Q r , and br = Q r∗ b, B r = U r∗ B U r , and c r = U r∗ c. Note that after we apply r steps of the ADI iteration with the set of shifts αi = βi = −σi to the projected Sylvester equation (14), ˜ r , since λ( A r ) = −{σ1 , . . . , σr }. By Lemma 1, at the rth step of the ADI iteration we obtain the exact solution X ˜ r = L˜ r R˜ r∗ where L˜ r = [ T (1) ( A r )br , . . . , T (r ) ( A r )br ] where T (i ) ( A r )br are rational functions that lie in Krrat ( A r , br , σ ). SimX ˜ r = [ S (1) ( B r∗ )c r∗ , . . . , S (r ) ( B r∗ )c r∗ ] where the S (i ) ( B r∗ )c r are rational functions that lie in Krrat ( B r∗ , c r∗ , σ¯ ). Furthermore, ilarly R for the same shifts, αi = βi = −σi for i = 1, . . . , r, applied to r steps of the ADI iteration on the full Sylvester equation (1), we have X r = L r R r∗ and L r = [ T (1) ( A )b, . . . , T (r ) ( A )b] and R r = [ S (1) ( B ∗ )c , . . . , S (r ) ( B ∗ )c ]. Thus it is sufficient to show that ˜ r = R r . Without loss of generality consider just the former equation. This, in turn, amounts to Q r L˜ r = L r and that U r R showing that Q r T (i ) ( A r )br = T (i ) ( A )b. If T i ( A )b are a set of orthogonal rational functions that span Krrat ( A , b, σ ), then it is sufficient to show that
Q r T i ( A r )b r = T i ( A )b .
(15)
Equality (15) follows readily from the interpolation properties of the Galerkin projection, which we show below. First, note that due to the interpolation properties of the Galerkin projection, Q r (σi I r − A r )−1 br = (σi I − A )−1 b. Let V r = [(σ1 I − A )−1 b, . . . , (σr I − A )−1 b]. Then, for some x ∈ Rr ,
V r x = T i ( A )b = Q
r
(σ1 I r − A r )−1 br , . . . , (σr I r − A r )−1 br x = Q r T i ( A r )br ,
(16)
which proves (15). ˜ r be the solution of (⇒) Let X
˜ r + X˜ r U r∗ BU r + Q r∗ bc ∗ U r = 0, Q r∗ A Q r X
(17)
Krrat ( B ∗ , c ,
where Q r is an orthonormal basis for σ ) and U r is an orthonormal basis for σ¯ ). Suppose that ˜ r U r∗ = X r . Let Xˆ r be the approximate solution of (17) resulting from applying the shifts αi = βi = −σi for i = 1, . . . , r Q rX ˆ r U r∗ = X r . It folto exactly r steps of the ADI iteration (5). By the interpolation result given in the proof above, Q r X ∗ ∗ ˜ ˆ ˜ ˆ ˆ lows from the assumptions that, Q r X r U r = Q r X r U r , so X r = X r . But this means that X r solves (17), and so either λ( Q r∗ A Q r ) = −{σ1 , . . . , σr } or λ(U r∗ BU r ) = −{σ1 , . . . , σr }. 2
Krrat ( A , b,
Remark 1. This theorem shows that the ADI approximation for the Sylvester equation is equivalent to lifting the solution of the projected Sylvester equation back to the original dimension when either λ( Q r∗ A Q r ) = −{σ1 , . . . , σr } or λ(U r∗ BU r ) = −{σ1 , . . . , σr }; hence the two most common approximation methods for solving a Sylvester equation are indeed equivalent for these special shift selection. Recalling pseudo H2 -optimality condition (13), for a given r, these special shifts are indeed exactly the pseudo H2 -optimal shifts for a dynamical system H 1 (s) = z 1 (sI − A )−1 b or H 2 (s) = z 2 (sI − B ∗ )−1 c ∗ where z 1 and z 2 are vectors of appropriate sizes. Remark 2. The equivalence between the RKPM and the ADI method can be extended to the cases in which the matrix Y of the Sylvester equation (1) has the form Y = F G ∗ , where F ∈ Rn× p , G ∈ Rm× p , m p, and n p with rank(Y ) = p > 1. This setup corresponds to having multi-input/multi-output (MIMO) dynamical systems H 1 (s) = Z 1 (sI − A ) F and H 2 (s) = Z 2 (sI − B ∗ )−1 G ∗ . Assume that a set of r distinct shifts σ = {σ1 , . . . , σr } are used to construct the corresponding block rational Krylov subspace Krrat ( A , F , σ ) = span{(σ1 I − A )−1 F , . . . , (σr I − A )−1 F }. Then, the equivalence result in Theorem 2 holds as before if and only if A r is a diagonalizable matrix with exactly r distinct eigenvalues λ = {λ1 , . . . , λr } such that λ = −σ . A similar result holds for Krrat ( B ∗ , G , σ¯ ) and B r . The connection to the optimal
54
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
H2 approximation proves harder to extend to the MIMO case because the interpolation-based necessary conditions for the H2 -optimal approximation of MIMO systems require enforcing tangential interpolation conditions [16,2]. These interpolation conditions can be enforced by constructing appropriate projection subspaces. Unlike in the SISO case, these subspaces do not, in general, correspond to rational Krylov subspaces; thus, the connection to H2 -optimality is lost for the MIMO case in general. Determining the special cases under which, for some restricted set of H2 -optimality conditions, the connection can be re-established for MIMO systems presents an interesting research direction to pursue. 2.3. Orthogonality in the case of Lyapunov equation The parameters for which the ADI iteration and the rational Krylov projections coincide also satisfy orthogonality conditions on the residual for the special case of the Lyapunov equation
A X + X A ∗ + bb∗ = 0.
(18)
For a given approximation X r to the solution X , define the residual R as
R = A X r + X r A ∗ + bb∗ .
(19)
The following result was first given in [13]. Here we present a new and more concise proof of the orthogonality result in terms of the special interpolation properties of the pseudo H2 -optimal shifts.
˜ r ∈ Rr ×r solve the projected Lyapunov equation Theorem 3. Given A X + X A ∗ + bb∗ = 0, let X
˜ r + X˜ r Q r∗ A Q r + Q r∗ bb∗ Q r = 0, Q r∗ A Q r X ˜ r Q r∗ . Then Q r∗ R = 0 if and only if where Q r is an orthonormal basis for Krrat ( A , b, σ ) with σ = {σ1 , . . . , σr } Let X r = Q r X ∗ λ( Q r A Q r ) = −{σ1 , . . . , σr } where R is the residual defined in (19). Proof. (⇒) Suppose that Q r∗ R = 0. Multiplying (19) with Q r∗ from the left and then transposing the resulting equation leads to
˜ r + Q r X˜ r Q r∗ A ∗ Q r + bb∗ Q r = 0. AQ rX
(20)
Let A r = Q r∗ A Q r = T Λ T −1 be the eigenvalue decomposition of A r where Λ = diag(λ1 , . . . , λr ). Plug these expressions into (20), and right multiply by T −∗ to obtain
˜ r T −∗ Λ∗ + A Q r X˜ r T −∗ + bb∗ Q r T −∗ = 0. Q rX
(21)
˜ r T −∗ must be Let ζi be the ith entry of b Q r T −∗ . Then it is straightforward to show that the ith column of Q r X − 1 rat rat ¯ ¯ (−λ I − A ) bζ . Thus, it follows that K ( A , b, σ ) = K ( A , b, −λ), where λ = {λ , . . . , λ }. Since both sets σ and λ are ∗
i
i
r
r
closed under conjugation, after an appropriate reordering, we obtain (⇐) Observe that
σi = −λi .
˜ r + X˜ r A r∗ + Q r∗ bb∗ Q r = 0 Ar X
⇒ ∗ −∗ −∗ ∗ ∗ ˜ ˜ + X r T Λ + Q r bb Q r T −∗ = 0. Ar X r T
1
r
(22) (23)
˜ r T −∗ is (−λ¯ i I r − A r )−1 Q r∗ bζi . But since Q r is an orthonormal basis for Krrat ( A , b˜ , σ ), and Thus, the ith column of X λi = −σ¯ i , this means
¯ i I r − A r )−1 Q r∗ bζi = (−λ¯ i I − A )−1 bζi = Q r X˜ r T −∗ e i , Q r (−λ
(24)
where e i is the ith unit vector. Thus,
˜ r T −∗ Λ∗ + A Q r X˜ r T −∗ + bb∗ Q r T −∗ = 0, Q rX
(25)
which implies
˜ r Q r∗ A ∗ Q r + A Q r X˜ r + bb∗ Q r = 0. Q rX
(26)
Transpose this last expression and use the fact that Q r∗ Q r = I r to obtain
˜ r Q r∗ A ∗ + Q r∗ A Q r X˜ r Q r∗ + Q r∗ bb∗ = Q r∗ R = 0, Q r∗ Q r X which is the desired result.
(27)
2
Remark 3. As we have previously noted, in almost every practical situation, one would choose a set of shifts σ which is closed under conjugation. But even for the cases where this assumption on σ does not hold, Theorem 2 holds as is, and Theorem 3 applies with a slight modification. To wit, λi = σ¯ i for i = 1, . . . , r.
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
55
Fig. 1. Relative error in the 2-norm as r varies for the EADY model.
Table 1 Comparison of methods with the EADY model order 598. r
πr +1 π1
10 20 30 40
2.31 × 10−4 3.38 × 10−7 7.63 × 10−9 4.83 × 10−10
X − X r 2 X 2
Method 1
Method 2
Method 3
8.42 × 10−2 5.73 × 10−2 1.09 × 10−3 1.93 × 10−4
1.28 × 10−3 4.99 × 10−7 8.46 × 10−9 1.06 × 10−9
1.96 × 10−1 1.13 × 10−1 8.54 × 10−2 9.70 × 10−3
3. A numerical study on using the pseudo H2 -optimal points as the ADI shifts Having shown that using the pseudo H2 -optimal points in the ADI iteration for the Sylvester equation is equivalent to applying RKPM and that the pseudo H2 points lead to an orthogonality condition in the case of Lyapunov equation, the natural question to ask is what quality of approximation the pseudo H2 -optimal points have as ADI shifts. We will briefly investigate this issue in this section. However, we emphasize that the purpose of our numerical results is not to advocate employing the pseudo H2 -optimal shifts in the ADI iteration or in the RKPM. This would be a costly numerical method for approximating Sylvester equations since obtaining the pseudo H2 -optimal shifts already requires solving several linear systems. Our numerical results are meant to illustrate the unique quality of these shifts compared with other choices of shifts that do not share the ADI-RKPM equivalency property. We used three benchmark models in our numerical simulations: The CD Player model with n = 120, the EADY model with n = 598, and the Rail model with n = 1357. The first two models are described in detail in [12] and the Rail model in [10]. For all three models, we compute a rank r approximation to the solution of the Lyapunov equation (18). The exact and approximate solutions are denoted by X and X r , respectively. The Rail model has multiple inputs; thus for this model we only use the first column of the input matrix. We use three different approximation methods for each model:
• Method 1: The RKPM is applied to a sequence of shifts that alternates between 0 and ∞. The resulting subspace is generally referred to as the extended Krylov subspace. Its application to RKPM was first introduced by Simoncini in [30]. • Method 2: The RKPM is applied using r pseudo H2 -optimal shifts; or equivalently r steps of the ADI iteration are applied using r pseudo H2 -optimal shifts. • Method 3: The r steps of the ADI iteration are applied where the ADI shifts are chosen via Penzl’s heuristic method [28].
56
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
Fig. 2. Relative error as r varies for the CD Player model.
Table 2 Comparison of errors for the CD Player model. r
πr +1 π1
4 10 20 40
3.20 × 10−4 1.68 × 10−5 1.92 × 10−6 3.22 × 10−8
X − X r 2 X 2
Method 1
Method 2
Method 3
6.71 × 10−1 2.39 × 10−1 3.70 × 10−2 1.62 × 10−3
3.20 × 10−4 1.73 × 10−5 1.10 × 10−5 8.24 × 10−8
8.04 × 10−1 9.92 × 10−1 7.84 × 10−2 8.56 × 10−2
Table 3 Comparison of errors for the Rail model. r
πr +1 π1
2 6 14 20
7.47 × 10−2 2.31 × 10−3 6.76 × 10−6 4.96 × 10−8
i.e.
X − X r 2 X 2
Method 1
Method 2
Method 3
3.83 × 100 1.44 × 10−1 2.88 × 10−2 8.89 × 10−3
9.47 × 10−1 8.86 × 10−2 8.43 × 10−5 7.29 × 10−7
2.14 × 10−1 3.80 × 10−2 2.90 × 10−4 6.85 × 10−7
The quality of the resulting approximations from each method are compared using the relative error in the 2-norm, X − X r 2 . Fig. 1 shows the relative errors for the EADY model as r varies from 1 to 50 together with the minimum X 2
π
possible error, i.e. πr +1 where πi is the ith singular value of the true solution X . Note that for a given r, the pseudo 1 H2 -optimal shifts perform remarkably well, almost matching the best low-rank approximation given by the singular value decomposition. For a selected number of r values, these numbers are also tabulated in Table 1 further illustrating the effectiveness of the pseudo H2 points as ADI or RKPM shifts. Similar results for the CD Player model are shown in Fig. 2 and in Table 2 and for the Rail model in Fig. 3 and Table 3 illustrating that the pseudo H2 -optimal shifts produce a nearly optimal rank r approximation in several cases. Indeed, this phenomenon was recently explained by Breiten and Benner in [7], where they show that the H2 -optimal shifts are optimal in a special energy norm related to the Lyapunov equation; for details we refer the reader to [7]. 4. Conclusions In this paper we presented a new result that solidifies the connection between the ADI iteration and rational Krylov projection methods for solving large-scale Sylvester equations. We have shown that for one-sided projections, the two
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
57
Fig. 3. Relative error as r varies for the Rail model.
methods are indeed equivalent for a special choice of shifts called pseudo H2 -optimal shifts, so-called because they partially satisfy first-order necessary conditions for H2 -optimal model reduction. These shifts are also optimal in the sense that they produce an approximation with a residual orthogonal to the rational Krylov projection subspace in the case of Lyapunov equation. Acknowledgements The authors thank Prof. Christopher Beattie of Virginia Tech. for several fruitful discussions on this subject. This work has been supported in part by the NSF Grant DMS-0645347. References [1] A.C. Antoulas, Approximation of Large-Scale Dynamical Systems, Advances in Design and Control, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2005. [2] A.C. Antoulas, C.A. Beattie, S. Gugercin, Interpolatory model reduction of large-scale dynamical systems, in: J. Mohammadpour, K. Grigoriadis (Eds.), Efficient Modeling and Control of Large-Scale Systems, Springer-Verlag, 2010. [3] L. Bao, Y. Lin, Y. Wei, A new projection method for solving large Sylvester equations, Appl. Numer. Math. 57 (5–7) (2007) 521–532. [4] R.H. Bartels, G.W. Stewart, Algorithm 432: Solution of the matrix equation A X + X B = C , Commun. ACM 15 (9) (1972) 820–826. [5] C.A. Beattie, S. Gugercin, A trust region method for optimal H2 model reduction, in: 48th IEEE Conference on Decision and Control, December 2009. [6] B. Beckermann, S. Güttel, R. Vandebril, On the convergence of rational Ritz values, SIAM J. Matrix Anal. Appl. 31 (4) (2010) 1740–1774. [7] P. Benner, T. Breiten, On optimality of interpolation-based low-rank approximations of large-scale matrix equations, Max Planck Institute Magdeburg Preprint, December 2011. [8] P. Benner, J.-R. Li, T. Penzl, Numerical solution of large Lyapunov equations, Riccati equations, and linear-quadratic control problems, Numer. Linear Algebra Appl. 15 (9) (2008) 755–777. [9] P. Benner, R.-C. Li, N. Truhar, On the ADI method for Sylvester equations, J. Comput. Appl. Math. 233 (4) (2009) 1035–1045. [10] P. Benner, J. Saak, Efficient numerical solution of the LQR-problem for the heat equation, Proc. Appl. Math. Mech. 4 (1) (2004) 648–649. [11] A. Bunse-Gerstner, D. Kubalinska, G. Vossen, D. Wilczek, H2 -norm optimal model reduction for large scale discrete dynamical MIMO systems, J. Comput. Appl. Math. 233 (5) (2010) 1202–1216. [12] Y. Chahlaoui, P. Van Dooren, Benchmark examples for model reduction of linear time-invariant dynamical systems, Dimens. Reduct. Large-Scale Syst. 45 (2005) 381–395. [13] V. Druskin, L. Knizhnerman, V. Simoncini, Analysis of the rational Krylov subspace and ADI methods for solving the Lyapunov equation, SIAM J. Numer. Anal. 49 (5) (2011) 1875–1898. [14] A. El Guennouni, K. Jbilou, A.J. Riquet, Block Krylov subspace methods for solving large Sylvester equations, Numer. Algorithms 29 (1) (2002) 75–96. [15] G.M. Flagg, H2 -optimal interpolation: New properties and applications, Talk given at the 2010 SIAM Annual Meeting, Pittsburgh, PA, July 2010. [16] S. Gugercin, A.C. Antoulas, C. Beattie, H2 model reduction for large-scale linear dynamical systems, SIAM J. Matrix Anal. Appl. 30 (2) (2008) 609–638. [17] S. Gugercin, R.V. Polyuga, C.A. Beattie, A. van der Schaft, Structure-preserving tangential interpolation for model reduction of port-Hamiltonian systems, Automatica 48 (9) (2012) 1963–1974, arXiv:1101.3485v2. [18] S. Gugercin, D.C. Sorensen, A.C. Antoulas, A modified low-rank Smith method for large-scale Lyapunov equations, Numer. Algorithms 32 (1) (2003) 27–55. [19] M. Heinkenschloss, D.C. Sorensen, K. Sun, Balanced truncation model reduction for a class of descriptor systems with application to the Oseen equations, SIAM J. Sci. Comput. 30 (2008) 1038.
58
G.M. Flagg, S. Gugercin / Applied Numerical Mathematics 64 (2013) 50–58
[20] D.Y. Hu, L. Reichel, Krylov-subspace methods for the Sylvester equation, Linear Algebra Appl. 172 (1992) 283–313. [21] D. Hyland, D. Bernstein, The optimal projection equations for model reduction and the relationships among the methods of Wilson, Skelton, and Moore, IEEE Trans. Automat. Control 30 (12) (1985) 1201–1211. [22] I.M. Jaimoukha, E.M. Kasenally, Krylov subspace methods for solving large Lyapunov equations, SIAM J. Numer. Anal. (1994) 227–251. [23] K. Jbilou, Low rank approximate solutions to large Sylvester matrix equations, Appl. Math. Comput. 177 (1) (2006) 365–376. [24] J.R. Li, J. White, Low rank solution of Lyapunov equations, SIAM J. Matrix Anal. Appl. 24 (1) (2002) 260–280. [25] J.R. Li, J. White, Low-rank solution of Lyapunov equations, SIAM Rev. 46 (2004) 693–713. [26] L. Meier III, D. Luenberger, Approximation of linear constant systems, IEEE Trans. Automat. Control 12 (5) (1967) 585–588. [27] D.W. Peaceman, H.H. Rachford, The numerical solution of parabolic and elliptic differential equations, J. Soc. Ind. Appl. Math. 3 (1) (1955) 28–41. [28] T. Penzl, A cyclic low rank Smith method for large sparse Lyapunov equations, SIAM J. Sci. Comput. 21 (4) (2000) 1401–1418. [29] J. Sabino, Solution of large-scale Lyapunov equations via the block modified Smith method, PhD thesis, Rice University, 2007. [30] V. Simoncini, A new iterative method for solving large-scale Lyapunov matrix equations, SIAM J. Sci. Comput. 29 (3) (2008) 1268–1288. [31] D.C. Sorensen, A.C. Antoulas, The Sylvester equation and approximate balanced reduction, Linear Algebra Appl. 351 (2002) 671–700. [32] J.T. Spanos, M.H. Milman, D.L. Mingori, A new algorithm for L 2 optimal model reduction, Automatica 28 (5) (1992) 897–909. [33] G. Starke, Optimal alternating direction implicit parameters for nonsymmetric systems of linear equations, SIAM J. Numer. Anal. 28 (5) (1991) 1431– 1445. [34] G. Starke, Fejer–Walsh points for rational functions and their use in the ADI iterative method, J. Comput. Appl. Math. 46 (1–2) (1993) 129–141. [35] T. Stykel, V. Simoncini, Krylov subspace methods for projected Lyapunov equations, Appl. Numer. Math. 62 (1) (2012) 35–50. [36] N. Truhar, R.C. Li, On the ADI method for Sylvester equations, Technical Report 2008-02, Department of Mathematics, University of Texas at Arlington, 2008, available at http://www.uta.edu/math/preprint/rep200802.pdf. [37] E. Tyrtyshnikov, Mosaic-skeleton approximations, Calcolo 33 (1) (1996) 47–57. [38] P. Van Dooren, K.A. Gallivan, P.A. Absil, H2 -optimal model reduction of MIMO systems, Appl. Math. Lett. 21 (12) (2008) 1267–1273. [39] E.L. Wachspress, Extended application of alternating direction implicit iteration model problem theory, J. Soc. Ind. Appl. Math. 11 (4) (1963) 994–1016. [40] E.L. Wachspress, Iterative solution of the Lyapunov matrix equation, Appl. Math. Lett. 107 (1988) 87–90. [41] E.L. Wachspress, The ADI minimax problem for complex spectra, Appl. Math. Lett. 1 (3) (1988) 311–314. [42] E.L. Wachspress, Trail to a Lyapunov equation solver, Comput. Math. Appl. 55 (8) (2008) 1653–1659. [43] D.A. Wilson, Optimum solution of model-reduction problem, Proc. IEE 117 (6) (1970) 1161–1165.