Stably Computing the Kronecker Structure and Reducing Subspaces of Singular Pencils A-λ for Uncertain Data

Stably Computing the Kronecker Structure and Reducing Subspaces of Singular Pencils A-λ for Uncertain Data

Large Scale Eigenvalue Problems J . Cullum and R.A. Willoughby (Editors) 0 Elsevier Science Publishers B.V. (North-Holland), 1986 283 Stably Computi...

2MB Sizes 1 Downloads 28 Views

Large Scale Eigenvalue Problems J . Cullum and R.A. Willoughby (Editors) 0 Elsevier Science Publishers B.V. (North-Holland), 1986

283

Stably Computing the Kronecker Structure and Reducing Subspaces of Singular Pencils A - XB for Uncertain Data James Demmel

Bo K8gstrBm

Courant Institute 215 Mercer Str. New York, NY 10012

Institute of Information Promsing University of Ume8 S-90187 Ume8 Sweden

USA

We present algorithms and error bounds for computing the Kronecker structure (generalized eigenvalue and eigenvectors) of matrix pencils A - XB. Applications of matrix pencils to descriptor systems, singular systems of differential equations, and statespace realizations of linear systems demand that we accurately compute features of the Kronecker structure which are mathematically ill-posed and potentially numerically unstable. In this paper we show this can be accomplished efficiently and with computable error bounds. 1. Introduction

1.1 Motivation and short summary During the last few years there has been an increasing interest in the numerical treatment of general matrix pencils A-XB and the computation of the Kronecker canonical form, or KCF. The main reason for this interest is that in many applications, e.g linear systems theory [40], descriptor systems [1, 24, 311 and singular systems of differential equations [4],[46], problems are modeled in terms of linear matrix pencils. Given information about the KCF questions about the existence and the unicity of solutions, the state of a system or even explicit solutions can easily be answered. Algorithms have been proposed by several authors (see section 4) for computing the KCF. These algorithms share the property of backward stability: they compute the KCF (or some of its features discussed in section 1.2) for a pencil C - A D which lies some small (or at least monitorable) distance from the problem supplied as input, A-XB. A natural question to ask is how such a perturbation, measured as 11 ( A , B ) - ( C , D ) [ l E( be it roundoff, a user supplied estimate of measurement error, or a user supplied error tolerance), can affect the KCF or related quantities being computed. In particular a perturbation analysis where the size of the perturbation is a parameter (not necessarily small) is needed to rigorously determine the effects of the uncertainty in the data (see sections 4 and 5). It is not at all obvious that such a perturbation analysis is possible at all, since the mathematical problem is frequently unstable, so that arbitrarily small perturbations in the data may cause large changes in the answer. For some applications (e.g in linear systems theory) it is important to calculate unstable features of the KCF because they still contain pertinent physical information. In fact one sometimes wants to compute as unstable (nongeneric) a KCF as possible, because this in a certain sense allows low dimensional approximations to high dimensional problems, in particular low

284

J. Dernmel and B. Kbgstrom

dimensional approximate minimal realizations of linear systems ([39],[48], see also sections 2.3 and 2.4). In this setting one wants to know what nongeneric pencils lie within a distance E of a given pencil A -XB, in order to pick the one supplying the best approximation (see section 4.5). Before we go into further details we outline the rest of the paper. In section 1.2 we collect the basic algebraic theory of linear matrix pencils, for example regular and singular pencils, the KCF, minimal indices, and generalized subspaces (deflating and reducing subspaces). In section 1.3 we define other notation. Section 2 describes some applications where general matrix pencils A - XB appear, for example descriptor systems (section 2.1), singular systems of differential equations (section 2.2), and statespace realizations of (generalized) linear systems (sections 2.3 and 2.4). Section 3 considers algorithms for computing the Kronecker structure and its relevant features like reducing subspaces. The section starts with a geometric interpretation of a singular pencil in canonical form (KCF)and introduces the RGSVD and RGQZD algorithms ([19, 201) that are reviewed in section 3.2. In section 3.3 the transformation of a singular pencil to a generalized upper triangular form (GUPTRI) is discussed. A new more efficient implementation of the RGQZD algorithm, which is used as the basic decomposition in GUPTRI, is also presented. Section 3.4 is concluded with a short review of other approaches and algorithms, notably Kublanovskaya [16, 171, Van Dooren [39,41], and Wilkinson [46]. In section 4 the perturbation theory of singular pencils is formally introduced. The section begins by reviewing the perturbation theory for regular pencils (section 4.1). Section 4.2 discusses why the singular case is harder than the regular case. Perturbation bounds for pairs of reducing subspaces and the spectrum of the regular part are presented (sections 4.3 and 4.4). For a more complete presentation and proofs the reader is referred to [ll]. Finally in section 4.5 we make some interpretations of the perturbation results to linear systems theory by deriving perturbation bounds for the controllable subspace and uncontrollable modes. We illustrate the perturbation results with two numerical examples. Armed with the perturbation analysis for singular pencils, section 5 is devoted to analyzing the error of standard algorithms for computing the Kronecker structure from section 3. An upper bound for the distance from the input pencil A-XB to the nearest pencil C - A D with the computed Kronecker structure and perturbation bounds for computed pairs of reducing subspaces are presented (sections 5.1 and 5.2, respectively). In section 6 we present some numerical examples (one generic and one nongeneric pencil) and assess the computed results by using the theory from sections 4 and 5. Finally in section 7 we give some conclusions and outline directions for our future work. 1.2 Algebraic theory for singular pencils In this section we outline the algebraic theory of singular pencils in terms of the Kronecker Canonical Form (KCF). Suppose A , B, S and T are m by n complex matrices. If there are nonsingular matrices P and Q ,where P is m by m and Q is II by n, such that A-XB = P (S-AT) Q-' (1.1) then we say the pencils A-XB and S-AT are equivalen~and that P ( . ) Q - l is an

Stably Computing the Kronecker Structure

285

equivalence transformation. Our goal is to find P and Q so that S-XT is in a particularly simple form, block diagonal: S=diag(Sll, . . . ,Sbb) and T=diag(TI1, . . . ,Tbb).We can group the columns of P into blocks corresponding to the blocks of S-AT: P = [P1I * . I Pb] where Pi is m by mi, mi being the number of

rows of Si,-XT,,. Similarly, we can group the columns of Q into blocks corresponding to the blocks of S-AT: Q=[Qll . lQb]where Q,is n by n,, ni being the number of columns Of S,,-A Ti,. The diagonal blocks Sii- ATIi contain information about the generalized eigenstructure of the pencil A-XB and P, and Q, contain information about the corresponding generalized eigenspaces. One canonical decomposition of the form (1.1)is the Kronecker Canonical Form [12],where each block Sii-ATii must be of one of the following forms:

- -

This is simply a Jordan block. ho is called a finite eigenvalue of A-XB

This block corresponds to an infinite eigenvalue of multiplicity equal to the dimension of the block. The blocks of finite and infinite eigenvalues together constitute the regular part of the pencil.

This k by k+ 1 block is called a singular block of minimal right (or column) index k. It has a one dimensional right null space for any X .

This j + 1by j block is called a singular block of minimal left (or row) index j . It has a one dimensional left null space for any A . The left and right singular blocks together constitute the singular part of the pencil. If a pencil A-XB has only a regular part, it is called regular. A-AB is regular if and only if it is square and its determinant det(A-XB) is not identically zero. Otherwise, there is at least one singular block Lk or L,’ in the KCF of A - XB and it is called singular. In the regular case, A-XB has n generalized eigenvalues which may be finite or infinite. The diagonal blocks of S - AT partition the spectrum of A - XB as follows:

286

J. Demrnel and B. Ktigstrorn

UEU(A-AB) =

b

U u(S,,-AT,I)

i=1

b

U U, .

1=1

The subspaces spanned by PI and Q, are called left and right deflating subspaces of A-AB corresponding to the part of the spectrum u, [32, 411. As shown in [41], a pair of subspaces P and Q is deflating for A-AB if P=AQ+BQ and dim(Q)=dim(P). They are the generalization of invariant subspaces for the standard eigenvalue problem A-AZ to the regular pencil case: Q is a (right) invariant subspace of A if Q=AQ+Q, i.e. AQCQ. Deflating subspaces are determined uniquely by the requirement that S-AT in (1.1) be block diagonal: different choices of the Pi or Qlsubmatrices will span the same spaces P,and Q1. The situation is not as simple in the singular case. The following example shows that the spaces spanned by the P, and Q,may no longer all be well defined:

As x grows large, the space spanned by Q2 (the last column of Q) can become arbitrarily close to the space spanned by Ql (the first two columns of Q). Similarly the space spanned by P z (the last column of P) can become arbitrarily close to the space spanned by P1 (the first column of P). Thus, we must modify the notion of deflating subspace used in the regular case, since these subspaces no longer all have unique definitions. The correct concept to use is reducing subspace, as introduced in [41]. P and Q are reducing subspaces for A-AB if P=AQ+BQ and dim(P)=dim(Q)-dim(N,), where N , is the right null space of A - AB over the field df rational functions in A. It is easy to express dim(Nr) in terms of the KCF of A-AB: it is the number of L, blocks in the KCF [41]. In the example above, N , is one dimensional and spanned by [1,A,OIT. In this example the nontrivial pair of reducing subspaces are spanned by P 1 and Ql and are well defined. In terms of the KCF,we may define reducing subspaces as follows. Assume in (1.1) that S-AT is in KCF,with diagonal blocks Sll-hTll through &-ATrr of the tfle Lk (right blocks) s r + l,r + l-AT[+ 1,r + 1 through s r + reg ,r+ r e g - A T r + reg,r+ reg regular blocks (Jordan blocks or blocks with a single infinite eigenvalue), and the remaining S,-ATl blocks of the type LF (left singular blocks). Assume for simplicity that there is one 1 by 1 regular block for each distinct eigenvalue (finite or infinite). Then there are exactly 2"g pairs of reducing subspaces, each one corresponding to a distinct subset of the reg eigenvalues. If P,Q are such a pair, Q is spanned by the columns of Ql, . . . ,Qr and those Q, that correspond to the eigenvalues in the chosen subset. P is spanned by the columns of P1,. . . ,Pr and those PI corresponding to the chosen eigenvalues. The smallest subset of eigenvalues is the null set, yielding P and Q of minimum possible dimension over all pairs of reducing subspaces; this pair is called the pair of minimal reducing subspaces. Similarly, the largest subset of eigenvalues is the whole set, yielding P and Q of maximal dimension; this pair is called the pair of maximal reducing subspaces. These pairs will later play a central role in applications. In any event, given A-AB, any pair of reducing subspaces is uniquely identified by specifying what subset of eigenvalues it corresponds to. We illustrate these concepts with the following 5 by 7 example: P-'(A - XB)Q = diag(Lo,LIJz(0),2.Jl(m),LT) =

Stably Coinputing the Kronecker Structure

287

7

.A 1

-A 0

1 -A

1

1 -A

1 Letting el denote the unit column vector with a 1 at position i and zeros elsewhere, we see that the minimal reducing subpaces are given by Qmin

= span{el,ez,e31 and

Pmin

= span{ed

and the maximal reducing subspaces are given by Q, = span(e1,ez,ej,e4,eg,e6,e7} and PmX= span{e1&2,e3&4&5). By choosing two different subsets of the spectrum of A--XB, we may get different pairs of reducing subspaces of the same dimensions. Thus Q = span{el,e2,e3,eq,ed and P = span(el,e~,e3} correspond to the double' eigenvalue at 0, and

Q = s p a n ( ~ l , e ~ w - w and ~ ) P = spadel,e4&5} correspond to the double eigenvalue at m. This concludes our summary of the purely algebraic properties of pencils. Since w e are interested in computing with them as well, we will now describe some of their analytic properties, and explain why computing the KCF of a singular pencil may be an ill-posed problem. Suppose first that A-AB is a square pencil. It turns out almost all square pencils are regular, in other words almost any arbitrarily small perturbation of a square singular pencil will make it regular. We describe this situation by saying that a generic square pencil is regular. Similarly, if A-AB is an m by n nonsquare pencil, it turns out almost all m by n pencils have the same KCF, in particular they almost all have KCFs consisting of Lk blocks alone (if m
288

J. Demmel and B. Ktigstrom

a:=O if ui=O. K ( A )will denote the condition number umm(A)/umin(A)of the matrix A ; this applies to nonsquare A as well. A@B will denote the Kronecker product of the two matrices A and B :A O B = [Ai, B ] . Let COlA denote the column vector formed by taking the columns of A and stacking them atop one another from left to right. Thus if A is m by n , colA is mn by 1 with its first m entries being column 1 of A , its second m entries being column 2 of A , and so on.

2. Matrix pencils in some applications 2.1 Descriptor systems One field of applications is concerned with the solution of the linear implicit system of differential equations Bx '( t )=,4x ( t )+ f ( r ) , x (to)=xo

(2.1)

where A and B are n by n (complex) matrices, possibly singular, f ( t ) is an n component input vector, x ( t ) is an n vector, and to is the initial time with the corresponding initial vector X O . In some applicationsf(t)=Cu(t), where C is an n by k (complex) matrix. Such systems are called descriptor systems [25] since they arise from formulating the system equations in natural physical variables. Implicit or descriptor systems occur for example in control problems (e.g the basic quadratic cost problem), singular perturbation problems and when modeling impulsive behavior of electrical circuits (see [4] for more details, examples and further references). If B is nonsingular then the system (2.1) can be transformed to an explicit system

x' ( t ) =Ex ( t )+g( t ) , x (to)=x0 (2.2) where E=B-'A and g = B - ' f , which is the usual linear state model. However if B is ill-conditioned with respect to inversion this transformation will introduce instabilities into the model, and should therefore be avoided. A characterization of the general solution of (2.1) is provided by the KCF of the pencil A-XB ( see [12] and [46]). By premultiplying this system by P - l and introducing new variables Y ( x = Q v ) we can assume that A-XB is in KCF (see equation 1.1). The solution of the equivalent system P - b Q v '( t )=P - 'AQv ( t )+P-'Cu ( t ) , Y( to)=P-?x0 (2.3) then reduces to solving smaller systems determined by the diagonal blocks of the KCF. If A-XB is regular then systems like (2.1) are called solvable because solutions to the descriptor system exist and two solutions which share the same initial value (admissible initial condition, see below) must be identical [31]. So in the following we assume that A-XB is regular and by collecting all blocks in the KCF of A-XB corresponding to the finite and infinite eigenvalues in J and N , respectively, i.e

(2.4)

where y = kf,y$]', it is easy to verify that the solution of (2.4) is given by (2.5a)

(2.5b)

Stably Cornpiiting the Kronecker Structure

289

Here m is the size of the largest Jordan block corresponding to the infinite eigenvalue (in the KCF) of A-XB, i.e N is nilpotent of degree m , and is called the (rdpotency) index of the system. Transformation methods and methods based on the explicit solutions (2.5a-b) do not preserve any sparsity structure of the problem and can of course only efficiently be used for systems that can entirely be handled in the main memory of the computer. Implicit systems are also called diff~ential-algebraic(D/A) systems and from (2.5a-b) we see that the solution of (2.1) includes one part (2.5a) corresponding to the solution of an explicit system (the Dpart) and one part (2.5b) corresponding to sufficiently differentiable (at least m - 1 times) functions u (the A-part). A simple example of an index 3 system with only an A-part is

k 8 :I I;:] t :4:;1 + [cil 0 1 0

Y1’

1 0 0

Y1

=

which has the solution y1 = - c r r ( t ) , y 2 = - c ' ( t ) and y 3 = - c ( t ) . Recently the problem of solving D/A systems has attracted many researchers. Sincovec et al [31] consider difference methods for solving large scale descriptor systems (2.1). Normally we do not know the index of the system (2.1) a priori, so one important part of the problem is to find admissible initial conditions satisfying (2.5b). They propose to use the forward Euler method integrating backwards in time. However in order to use the method they have to know an upper bound for the index of the descriptor system. A backtracking procedure is suggested to resolve that problem (see [31] for details). More general implicit systems are studied by Gear and Petzold [13, 141 and Brenan 131, notably linear systems (2.1) with time-varying coefficients in [13]. For a fixed time the solution can be expressed as in (2.5a-b), but normally the local index (index at a fixed time) changes with time and the behavior of the solution is determined by a global nilpotency index. The above mentioned papers all use the backward difference formulas (BDF) and solve a nonlinear system of equations at each time-step. Campbell [5] also considers systems (2.1) with timevarying coefficients and presents a family of explicit Taylortype methods. It is worthwhile to mention that existing numerical methods for solving linear time-varying D/A-systems can reliably solve only index 1systems, some index 2 systems and very few higher index systems [14]. 2.2 Singular systems of differential equations When det(A-XB)=O or the matrices A and B are rectangular then the pencil A-XB is singular and the corresponding system (2.1) is called a singufur system [46]. The KCF of A-XB will then (except for a possible square regular part) contain nonsquare diagonal blocks corresponding to the minimal indices of A-XB (see section 1.2), and as we will see the singular system (2.1) is not solvable in the sense of the previous section. The nature of the solution corresponding to the variables associated with the regular part will be as before (see section 2.1), so it remains to deal with variables associated with minimal indices. Without loss of generality we do this by considering a left and a right index of degree 2, respectively. Example 2.1 - right index 2 corresponding to a an L2 block

J. Demmel and B. K6gstrom

290

or equivalently, XI’

x2’

+ f1 = x3 + f2

= x2

and we see that x 3 may be chosen to be an arbitrary function and nl,x2 are determined by quadratures. Example 2.2 - left index 2 corresponding to an Lq block

or equivalently, x1’ = f l x2’ = X I + f2

0

= x2+f3

and we see that the variables are completely determined by the compatibility relation

fi = -f2

- f3 .

There are analogous results for Lk blocks and L f blocks of any size, including k=O. In summary, the characteristic parts in the solution of (2.1) introduced by singularity are: (i) the equations corresponding to an Lk block have solutions for all right hand sides f but these are not completely determined by the driving function f or the initial data, and (ii) for. the equations corresponding to an L[ block there is a compatibility relation which the components off must satisfy if a solution is to exist. So a singular system of differential equations may or may not have a solution, and can even have infinitely many solutions dependent on the KCF of the underlying pencil A-XB. 2.3 State-space realizations of linear systems Frequently it is not possible to measure all the state variables x in the linear system of (2.2), but instead one has to measure linear combinations of the state variables. In this case the linear state model (2.2) is extended to the general state space representation

x ’ ( t ) = &(t)

+ Cu(t)

(2.6a)

(2.6b) where as before x and u represent the state variables and controls (inputs), respectively, A is an n by n matrix and C an n by k matrix. The p component vector y represents the outputs and D is a p by n matrix. Often the model is further extended by adding a feedthrough term Fu(t) to the outputs in (2.6b). System theorists are interested to know if the system (2.6) is controllable and/or observable. The system (2.6a) is said to be completely controllable if we can choose the Y(t> = W t )

Stably Computing the Kroizecker Structure

29 1

inputs u to give the states x any value in a finite time, and algebraically it is equivalent to the controllable subspace C(A,C) R[CBCB2CI * * h"-'C]

-

having full rank n. Wonham [48] gives a geometric characterization of C(A,C) as the smallest invariant subspace of A containing the range of C. Observability is the dual concept to controllability and the system (2.6) is said to be completely observable if we can determine the states x of the system at time to from the knowledge of the outputs y over a finite time interval [to,tl]. This decision can algebraically be expressed in terms of the unobservable subspace

namely (2.6) is cozpletely controllable if and only if C(A,D)={O}. The geometric interpretation of O(A,D) is the largest invariant subspace of A included in the nullspace of D [48]. Recently there has been much interest in computing the controllable and observable subspaces (see e.g Boley [2], Paige [28] and Van Dooren [40] which contains further references to the control literature). C(A,C) and O(A,D) can also be characterized in terms of matrix pencils [40]. Define the pencils

Then C(A,C) is the finimal left reducing subspace of the-pencil-Al-XBl (see section 1.2), and similarly O(A,D) is the maximal right reducing subspace of the pencil A2-XB,. Since B,=[O,Z] is of full row rank the KCF of A1-XB1 can only have finite eigenvalues and Lk blocks (right minimal indices) and no infinite eigenvalues or Lf blocks. A dual result holds for A2-XB2, i.e since B2 is of full column rank the KCF of A2-XB2 can only have finite eigenvalues and L f blocks. HAl-XB1 andA2-XB, have no regular part then the system (2.6) is said to be completely controllable and observable, respectively. Also the eigenvalues of the regular part of A1-XB1 are of interest in designing control systems. They are called the input decoupling zeroes or uncontrollable modes and are the eigenvalues of the system (2.6a) which cannot be controlled by any choice of feedback u=Kx. Note that similar results hold for the discrete-time versions of the continuous-time systems (2.6). 2.4 State-space realizations of generalized linear systems

By replacing the explicit linear model (2.6a) with a descriptor system we get a generalized state-space realization [45]

+ Cu(t) y(t) = Dx(t) + Fu(t)

Bx'(t)

= Ax(t)

(2.7a)

(2.7b) where B is an n by n matrix. The interesting cases are when B is singular or almost

292

J. Demmel and B. Kdgstrom

so, since otherwise the generalized system can be transformed to a system of the state-space form (2.6). Further we assume that the pencil A-XB is regular so for admissible initial conditions the system of equations (2.7) have a unique solution (see section 2.1). The generalized state-space systems (or descriptor systems, singular systems) allow considerably more generality since the transfer funcrion G(s)=D(A-sB)-'C + F from u to y is not necessarily strictly proper (i.e. G ( s ) 0 as s OD).However in these cases G(s) can be written as the sum of a strictly proper part Gl(s) and a polynomial part F ( s ) [45]. By knowing the KCF of the regular pencil A-XB we can separate the causal part (finite eigenvalues) from the noncausal part (infinite eigenvalues) of the system (2.7): GI($) = D,(J-sZ)-'Cr (strictly proper) (polynomial in s) ~ ( s ) = D,(z-sN)-'c~

-

-.

The concepts of controllability and observability are generalizable to descriptor systems. These concepts are of particular interest for the natural system frequencies at infinity (for a thorough discussion see e.g. [l, 7, 241). The infinite modes are separated into dynamic and nondynamic modes, where a nondynamic mode at infinity is a direction of the descriptor vector x in which there is a purely algebraic relationship between x and the input u and output y . All other infinity modes are called dynamic (or impulsive males) and are directions in which x can exhibit impulsive behavior due to an initial condition with zero input. These modes can directly be identified from the KCF of A-XB; the nondynamic modes are associated with the 1 by 1 Jordan blocks in N and the dynamic modes with Jordan blocks of size at least 2. A straight forward generalization of controllable and unobservable subspaces of statespace systems to generalized statespace systems corresponding to both finite and infinite modes of the system (2.7) is proposed by Van Dooren [40]and he defines C(B,A,C) = inf{X : Y=AX+BX , dim(Y) = dim(X) , R(C)CY} g(B,A,D) = sup(X : Y=AX+BX , dim(Y) = dim(X) , XGN(D)} For B = l we see that these definitions coincide with the controllable subspace C(A,C) and the unobservable subspace O(Ap), respectively, of the state space system (2.6). In words the above geometric characterization says that C@,4,C) is the smallest deflating subspace of A-AB containing the range of C, and O(B,A,D) is the largest deflating subspace of A -XB included in the nullspace of D . These subspaces can also be expressed in terms of matrix pencils. Define the pencils

then C ( B d , C ) is the minimal left reducing subspace o f t h e pencil A1-AB1, and similarly O ( B , A p ) is the maximal right reducing subspace of the pencil A2-XB2. The reducing subspaces corresponding to controllable and unobservable subspaces of state-space or descriptor systems can be computed by applying the algorithms in section 3 to the pencils A1-XB1 and A2-XB2, defined in sections 2.3 and 2.4, respectively. It is also possible to take advantage of the special structure of B , and Bz, respectively, when computing these subspaces [40]. Our perturbation theory for singular pencils (presented later) can also be applied to singular A,- XB, and A2- hB2, respectively, giving perturbation bounds for controllable and unobservable subspaces (see section 4.5).

Stably Cornputirig the Kronecker Structure

293

3. Algorithms for computing the Kronecker structure and pairs of reducing subspaces 3.1 A singular pencil in canonical form - a geometric interpretation In the study of the the Kronecker structure of a singular pencil A - XB the column and row nullities of A and B and their possible common nullspaces play an important role. For example, parts of the singularities in A and B may correspond to zero and infinite eigenvalues of A - X B , and if A and B have a common nullspace there exist minimal right (column) indices in the KCF of A-XB. We illustrate the case of a common column nullspace by considering an m =3 by n = 6 singular pencil, already in canonical form [19]:

I

I

1 0 0 0 0 0 0 0 0 = diag(J2(0),L0,L0,L1) (3.1) 0 0 0 - A 1 Its KCF consists of one 2 by 2 Jordan block with eigenvalue 0 ( J 2 ( 0 ) ) , two right indices of degree 0 (Lo) and one right index of degree 1 (L1).Notice that the 0 by 1 blocks LO only contribute to the column dimension of A-AB. Studying A and B separately and permuting A and B so that all zero-columns of A are the leading ones we obtain: -A

A-XB =

0 0

-A

4

0 0 0 0 1 0

A'=

0 0 0 0 0 1 [o 0 0 0 0

(3.2a)

(3.2b)

The permutations made can be summarized as follows: the LI block is retrieved from columns 3, 5 and row 1, and the J2(0) from columns 4, 6 and rows 2, 3. Finally the two Lo form columns 1and 2. Let n1 denote the column nullity of A and rl be the dimension of the column nullspace of A which is not common to B . Directly from (3.2a-b) we see that n1 is 4 and that A and B have a common column nullspace of dimension 2 (nl-rl), which is the number of Lo blocks in the KCF. We also see that r l = 2 is equal to the number of structure blocks Jk(0) or L, of order k greater than or equal to one. More information about these blocks can be retrieved by first deflating A-XB at row rl and column n l and then analyzing the remaining pencil [0 01-XIO 11 in a similar way. We see that the column nullspace of the new A has dimension n2=2 and the common column nullspace of A and B has dimension l=n2-r2 (r2= l), which now gives the number of L1blocks. Generally we have nl(=4) 2 r1(=2) 2 n2(=2) 2 r2(=1) and r1-n2 is the number of Jl(0) blocks in the KCF, here none. After deflating [0 01-XIO 11 at row r2 and column n2 we are left with the "empty" pencil, and therefore define n3=0. Now r2-n3=l gives the number of J2(0) blocks in the KCF, and we are finished. The above geometric inte retation of the Kronecker structure holds for arbitrary pencils A-XB. Let A(l-l)-ABT-l)denote the deflated pencil from step i-1 (i=1,2, ... andA(O)=A, B(O)=B) which will be studied next, and let

J. DernrneI and B. Kdgstrom

294 nt =

dimN(A('-l))

and

q-ri

=

dim (N(A(i-l))nN(B(f-l)))

.

(3.3)

In general these column nullities expose the Jordan structure of the zero eigenvalue and the right indices of a singular A-XB in the following way. Theorem 3.1: (Geometric interpretation of the structure blocks). The number of Lt blocks equals ni - r, and the number of Ji(0) blocks equals ri -nf+ 1. For a proof see [i9],[39] or [46]. We can also extract the left (row) indices similarly by considering row nullities of A and B , or equivalently column nullities of AT and BT. If we interchange the roles of A and B (i.e. B - pA is considered) instead of the Jordan structure of the zero eigenvalues we now obtain the Jordan structure of the infinite eigenvalues of A-XB. So it is possible with entirely geometric arguments t o extract the Jordan structures of the zero and infinite eigenvalues and the right and left minimal indices. In the next section we make use of this geometric interpretation of the KCF to formulate algorithms for computing the indices nl and r, and transforming (A,B) to a form equivalent to (3.2a-b) via a sequence of equivalence transformations of A - XB. 3.2 The RGSVD and RGQZD algorithms One way of doing a column rangelnullspace separation of a matrix pair (A$) (i.e. to compute n1 and r1) is via the generalized singular value decomposition (GSVD) and we start by formulating the GSVD of the m by n matrices A and B where m r n [19]. The case m
[ u* o v*] 0 FIX=

where D A ,DBE C"

I:$ -

are given by

(3.4) -

(3.5a)

DB

C=diag(cl,c2,

= [OxB]

(3.5b)

I

. , . , c r ) , O S C ~ S C ~ S 1c,11 . . . , s r ) , l r s 1 r s 2 2- . . z s r r O *

*

(3.6a)

(3.6b) S=diag(sl,s2, satisfying C 2 + S 2 = I . Here X A and X B are m by n - r . Since we only need V and X (or U and X) in an equivalence transformation of A-XB, we do not come into the numerical difficulties one encounters when trying to compute the complete GSVD. (See [30, 34, 35, 42, 431. In [36, 371 Sun presents a perturbation analysis of the GSVDproblem, and an improved version of one of his results using straightforward matrix ideas is presented in [29].) By knowing the GSVD (3.4) of A and B we can write A =U

and

01

oc [o

X-l

(3.7a)

295

Stably Coinputirig the Kronecker Structure

(3.7b) Let nl and rl be defined as in equation (3.3). Then the first rl ci VLJS must be zero and therefore the cci-responding si values must be ones (C2+S2=Z). By multiplying A and B in {3.7a-5) with X from the right we see that the columns xi of X span the column nullspace of A and the common nullspace of A and B in the following way: N ( A ) = s P ~ ~ ( ~ I J z* J.~. ,jxnJ

--

N(A)nN(B) = s ~ a n ( x 1 ~ 2 , . nxnl-rl) Now we can make use of V and X from the GSVD of (A$) transformation:

in an equivalence

where M1 and Al are full matrices and B1 is diagonal. In fact the transformed B part of (3.8) is I), in the GSVD (see eq.(3.5b)). Further the right hand side of equation (3.8) is in a form similar to A’ and B’ in equations (3.2a-b). This is the first step in a reduction theorem that apart from the Jordan structure of the zero eigenvalue computes the right minimal indices, similar to what was illustrated in the previous section. In the next step we do a GSVD of ( A l p l ) and decompose Al-XB1 similarly with an equivalence transformation. The deflation process continues until n,=O (dimN(A)=O) or n,#O but r,=O (dim (N(A)nN(B))=O). At the same time we obtain an associated pair of reducing subspaces. We summarize the reduction in the following theorem [20]. Theorem 3.3: Reduction based on GSVD. Let A and B be m by n complex matrices. Then there exists a unitary matrix VECmXmand a nonsingular matrix X € C n X nsuch that (3.9a) where Aor = 0

M1

rl r2 r3

296

J. Demmel and B. Kigstrcm

and BOr= O

0

I

0

Here I,, denote the rk by rk identity matrix. The indices ni and ri satisfy nl 2 r l L n2 2 r2 L * * 2 n, 2 r, = ns+l = 0

-

and provide part of the K c F as follows: nk-rk is the number of Lk-1 blocks and rk-nk+l is the number of Jk(0) blocks. The remainder of the KCF is contained in A,-XB,. We can now apply Theorem 3.3 to A - XB and B* - pA*, respectively, as discussed in section 3.1 and we get the reiterating generalized singular value deflation (RGSVD)algorithm for computing the Kronecker structure of a singular matrix pencil 1201*

It is also possible to do a column (or row) rangdnullspace separation of (A$) in terms of a unitary equivalence transformation that displays nl and n l - r l similarly to GSVD. In [19] a generalized QZ decomposition (GQZD) is presented. Theorem 3.4 (GQZD):Let A$ECmXn where m r n and r=rank[A*$*]*ln. Then there exist unitary matrices U,VECmXm and a unitaq matrix QECnXnsuch that

where TA ,TBE C"

[Y *;]

are given by

F] E]

(3.10)

Q =

(3.1la) (3.llb) R , and R, ere upper triangular and satisfy IIRc*Rc + RS*R#IIE = I I W E where Z1is the diagonal matrix of singular values of [A* ,B*]*.

(3* 12)

By comparing the GSVD and GQZD we see that the unitary Q (3.10 and the norm-identity (3.12) take the place of X and the cosine-sine identity (C +S2=Z), respectively in Theorem 3.2. If the first rl ci7sin the GSVD of (A$) are zeros then the first rl columns of Rc will be zero columns [19], so from (3.10) we see that the n l - r l first columns of Q span N(A)r) N(B) and the n1 first columns of Q span N(A). If we make use of V and Q in an equivalence transformation similar to (3.8) we get

1

Stably Computing the Kronecker Structure

297

(3.13)

where Rll is rl by tl, upper triangular and of full rank and B1 has only nonzero elements on and above the diagonal starting at the top left corner. M I , A1 and N1 are full matrices. In the next step we now continue with a GQZD of (Al,Bl) and decompcse A 1 - U 1 similarly and so on. In summary if we replace GSVD by GQZD in the RGSVD algorithm we get the RGQZD algorithm for computing the left and right minimal structures and the Jordan structure of the zero and infinite eigenvalues, respectively (for details see [20]).

As we will see in section 4 the probIem of computing the KCF of a nongeneric pencil is generally ill-conditioned. For example almost all (arbitrary small) perturbations of a nongeneric pencil will turn it into a generic one. However if we make a regularization of the problem by restricting the perturbations in A and B such that the unperturbed and perturbed pencils have minimal and maximal pairs of reducing subspaces of the same dimension, respectively, we can turn the problem to a well-conditioned one. (For more details see sections 4 and 5 . ) So our aim should be to minimize the influence of rounding errors in the computations. In the coming section we therefore use only unitary equivalence transformations in the reduction process to compute the Kronecker structure. 3.3 GUPTRI Transformation to a generalized upper triangular form We will now reduce a singular pencil A-XB to a generalized upper triangular (GUPTRI) form in terms of a unitary equivalence transformation such that

-

(3.14)

P*(A-XB)Q =

-A,-XB, = -

-

*

0

A*-XBo

0 0

0 0

0

0

* * Af-XBf

0 0

*

* * Ai-XBI

0

* * *

* Ai-XB,

where A,- AB, has only the right minimal indices (the L, blocks) of the KCF of A - AB in its KCF. AI- ABl has only the left minimal indices (the L f blocks) of A -XB AreE-ABre8 is the regular part of the KCF of A-XB, and consists of the following three subpaxts: Ao-XBo has the Jordan structure of the zero eigenvalues of A-XB, A,-AB has the nonzero finite eigenstructure of the regular part of A-XB in any desiredorder (det(Af)# 0 and det(Bf) #O), Al -ABI has the Jordan structure of the infinite eigenvalues of A -XB. The diagonal blocks A,-AB, and Al-hB, are block upper triangular and the three diagonal blocks corresponding to the regular part of the pencil A,,,-XB,, are all

.

J. Demmel and B. KLgstrBm

298

upper triangular (in real arithmetic Af-XBf may have two by two diagonal blocks corresponding to pairs of complex conjugate eigenvalues). The stars (*) in (3.14) denote full matrix pencils of appropriate sizes. Further reduction of GUPTRI to a block diagonal form will rely on nonunitary equivalence transformations (see section 4) * Note that A-XB must be nongeneric in order to have both right and left indim, or indices of either kind and a regular part in its GUPTRI form. Since a square generic pencil is regular it cannot have any of the blocks A,-hBr and AI-XBl in its GUPTRI form. If A-XB is a generic and rectangular pencil where mn) it reduces to AI-XBl. (See also sections 1.2 and 4.2.) Since the user controls the ordering of the eigenvalues in Af-XBf it is possible directly from the GUPTRI decomposition to produce a pair of reducing subspaces associated with any part of the spectrum a ( A - X B ) of the regular part of A-XB. By partitioning P and Q as

QI] and P = [Pr ,Preg P I ] we see that the minimal and maximal pairs of reducing subspaces, respectively (see section 1.2) are P m i n = S P d P r ) and Qmin = s~an(Qr) and Pmax = sPan([Pr > pregl) and Qmax = SPan([Qr QregI)

Q

=

[Qr

3

Qreg

9

3

3

A further partition of Q,,=[Qo example that

, Qf , Q l ] and PrCg=[Po, P, ,P I ] gives us for

P = spm([Pr Po ,PfI) and Q = Span([Qr Qo QfI) is a pair of reducing subspaces corresponding to the finite eigenvalues of A-XB. If, for example, we want a pair of reducing subspaces associated with the right minimal indices and all eigenvalues in the left complex plane Re X C O (stable modes in connection with linear systems, see sections 2.34) we ask for an ordering of the eigenvalues of Af-ABf such that these eigenvalues will be in the upper leftmost (north-west) corner of A f - X B f . Then we just have to partition Q, and Pr accordingly in order to produce a pair of reducing subspaces associated with the stable modes. However if for any reason we want to mix the zero and/or the infinite eigenvalues together with the nonzero eigenvalues the computed Jordan block structure of Ao-XBo and Al-XB, will be destroyed. Such a reordering can optionally be imposed directly when reducing A-XB to GUPTRI form or by postprocessing. The GupTRl decomposition (3.14) is based on the RGQZDalgorithm (see section 3.2). It utilizes two variants of a GQZD version of the reduction theorem 3.3. The first one, RzsI1I, is a new more efficient implementation of this reduction theorem and computes the right minimal indices (structure) and the Jordan structure of the zero eigenvalues of A -XB. One step of deflation in RZSTR is computed in the following way. Let A(')-XB(') denote the pencil we are to decompose in step i via the unitary transformations PI and Ql such that P~*(A(')-XB('))Q,=

We will use the following notation in the algorithm below:

Stably Computing the Kronecker Structure

299

:= svd(A) computed the singular value decomposition A = KEV* of A with the singular values on the diagonal of X = diag(ak, . . . ,al) in decreasing order. If one of U,X or V is "nil", that means it need not be computed. n := coluxnr~uility@,epsu)is the number of singular values of X which are less than or equal to epsu. < Q p > := qr(A) computes the QR decomposition A=QR. Algorithm RZSTR-kernel: Step 1. Compress the columns of A(i) to give ni 1.1 Cnil,Xf) ,VL’)>:=svd(A (i)) ;n, :=column-nullity(Xf) ,epsu) 1.2 Apply V"' to A(') and B(,): [0 A2]:=A(')V''); [B1B2]:=B(')Vf) (B1and the zero block to the left of A2 each consist of n, columns) Step 2. Compress the columns of B1to give n f - q 2.1 C nil ,Xi[) ,Vh’)> :=svd(B ;n,- r,;=column-nullity(Ei') ,epsu) to B1 and build Q,, the final right transformation in step i: 2.2 [0 B o ] : = B l V ') (Bo has rf columns)

4

Q,:= Vi')

r):]

Step 3. Triangularize [Bo ,B2]to give P f , the final left transformation in step i 3.1

:= qr([Bo,Bz])

r "I

0 B ( f + l ) := Rf

3.2 Apply Pi to A2

Next we apply the RZSTR-kernel to A('+l)-XB('+l) and so on until nlfl=O or n,+l#O but r,+l=O, as in theorem 3.3. In steps 1.1 and 2.1 above we use epsu (a bound on the uncertainty in A and B that is given by the user) as a tolerance for deleting small singular values when computing nullities. All singular values smaller than epsu are deleted, but we always insist on a gap between the singular values we interpret as zeros and nonzeros, respectively. In section 5 we further discuss the influence of finite arithmetic to computed results. The second decomposition LISTR computes the left minimal structure and the Jordan structure of the infinite eigenvalues similarly to RZSTR b;r working on B - pA. The main difference is that column compressions are replaced by row compressions and we start working from the lower rightmost (south-east) corner of the pencil. As we said earlier it is also possible to apply RZSTR to B* - pA* to get the left minimal and infinity structures (see section 3.1). However RZSTR working on B*-pA* will produce a block lower triangular form of A-XB, so in order to keep GUPTFU block upper triangular we make use of LISTR. In summary the GUPTFU decomposition is computed by the following 6 reductions: 1: Apply RZSTR to A-XB. Gives the right minimal structure and the Jordan structure of the zero eigenvdue in A,,- XB,,:

ror-ABor* I

J. Dernmel and B. Khgstrom

3 00

0

P1*(A-hB)Q1 =

A1-XB1

The block structure of Ao,-hBo, is exposed by the structure indices nf and rf (column nullities, see theorems 3.1 and 3.3). 2: Separate the right and zero structures by applying RZSTR to Bo,-pAo,. Gives the dght minimal structure in A,-XB,:

YBr* I 0

P z * ( A o ~ - X B O ~ )=Q ~

A ~ - X B ~

Insist on the same right minimal indices as in reduction 1. 3: Apply RZSTR to A2-XB2, which contains the zero eigenvalues of A-XB, giving P ~ * ( A z - X B ~ )= Q ~Ao-hBo Note that reduction 2 destroys the block structure of the zero eigenvalues, but this reduction gives it back by insisting on the same Jordan structure as in reduction 1. Reductions 1 to 3 in summary:

The pencil Al-AB1 comes from reduction 1 and contains the possible left minimal structure and remaining regular part of A -XB. 4: Apply LISTR to A1-XB1. Since A1-XB1 cannot have any zero eigenvalues the reduction produces the block structure associated with &e left indices inAl-ABl:

5 : Apply LISTR to B 2 - g ~ .Gives the Jordan structure of the infinity eigenvalues h Al-XBi: =

P5*(A2-XB2)Q5

[" 0"

B3

3-

*

Af-Wf

1

6: Apply the QZalgorithm [27]to As- XB3 and a reordering process to the resulting upper triangular pencil to give the desired pencil Af-XBf: Pg* (A3-AB3)Qa = A / - XBf

*I

Let af and be the diagonal elements of Af and B f , respectively. Then the finite eigenvalues of A-XB are given by a f / p f . Reductions 4 to 6 in summary:

[

[

P6* 0 Ps* 0 0 I] 0 I]

[

Q5

P4*(A1-XBl)Q4

0

0 I]

[

Q6

0

0 I] =

[

Af-XBf

0

*

Af-XBf * 0 Ai-XBi

Note that the transformation matrices Pi and Q, in the 6 reductions above are all unitary and the composite transformations P and Q in GUPTRI are given by

30 1

Stably Coinpiring the Kronecker Structure

[A 4[A 13][ [ [ o 0 0 I]

Q4

Q = Q,

0 0 I]

Q5

Q6

(3.15a) 0 I]

(3.15b)

where the identity matrices in (3.15a-b) are of appropriate sizes. 3.4 Some history and other approaches During the last years w e have seen an intensive study of computational aspects of spectral characteristics of general matrix pencils A-XB. Already in 1977 Vera Kublanovskaya presented her first paper (in Russian) on the AB-algorithm for handling spectral problems of linear matrix pencils. For more recent publications see [16, 171. The AB-algorithm computes two sequences of matrices {Ak} and {Bk} satisfying AkBk+I = BkAk+I , k = 0 , 1 , . * * ;Ao=A , Bo=B where At+ and Bk+ are blocks (one of them upper triangular) of the nullspace of the augmented matrix c k = [Ak Bk] in the following way: N(C&)=

rBk+lI *

By applying the AB-algorithm to a singular pencil A-XB we get the right minimal indices and the Jordan structure of the zero eigenvalues. Different ways to compute N ( C k ) give rise to different algorithms. Kublanovskaya presents the AB-algorithm in terms of the QR (or LR) decomposition. In [18] a modified AB-SVD algorithm is suggested that more stably computes the rangelnullspace separations in terms of the singular value decomposition. In 1979 both Van Dooren and Wilkinson presents papers on computing the KCF. The paper [46] has already been mentioned in sections 2.1-2, where in algorithmic terms he derives the KCF starting from a singular system of differential equations. In [39] Van Dooren presents an algorithm for computing the Kronecker structure which is a straight forward generalization of Kublanovskaya's algorithm for determining the Jordan structure of A-XI, as further developed by Golub and Wilkinson [15] and Ki3gstrt)m and Ruhe [21, 221. His reduction is obtained under unitary equivalence transformations and is similar in form to the one obtained from the RGQZD algorithm. The main difference is that we in RZSTR and LISTR compute n, and nl-rf (di~n(N(A(~-~))) and dirn(N(A('-l)) nN(B('-I))), respectively) from two column compressions, while Van Dooren computes n, and r, from one column compression and one row com ression. Thus Van Dooren never computes a basis for tLe common nullspace of A(f- and B(I-I). The operation counts for one deflation step in Van Dooren's algorithm and in RZSTR are of the same order. For exam le if m = n the first deflation step counts 2n3+O(n2) for Van Dooren and n3+O(n ) for RZSTR. Note that both algorithms use singular value decompositions for the critical rank/nullity decisions. If one consider less stable and robust procedures for determining a rangelnullspace separation (like the QR or LR decompositions) it is possible to formulate faster variants of our algorithms. However this is not recommended except possibly in cases where we for physical or modeling reasons know that the underlying pencil must have a certain Kronecker structure.

!?

f

J. Demmel and B. Kdgstrom

302

4. Perturbation Theory for Pencils 4.1 Review of the Regular Case The perturbation theory of regular pencils shares many of the features of the theory for the standard eigenproblem A-XI. In particular, for sufficiently small smooth perturbations, the eigenvalues and eigenvectors (or deflating subspaces) of a regular pencil also perturb smoothly. One can prove generalizations of both the Bauer-Fike ([6, 10, 11, 361) and Gerschgorin theorems [33] to regular pencils. In order to deal with infinite eigenvalues, one generally uses the chordal metric x(X,X’) =

IX

- X‘I

(1 + X2)” * (1 + X’2)’/2 to measure the distance between eigenvalues, since this extends smoothly to the case A =

00.

In our context we are interested in bounds on the perturbations of eigenvalues and deflating subspaces when the size of the perturbation E is a (not necessarily small) parameter, possible supplied by the user as an estimate of measurement or other error. Thus, we would like to compute a decomposition of the form A-XB = P(S-XT)Q-’ , S- AT block diagonal as in (1.1),which supplies as much information as possible about all matrices in a ball P(E) { A S E - X(B+F) : II(E,F)II.
To illustrate and motivate our approach to regular pencils, we indicate how we would decompose P(E) for various values of E, where A-XB is given by

[:1:]

1100 0 1 0 [o 0 ll where -q is a small number. It is easy to see that IT, the spectrum of A - h B , is a={l/q,O,q}. The three deflating subspaces corresponding to these eigenvalues are spanned by the three columns of the 3 by 3 identity matrix. For E sufficiently small, the spectrum of any pencil in P(E) will contain 3 points, one each inside disjoint sets centered at Vq, 0 and q. In fact, we can draw 3 disjoint closed curves surrounding 3 disjoint regions, one around each X € u such that each pencil in P(E) has exactly one eigenvalue in the region surrounded by each closed m e . Similarly, the three deflating subspaces corresponding to each eigenvalue remain close to orthogonal. Thus, for E sufficiently small, we partition u into three sets, ul={l/q}, u2={0} and A-XB=

u3={d*

0 0 0 -A

v

As E increases to q / f i , it becomes im ible to draw three such curves because there is a pencil almost within distance q/ 2 of A-XB with a double eigenvalue at q/2:

E p 1;H] q;J-A

7

where t; is an arbitrarily small nonzero quantity. Furthermore, there are no longer three independent deflating subspaces, because the 5 causes the two deflating subspaces originally belonging to 0 and q to merge into a single two-dimensional deflating subspace. We can, however, draw two disjoint closed curves, one around

Stably Computing the Kronecker Structure

303

Uq and the other around 0 and q, such that every pencil in P(E) has one eigenvalue inside the curve around Vq and two eigenvalues inside the other curve. In this case {Vq} = u1u 0 2 . we partition u={O,q} As E increases to 1, it no longer becomes possible to draw two disjoint closed curves any more, but merely one around all three eigenvalues, since it is possible to find a pencil inside P(r) with q and Uq having merged into a single eigenvalue near 1, as well as another pencil inside P(E) where 0 and q have merged into a single eigenvalue near q/2, In this case we cannot partition u into any smaller sets. This example motivates the definition of a stable decomposition o f a regular pencil: the decomposition in (1.1) is stable if the entries of P , Q , S and T are continuous and bounded functions of the entries of A and B as A-XB varies inside P(E). In particular, we insist the dimensions nI of the SS-XTil remain constant for A-XB in P(E). This b corresponds to partitioning u=U u, into disjoint pieces which remain disjoint for

u

i=1

A - X B in P(E). We illustrated this disjointness in the example by surround each ul by its own disjoint closed curve. For numerical reasons we will also insist that the matrices P and Q in (1.1) have their condition numbers bounded by some (user specified) threshold TOL for all pencils in P(E). This is equivalent to insisting that the

deflating subspaces belonging to different ui not contain vectors pointing in nearly parallel directions. In the above example, as E grows to q / f i , there are pencils in P(E) where the deflating subspaces belonging to q and 0 become nearly parallel:

E

0 11/2.P q;J - A

[;;;]. T O O

The two right deflating subspaces in question are spanned by [0,1,OIT and [0,-V5,llT, respectively, which become nearly parallel as 6 approaches 0. The numerical reason for constraining the condition numbers of P and Q is that they indicate approximately how much accuracy we expect to lose in computing the decomposition (1.1) [8]. Therefore the user might wish to specify a maximum condition number TOL he is willing to tolerate in a stable decomposition as well as specifying the uncertainty E in his data. With this introduction, we can state our perturbation theorem for regular pencils: it is a criterion for deciding whether a decomposition (1.1) is stable or not. Theorem 4.1: Let A - X B ,

E

and TOL be given. Let u

u into disjoint sets. Define xI for l S l b as

=

b

U uI be some partitioning of

i= 1

wherep,, q,, Dif. and Difr will be explained below. The corresponding decomposition (1.1) is stable if the following two criteria are satisfied: max xl < 1

lslsb

and

304

J. Demmel uml B. Kugstrom

For a proof and further discussion see [ll]. The criterion (4.1) is due essentially to Stewart [32]. If we have no constraint on the condition numbers (i.e. TOL=m), then there is a stronger test for stability (see [ll] for details). The quantities pi,q l , Dif,(u,,u-cr,), and Difl(ul,u-u,) play a central role in the analysis of singular pencils, and will be discussed further in the next section. For now let us say Dif,(u,,u-ui) and Difl(ui,u-mi) measure the "distance" between the eigenvalues in u, and the remainder in u- u,,and that p i and q, measure the "velocity" with which the eigenvalues move under perturbations. Thus the factor multiplying E in (4.1) is velocity over distance, or the reciprocal of how large a perturbation is needed to make an eigenvalue of ui coalesce with one from u--0,. This bound is fairly tight, and in fact the factor b.max (Pi,q,) in (4.3) is essentially the least possible value for ,I the maximum of K ( P ) and K(Q) where P and Q block diagonalize A-AB, the center of P( ). The quantities Dif,, Difl, p , , and q1 may be straightforwardly computed using standard software packages. Also, nearly best conditioned block diagonalizing P and Q in (1.1) can also be computed. Therefore, it is possible to computationally verify conditions (4.1)to (4.3)and so to determine whether or not a decomposition is stable as defined above, as well as to compute the decomposition. 4.2 Why is the Singular Case Harder than the Regular Case? Our analysis of the regular case in the last section depended on the fact that eigenvalues and deflating subspaces of regular pencils generally change continuously under continuous changes of the pencil. This situation is completely different for singular pencils. As mentioned in section 1.2, the features of a nongeneric singular pencil may change abruptly under arbitrarily small changes in the pencil entries. In this section we illustrate this with some examples: As stated in section 1.2, almost any arbitrarily small perturbation of a square singular pencil will make it regular. The singular pencils form a lower dimensional surface called a proper variety in the space of all pencils. A variety is the solution set of a set of polynomial equations. In this case the polynomial equations are obtained by equating the coefficients of all the different powers of A in det(A - AB) to zero [44]. In fact, given a square singular pencil one can find arbitrariIy small perturbations such that the resulting regular pencil has its eigenvalues at arbitrary preassigned points in the extended complex plane [46]. For example, the pencil 0 1 0 -A 1 0 0 0 0 - A O O l = 0 0 - A I 0 [ o 0 11 is singular, and it is easy to see that the perturbed pencil

J [:I A]:-

+

[:a

:]

-"b -c-A 0 has determinant dA3+cA2+bA+a. Clearly we can choose a , b , c and d arbitrarily

0

small so that this polynomial has any three roots desired.

Stably Computiiig the Kroriecker Structure

305

Similarly, almost all nonsquare pencils have the same KCF, which will consists entirely of Lk blocks if there are more columns than rows and LT blocks if there are more row than columns. For example, the following pencil is nongeneric because it has a regular part with an eigenvalue at 1:

[i

0 1 0 [0 0 11 -

=

-[: i

but the following perturbation

-[: i

hO1]

+

a\

I!X]

01

has a generic KC.F consisting of a single L2 block for any nonzero a . In control theory the fact that a generic nonsquare pencil has no regular part implies that a generic control system is completely controllable and observable (SW section 2.3). 4.3 Perturbation Results for Pairs of Reducing Subspaces How do we do perturbation theory for nongeneric pencils in the face of the results of the last section? Clearly, we need to make some restrictions on the nature of the perturbations we &ow, so that the structure we wish to preserve changes smoothly. In particular, we will assume that our unperturbed pencil A-XB has a pair of left and right reducing subspaces P and Q, and that the perturbed pencil ( A + E ) - X ( B + F ) has reducing subspaces PEF and QEFof the same dimensions as P and Q, respectively. Under these assumptions we will ask how much PEF and QEF can differ from the unperturbed P and Q as a function of 11 (E,F)(( E: Theorem 4.2: Let P and Q be the left and right reducing subspaces of A-XB corresponding to the subset ul of the set of eigenvalues u of A-XB. Let E = 11 (E,F)II E . Then if ( A + E ) - X ( B + F ) has reducing subspaces PEFand QEF of the same dimensions as P and Q, respectively, where (where Difu(ul,u-ul), Difl(ul,u-ul), p and q will be defined below) then one of the following two cases must hold: Case 1: emax(P,PEF)Iarctan(x.(p + (p2- I)*~)) and a c t d x * ( q +(q2- 1IU2>). If x
5

ardan(2.x 1 4 ) . In other words, both angles are small, bounded above by a multiple of the norm of the perturbation E . Case 2: Either ~~JQ,QEF) 5

306

J. Demmel and B. K6gstrom

or

In other words, at least one of the angles between perturbed and unperturbed reducing subspaces is bounded away from 0. Note that the definition (4.4) of x is identical to the definition (4.1) for regular pencils. We will see when we sketch the proof of this theorem below how this result for singular pencils is a natural generalization of the result for regular pencils: Without loss of generality we assume A-AB is in the GUPTRI form defined in section 3:

: :I

A,- XB, A-XB = 0 Arcx-XBrcx 1

0

where A,-XB, has only L, blocks in its KCF Al-XB, has only L,' blocks in subscript I-), A,,,- AB,, is upper triangular and regular, pencil. Now repartition (4.5) as follows: All Al2 A - A B = 0 A,,]

[

AI-XBI

(4.5)

(i.e. all right minimal indices, hence the its KCF (i.e. all left minimal indices), and each * is an arbitrary conforming

-

'[

B11 Bl2 0 B22]

where All-ABll includes all of A,-XB,-and possibly part of Areg-XBrc , and Az2-ABzz contains all of Al-XBl and the remainder of A,,-XB,,,. In partidar, we assume All-XBll and Az2-XBzz contain disjoint parts (al and u 2 = u-ul respectively) of the spectrum u of A-XB. We denote the numbers of rows and columns of All-XBl, by m1 and n,;it is easy to see m+nl (with equality if and only if A,-XB, is null) and rn22n2 (with equality if and only if AI-XB, is null). In this coordinate system our unperturbed reducing subspaces P and Q are spanned by the column of [Zm1IOIT and [Zn1IOIT, respectively. Our first step in computing PEF and QEFis to blockdiagonalize the pencil in (4.6), the upper left block containing crl and the lower right block u2. Thus, we seek P and Q of the following special forms such that P-'(A-XB)Q =

(4.7a) (4.7b)

301

Stably Computirrg the Krorzecker Structure

This is a set of 2rn1n2 linear equations in nln2+mlmz unknowns, the entries of L and R . Since rnl
(A11A22;B11J 2 2 )

umin(Zu)

with the trivial consequence:

11 (LsR)llE

11 (A1Z&12)11

E

Dfa(A11A22$11p22)

*

Dif, is specified merely by choosing u1 and uz=cr-ul, permitting us to write Difu(al,a2) if A-XB is known from context or even just Dif, if u1 is known as well. If A-XB is singular and (4.8) has nonunique solutions, we will choose the (L,R) of least norm since it leads to P and Q as well conditioned as possible. Call this minimum norm solution (Lo,Ro) and denote (1+11 Loll 2)m by p and (I+ 11 Roll 2)uz by q . Just as Dif, is specified only by ul,so are 11 Loll , 11 Roll ,p and q . Now consider the perturbed pencil (A+E)-X(B+F). Premultiplying by P - I and postmultiplying by Q yields the pencil p-l

. ((A+E)-h(B+F))

'

Q

Eal

=

]

A 2El2 2 + ~ 2]2- A ~ l lF2l + F 1 lBz2+F22 Fl2 (4.9)

[A1lCEI1

We now seek PFk and Q E F of the forms

and QEF

=

[

- R ~

.

In,

such that premultiplying (4.9) by Pi; and postmultiplying it by Q E F blockdiagonalizes it. It is easy to see that if we can find such PEF and Q E F the perturbed pair of reducing subspaces P E F and QEFwill be spanned by the columns of

so that if Lz and R2 are small, the-angles between the unperturbed and perturbed subspaces will be small. Performing these multiplications and rearranging the result yields

[

(All+~ll-L1EZl) (I+RlR2)

~ 2 ~ ~ 1 1 + ~ l l ~ - ~ ~+ 2 E21 2 +~ L2El2R2 2 2 ~ ~ 2

(All+Ell)Rl-Ll(A22+E22) + El2 - LlEZlRl (A,,+ E22+LZE12) (I+R2Rl)

I

(4.10)

308

J. Demmel and B. Kbgstrom @11+

F11- W 2 1 )

(I+ R P 2 )

( ~ 1 1 + ~ 1 1 ) ~ 1 - ~ 1 ( ~ 2 2 ++ ~ 2F12 2)

- LlF2lRl

@22+ F22+ J52F12) (Z+R2R1)

1

Setting the upper right and lower left corners of the pencil two zeroes yields two sets of equations: (All+Ell)R1-Ll(A22+E22) = -El2+L1E21R1 (4.11a) (4.llb) (4.12a) J52@11+F11)-

@22+F22)R2

= -F21+L2F12R2

.

(4.12b)

(4.11ab) is a set of 2m1n2 nonlinear equations in n l n 2 + m l r n 2 unknowns, and (4.12ab) is a set of 2m2nl nonlinear equations in the same number of unknowns. If A-XB is regular ( m f = n f ) ,both these systems have the same number of equations as unknowns; this was the case considered by Stewart [32]. He shows that as long as Ef, and F!,are small enough, both (4.11ab) and (4.12ab) have unique solutions in a neighborhood of the origin. If A-XB is singular, (4.1lab) will be underdetermined and (4.12ab) will be overdetermined. Overdetermined systems have no solution in general; that this one does is guaranteed by our assumption that the perturbed pencil (A+E)-X(B+F) have reducing subspaces PEF and QEFof the same dimensions as P and Q. To solve (4.11ab) we make the identifications x = [L1hl], Tx =

(4.13a) (4.13b)

and (4.13~)

so that (4.11ab) becomes

Tx = g + $(x) . Expressing T in t e r m of JSronecker products yields

so that [32]

(4.13d)

309

Stably Computing the Kronecker Structure

(4.14a) (4.14b) and (4.14~)

so that (4.12ab) becomes Tx = g

+ +(x)

(4.14d)

Expressing T in terms of Kronecker products yields

I

(All+

EdT@Zrnz

-zn1@

(B 11+ F1d*@zrn2 -1.

I@

( A 2 2 +E22)

(B22 +F Z Z )

Swapping the first mlm2 columns with the last nlnz columns and negating the whole matrix, none of which changes its singular values, yields

[

Znl@(A22

+E22) - (All+ E I Y @ Z r n z

Zn,@P22+F22)

J

-( ~ 1 1 + ~ 1 d * @ Z q

which we recognize as the matrix whose smallest singular value is defined as ~22~11+~11;B22+~22,~11+~11)

2 ~f,(A22,A11~22,B,l) -

v5 II ( ~ l l , E 2 2 , F l l 9 F 2 2 ) 1 1 E *

The quantity Difu(A22,A11;82281J does not generally equal DifU(A11P~2$11,B2~) (unless the A,, and B,, are symmetric). In the interests of retaining our coordinate free formulation of our bounds, we therefore define D i f d U l P Z ) = D i f u ('422 sill $22 9Bld where the fact that Dif, depends only on u1 and u2 follows just as for Dif,. Also, one may prove that Dif,(u1,a2)>0 if and only if u1 and w2 are disjoint, as we have assumed. Thus UrninV) 2

~f,(ul,uz)

* II *

.

( ~ 1 1 , ~ 2 2 ~ 1 1 , ~ 2 E2 ) 1 1

Using these bounds on urn&), we can find bounds on the solution L, and R, of (4.11ab) and (4.12ab): Lemma 4.3: [ll] Consider the equations

J. Demmel and B. Khgstrom

3 10

Tx = g

and

+ +(x)

x = T+(g

+

(4.15a)

+(.I>

(4.15b)

where T + is the pseudoinverse of T. Assume that amin(T),11 $11 , and K

IIgll 11 $11 *

u&(T) <

1

11 gll

satisfy

4 .

Then equation (4.15b) has a unique solution x inside the ball (4.1%)

This solution x of (4.15b) has the following relationship to the solution 2 of (4.15a):

Case 1: If m=n, 2=x‘ is the unique solution of (4.15a) (in the ball). Case 2: If mn, and if (4.15a) has a solution 2 in the ball, then 2=f. Thus, (4.15b) may have a solution whereas (4.15a) may not. Furthermore, in cases 1 and 3, if (4.15a) has a solution which does not lie inside the ball in (4.15c), it must lie outside the ball: (4.15d) Substituting in values for umin(T), 11 gll , and 11 +I[ from equations (4.13abc) and (4.14abc) yields Theorem 4.2. 4.4 Perturbation Results for the Spectrum of the Regular Part If mZ=n2, then Azz-ABz2 is square and hence regular (otherwise it would have both Lk and LJ blocks in its KCF). Then after blockdiagonalizing the perturbed pencil (A+E)-A(B+F) as in (4.10) we see the perturbed pencil also has a square part [(A22+Ezz+LzE12) - w322+F22+L2F12)1 * (Z+RZRl) * If this square part is regular, we can use our bounds on IIL,II and IIR,II from the last section in conjunction with some of the perturbation theory for regular pencils in section 4.1 to derive bounds on the difference between its eigenvalues and the eigenvalues of the regular part of the unperturbed pencil A22-AB22: Theorem 4.4: Suppose that Case 1 of Theorem 4.2 holds. Suppose further that the block A22-AB22 is regular (i.e. rn2=n2>0). Then the spectrum of the perturbed pencil (A +E) -A (B +F) includes the spectrum of (A22+E’22)-Wzz+F’22) where

v5

II (E‘22J7’22)IlE 5 * 4 * II (E,F)llE . Similarly, if we instead assumeA1l-AB1l is regular (ml=nl>O), the perturbed pencil (A +E) -A (B +F) includes the spectrum of (A11+~’11)-~(~11+~’11) where l l ~ ~ ’ l l J ’ l l ~5 llv E5 . P

- II (EJ)II.

*

then the spectrum of

Stablv Computing the Kronecker Structure

311

See [ll] for a detailed proof. 4.5 Some Applications to Linear Systems Theory

Now let us interpret the results of the last two sections in terms of linear system theory: perturbation bounds for controllable subspaces and uncontrollable modes. Let C ( A , C ) denote the controllable subspace of the pair of matrices ( A , C ) as in section 2.3. As discussed in that section, this sFbsp$ce is simply the left reducing subspace corresponding to crl = 0 of the pencil A-XB = [ C k - X I ] . It is easy to s~ that this pencil can have neither infinite eigenvalues nor L l blocks in its KCF since B = [OF] has full rank; hence it can only have finite eigenvalues and L, blocks in its KCF. Also, one can see that the number of L, blocks is a constant equal to the number of columns of C. Thus, the assumption in Theorem 4.2 about the perturbed pencil having reducing subspaces of the same size is implied by the assumption that the perturbed system (A+EA,C+Ec) has a controllable subspace of the same dimension as C ( A , C ) . Also, the algorithms one uses
Then either Case 1: @,,(C(A,C),C(A+ EA,C+Ec))

5

~ctan(~*(p+(p~-l)~~))

with @,,(c(A,c),c(A+EA,c+Ec)) 5 Xctarl(2.X / p )

if x < l / 2 , or Case 2:

For a detailed proof, see [ll]. It is also easy to see that this result applies immediately to observable subspaces as well by duality [48]. Another feature of a control system ( A , C ) for which we can derive perturbation bounds using this approach is the spectrum of the regular part, also called the input decoupling zeroes or uncontrollable modes of the control system. Theorem 4.4 of the last section yields: Corollary 4.6 Assume we are in Case 1 of Corollary 4.5. Then the uncontrollable modes of the perturbed control system (A+EA,C+EC) are the eigenvalues of the pencil (Azz+E122)-~B22

where

II E’zzll E

5

fi

*

4 . II (EA,&)I1

E

*

312

J. Demmel and B. Ktigshom

In [40],P and Q are chosen in (4.1) so that P-IBQ = [rlO], implying Bz2=Z. Thus the problem of finding the eigenvalues of the perturbed pencil (A22+E122)-AB22 above reduces to perturbation theory for the standard eigenproblem. To illustrate these points consider the pencil

A - A i = [Cb-XI]

=

E5

A 2:

8].

0 0 3-A This pencil is in the canonical form (4.2) with rnl=l, rnz=2, n l = 2 and 4 = 2 . In other words All-ABll = 1-A] has KCF L1 and A22- ABzz = diag(2-A,3-A) is

a regular diagonal pencil with eigenvalues 2 and 3. The left reducing subspace P which is spanned by [1,O,OIT is the controllable subspace C(A,C) of the system (AS). The quantities of interest in Corollary 4.5 are Dif,(0,{2,3}) = Difi(0,{2,3}) .I12 , and p =q = 1. Thus Corollary 4.5 implies that if E A and Ec are such that 1 = dim(C(A,C)) = dim(C(A+EA,C+EC)) and 11 (EA,Ec)ll = .028.x where x < l then either Case 1: B,,(C(A,C) , C(A+EA,C+Ec)) 5 ~ c t a n ( x ) or Case 2: B,,(C(A,C) , C(A+EA,C+Ec)) 2 arctan(l/ = arctan(S77) = 5 2 Thus, for example, if we have II(EA,Ec)llE < .001, then any one dimensional controllable subspace of (A+EA,C+Ec) will either be within .036 radians of C ( A , C ) or at least .52 radians away from C ( A , C ) . A simple example of the latter situation is

e)

,o,

[ ,"

0

[c-kE,-w+E~-Az] = lo-'

2-A

0

3J: in which case the new controllable subspace is spanned by [0,1,OITand so is orthogonal to C(A,C). In case the perturbed controllable subspace falls into Case 1, then we can use Corollary 4.6 to bound the uncontrollable modes of the perturbed system: the perturbed modes are eigenvalues of the matrix

[;: :] +

E'

where 11 E ' l f ~ 5 ~ * I (EA,Ec)ll I E . Gershgorin's theorem supplies the simple bound th the perturbed uncontrollable modes lie in disks centered at 2 and 3 with radii 4 l J ( E , , E c ) l J s 5 xm.04. 4.6 Other Approaches

Stably

C'rirnpii turg

313

the Kroriecker Stntcture

Sun uses a somewhat different approach for deriving perturbation theorems for the eigenvalues of singular pencils [38]. He generalizes his results for regular pencils [37] by relating the perturbation of the eigenvalues of a diagonalizable A - XB and the variation of the orthogonal projection onto the column space R([A,B]*) to each other. His assumption that A-XB is diagonalizable means that A-XB may have neither Jordan blocks of dimension greater than 1 nor singular blocks other than Lo and LOT. Sun must also make restrictions on the perturbations in A and B . His assumptions are that the space W=R(Ap) contains no vectors orthogonal to Z=R(A+Ep+F) and that W'=R(A*(B*)* contains no vectors orthogonal to Z'=R[(A+E)*I(B+F)*]*. By using the chordal metric (see section 4.1) to measure the perturbations of eigenvalues he obtains a Wielandt-Hoffman and a Bauer-Fike type of theorem for singular diagonalizable pencils, showing that the eigenvalues of A-XB are insensitive to small perturbations in A and B. 5. Analyzing the error of standard algorithms for computing the Kronecker structure Using the perturbation analysis from section 4 this section analyzes the error of standard algorithms for computing the Kronecker structure from section 3. Here we focus on the GUPTRI form presented in section 3.3, but the analysis hold for any backward stable algorithm. 5.1 Distance from A-XB to the nearest pencil with the computed Kronecker structure Let (Y be a description of the KCF of A-XB (a list of left minimal indices, right minimal indices, and eigenvalues along with the number and sizes of Jordan blocks for each one). Then we define Sing,(A,B) = min{ll (E,F)IIE : (A+E)-X(B+F) has KCF a} (5.1) i.e. Sing,(A,B) is the smallest perturbation of A-XB (measured as II(E,F)IIE) that makes the perturbed pencil (A+E)-X(B+F) have the Kronecker structure a. One estimate of Sing,(A,B) comes from running one of the algorithms in section 3 for computing the KCF. We focus on the GUPTRI form and its RZSTR kernel discussed in section 3.3. The main contribution to errors caused by the finite arithmetic in RZSTR comes from the computation of nl and r,. For example the computation n,:=column-nullity(A(') , epsu) means that all singular values (Tk of A(') less than epsu are interpreted as zeros. In practice we make column-nullity(.;) more robust as follows: if ak
L

J

i.e the right hand side of (5.2) is an exact reduction of a perturbed pencil A(,)+,!?(')- X(B(')+F(')). Here IIE(')IIi is the sum of the squares of the n, singular values interpreted as zeros in the column compression of A(') (see step 1.1of RZSTR

314

J. Dernniel and B. Kagstrom

in section 3.3). Similarly 11 F(’)IIi is the sum of the squares of the n i - q singular values of B1 interpreted as zeros in the column compression of the n, columns of B(l)Vf) (see step 2.1 of RZSTR). We get similar contributions from all reductions. Since we only make use of unitary equivalence transformations when computing the GUPTRI form the resulting decomposition can be expressed in finite arithmetic as

P* ((A +E ) -A (B + F ) ) Q

=

(5.3)

where S2 is the sum of the squares of all singular values interpreted as zeros during the six reductions (see section 3.3). The equations (5.3-4) express that the GUPTRI form is backward stable and it computes a Kronecker structure (Y for a pencil C-AD within distance 6 from A-AB. This computed 6 is an upper bound on Sing,(A,B) for the value of the Kronecker structure (Y the algorithm reports. How good is this upper bound? We cannot expect 6 to be smaller than O(epsu max(llAIIE , llBllE)), since epsu is used as the tolerance for deleting small singular values. However if A-AB has a well-defined Kronecker structure (all a,,!are O(rnacheps max((lAIIE, 11 BIIE)) where macheps is the machine are O(l), where machepssepsu), then 6 will be of the roundoff error, and all size O(macheps max(l1 All , 11 Bll E ) ) too. The described situation shows a case where we have overestimated the uncertainties in A and B. On the other hand examples (see [39]) show that delta may be as bad as a root of the true distance (e.g. cU2 versus E), but some function of it may still provide a crude lower bound on &&(A ,B), much as the output of K%gstrt)m’s and Ruhe’s algorithm for the Jordan canonical form [21] provides a lower bound on the distance from the input matrix to one with reported Jordan form. This problem is under investigation. From the backward stability of the GUPTRI form we see that the computed bases for a pair of reducing subspaces (P,Q) correspond exactly to a nearby pencil C-AD within distance 6 . How to compute error bounds for these bases is the topic of the coming section. 5.2 Perturbation bounds for computed pairs of reducing subspaces In this section we apply the perturbation theory of chapter 4 to analyze the results of the GUPTRI algorithm (or any similar backwards stable algorithm) of chapter 3. Let A-AB be a pencil supplied by the user; it may or may not be singular. The algorithm of chapter 3 had the property that it found an exactly singular pencil A’-AB’ within a small distance 6 of A-XB. Theorem 4.2 showed that if we perturb A’-XB’ such that the perturbed pencil (A’+E)--X(B’+F) has reducing subspaces of the same dimension as A’-XB’, and if the norm of the perturbation 11 (E,F)II is small enough, then we can compute bounds on the maximum angle between the unperturbed and perturbed reducing subspaces. Let A denote the upper bound on 11 (E,F)II of Theorem 4.2:

Then clearly if any singular pencil A”-XB” with reducing subspaces of the same dimensions as A’-XB’ is sufficiently close to A-XB so that it is also within A of

315

Stably Cotnputirig the Kronecker Structure

A‘-XB’, the perturbation bounds of Theorem 4.2 will apply to it. For A”-XB” to be sufficiently close to A’-XB’ it clearly suffices that 11 (A“-A,B”-B)IIE < A-6, if A-6>0. This leads to the following algorithm: Algorithm 5.1: Perturbation Bounds for Computed Reducing Subspaces 1) Reduce A-XB to A‘-XB’ of the form GUPTRI using an algorithm such as the one in section 3.3 based on RGQZD. Let 6 bound the perturbation l l ( A f - A , B ’ - B ) \ l E . Let P’ and Q’ be reducing subspaces of A’-XB’ corresponding to the subset u1of the spectrum of A ’ - X B ’ . 2) Compute the bound A given above in equation (5.5). Let q = A-6. 3) I q>O, then if A”-”” is any singular pencil within distance d = 11 ( A ” - A 7 Z 3 ” - B ) ~ ~ E< q of A-XB with reducing subspaces Q” and P” of the same dimensions as Q’ and P‘, one of the two cases of Theorem 4.2 must

hold: Case 1:

and Omax(Q’,Q”)

5

arctan(-

6+d

A

(q+(q2-1)”))

,

with better bounds if A’ ’ -XB’ ‘ is within q/2 of A - XB Case 2: Either

or

4) If 1110,no perturbation bound can be made because the perturbation 6 made by the algorithm is outside the range of our estimate. One other interpretation of chapters 3 and 4 can be made. If the rank decisions made by the algorithm of chapter 3 are sufficiently well defined, i.e. at each reduction step there is a large gap between the largest singular value (Jk+ less than the threshold epsu and the smallest singular value uk greater than epsu: Uk

>>epsu >>uk + 1

then the algorithm will make the same rank decisions for all pencils in a disk around A - X B , and hence will compute the same Kronecker structure for the left and right indices and zero and infinite eigenvalues. In particular, the computed minimal and maximal reducing subspaces will have the same dimensions for all inputs in the disk around A - X B . Combined with Theorem 4.2, this means that if A-XB is intended to be singular, but fails to be because of roundoff or measurement errors, then the minimal and maximal reducing subspaces will be computed accurately anyway. Applied to linear systems theory (see sections 2.3 and 4 . 3 , this means that the algorithms of chapter 3 compute accurate controllable subspaces and uncontrollable modes of uncontrollable systems.

3 16

J. Demmel and B. Khgstrom

6. Numerical examples A Fortran 77 program implementing the GUPTRI form with accuracy-estimates of computed results (see sections 3.3 and 5 ) is under development. The inputs consist of a matrix pair (A,B), a bound epsu on the size of the uncertainty in ( A $ ) , and flags indicating whether a complete or just partial information (e.g. the minimal indices or the regular part of A - X B ) is desired. The output is a GUPTRI form (3.14) which is true for almost all pencils within distance delta of A-XB (the generic case), or else a nongeneric GUPTRI form lying within distance 6 from the input pencil A - X B . The bound 6 is expressed in terms of small singular values that are deleted ( C e p s u ) during the deflation process to compute the Kronecker structure cx (see section 5.1). Here we report results from a Matlab [26] implementation in double precision arithmetic (the relative machine accuracy macheps = 1.4E-17) of the basic reduction RZSTR used to transform A-XB to a GUPTRI form. The inputs are as for the Fortran program while the output of RZSTR is a block upper triangular form

[

*I

Aor - XBor 0 A1-XB1

where Aor-XBo, contains the Jordan structure of the zero eigenvalue and the right minimal indices. The pencil A1-XB1 contains the possible left minimal structure and the remaining regular part of A - X B . This decomposition is the first of six reductions used for computing the GUPTRI form and corresponds to what we get when replacing GSVD by GQZD in the reduction theorem 3.3. 6.1 A generic pencil Let A and B be 7 = m by 8 = n random matrices (generated in Matlab by the RAND-function) and ~ P S U= m a c h e ~ sm a ( ]I A I] E [I B ]I E ) RZSTR computed the following decomposition corresponding to one L7 block (see section 1.2): 9

Since we mainly are interested of the computed block structure we only display numbers in 4 decimals accuracy. However the zeros displayed above are all accurate to the rounding level. RZSTR does not know in advance that A - X B is generic, so it

317

Stably Compzttiiig the Kronecker Structure

has to go through eight deflation steps in order to realize that A - XB is equivdent to one L7 block. (The generic structure of a k by k+ 1pencil is one Lk block.) Below we display ni,r, and the smallest singular value of [A(')*,B(')*]*in each of the seven deflation steps. i n, ri umin([A(')*$(')*]*) 1 1 1 0.38 2 1 1 0.34 3 1 1 0.33 4 1 1 0.21 5 1 1 0.41 6 1 1 0.36 7 1 1 0.67 8 1 0 Locally at each step i the smallest singular value displayed is the 2-norm of the smallest perturbation of [A(')*$(')*]* that makes A(') and B(') have a common nullspace (ni-ri>O). So at least one of these singular values must be interpreted as zero (
-

A =

0.2055 0.6329 0.1249 0. 0. 0. 0.

0.3407 0.7626 0 5552 0. 0. 0. 0.

0.3004 0.9134 0.1967 0. 0. 0. 0.

0.2294 0.6493 0.2089 0. 0. 0. 0. B=

0.5271 1.0435 0.6309 0. 0. 0. 0.

0.8712 0.9253 0.9119 0. 0. 0. 0.

0.4016 0.5897 0.4095 0. 0. 0. 0.

["d' iz]

rl

0.5034 0.7699 0.5951 0. 0. 0. 0.

=

0.3013 0.7512 0.3976 0. 0. 0. 0.

1.2644 1.8991 1.3820 0.9856 0.3355 0.7987 0.5069

1S207 2.3651 1.7303 1.3820 0.6163 1.2348 0.8677

0.7374 1.1086 0.8401 0.6800 0.2280 0.5545 0.4382

0.5971 0.8535 0.6428 0.0489 0.1748 0.0001 0.0764

0.9958 1.3437 1.1080 0.1379 0.4933 0.0001 0.2155

0.4886 0.6911 0.5103 0.0454 0.1622 0 .oooo 0.0709

=

0.7325 0.9922 0.7249 0. 0. 0. 0.

J. Demmel and B. Khgstrom

3 18

are displayed only to four decimals accuracy. The pencils A1l-XB1l and A22-XB22 have the KCF's diag(Lo,L1,J2(0))and diag(LT,2.J1(w)), respectively. Since A - X B is known to full machine precision RZSTR is called with an epsu similar to the one in the preceding example. In order to obtain the complete Kronecker structure RZSTR is called twice, first with A - XB as input and then with the deflated pencil from the first reduction as input (see the discussion in section 3.3 for more details). The following indices determining a nongeneric Kronecker structure a1are computed:

number of L I - blocks

2 1

1 1

number of Lf-l blocks

3 0

0 1

Table 6.1 During the reduction process all singular values that should be zeros are zeros to the rounding level. What happens if we perturb the nongeneric pencil A-XB? Consider where E and F are full matrices and IIEllE is O(machepsV2 IIA116), IIBllE). In section 4.2 we pointed out that almost all (arbitrarily small) perturbations of a nongeneric pencil will make it generic, so ( A + E ) - X(B +F ) ought to be generic, and it is! When we apply RZSTR with the same epsu as for the unperturbed pencil it computes the generic structure as in section 6.1 (which we denote by a2). However if we consider (A+E)--X(B+F) as a pencil with uncertainties in the data of size 11 Ell and 11 FII , respectively, RZSTR computes the nongeneric Kronecker structure a1 corresponding to the unperturbed pencil A-XB. Below we display the computed reduction P*(A+E)Q = 8.4580e-10 1.5835e-9 3.3293e-9 .1.5798e-9 9.4764e-10 4.0688e-9 6.2048e-9

0.0 2.0000e-15 2.4458e-11 5.4961e-9 -6.8016e-10 2.3264e-10 3.0473~-10 3.9444~-9 -2.9634e-10 2.6389e-9 5.7650e-10 -2.3186e-9 0.0 0.0

["" 0

- P*E’Q

=

(6.la)

-7.0593e-1 1.7643 -4.4274 -4.4983e-1 1.4142e-1 3.7705e-1 -7.0718e-2 -5.2545e-1 -4.9481e-2 -6.7643e-3 -6.4429e-8 2.6144e-8 -6.1760e-1 -9.4730e-2 1.5776e-2 -2.5683e-8 1.9761e-8 1.7517 0.0 0.0 -2.2751e-8 1.0899e-8 1.4153 9.8271e-2 0.0 -4.1396e-8 2.0990e-8 3.6086e-1 1.5134e-2 -6.0993e-2 2.8080e-8 -2.7099e-8 1.5303 -4.0147e-2 -3.2022~-2

(6.lb) 0.0 0.0 -2.6917e-9 -1.0018e-9 -4.6944e-9 7.1359e-10 6.7613e-9

6.6146e-1 0.0 0.0 0.0 0.0 0.0 0.0

-1.4960e-2 8.5456e-2 0.0 0.0 0.0 0.0 0.0

8.3731e-1 4.5289e-2 8.0000e-15 -8.7672e-9 3.7934e-9 -8.5628e-9 1.5586e-8

2.5573 4.2848e-1 5.1278e-1 0.0 0.0 0.0 0.0

-2.4824 3.9301e-2 -3.8112e-1 4.8408e-2 -4.3052e-1 1.7979e-2 9.422e-11 6.8512e-10 -5.4277e-11 -3.6715e-10 2.3142e-9 8.3491e-9 6.0093e-1 -1.3563e-1

-4.2721e-2 2.4430e-2 -7.8591e-3 -9.2871e-10 4.3950e-10 6.6088e-9 -3.9087e-2

rows 1 to 4 and columns 1 to 5

of

Stably Compirtiizg t h e Kronecker Structure

319

P*((A+E)-X(B+F))Q and its block upper triangular form exposes the Jordan structure of the zero eigenvalue and the right minimal structure of A+E--X(B+F). The pencil Ail-XBll is identical to rows 5 to 7 and columns 6 to 8 of P*((A+E)-X(B+F))Q and corresponds to the Jordan structure of the infinite eigenvalues and the left minimal indices. Note that since we applied RZSTR to a transposed pencil in the second reduction All-XBll is block lower triangular. The perturbation matrices E’ and F’ in (6.la-b) correspond to the small but nonzero singular values interpreted as zeros during the deflation process to determine the Kronecker structure. However our Matlab implementation of RZSTR does not delete these small singular values in the process of determining the q s in Table (6.1), so the decomposition (6.la-b) is to the rounding level an exact similarity transformation of the pencil A+E--X(B+F) specified as input. How do we interpret the resulting decomposition? One way is to say that (6.la-b) is exactly nongeneric to the accuracy max (IIE‘II. , llF’IIE)-7e-8, with the Kronecker structure of A-XB (the unperturbed pencil). In the context of section 5.1

[

[

Bor BlZ 0 Bill from (6.la-b) is nongeneric to the rounding level and is an exact equivalence transformation of the perturbed pencil (A+E+E’)-X(B+F+F’). 6.3 Assessment of the (nongeneric) computed Kronecker structures In the following figure we try to visualize our nongeneric example: Aor A12 0 Ail] -

63 =

O(E”2)

A’-XB’ 61 = O(E) /”A-XB

-

Figure 6.1 - A surface of equivalent nongeneric pencils (same KCF) Here epsu=O(E) means that w e claim to know the input pencil to full machine accuracy and epsu=O(Ev2) means that we claim to know the input pencil to around half the machine accuracy. Let tii denote the distance between two matrix pencils (See Fig. 6.1) and corresponds to deleted singular values (see equations 5.3 and 5.4). First the unperturbed nongeneric pencil A-XB and epsu=O(E) were specified as inputs to RZSTR, and a nongeneric GUPTRI form with Kronecker structure a1 was computed. This form corresponds exactly to the pencil A‘-XB’ which is very close to A-XB; thus Sing,,(A,B) 5 E1=O(c). Then RZSTR was called with the perturbed (generic) pencil (A+E)-x(B+F) and epsu=O(e) as inputs, and as a result the generic GUPTRI form corresponding to (A+E)’--X(B+F)’ was produced. In this case Sing,,(A+E,B+F) Ia2=O(e)). Finally (A+E)-X(B+F) and epsu=O(Em) were specified as inputs to RZSTR which computed a nongeneric GUPTRI form with Kronecker structure a1 again, and corresponds to (A+E+E’)-X(B+F+F’). Thus Sing,,(A+E,B+F) 5 63=O(em). To summarize, A’-XB’ and

3 20

J. Dernmel and R. Kagstroni

( A+ E + E ’ )- X(B + F + F ’ )

have the same KCF and belong to the proper variety of nongeneric pencils. Let P’ and Q’ denote the computed pair of reducing subspaces corresponding to Aor-XBor in (6.la-b), and P and Q the corresponding pair of A ’ - X B ’ . What can be said about the maximal angle between these two pairs of reducing subspaces associated with the perturbed pencil (A + E) -X(B + F ) and the unperturbed A -XB , respectively? By applying the error bounds as described in section 5.2 the following results were obtained: max(p,q) = 3.2847 Dif, = 0.0315 Difl = 0.0147 A = 0.0013 Case 1: O,,(P,P’) 5 5.3E-5 radians (approximately 0.003 degrees) O,,(Q,Q’) I1.3E-4 radians (approximately 0.007 degrees) or Case 2: Om,(P,P’) 2 0.19 radians (approximately 10.8 degrees) O,,(Q,Q‘) 2 0.08 radians (approximately 4.6 degrees) Since we computed a GUPTRI form of both ( A + E ) - X ( B + F ) and A - X B we can show the actual angles between the computed pairs of reducing subspaces corresponding to the perturbed and unperturbed pencils: O,,(P,P’) = 1E-7 radians O,,(Q,Q’) = 5E-7 radians 7. Conclusions and future work

With applications of matrix pencils to descriptor systems, singular systems of differential equations, and state-space realizations of linear systems in mind (section 2) we have shown how to stably and accurately compute features of the KCF (for example minimal indices, pairs of reducing subspaces, regular part, and Jordan structures) which are mathematically ill-posed and potentially numerically unstable. The stability is guaranteed by a finite number of unitary equivalence transformations of A - X B , giving a generalized upper triangular (GUPTRI)form (sections 3 and 5.1). We make the mathematical problem well-posed by restricting the allowed perturbations E-XF of A - X B such that A - X B and ( A + E ) - X ( B + F ) have minimal and maximal pairs of reducing subspaces of the same dimension (section 4). This regularization leads to perturbation bounds for pairs of reducing subspaces and for the eigenvalues of the regular part (section 4) which can be used for analyzing the error of standard algorithms (section 5). The accuracy of the computed Kronecker features are confirmed via error bounds (sections 5 and 6 ) . Our future work will be concentrated on the completion of the our Fortran 77 program that implements the GUPTRI form and the error bounds. Then we want to experiment with the software on a large and diverse set of applications and encourage other people to make use of it. Hopefully this will lead to feedback permitting us to build in more application-based knowledge and focus the software on different and specialized applications (see sections 2.3-4). The first version of the program will

Stablji Computing the Kronecker Structure

32 1

treat all pencils as a general A - XB problem.

8. References

[l] D.J. Bender, "Descriptor Systems and Geometric Control Theory", PhD Dissertation, Dept of Electrical Engineering and Computer Science, Univ. of California Santa Barbara, September 1985 [2] D.L. Boley, "Computing the Controllability/Observability Decomposition of a Linear Time-Invariant Dynamic System, A Numerical Approach, PhD Dissertation, Dept of Computer Science, Stanford University, June 1981 [3] K.E. Brenan, "Stability and Convergence of Difference Approximations for Higher Index Differential-Algebraic Systems with Applications in Trajectory Control", PhD Dissertation, Dept of Mathematics, Univ. of California Los Angeles, 1983 [4] S.L. Campbell, Singular systems of differential equations, Vol. 1 and 2, Pitman, Marshfield, MA, 1980 and 1982 [5] S.L. Campbell, "The numerical solution of higher index linear time varying singular systems of differential equations", SIAM J. Sci. Stat. Computing, Vol. 6, NO. 2, 1985, pp 334-348 [6] K.E. Chu, "Exclusion Theorems and the Perturbation Analysis of the Generalized Eigenvalue Problem", NA-report 11/85, Dept of Mathematics, Univ of Reading, UK, 1985 [7] D.J. Cobb, "Controllability, Observability and Duality in Singular Systems", IEEE Trans Aut. Contr., Vol. AC-29, No. 12, 1984, pp 1076-1082 [8] J. Demmel, "The Condition Number of Equivalence Transformations that Block Diagonalize Matrix Pencils", SIAM J. Num. Anal., Vol. 20, 1983 pp 599-610 [9] J. Demmel, "Computing Stable Eigendecompositions of Matrices", to appear in Linear Alg. Appl., 1985 [lo] J. Demmel and B. K&strbm, "Stable Eigendecompositions of Matrix Pencils A-XB", Report UMINF-118.84, Inst of Information Processing, Univ of Umea, Sweden, June 1984 [ll]J. Demmel and B. K%gstrt)m,"Computing Stable Eigendecompositions of Matrix Pencils", Technical report #164, Courant Institute of Mathematical Sciences, 251 Mercer Str., New York, NY 10012, May 1985 [12] F.R. Gantmacher, The theory of matrices, Vol. 1 and 2, (Transl.) Chelsea, Ney York, 1959 [13] C.W. Gear and L.R. Petzold, "Differentid Algebraic Systems and Matrix Pencils", in [23] pp 75-89 [14] C.W. Gear and L.R. Petzold, "ODE Methods for the Solution of DifferentiaYNgebraic Systems", SIAM J. Numer. Anal., Vol. 21, 1984, pp 716-728 [15] G.H. Golub and J.H. Wilkinson, "Ill-conditioned Eigensystems and the Computation of the Jordan Canonical Form", SIAM Review, Vol. 18, No. 4, 1976, pp 578-619 [16] V.N. Kublanovskaya, "An Approach to Solving the Spectral Problem of A - X B " , in [23] pp 17-29 [17] V.N. Kublanovskaya, "AB-algorithm and Its Modifications for the Spectral Problem of Linear Pencils of Matrices", Numer. Math., Vol. 43, 1984, pp 329342 [18] B. K%gstrt)m, "On Computing the Kronecker Canonical Form of Regular

322

J. Dernniel and B. Kagstrorn

(A-XB)-pencils", in [23], pp 30-57 [19] B. mgstrdrn, "The Generalized Singular Value Decomposition and the General A - XB Problem", BIT 24, 1984, pp 568-583 [20] B. K%gstrt)rn, "RGSVD - An Algorithm for Computing the Kronecker Structure and Reducing Subspaces of Singular Matrix Pencils A - XB ," (Report UMINF112.83), to appear in SIAM J. Sci. Stat. Comp., Vol. 7, No. 1,1986 [21] B. K%gstrt)mand A. Ruhe, "An Algorithm for Numerical Computation of the Jordan Normal Form of a Complex Matrix", ACM Trans. Math. Software, Vol. 6, NO. 3, 1980, pp 398-419 [22] B. K8gstrt)m and A. Ruhe, "ALGORITHM 560, JNF, An Algorithm for Numerical Computation of the Jordan Normal Form of a Complex Matrix [El" ACM , Trans. Math. Software, Vol. 6, No. 3, 1980, pp 437-443 [23] B. K%gstrBm and A. Ruhe (eds), Matrix Pencils, Proceedings, Pite Havsbad 1982, Lecture Notes in Mathematics Vol. 973, Springer-Verlag, Berlin, 1983 [24] F.L. Lewis, "Fundamental, Reachability, and Observability Matrices for Descriptor Systems", IEEE Trans. Aut. Contr., Vol. AC-30, No. 5 , May 1985, pp 502-505 [25] D.G. Luenberger, "Dynamic Equations in Descriptor Form", IEEE Trans Aut. Contr., Vol. AC-22, No. 3, 1977, pp 312-321 [26] C. Moler, "MATLAE - An Interactive Matrix Laboratory", Dept of Computer Science, Univ of New Mexico, Albuquerque, New Mexico [27] C. Moler and G.W. Stewart, "An Algorithm for the Generalized Eigenvalue Problem", SIAM J. Num. Anal., Vol. 15, 1973, pp 752-764 [28] C.C. Paige, "Properties of Numerical Algorithms Related to Computing Controllability", IEEE Trans. Aut. Contr., Vol. AG26, No. 1, 1981, pp 130138 [29] C.C. Paige, "A Note on a Result of Sun Ji-Guang: Sensitivity of the CS and GSV Decompositions", SIAM J. Num. Anal., Vol. 21, 1984, pp 186-191 [30] C.C. Paige and M.A. Saunders, "Towards a Generalized Singular Value Decomposition," SIAM J. Num. Anal., Vol. 18, 1981, pp 241-256 [31] R.F. Sincovec, B. Dembart, M.A. Epton, A.M. Erisman, J.W. Manke, E.L. Yip, "Solvability of Large Scale Descriptor Systems", Final Report DOE Contract ET-78-GO1-2876,Boeing Computer Services Co.,Seattle Washington, 1979 [32] G.W. Stewart, "Error and Perturbation Bounds for Subspaces Associated with Certain Eigenvalue Problems", SIAh4 Review, Vol. 15,1973, pp 752-764 [33] G.W. Stewart, "Gershgorin Theory for the Generalized Eigenproblem Ax=zBx", Math. of Comp., Vol. 29, No. 130, April 1975, pp 600-606 [34] G.W. Stewart, "A Method for Computing the Generalized Singular Value Decomposition," in [23], pp 207-220 [35] G.W. Stewart, "Computing the CS Decomposition of a Partitioned Orthonormal Matrix," Num. Math., 1982, pp 297-306 [36] J-G. Sun, "Perturbation Analysis for the Generalized Eigenvalue and the Generalized Singular Value Problem", in [23], pp 221-244 [37] J-G. Sun, "Perturbation Analysis for the Generalized Singular Value Problem", SIAM J. NUm. Anal., Vol. 20, 1983, pp 611-625 [38] J-G. Sun, "Orthogonal Projections and the Perturbation of the Eigenvalues of Singular Pencils", Journal of Computational Mathematics, China, Vol. 1, No. 1, 1983, pp 63-74 [39] P. Van Dooren, "The Computation of Kronecker's Canonical Form of a Singular Pencil", Lin. Alg. Appl., Vol. 27, 1979, pp 103-141 [40] P. Van Dooren, "The Generalized Eigenstructure Problem in Linear System

Stably Computing the Kronecker Structure

3 23

Theory", IEEE Trans. Aut. Contr., Vol. AC-26, No. 1, 1981, pp 111-129 [41] P. Van Dooren, "Reducing Subpaces: Definitions Properties and Algorithms", In [23] pp 58-73 [42] C. Van Loan, "Generalizing the Singular Value Decomposition", SIAM J. Numer. Anal., Vol. 12, 1975, pp 76-83 [43] C. Van Loan, "Computing the CS and the Generalized Singular Value Decompositions", Numer. Math., Vol. 46, No. 4, 1985, pp 479-491 [44] W. Waterhouse, "The Codimension of Singular Matrix Pairs", Lin. Alg. Appl., Vol. 57, 1984, pp 227-245 [45] G.C. Verghese, B.C. Levy, T. Kailath, "A Generalized State-Space for Singular Systems", IEEE Trans. Aut. Contr., Vol. AC-26, No. 4, 1981, pp 811-831 [46] J.H. Wilkinson, "Linear Differential Equations and Kronecker's Canonical Form", In Recent Advances in Numerical Analysis, Ed. C. de Boor, G. Golub, Academic Press, 1978, pp 231-265 [47] J.H. Wilkinson, "Kronecker's Canonical Form and the QZ Algorithm", Lin. Alg. Appl., Vol. 28, 1979, pp 285-303 [48] W.M. Wonham, Linear Multivariable Control: A Geometric Approach, Second ed.,New York, Springer-Verlag, 1979 Acknowledgements The authors would like t o thank several sources for their support of this work. Gene Golub provided both financial support and a pleasant working environment to both authors for several weeks during summer 1985. James Demmel has received support from the National Science Foundation (grant number 8501708). Bo K8gstrBm has received support from the Swedish Natural Science Research Council (contract IWRS-FU1840-101). Both authors have been supported by the Swedish National Board for Technical Development (contract STU-84-5481). Finally, both authors wish to thank Jane Cullurn, Ralph Willoughby, and IBM for their support in helping the authors present their results at the Large Eigenvalue Problems conference of the IBM Europe Institute at Oberlech, Austria, in July 1985.