On Inhomogeneous
Eigenvalue
Problems.
I*
R. M. M. Mattheij
Mathemutisch Znstituut Katholieke Universiteit Toemooiveld 6525 ED Nijmegen, The Netherlands G. Siiderlind Department of Numerical Analysis Royal Znstitute of Technology S-100 44 Stockholm, Sweden In memory of James H. Wilkinson
Submitted by Gene H. Golub
ABSTRACT In studying stability of solutions of linear differential equations one naturally encounters Liapunov equations. In a suitable setting they can be interpreted as equations for the normalized “directions” of these solutions. When applying discretizations to the Liapunov equations one is led to a problem which in its most elementary form can be stated as: Given a matrix A and a vector b, determine a vector x with xrx = 1 and a scalar p such that Ax - b = px. Here p is called the inhomogeneous eigenvalue. We consider the question of how many solution pairs (CL,X) of this problem exist. We also give some numerical methods to compute such a pair; they are based on (generalizations of) shifted power iterations. Finally we consider the case that jlbll is small so that the inhomogeneous eigenvalues can be viewed as perturbations of the homogeneous ones (i.e. b = 0).
*Supported in part by the Netherlands Organization for the Advancement of Pure Research (Z.W.O.).
LINEAR ALGEBRA
AND ITS APPLICATIONS 88/89:597-531
(1987)
597
0 Elsevier Science Publishing Co., Inc., 1987 52 Vanderbilt Ave., New York, NY 10017
0024-3795/87/$3.50
R. M. M. MA’ITHEIJ AND G. S6DEkLIND
508 1.
INTRODUCTION
In a recent paper [9], the authors considered asymptotic estimates of solutions to linear differential systems. The technique used to obtain the estimates was based on a generalized kinematic similarity transformation [7], which decouples the given system
0.1)
ti = L(t)u + g.
Thus, by introducing a new dependent variable y defined by the generalized Liapunov transformation x( t ) = T( t ) y( t ), one easily finds that y satisfies the differential equation
(1.2)
tj = (T_‘LT
- T-‘l;)y
+ T-lg.
In [9] it was shown that the matrix T can be chosen so that
0.3)
T-‘LT-T-If=
A.
where A is a diagonal matrix. The transformation matrix T is nonsingular for all t and is continuously differentiable. Furthermore, T is uniformly bounded, although T- ’ need not be. This is a relaxation of the classical requirements for Liapunov transformations (cf. [2, p. 1171) which were found to be unnecessarily restrictive for our purposes in [9]. We say that A contains the kinematic eigenvalues of A with respect to T. The Liapunov equation (1.3), rewritten as (1.3’)
?=LT-TA,
can be used for stability investigations. Thus, the asymptotic stability properties can be expressed in terms of the largest element of A, i.e. the largest kinematic eigenvalue. If this is denoted by A and the corresponding vector by Z, it is clearly seen that these quantities satisfy
(14
i = Lz
- AZ.
The purpose of this paper is to discuss some methods for the practical computation of X. Rather than solving yet another differential equation in order to find h, we shall transform (1.4) into an algebraic problem by discretizing the derivative appearing in the equation. We shall not discuss the
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
509
choice of discretization; see e.g. [8]. Suffice it to say that any finite-difference approximation yields a problem of the structure 0.5)
px=Ax-b,
where A is a matrix related to L but depending on the particular discretization, p and x are approximations to X and z(t) respectively, and b is a “constant” vector depending on previously computed approximations to Z. Note that if A is n X n, then (1.5) provides n equations for the n + 1 unknowns p and x. The system is therefore underdetermined, and we have to impose an extra condition in order to obtain a welldefined problem. We shall ask that x be a unit vector in Euclidean norm, i.e.
(1.6)
ll4ls = 1.
Numerous other normalization requirements could be used (e.g., x could be required to have the same nom? as b), but we find conditions of the latter type inappropriate, since it is desirable that the solution of (1.5)-(1.6) approach an eigenpair of the matrix when b + 0. We note that (1.5)-(1.6) may be regarded as an eigenvalue problem with an inhomogeneity added to the right-hand side. For this reason we shall refer to this problem as the inhomogeneous eigewalue problem. We note that problems quite closely related to this have been considered by Golub [4] in connection with investigations of stationary values of quadratic forms satisfying constraints of the type (1.6). Other related problems are dealt with in Gander [l], where least-squares problems with quadratic constraints are considered. Indeed, we may think of the inhomogeneous eigenvalue problem (1.5), (1.6) as a Lagrangian system (in which p is the Lagrange multiplier) obtained by trying to solve 0 = Ax - b, subject to the quadratic constraint rTr = 1. Thus we minimize ]IAx - bllz over the unit sphere, and the corresponding value of the Lagrange multiplier is of particular interest. It is interesting to see that the solution to the problem (1.5)-(1.6) is not invariant with respect to resealings of the solution vector X. This stands in sharp contrast to the properties of the “usual” homogeneous eigenvalue problem, the solution of which is independent of the way in which the solution vector is normalized. One may then ask whether or not it is justified to solve our original problem (1.4) in the proposed way. The answer is yes, because (1.4) is indeed homogeneous in the sense that if z(t) is a solution, then so is oz(t), and they both correspond to the same kinematic eigenvalue A. In [9] the transformation matrix was constructed in such a way that the
510
R. M. M. MA’ITHEIJ AND G. SijDERLIND
columns all have unit Euclidean norms, i.e., z(t) has unit length. Thus it is most natural to impose (1.6) in the application under consideration. At present, we are not aware of any other applications in which the inhomogeneous eigenvalue problem arises, but one should keep in mind the different possibilities in choosing the normalization requirement. As will turn out in the sequel, the actual computation of p and x is less straightforward than might be thought beforehand, viewing the inhomogeneous eigenvalue problem as a generalization of the homogeneous one. As we shall see, the character of the former may already cause a breakdown in the generalized power method, whereas the inverse power method (with shifts) only has a formal relationship. Besides this inhomogeneous eigenvalueeigenvector problem and its related computational methods, it is appealing to look for a generalization of the Schur form of the analogous matrix problem (cf. [8, 91). In our context this means, given matrices A and B, that we would like to find an orthogonal matrix Q and an upper triangular matrix U such that AQ=QU+B.
(I.71
For reasons of numerical stability (cf. [6, lo]), this should be preferred to trying to find a matrix T with unit column vertcrs and a diagonal matrix D such that AT=TD+B, as was mentioned in [9]. The existence and computation of such Q and U require a separate analysis, which we intend to deal with in paper II of this series [lo]. The paper is built up as follows. In Section 2 w-e consider basic properties of the inhomogeneous eigenvalue problem and the question of existence and uniqueness of solutions. In Section 3 we discuss a power iteration method and analyze its convergence properties. Then, in Section 4, we consider an inverse power iteration. Finally, in Section 5 we develop a first-order perturbation analysis for the inhomogeneous eigenvalue problem considered as a perturbation of the homogeneous problem. 2.
EXISTENCE
For convenience, we shall reformulate the inhomogeneous eigenvalue problem (1.5)-(1.6) as (2.1)
/LX= Ax-b, x5 = 1.
ON INHOMOGENEOUS EIGENVALUE PROBLEMS. I
511
We shall now investigate the existence of solutions to (2.1), and in this discussion we limit ourselves to the real case. As we shall see later on, a solution in the complex domain is of little interest in our application. The question of the existence of solutions to (2.1) is quite closely related to the classical Fredholm theory for integral equations. For the moment, we shall consider the solvability of Z.KK = Ax - b when p is a given parameter. Thus, given a CL,we have the following Fredholm alternative: (1) A unique ~LX= Ax only has matrix A. (2) If ZJ is an orthogonal to any Clearly, in the can be directly 9( 0) and &“(a) Then it follows
solution exists if and only if the homogeneous problem the trivial solution x = 0, i.e., ZJ is not an eigenvalue of the eigenvalue of A, then a solution exists if and only if b is corresponding left eigenvector of A.
first case the resolvent operator (A - pZ)-’ exists, so that x expressed in terms of A, p and b. In the second case, let denote, respectively, the range and nullspace of an operator. that a solution exists if and only if
Let y be a left-hand eigenvector of A corresponding to the eigenvalue ZJ, and note that y E N(AT - pZ). We then obtain yTb=yT(A-pZ)x=O, which proves the second alternative. It is worthwhile to note that yTb = 0 is only a necessary condition in order to have an inhomogeneous eigenvalue that coincides with an eigenvalue of A. However, the case yTb = 0 for some left eigenvector y requires special attention, and we shall refer to this as the degenerate case. In (2.1), we have to solve for x and (L simultaneously. The sole purpose of assuming that p is given is that the Fredholm alternative reveals the structure of the case when the inhomogeneous eigenvalue is also an eigenvalue of A. Dropping the assumption that p is a given parameter, we can still write the solution formally in the following way:
(2.2)
x=(A-pZ)-lb.
We note that the resolvent operator is a meromorphic function of Z.L,i.e., the solution given by (2.2) is well defined and smooth away from the spectrum of A. The problem of solving (2.1) then amounts to finding a value
R. M. M. MATTHEIJ AND G. SijDERLIND
512
FIG. 1.
f(p) = ll(A - /dm’bll,.
of p that makes r a unit vector in Euclidean norm. Consequently, the question of existence of solutions can be rephrased in terms of the solvability of the single nonlinear equation
(2.3)
l=
(I(A - pZ)-?(I,.
Assume that A and b are real and let p be a complex number. Evidently the expression on the right-hand side of (2.3) tends to zero as p tends to infinity. Similarly, it becomes arbitrarily large when p approaches any one of the eigenvalues of A unless b is orthogonal to any corresponding left eigenvector. By continuity it follows that (2.3) defines a nonempty set of “level curves” in the complex plane that are symmetric with respect to the real axis. All points on such a level curve correspond to solutions to (2.1). See Figure 1 for an example where A is of third order. Clearly, a level curve is closed, since it would otherwise pass through infinity. It therefore follows that if A has at least one real eigenvalue and b is not orthogonal to any corresponding left eigenvector, then there must exist at least two real solutions to (2.3). If b is orthogonal to that eigenvector, or if A has no real eigenvalues, the problem requires further analysis. For example, roots may still exist in pairs (some of which can coalesce to form double roots) or they may be nonexistent. In any case it is obvious that a solution p to (2.3) also defines a solution vector x via (2.2). To further clarify these matters, we shall give a few examples that illustrate the different cases. EXAMPLE 2.4.
(2.5)
Consider the inhomogeneous eigenvalue problem
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
513
where v is a real parameter. A has no real eigenvalues, and real solutions exist if and only if v2 3 i. If v2 = +, then p = 0 is a double root with x=(v, - v)r. For y2 > +, the two solutions are
EXAMPLE 2.7. Consider the problem
(2.8)
?[i
_H+- [g],
&=l.
A has a real eigenvalue 1, and b is orthogonal to the corresponding left eigenvector. The solution p is given by (24, so that once again there are real solutions if and only if v2 >, i. The corresponding solution vectors x have a zero first component; the second and third components agree with the solution in (2.6). Thus, when v2 is in the interval (i, l), the two solutions p are located to the left of the real eigenvalue. For v 2 = 1 one of the inhomogeneous eigenvalues coincides with the real eigenvalue of A. Note that x is not an eigenvector of A in that case (if it were, b would be zero and the problem would be a homogeneous eigenvalue problem). Finally, if v2 > 1, there is one real solution p on either side of the real eigenvalue 1. Normally (i.e. when the problem is not degenerate) the real solutions would always appear pairwise on either side of the real eigenvalue (so there are at most 2m inhomogeneous eigenvalues if A is of order m). The effect of a degenerate b is that it “removes” one of the singularities of the resolvent operator. In general, the inhomogeneous eigenvalues do not have the interlacing property that between two real eigenvalues of the matrix there are two inhomogeneous eigenvalues, although this may often occur. To clarify these statements we have drawn a few graphs of f(p) = \!(A - pI)-‘bjl, for some typical cases (m = 3). In Figure 2 we have a nondegenerate situation where the number of inhomogeneous eigenvalues is smaller than 2m, and finally, in Figure 3 we have sketched a degenerate problem. One should note that rather complicated bifurcations occur if b is replaced by ab, a scalar. Thus for some (Y< 1 the inhomogeneous eigenvalues /.L 1 and p2 in Figure 3 disappear, but pa,p4 and ps, p6 are continuous functions of LXas IY--+0. Conversely, for some (Y> 1, p2, pa, p4, and p5 vanish, whereas pi and /.~s vary continuously with an increasing 0~.
R. M. M. MATTFIEIJ AND G. S6DERLIND
514
Al
I4
PL2
f(p) = [[(A - pZ)-‘bit,,
FIG. 2.
CL1 FIG. 3.
P2 Al
13
PLj A2
I.4 x2
f(p) = ll(A - pZ)-‘bll,,
p real, nondegenerate
I.4
p5
A3
PLq case.
p6
p real, degenerate case.
We summarize
these results in the following
THEOREM 2.9.
Let A be of order m. For the number of real inhomogek say (counted with their multiplicity) we have
neous eigenvalues, (i) k is even; (ii) 0 < k < 2m.
If all eigenvalue of A are real and k = 2m, then it follows that the largest inhomogeneous eigenvalue is larger than the largest eigenvalue of A and the
ON INHOMOGENEOUS
515
EIGENVALUE PROBLEMS. I
smallest inhmwgeneous eigenvalue is smaller than the smallest eigenvalue of A. The number k can be zero only if rw eigenvalue of A is real. Proof. Consider f’(p) = ll(A - pZ)-‘bll$ Let T be a matrix of which the column vectors are the principal vectors, i.e. such that T-‘AT = J has the Jordan form. Let A have k eigenvalues x . . , A, with multiplicity p,, . . . , pk respectively. For the Jordan block Ji associated with Xi, i.e. r,
xi
Ii =
we find that
.
-1
I.
So we find that the vector (A - pZ)-‘b = T-‘(JIJ.Z)-‘Tb has coordinates typically of the form E~=~C~:~(Y~~/(X~- CL)’for some ais. Hence
fQ)=
i 5 t z i=l s=l j=l t=l
Pisjt
for some fiisjt.
(Xi-/.b)S(Xj-~)t
Now consider f 2(p) = c (some value E R + ). Multiplying through both sides of this equality by p’(p), with p the characteristic polynomial, gives an equation for p of degree 2m. n Before we conclude this section we point out that it was shown in [8] that the kinematic eigenvalues of any linear (nonautonomous) differential system are real. Since one of the goals of this investigation is to present possible ways of computing numerical approximations to these kinematic eigenvalues, it is evident that complex solutions are of no interest. Moreover, since the level curves mentioned earlier are symmetric about the real axis when the data of
R. M. M. MATTHEIJ AND G. SijDERLIND
516
the problem are real, we cannot see any reason why one should prefer one particular complex solution to another (recall that all points on the level curve are solutions). Therefore we shall only consider real solutions in the sequel. Golub [5] has pointed out that the inhomogeneous eigenvalue problem can be viewed as a quadratic eigenvalue problem. When or.is real, Equation (2.3) can be written br(A-pZ)-r(A-PI)-‘b-1=0. The quantity on the left-hand side is the Schur complement of the matrix
b
(A-/LZ)(A-~Z)~
1
b*
1.
Since this matrix must be singular, we have det[(A-pZ)(A-PI)*--bb*]
=O
which reduces to det[$Z+@‘+Q]
=0
where P = - (A + AT), Q = AAT - bb*. Evidently there is an x such that
,u2x+ pPx + Qx = 0. Introducing
y = PLXand transforming to companion matrix form, we obtain
bb’)AA*
.:A*][;]
=‘[
;I7
and our problem has been rewritten as a standard 2n x 2n eigenvalue problem. (As we have already mentioned, only real eigenvalues 1-1are of interest.) However, since this approach has no simple counterpart in differential equations, we have not pursued its analysis further.
ON INHOMOGENEOUS
3.
POWER
EIGENVALUE PROBLEMS. I
517
ITERATION
We shall now proceed to investigate different methods of computing inhomogeneous eigenvalues. Since the simplest method for the computation of an eigenvalue of a matrix is the power iteration, it is natural to generalize that process to the inhomogeneous case. The proper generalization of the power method to the inhomogeneous eigenvalue problem is the iteration (3.1)
Yn+l:=Ax,-b, P”+l:=
CYn+13
x n+l:=
skdk+l> ~
Ynil
llYn+1112*
Evidently, this process generates a sequence of normalized vectors. The vector sequence and the sequence of p-values will converge under certain conditions to a solution to (2.1). It is necessary to use the information provided by the sign of p in the construction of X, since if p is negative, then the angle between x and y is obtuse. In such a case the sign of p ensures that x and y become antiparallel. If the algorithm were not constructed in this way, it would not be able to converge to a solution with p < 0. From the previous discussion of the existence of solutions we know that there may be more than one solution to the given problem. Therefore, any convergence result for the power method must be of a local nature, i.e., it is not meaningful to discuss global convergence properties. In our convergence analysis we consequently limit ourselves to deriving a local criterion, expressed in terms of a solution to (2.1), that is sufficient for local convergence to that solution. Naturally the result is based on a contraction-mapping argument. THEOREM 3.2. Let CL,x denote a solution to (2.1), and let P denote the orthogonal projector Z - xx T. Zf the operator PAP/p is a contraction and x0 is chosen sufficiently close to x, then the inhomogeneous power iteration scheme (3.1) converges linearly to the solution CL,x. Proof.
Let P,,+l = Z - x,,+~x~+~ and note that
(3.3)
O=P(Ax-b),
(3.4)
0 = Pn+l(Ax,
- b).
R. M. M. MA’ITHEIJ AND G. SijDERLIND
518
Introduce the error 5, defined by
(3.5)
x, = x + [“.
Neglecting higher-order error terms, the variational equation of (3.4) becomes
o=(z-(~+S,+l)(x+5”+l)“)(A(x+~,)-~) “(P-X.g+1-< .+lXT)(W + A&J = PM,, - p&,+1- %+1xNow, the last term is also of second order, i.e. t,‘+,x = 0(11,$‘,+,112); see Figure 4. Hence the variational equation is
0 = PM,, - ,d,+
1.
Simfiarly, 5, = PI, + Wi,l12), and consequently the error propagation is given up to first-order perturbations by PAP
W)
5 *+1=-
2
FIG. 4.
tl’
n
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
519
It is important to note that the normalization of the vector sequence at every step implies that the scheme (3.1) generates an infinite bounded sequence. Therefore there is at least one cluster point, although this in turn does not imply convergence of the scheme. On the contrary, it may happen that the sequence approaches a limit cycle. In fact, it may even happen that the sequence converges for some initial vector X, whereas for others it approaches a limit cycle. Numerical experiments indicate that aperiodic nonconverging sequences may also result, even if a limit cycle is the normal indication of a nonconverging iteration. If there is a solution with /J> 11AlI s, it immediately follows from Theorem 3.2 that convergence will take place with an appropriately chosen initial vector. However, this is not a necessary condition: it is not even necessary that p be greater than the spectral radius of A, and the scheme may converge to more than one solution. Thus, the well-known phenomenon that the power method for the homogeneous eigenvalue problem will converge only to the dominant eigenvalue has no simple counterpart in the inhomogeneous case. This is clearly illustrated in the following examples.
REMARK 3.7. Note that shifts can easily be incorporated in the scheme (3.1); one merely replaces A by A - KI and p by p - K. EXAMPLE 3.8. Consider the problem where
Th e problem has six real solutions, two on either side of each of the eigenvalues. If the initial x-vector is parallel to the vector (1, - 1,l)r, the method converges to the solution with p = - 11.008. The corresponding inhomogeneous eigenvector x equals approximately (0.9924,0.0999,0.0714)r. For the matrix PAP we thus find
PAP =
- 0.0030 0.1146 - 0.2015
0.1146 - 1.0783 - 0.08449
- 0.2015 - 0.08449 2.9193
1 .
It is interesting to note that PAP has eigenvalues approximately equal to the diagonal elements. In fact it turns out that p(PAP) = 2.94. Hence we expect a convergence rate = + 0.267. If, on the other hand, the initial vector
R. M. M. MA’ITHEIJ AND G. SijDERLIND
520
is parallel to ( - 1, - 1,0.5)r, the method converges to the solution p = - 8.9885. Convergence does not occur for the other solutions, showing that there is also a dominance phenomenon in the inhomogeneous case. In this example, the matrix A and the vector b are such that both inhomogeneous eigenvalues to which convergence was obtained are directly associated with the largest eigenvalue of A. The following example shows, however, that for nondiagonal matrices and matrices which do not have such well-separated eigenvalues, a clear dominance may not be observed. EXAMPLE 3.9.
Consider the inhomogeneous eigenvalue problem given
by
A=
[!; ,t,],
b= [g].
The problem has two real solutions, /.L = 0.291833 and /.L = - 0.091833 (cf. Example 2.4). When (3.1) is applied to this problem with x = (0, - 1)r as starting vector, we obtain convergence to the positive inhomogeneous eigenvalue. If one takes x = (1,O)r as the starting vector, the sequence eventually reaches a limit cycle, with EL, oscillating between - 0.132 and - 0.0674. There is no convergence to the negative inhomogeneous eigenvalue. Example 3.9 shows that convergence and limit cycles may occur in the same system. Since a limit cycle is the natural sign of divergence, we also conclude that there are cases which do not exhibit a dominance phenomenon, and that convergence of the power method (3.1) necessarily is of a local nature. As we shall see in Section 5, the dominance observed in Example 3.8 can be attributed to the size of the vector b; it is “small,” and therefore the inhomogeneous eigenvalues are close to the homogeneous ones. The dominance of the inhomogeneous iteration can then be investigated in terms of a perturbed power iteration. In the second example b is not “small,” and the two real inhomogeneous eigenvalues have little to do with the two complex homogeneous eigenvalues.
4.
INVERSE ITERATION
For homogeneous eigenvalue problems, inverse iteration has the advantages that one can obtain convergence to any one eigenvalue and that the convergence rate can be enhanced by using shifts in an appropriate way. In
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
521
the inhomogeneous case, however, there is no obvious way of introducing an analogue to inverse iteration. Here we shall describe and investigate two algorithms which both resemble inverse iteration in the homogeneous case. In the previous section, we noted that shifts can easily be used in connection with the inhomogeneous eigenvalueproblem (Remark 3.7). In the following analysis, it is assumed that fi and A represent shifted quantities unless otherwise stated, i.e., for a shift K we have fi := p- K and A := A - KZ. To begin with, we shall study the following algorithm. Introduce the notation
(4.1)
C:=&lb
Then our problem can be written
(4.2)(4 (4.2N-4
pz+c=r, xrx=1.
Now consider the iteration
(4.3)
jinZn+C=xn+l,
where z, = A- lx,, and fi,, is chosen so that x,+ i is a unit vector. Imposing the condition r~+ir,,+i = 1 on (4.3), it can easily be shown that
(4.4)
The sign should naturally be chosen to maximize xz+ix, to promote convergence. Clearly, a real solution fi, exists if cTc < I (the norm of c depends on the shift, but cannot always be made less than unity when the shift is chosen to approximate a certain inhomogeneous eigenvalue p), although this condition is not necessary for convergence, as can easily be seen from a graph like Figure 5. An analysis of the local convergence behavior of this iteration yields the following result.
52.2
R. M. M. MATIFIEIJ AND G. SijDERLIND
FIG. 5. THEOREM 4.5. Let fi, x denote a solution to (2.1), and let P denote the orthogonal projector I - xx T. Zf the operator ,kP(i - bxT)-‘P is a contraction and x,, is chosen sufficiently close to x, then the inverse power iteration scheme (4.3)-(4.4) converges to the solution fi, x. REMARK 4.6. The condition for convergence with a shift K is that (p - K)P(A - KZ- brT)-‘P is a contraction. Clearly, the convergence is faster the closer to p the shift K is taken, unless p is an eigenvalue of A - bxT. Proof. We use the same error notation as in the previous section, i.e. XII = x + E,. In this proof we shall omit A over p and A. Subtracting (4.2a) from (4.3), we obtain (4.7)
t n+l =~A-‘&+(~,,-/J)A-‘x,.
Now, since rT&,+i = 0(115,+i11~), we have, to first-order perturbations, . 0 = ,ux~A-‘&, + (/.L,- /.+TA-‘x,.
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
523
Solving for CL,- p and substituting into (4.7) yields the error propagation formula
(4.8)
Using the relation A -IX = (X - c)/p, we can write A-%X* ---=
(4.9)
XTA -1x
XXT-
cx*
l-x*c
*
In certain problems it may happen that x*c = 1. However, this situation is easily changed by using a shift, and thus we may assume that X*c # 1. Now let P = Z - xx*. We know that [,+r = P.$,+lr [, = P.$, to first-order perturbations, and hence the recursion of interest is
ptnfl=
P( z - ~)PA-~PE,.
Using (4.9), this equation can be written (dropping the orthogonal projector on the left-hand side)
(4.10)
tnfl=
P(Z+ &)p-%,.
By the Sherman-Morrison-Woodbury
(4.11)
z+
formula,
&(z-cx’)-l.
Hence, using the relation AC = b, we have (4.12)
En+l = pP(A
- bx*) -‘P&,
from which the result follows. A useful alternative to (4.3), (4.4) is given by (4.13)
p,(A-KI)-lX,+(A-KI)-l(b-KX,)=X,+l.
R. M. M. MATI’HEIJ AND G. SijDERLIND
524
So we may redefine z, := (A - KZ)-~X, and c( = c,,):= (A - KZ)-‘(b - KX,). The computation effort is much the same as for the first method, and the same holds true for the convergence properties. Next, we shall present an algorithm for the inhomogeneous eigenvalue problem that is based on Newton’s method. The convergence of this algorithm may be quite fast, and we shall see that it is closely related to inverse iteration in the homogeneous case. We shall not, however, analyze its convergence properties. The inhomogeneous problem (2.1) is rewritten in the form (4.14)
Ax-/m-b=& 1 - x*x ~ =o. 2
Introducing the partitioned vector y, defined by (4.15)
Y=
[
b1
the problem (4.14) can be viewed as a nonlinear equation F(y) = 0. The Jacobian of this system is
(4.16)
A-p,,Z
-x,
- xf
0
F’(Y”) =
Defining B, = A - p,Z, and denoting the residual by
(4.17)
we can write Newton’s method in the following way: (4.18)
B,,~x, - x,&,, -x*sx n
= r,,,
” =p n,
and (4.19)
Xnil
=
Lx” +
lL+l=l4t+~PL,~
ax,,
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
525
Thus each iteration consists of solving the system (4.18) and updating the approximation according to (4.19). The first equation in (4.18) can be written (4.20)
B,6x,
= X,&L, + r,,,
showing that the correction Sx, is an affine function of Sp,,, i.e., (4.21)
Sr, = P, + %l4-%~
where (4.22)
%P,
= r n,
%q,
= xn-
Thus we may solve for p, and q, using the same system matrix B,. Once these vectors are found, (4.21) is substituted into the second equation of (4.18). This immediately yields the correction S/J,,:
S/l,=
(4.23)
-
Pn + CP, G7,
.
The method we have proposed here is closely related to inverse iteration in the following way. Suppose that b = 0. Then it follows that p, = - x, and q,, = x,+ 1, where (4.24)
x”+~=(A-P,,Z)-lx..
By (4.23), the updating formula for p in (4.19) can be written
(4.25)
Hence, if the last component of the residual is small (i.e. XL = l), we obtain approximately
(4.26)
pFln+1=
Pn +
x:x, x;(A-,u”Z)-~X,
p, -=c1, or
526
R. M. M. MATTHEIJ AND G. SbDERLIND
in the homogeneous case. We note that (4.24) and (4.26) correspond to inverse iteration of Rayleigh quotient type for homogeneous eigenvalues. Comparing the two methods discussed here, we see that the Newton method is more expensive, since it requires that one solve two linear n X n systems per iteration. (It is only in the homogeneous case that the first of these systems has the simple solution p, = - x,.) Naturally, the cost of the Newton iteration can be reduced to some extent by using a modified Newton iteration; this correspond to using the same shift in several consecutive iterations. Finally, one should note that the Newton method does not generate a normalized sequence of x-vectors. In a practical situation we take the shift equal to the most recent approximate inhomogeneous eigenvalue. From the proof of Theorem 4.5 we then deduce that CL,,- p = [rr(A -p .-~~)-l~,I-‘~T(A - ~,-d-%n(~n-l - p)+O(llE,+,II~). Assuming ll~,lls = O(p,_, - p), we then may draw the conclusion that we have a quadratically convergent process, as in the homogeneous case. This is demonstrated in the next example. EXAMPLE 4.27.
Let A
=
0.01 (0
0.01 0.005)’
b= K::).
This problem has an inhomogeneous eigenvalue p = 0.15397.. . and corresponding eigenvectors 1c= ( - 0.74121, - 0.67127)r. For the sequence of errors we obtain n
P-P,
1 2 3 4 5
5.43 - 1.73 1.53 - 9.1E 1.23
-
2 5 8 16 16
The error p - ps is almost entirely due to rounding errors (IBM FORTRANdouble precision).
5.
PERTURBATION
4341
ANALYSIS
In this section we shall consider the inhomogeneous eigenvalues as perturbations of homogeneous eigenvalues when the inhomogeneous term b
ON INHOMOGENEOUS
EIGENVALUE PROBLEMS. I
527
is small. We shall develop a first-order perturbation analysis where the perturbation parameter is the norm of b. We restrict ourselves to an analysis of how a small inhomogeneity will affect a simple real eigenvalue of the matrix A. Thus we shall not discuss multiple or defective eigenvalues. Let b be a given unit vector, and consider the inhomogeneous eigenvalue problem (5.1)
px=Ax-sb,
xTx=l
as the perturbation parameter E + 0. Next, we express x and Z.Las asymptotic series in .s: (5.2)
X -
xXiEi,
p
&hiEi.
-
We shall prove the following result: THEOREM 5.3. Let p. be a simple real eigenvalue of the matrix A with a unit eigenvector x0. The solution to the perturbed problem (5.1) is then represented by the following asymptotic series:
(5.4) (5.5) where (A - p0 Z ) + denotes the Moore-Penrose geizeralized inverse of A - Z.L ,I (cf. [6, p. 1391). The choice of sign in x0 gives two diffment solutions, i.e. each simple real eigenvalue is split into a pair of solutions. Proof. Substitute the formal series (5.2) into (5.1). Equating like powers of E, we obtain
&O: El: E2:
pox1
pox2
+
PlX1-t
I-W,,
=
Ax,,
+ ~1”~
=
Ax, - b,
ELZXO =
Ax27
and so on. The first equation defines po, x0 as the unperturbed eigenpair under consideration. The normalization of this eigenvector yields xExo = 1,
R. M. M. MATTHEIJ AND G. SijDERLJND
528
and the normalization of the solution x gives (neglecting higher-order terms)
Thus we must have +,
= 0,
x;l;c,+!2x;r,=0,....
Clearly, the first-order perturbation xi of the eigenvector x0 is orthogonal to x0. Forming inner products with x0 in the equation for er and sohing for pi yields pi = x;(Ax,
(5.6)
- b).
Evidently Z.L~may be 0, e.g. if A = AT and xib = 0, in which case x:Ax, = (ATxO)%i = /~a& = 0, or, more generally, if Ax, - b should be orthogonal to xa. xi can now be determined by substituting the expression for pi into the equation, derived from the &‘-term: pox1
Denoting
I - x0x:
+
x,&Ax,
-
b) = Ax, - b.
by Pa, this equation can be written
P,b = (&A - P&,, or, since x1 = Poxi and Pa is idempotent,
P,b = &(A
- poZ)Poq.
There is a consistent singular system, and one solution is therefore given by (5.7)
xi=
[&(A-/QZ)P,]+P,,~,
where + denotes the Moore-Penrose inverse. It remains to simplify (5.7). Since pclois a simple eigenvalue, the unique xi satisfying the orthogonality requirement XTX~= 0 is given by (5.8)
q=(A-poZ)+b.
ON INHOMOGENEOUS
529
EIGENVALUE PROBLEMS. I
Now let U be a matrix consisting of orthonormal column vectors such that UTx,=O. Then P,,=UUT and
[U 1 ~o]=Gh,$)[ U 1 G] =%= [;
;],
where S, is a nonsingular (n - l)st-order matrix. Hence
Po(A-PZ)P,,=UU=[ U 1 G][; =UU=
s [o
;
;][u 1 x,]=uu=
1
UUT=A-p,,Z.
Consequently P,(A-~,,Z)+PO=(A-~OZ)+=U
[
‘6’
;
1
UT.
From this we deduce that the requirement xTrO = 0 is fulfilled. Substituting (5.8) in (5.7) and (5.6) proves the result.
n
Next, we shall proceed to give a brief discussion of the implications of Theorem 5.3 as regards the power iteration presented in Section 3. We may thus give a partial explanation of the dominance phenomenon observed in Example 3.8 as well as a quantitative analysis of the convergence rate when b is small. According to Theorem 3.7, convergence is governed by the spectral properties of the operator PAP/p, where P is the orthogonal projector associated with the solution to the inhomogeneous eigenvalue problem. Therefore we have to investigate the properties of this operator in terms of the asymptotic representation of the solution in order to assess the convergence when b is small. By direct calculation, it can be verified that
(5.9)
PAP=P,A+e[
-( x1x; + &)A
- (P,A - /q,Z)x,x,T]
,
neglecting higher-order terms. Hence the spectral properties of P,A will be decisive when E is small. Once again, we assume that pa, x0 is a simple eigenpair. Clearly, P,Ax,
= Popox
= p,,P,,x, = 0.
R. M. M. MA’ITHEIJ AND G. SODERLIND
530
Since ZQ, is simple, we can find an orthogonal similarity transformation [U]xO] such that
We find that S = UTPoAU, since P,Ax, and it follows that
= 0 and x:P, = OT. Now, P,U = U,
S = UTAU. Consequently,
we have S - AZ = UT(A - XZ)U, and
S-AZ 0
0
(A-XZ)[ 1
U 1 &,I.
Hence S has the same eigenvalues at A, except for the simple eigenvalue po. Moreover P,A has the same eigenvalues as S, and one zero eigenvalue. Thus, by (5.9), the eigenvalues of PAP/p are O(E) perturbations of the eigenvalues of A (except for I*,,) divided by 1-1.This agrees with the theory for power iteration in the homogeneous case. If we consider an inhomogeneous problem in which b is small, and we are looking for the inhomogeneous eigenvalue which is a perturbation of the largest homogeneous eigenvalue (i.e. p. = Ai, where 1h 1I> IA 2j > . . . > IX nI), then the convergence is essentially governed by the ratio
(5.10)
I; I
+0(E),
also in agreement with homogeneous power iteration,. We have avoided to estimate the first-order perturbation term, since the computations become quite complicated. We remark that (5.10) apparently enables convergence to two different solutions (corresponding to the sign ambiguity in the associated eigenvector) as long as the ratio (5.10) is less than unity. This was also observed in Example 3.8. However, one should keep in mind that this convergence analysis is only valid when b is small compared to other parameters, and in the general case, convergence may still take place to more than one root.
ON INHOMOGENEOUS
EIGENVALUE
PROBLEMS.
I
531
We would like to thank G. W. M. Staarink for helping us to prepare some of the numerical examples, and an anonymous referee for a very careful reading of the manuscript. REFERENCES
10
W. Gander, Least squares with a quadratic constraint, Numer. Math. 36:291-307 (1981). F. R. Gantmacher, The Theory OfMatrices, Vol. 2, Chelsea, New York, 1959. G. H. Golub and R. Underwood, Stationary values of the ratio of quadratic forms subject to linear constraints, Z. Angew. Math. Phys. 21:318-326 (1970). G. H. Golub, Some modified matrix eigenvalue problems, SZAM Rev. 15:318-334 (1973). G. H. Golub, private communication. G. H. Golub and C. F. van Loan, Matrix Computations, Johns Hopkins U. P., Baltimore, 1983. E. J. Harris and J. F. Miles, Stability of Linear Systems: Same Aspects of Kinematic Similarity, Academic, New York, 1980. R. M. M. Mattheij, Riccati type transformations and decoupling of singularly perturbed ODE, in Stiff Computation, (R. Aiken, ed.), Oxford U. P., 1985. G. Soderlind and R. M. M. Mattheij, Stability and asymptotic estimates in nonautonomous linear differential systems, SZAM J. Math. Anal. 16:69-92 (1985). G. SSderlind and R. M. M. Mattheij, On inhomogeneous eigenvalue problems. II, to appear. Received 29 April 1986; revised 16 June 1986