19 Homotopy and continuation methods For iterative methods for solving nonlinear operator problems, the difficult and important problem remains of finding an initial guess that is close enough to a solution so that the iterative method converges to the solution. Sometimes this can be done by using additional information about the operator equation derived from a close examination of the particular operator equation involved, or from a crude approximation to a solution found by other means. An elegant approach to the problem of finding a sufficiently close initial ap proximation for a given iterative method to converge is afforded by 'continuation' methods based on the concept of a homotopy. In this chapter, we give an intro ductory treatment of this approach. The basic idea is this: we write an operator equation that we can solve which somehow resembles the one we want to solve. We then find a continuous transformation (a homotopy) depending on a parame ter A such that, for λ = 0, we have the problem we can solve, and for A = 1 we have the problem we want to solve. Thus, we have a continuous transformation of a solvable problem into the problem of interest. We then subdivide the interval [0,1] into a finite number of subintervals, and take the solution (or an approximate solution) at A, as an initial guess for the solution of the problem at A,+i in some it erative method. We hope, in this way, to find a suitable initial guess for the original problem, which corresponds to A = 1. In practice, the approach is most successful if we take a problem corresponding to A = 0 which closely resembles the problem we want to solve, but has a known solution. A family of continuous mappings hx:X-^Y,
Ae[0,l]
Ch. 19]
Homotopy and continuation methods
139
is called a homotopy if the function H:Xx[0,l]->Y defined by H(x,X) = hx(x),
xeX,
λ€[0,1]
is continuous (with the product space topology on X x [0,1]). See Hu [25]. Here X and Y can be any two topological spaces (for instance Banach spaces). The maps ho and h\ are called, respectively, the initial map and the terminal map of the homotopy h\. Two maps, f\X->Y
and
g:X^Y,
are said to be homotopic if there exists a homotopy hx such that h0 = f
and
hi = g.
Then / can be changed continuously into g. In this chapter we consider the use of homotopies in successive perburbation methods, often called continuation methods. By an open set S in a Banach space B we mean a subset S of B such that every point y in 5 has a δ(ν) > 0 neighborhood contained in 5. Thus 5 is open if y in 5 implies that, for some 8{y) > 0, Νδφ)
= {ζ€Β:
\\z-y\\<ô(y)}ÇS.
Note that B is open (in itself). Suppose we are interested in finding a zero of a continuous mapping P:SQBX->B2 where S is an open set in B\. Let XQ be any element of S and consider the homotopy hxiy) = H(y,X) = P(y) + (A - l)P(x0) with initial map ho{y) =
P(y)-P(xo)
and terminal map hi(y) = P(y). Clearly, A0 has a zero at y = XQ. Under various conditions, H(y,X) will have a zero for each λ in [0,1], and the zero xo of H(y,0) may lead to an approximation to a zero of H(y, 1) = P(y) if we can approximately follow a curve of zeros χ(λ) satisfying Η(χ(λ),λ) = 0, λ€[0,1]
140
Homotopy and continuation methods
[ Ch.
with JC(0) = JCO. We can write H(y,X) = Q(y) + XP(x0), where Q(y)=P(y)-P(xo). Suppose that, for every y in 5, P'(y) exists and A(y) is a bounded linear operator from Bi into B\, and that \\A(y)\\ < a for all y in 5. If there is a continuous curve of zeros χ(λ) in the open set 5, then we have the following discrete continuation method. Theorem If, for each X in [0,1], there is a δ(Χ) > 0 such that
ΐμωη*(Α))-/||<θ<ι for every y in Ν§[χ){χ(Χ)) (with Θ independent of X andy), then for any ε > Othere are integers M,N\,N2,..-,NM, and numbers 0 < X\ < X2 < ■ ■ ■ < XM = 1 such that \\χ{1)~χΜ,ΝΜ\\<ε where
xjM l = Fj (-*/,*)>
k=
0,l,...,Nj-u
Xj+ifl=Xj,Nj,
j=l,2,...,M-l,
wHhFjdet\nedbyFj(y)=y-A(y)H(y,Xj),j=\,2,...,M. Proof. From the theorem on p. 131, for any Xj in [0,1] there is a S'(Xj) > 0 such that Xj,k+l
=
Fjixj,k)i
fc
= 0,1,...
converges to x(Xj) from any Xj$ in {y:
\\y-x(Xj)\\<8'(Xj)}.
From the assumed continuity of x(X) on the compact set [0,1] it follows that there is a positive lower bound 0 < δ < ö'(Xj) for Xj in [0,1]. From the uniform continuity ofx(A) on [0,1] we can choose Δλ small enough so that ||JC(AJ+I)— *(λ ; ·)|| < δ for 0 < Xj+\ —Xj
\\y-x(Xj+l)\\<8}.
Homotopy and continuation methods
19]
141
Thus */+1,4+1 -Fj+l
(Xj+l,k)
converges to x(kj+i) from any Jt/+i,o that is close enough to χ(λ ; ) to be in 5χ.+ι. This implies the existence of the integers N\ ,Λ/2, · · · ,ΝΜ in the conclusion of the theorem. D With further assumptions for specific algorithms, we can obtain more precise information. Let us consider, for example, in more detail such an argument as applied to Newton's method. Suppose that
and ||[ΡΌ0ΓΊΙ < B
pM(y)\\
for all y in S. This time, without assuming the existence of zeros of H(y,X) in advance, let us seek sufficient conditions to guarantee that a zero exists and that XQ is a safe starting point for the convergence of the Newton iterates to a zero χ(λ\) of H(y,X\) for some λ\ > 0. In fact, let us seek conditions such that χ(λ) is a safe starting point for convergence to a zero χ(λ +Αλ) of H(y,X +Αλ) if Η(χ(λ),λ) = 0. In order to apply the Kantorovich conditions (see Chapter 17) here, with H(y,X+Ak)=H(y,X)+AXP(xo), and H(y,X) = P(y) +
(X-l)P(x0),
we can take η>Β\Αλ\\\Ρ(χο)\\>\\χι-χ(λ)\\, where χι =χ(λ)-[Η'(χ(λ),λ+Αλ)}"ιΗ{χ{λ),λ since H'(y,X) = P'(y), and Η{χ{λ),λ+Αλ)
= ΑλΡ(χ0).
Now, for h=
Br\K<-,
we can require that hQ =
B2\AX\\\P{xo)\\K<-.
Finally, we require that Ng(x(X)) Ç S where
+ Αλ),
142
Homotopy and continuation methods
[Ch.
For small enough ΔΑ > 0 we can satisfy the above inequality with ff^_
1-(1-2Βΐ\Αλ\\\Ρ(χ0)\\κγ/\^ο_ BK
Thus, if Ng(xo) Ç S for δ satisfying the above inequality, and if /*ο = β 2 |Δλ|||Ρ(*ο)||Κ<1/2 holds, then the Kantorovich theorem applies and H(y,AX) has a zero in Ν§(χο) to which the Newton iterates converge. Call λ\ — AX and denote by χ(λ\) the zero of Η(γ,λι) ΐηΝδ(χ0). We have H(χ(λι),λχ) = 0. If again Ns(x(Xi)) Ç S for some δ as above, then again the Kantorovich theo rem applies, and Newton's method will converge to a zero xfa) of H(y, λι) = 0 in Νξ(χ(Χ\)) where X2 = X\ +AX. Furthermore, since δ > 0, the Newton iterates will also converge to x(Xi) from an approximation to x{X\) (in fact, from any point in N$(x(X\))). Finally, we have the following result. Theorem If there is a ΔΑ > 0 such that ho < 1/2 and Ns(x{Xj)) Ç 5 for some ôj satisfying 8j > So, Xj+\ — Xj + Δλ, then the conclusions of the previous the orem hold, putting A(v) = [^(y)] - 1 ; given an ε > 0, a finite series of Newton iterations will produce an approximation XM,NM t o a z e r o x(\) of P in 5 such that ||x(l) —XM,NM\\ < ε. Example Let B\ and ^2 be the real line and consider the mapping P(y) = ln(l+y) defined and continuous on the open set S = (—b,a) for any 0 < b < 1, a > 0. We have [/"OOl-^i+y,
P{2)(y) = j ^ 2 ,
so we can take
*=(rr^ | p ( 2 ) w | '
*=i+«> ror'i,
^s.
The exact region of safe starting points for Newton's method to converge to the zero of P at y = 0 is easily found to be the open interval (—1, e — 1 ). The Kantorovich conditions will be satisfied in some subinterval. Using the theorem above with H(y,X) = ln(l +y) + (A - l)ln(l +* 0 ) for any S = {-b,a) with 0 < f t < l , a > 0 and any x0 in 5, we can satisfy ho < 1/2 and δ>δο with (l-b)2 AX < 2|1η(1+*ο)|(1+α) 2
19]
Homotopy and continuation methods
143
and -b < x(X) - δ < χ(λ) + δ < a where 1-(ΐ-Τ^#Δλ1η(1+*0))1/2 } δ > ^ y-^-rr — > 0. —
1+α
For any —1 <— b < xo < a < °° v/e can take Αλ sufficiently small to satisfy the above inequalities as long as χ(λ) remains in the interval (—b,a). We can check directly in this simple example that Η(χ(λ),λ)
=0
for ln(l +χ(λ)) + (λ - l)ln(l +*o) = 0, thus (1+χ0γ-λ-1.
χ(λ) =
We can check that Jt(0) = XQ, x(l) = 0, and χ(λ) lies between XQ and 0 in this example. So the discrete continuation method, using Newton's method (A(y) = [^(v)] -1 ) at each step, in this example, can be made to converge from any Jto > — 1 to the zero of P. Whereas the ordinary sequence of Newton iterates converges from XQ to a zero ofP only if — \
+ Ρ(χ0) = 0
(assuming that a differentiable curve of zeros exists, for the moment). We can rewrite this as the differential equation χ'(λ) =
-[Ρ'(χ(λ)Τ>Ρ(χο)
which, together with x(0) = XQ, can be viewed as an initial value problem. If we can find an approximate solution at λ = 1 then this can be taken as an approximation to a zero of P, since P(x(\)) = P(x{0))+ f Jo
Ρ'{χ{λ))χ\λ)άλ.
144
Homotopy and continuation methods
[Ch.
Davidenko's method [47] consists of finding an approximate solution to the initial value problem at λ = 1 by using numerical integration techniques (Runge-Kutta, etc.). For instance, if we apply Euler's method we obtain the algorithm χ{λ+Αλ)=χ(λ)-Αλ[Ρ,(χ(λ)))-ιΡ(χ0), x(0)=x0. If we put Αλ = \/N and denote ** = x(kAX), then Λ 4 + ι=^-Δλ[Ρ'(Λ 4 )]- 1 Ρ(Λο),
* = 0,l,2,...,tf-l.
For N = 1, the algorithm above is just Newton's method! In any case, we can take the result XN after N steps as an approximate solution to P(y) = 0 obtained starting from XQ. Rail's report [47] contains existence and convergence theorems, computational results, illustrations, translations of Russian papers by Davidenko, and numerous references. Ortega and Rheinboldt [44] (especially §7.5 and §10.4 on continuation methods) give extensive discussion and references to methods and results of the type discussed in this chapter. Ficken [16] and Lahaye [34] give theoretical results on discrete continuation similar to the theorems of this chapter. Further develop ment of this approach is contained (for operators on En) in Mayer [39]. Computational aspects For efficient computational procedures based on continuation methods using the discrete approach one should probably do something hke the following. To find (approximately) a zero x{\) of P using the homotopy H(y,X)=P(y)
+
(X-l)P(x0),
we: (1) find the largest (Δλ)ι for which XQ is a good initial approximation for some iterative method to a zero ofH(y, (ΔΑ)ι); (2) compute one or a few iterations, obtaining a point x\ ; (3) find the largest (Δλ)2 for which JCI is a good initial approximation to a zero of H(y,X2) with A2 = (Δλ)ι + (Δλ) 2 ; (4) iterate once or twice, obtaining x2; (5) continue until λ = 1 is reached (if possible).
Homotopy and continuation methods
19]
145
In the general step, then, we should take the result x, of iteratively approximating a zero of H(y,kj) using only one or at most a few iterations and use JC, as a starting point to find, in the same way, an approximate zero of H(y, λ, + (Δλ),·) for as large as possible (Δλ), until we reach λ = 1 (Rail [47]). Of course, if XQ is already a safe starting point (even more so if it is a good starting point) for an iterative method for finding a zero of P, then probably nothing will be gained by taking (Δλ)ι < 1. When using the differential equation approach (Davidenko's method) in a stepby-step numerical integration one should probably take as large a step Δλ as pos sible to stay somewhere near the curve of zeros until λ = 1 is reached with an approximation to x(l) which can then be improved if needed by some iterative method such as Newton's method. In this connection, Mayer [39] shows that if P: En —> £" is twice differentiable and satisfies
ΙΙΡΜΓ^αΜΐ+β n
for all x in E , then for arbitrary JCO in E" we have the following. Suppose x/(t) = -[P'(x(t))}-ip(xo), x(0)=x0, is integrated from t = 0 to t = 1 with a numerical method of order hp, say Xk ~ x(kh) and;c*+i = Φ(*&,Α), and ||JC(1) — XN\\ < Chp where C does not depend on h = \/N. Then the iteration χ Α+ ι=Φ(α&,Α),
k=l,...,N-l
xt+^Xk-lfix^Pixk),
k=
N,N+l,...
converges to the unique solution of P(x) = 0 provided that
A=
i<
[^—L)
and
ChP<8
where j3' and L are constants such that ||[/"W]" 1 ||
and
P^\x)
for all x in Nr+§(xo) and r is given by
ko|| + I
exp(a||/>(x0)||)-|M-§ifa^O,
r=<
L/3||P(x 0 )||ifa = 0. Additional coverage of homotopies, with an introduction to degree theory, can be found in Hutson et al [26].