A convergence theorem for the Newton-like methods under some kind of weak Lipschitz conditions

A convergence theorem for the Newton-like methods under some kind of weak Lipschitz conditions

J. Math. Anal. Appl. 339 (2008) 1425–1431 www.elsevier.com/locate/jmaa A convergence theorem for the Newton-like methods under some kind of weak Lips...

116KB Sizes 2 Downloads 63 Views

J. Math. Anal. Appl. 339 (2008) 1425–1431 www.elsevier.com/locate/jmaa

A convergence theorem for the Newton-like methods under some kind of weak Lipschitz conditions ✩ Min Wu Department of Mathematics, Zhejiang University, 310027 Hangzhou, Zhejiang, People’s Republic of China Received 21 February 2007 Available online 7 August 2007 Submitted by T. Witelski

Abstract We provide sufficient conditions for the convergence of the Newton-like methods in the assumption that the derivative satisfies some kind of weak Lipschitz conditions. Consequently, some important convergence theorems follow from our main result in this paper. © 2007 Elsevier Inc. All rights reserved. Keywords: Banach space; Newton-like methods; Inexact Newton methods; Weak Lipschitz condition

1. Introduction Consider the system of nonlinear equations f (x) = 0,

(1.1)

where f is a Fréchet differentiable operator defined on a convex subset D of a Banach space X with values in a Banach space Y . Let f  (x) denote the Fréchet derivative of f evaluated at x ∈ D. A classical method for finding a solution to (1.1) is Newton’s method. Given an initial guess x0 , Newton sequence {xn } (n  0) generated by xn+1 = xn − f  (xn )−1 f (xn ).

(1.2)

Under some conditions on f one can show that for some x ∗ ,   x ∗ = lim xn and f x ∗ = 0. n→∞

Newton’s method is attractive because it converges rapidly from any sufficiently good initial guess. However, in many applications, the Fréchet derivative f  is tedious to obtain. For these cases, the approximate Newton processes, also called Newton-like methods (see [5,12]) may be better. Thus, let A(x) : D → L(X, Y ) be an approximation of f  , ✩

Supported by the National Science Foundation of China (Grant No. 10471128). E-mail address: [email protected].

0022-247X/$ – see front matter © 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.jmaa.2007.07.073

1426

M. Wu / J. Math. Anal. Appl. 339 (2008) 1425–1431

where L(X, Y ) is the set of all bounded linear mappings from X to Y and I is the identity operator. Assume A(x) is invertible, then xn can be defined by xn+1 = xn − A(xn )−1 f (xn ),

n = 0, 1, . . . .

(1.3)

Sufficient convergence conditions for the sequence {xn } (n  0) based on Lipschitz type conditions on the first Fréchet derivative f  are usually given as follows: the condition    f (x) − f  (x  )  Lx − x  , ∀x, x  ∈ D, (1.4) is usually called the Lipschitz condition in the domain D with constant L. For some kind of domain with center such as a ball B(x0 , r), sometimes it is not necessary for the inequality holding for any x, x  in the domain, but valid for all x ∈ B(x0 , r) and for all x  ∈ B(x, r − ρ(x)), then we call it the center Lipschitz condition in the inscribed sphere. Furthermore, the L in the Lipschitz condition need not be a constant, but a positive integrable function. If this is the case, then (1.4) is replaced by    f (x) − f  (x  ) 

ρ(xx  )

  ∀x ∈ B(x0 , r), ∀x  ∈ B x, r − ρ(x) ,

L(u) du,

(1.5)

ρ(x)

where ρ(x) = x − x0 , ρ(xx  ) = ρ(x) + x − x    r. Moreover, B(x0 , r) denotes the open ball with center x0 and radius r. Its closure is denoted by B(x0 , r). This inequality was first introduced by Wang in [14]. Other special and generalized Lipschitz conditions are also studied in subsequent papers, such as [1,2,15], etc. In this paper, we give a convergence theorem for the scheme (1.3) in the assumption that A(x0 )−1 f  satisfies the center Lipschitz condition in the inscribed sphere, which is defined as follows    A(x0 )−1 f  (x) − f  (x  )  

ρ(xx  )

L(u) du,

  ∀x ∈ B(x0 , r), ∀x  ∈ B x, r − ρ(x) ,

(1.6)

ρ(x)

where ρ(x), ρ(xx  ) are defined as (1.5) and L(u) is a positive nondecreasing function in [0, r]. The so-called inexact Newton method, that is introduced by Dembo, Eisenstat and Steihaug in [3] may be regarded as any iterative procedure which approximates a solution of (1.1) by {xn } and is given by xn+1 = xn + n ,

(1.7)

where the correction n satisfies f  (xn )n = −f (xn ) + rn ,

(1.8)

for a suitable rn ∈ Y . The importance of studying inexact Newton methods comes from the fact that many popular variants of Newton’s method can be considered as procedure of this type. Hence, in this sense, Newton-like method defined by (1.3) is some kind of inexact Newton method with   (1.9) rn = f  (xn ) − A(xn ) n . Given the residuals rn a proper control, we obtain the convergence of the Newton-like methods. This paper is organized as follows. In Section 2 we present our main result and give the proof. Some applications are discussed in Section 3. 2. Main result and its proof We give the convergence theorem for (1.3) as follows: Theorem 2.1. Let f : D ⊂ X → Y be Fréchet differentiable on B(x0 , r) ⊂ D and A(x) be an approximation of f  (x). Moreover, there exists an initial guess x0 ∈ B(x0 , r0 ) such that A(x0 ) invertible, where r0 ∈ [0, r]. Under (1.6) the sequence {xn } (n  0) defined by (1.3) remains in B(x0 , t ∗ ), and converges to a solution x ∗ of (1.1), if the following (I)–(III) hold:

M. Wu / J. Math. Anal. Appl. 339 (2008) 1425–1431

(I) There exists nonnegative constants s0 , α, β and 0  u0 < 1 such that   A(x0 )−1 f (x0 )  s0 ,    A(x0 )−1 f  (x) − A(x)   u0 + β

1427

(2.1)

ρ(x) 

L(u) du,

(2.2)

∀x ∈ B(x0 , r0 ).

(2.3)

0

and   A(x0 )−1 A(x) − I   α

ρ(x) 

L(u) du, 0

(II) There hold for some a  max{1, α + β}, r0 L(u) du = (1 − u0 )/a

(2.4)

0

and r0 s0  r0 (1 − u0 ) − a

L(u)(r0 − u) du.

(2.5)

0

(III) B(x0 , t ∗ ) ⊂ B(x0 , r0 ), for the least positive root t ∗ of φ(t), t φ(t) := a

(t − u)L(u) du − (1 − u0 )t + s0 .

(2.6)

0

The theorem is under the traditional frame of the semi-local convergence theorem of Newton-like methods, which is obtained by Rheinboldt (see [12]) applying his majorant theory to (1.3) and has been studied by many authors (see [4–7,9,16]). Our proof mainly base on the following lemma, which is also an important result presented by Moret in [10]. Lemma 2.2. Let f : D ⊂ X → Y be Fréchet differentiable on B(x0 , r) ⊂ D and satisfy (1.5). Assume the following conditions hold: (I) For constants s0 > 0, σ0 > 0, 0  u0 < 1 and a  1, the function t φ0 (t) := aσ0

(t − u)L(u) du − (1 − u0 )t + s0 0

has roots in [0, r], denoted by t0∗ the least of them. (II) {sn } > 0, {un }  0, {σn }  0, n = 0, 1, . . . , be sequences (starting from s0 , u0 , σ0 , respectively) such that 1 − un+1 − (1 − un )  −aσn νn

sn L(ρn + u) du,

(2.7)

0

where νn = σn+1 /σn and ρn = ρ(xn ) = xn − x0 . Moreover, lim inf σn > 0 is fulfilled for every n. (III) For every n, we have   n   sn  σn f (xn )

(2.8)

1428

M. Wu / J. Math. Anal. Appl. 339 (2008) 1425–1431

and rn  

un sn , σn

(2.9)

where rn = f  (xn )n + f (xn ). Then the sequence {xn } generated by xn+1 = xn + n remains in B(x0 , t0∗ ) and converges to x ∗ , that satisfies f (x ∗ ) = 0. We note that the condition on f  is different from the original one, but the same conclusion can be obtained in the similar way. Proof of Theorem 2.1. By the conditions we have (1 − u0 )α  a. Hence, by (2.4) we obtain r0 L(u) du = α

α

1 − u0  1. a

0

However, as x ∈ B(x0 , r0 ), we get from (2.3) and the positivity of L(u), ρ(x) 

  A(x0 )−1 A(x) − I   α

r0 L(u) du  1.

L(u) du < α 0

0

)−1 A(x)

is invertible for each x in the ball B(x0 , r0 ) and there holds By Neumann lemma A(x0   1 A(x)−1 A(x0 )  .  ρ(x) 1 − α 0 L(u) du

(2.10)

Next let us observe the function φ(t). It is easy to see that φ(t) is minimized at t = r0 and this value can be achieved. On the other hand, by (2.5) there holds φ(r0 )  0. Hence, the t ∗ exists. In what follows, we should use Lemma 2.2 to verify the assertion of our theorem. We show that the function A(x0 )−1 f satisfies all conditions of Lemma 2.2. Thus, x ∗ = limn→∞ xn and A(x0 )−1 f (x ∗ ) = 0, i.e. f (x ∗ ) = 0. We use (2.10) to define σn :=

1−α

1  ρn 0

L(u) du

.

We have lim inf σn > 0. We note also that σ0 = 1. Thus, φ0 (t) defined in Lemma 2.2 is the same as φ(t). Therefore, t0 = t0∗ . We know (see (1.3)) xn+1 = xn − A(xn )−1 f (xn ). Hence, if we define sn = xn+1 − xn , then   sn = xn+1 − xn  = A(xn )−1 f (xn )      A(xn )−1 A(x0 ) · A(x0 )−1 f (xn )    σn A(x0 )−1 f (xn ). In other words, (2.8) is fulfilled. Let us look at rn in Lemma 2.2. Clearly, the relation (1.9) tells us that for A(x0 )−1 f the function rn is defined by A(x0 )−1 (f  (xn ) − A(xn ))n . We have by (2.2)     rn  = A(x0 )−1 f  (xn ) − A(xn ) n      A(x0 )−1 f  (xn ) − A(xn )  · n    ρn = u0 + β

L(u) du sn . 0

M. Wu / J. Math. Anal. Appl. 339 (2008) 1425–1431

1429

Setting 



ρn

un = σn u0 + β

L(u) du , 0

we conclude un sn . σn That is (2.9). It remains to check (2.7). To this end, we use the definition of un , νn and σn to obtain rn  

1 − un+1 − (1 − un ) = −σn (α + β) T := νn

ρn+1

L(u) du. ρn

By changing of the variable and remembering a  max{1, α + β}, we get, as ρn+1 − ρn  sn and ρn > 0, ρn+1

T  −aσn

sn

L(u) du  −aσn ρn

L(ρn + u) du. 0

Now it is clear that all conditions in Lemma 2.2 are fulfilled for A(x0 )−1 f instead of f . Therefore, by Lemma 2.2 the sequence {xn } remains in B(x0 , t0 ) and converges to x ∗ with A(x0 )−1 f (x ∗ ) = 0. 2 3. Applications Let us give two examples to show that some elegant results are special cases of Theorem 2.1. Taking A(x) = f  (x) in (1.3), we have the following Proposition 3.1. Let f : D ⊂ X → Y be Fréchet differentiable on B(x0 , r) ⊂ D, r0 ∈ [0, r]. Suppose that there exists x0 ∈ B(x0 , r0 ) such that f  (x0 ) is invertible. Assume further the following conditions hold:    f (x0 )−1 f (x0 )  s0 ,     f (x0 )−1 f  (x) − f  (x  )  

ρ(xx  )

L(u) du,

  ∀x ∈ B(x0 , r), ∀x  ∈ B x, r − ρ(x) ,

(3.1)

ρ(x)

where ρ(x), ρ(xx  ) and L(u) are defined as before. r0 and s0 satisfy r0

r0 L(u) du = 1,

s0 

0

uL(u) du. 0

Then Newton’s method (1.2) is defined for all n and converges to a solution x ∗ of Eq. (1.1). Proof. According to Theorem 2.1 we have to show that the conditions of this proposition satisfy those of Theorem 2.1 with some suitable choices of constants. To this end, let us choose A(x) = f  (x), u0 = β = 0 and α = a = 1 in Theorem 2.1. Thus, (1.6), (2.1) and (2.2) are fulfilled. Moreover, (2.3) follows from (3.1). Indeed, let x = x0 , so ρ(x) = 0 and ρ(xx  ) = ρ(x  ). Hence, we obtain from (3.1)    f (x0 )−1 f  (x  ) − I  

 ρ(x  )

L(u) du,

∀x  ∈ B(x0 , r),

0

which obviously implies (2.3). Furthermore, (2.4) and (2.5) are satisfied by a simple computation. It remains to verify B(x0 , t ∗ ) ⊂ B(x0 , r0 ), i.e. t ∗  r0 . With our special choice of the constants, φ(t) is given as follows

1430

M. Wu / J. Math. Anal. Appl. 339 (2008) 1425–1431

t φ(t) =

(t − u)L(u) du − t + s0 , 0

which is the same as the majorizing function presented in [14]. Recalling (2.4) and (2.5), we have φ(s0 )  0 and φ(r0 )  0, so s0  t ∗  r0 . Therefore, all the conditions of Theorem 2.1 are satisfied. Consequently, the convergence of {xk } follows from Theorem 2.1. 2 Remark. Proposition 3.1 gives the convergence analysis of Newton’s method as [14]. Our proof is different from [14]. Consequently, we can conclude Kantorovich theorem (see [7,11]) and the Smale type theorem (see [13]) with L(u) evaluated specifically. On the other hand, if we take L(u) ≡ L, where L is a positive number, we obtain the following conclusion. Proposition 3.2. Let f : D ⊂ X → Y be Fréchet differentiable on B(x0 , r) ⊂ D and A(x) be an approximation of f  (x). Suppose there exists an initial guess x0 ∈ B(x0 , r0 ) with A(x0 ) invertible, where r0 ∈ [0, r]. Assume    A(x0 )−1 f  (x) − f  (x  )   Lx − x  , where x ∈ B(x0 , r) and x − x0  + x − x    r, and the following: (I) There exists nonnegative constants s0 , α, β and 0  u0 < 1 such that   A(x0 )−1 f (x0 )  s0 ,    A(x0 )−1 f  (x) − A(x)   u0 + βLx − x0 , and

  A(x0 )−1 A(x) − I   αLx − x0 ,

for every x ∈ B(x0 , r0 ). (II) There hold for some a  max{1, α + β}, r0 =

1 − u0 , aL

h=

2aLs0 (1 − u0 )2

(III) B(x0 , t ∗ ) ⊂ B(x0 , r0 ), with t ∗ =

with h  1.

√ 2s0 (1−u0 )h (1 − 1 − h )

is the least positive root of φ(t), which is defined by

1 φ(t) := aLt 2 − (1 − u0 )t + s0 , 2 then the sequence {xn } (n  0) defined by (1.3) remains in B(x0 , t ∗ ) and converges to a solution x ∗ of (1.1). Remark. Under the common Lipschitz condition (1.4), Proposition 3.2 is obtained by G.J. Miel in [8]. There are only differences in the choice of parameters and our conditions are a little weaker than the ones in [8] since we only need A(x0 )−1 f  to satisfy Lipschitz condition in B(x0 , r) ⊂ D. Moreover, our conditions are affine invariant. Acknowledgments The author is grateful to the referees for their nice suggestions. By these suggestions this paper has been greatly improved.

References [1] I.K. Argyros, A new convergence theorem for the inexact Newton methods based on assumptions involving the second Fréchet derivative, Comput. Math. Appl. 37 (1999) 109–115. [2] Jinhai Chen, Weiguo Li, Convergence behaviour of inexact Newton methods under weak Lipschitz condition, J. Comput. Appl. Math. 191 (2006) 143–164.

M. Wu / J. Math. Anal. Appl. 339 (2008) 1425–1431

1431

[3] R.S. Dembo, S.C. Eisenstat, T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal. 19 (1982) 400–408. [4] J.E. Dennis Jr., On Newton-like methods, Numer. Math. 11 (1968) 324–330. [5] J.E. Dennis Jr., Toward a unified convergence theory for Newton-like methods, in: L.B. Rall (Ed.), Nonlinear Functional Analysis and Applications, Academic Press, New York, 1971, pp. 425–472. [6] P. Deuflhard, G. Heindl, Affine invariant convergence for Newton’s method and extensions to related methods, SIAM J. Numer. Anal. 16 (1979) 1–10. [7] L.V. Kantorovich, G.P. Akilov, Functional Analysis, second ed., Pergamon Press, Oxford, 1982. [8] G.J. Miel, Majoring sequences and error bounds for iterative methods, Math. Comp. 34 (1980) 185–202. [9] I. Moret, A note on Newton type iterative methods, Computing 33 (1984) 65–73. [10] I. Moret, A Kantorovich type theorem for inexact Newton methods, Numer. Funct. Anal. Optim. 10 (1989) 351–365. [11] J.M. Ortega, W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. [12] W.C. Rheinboldt, A unified convergence theory for a class of iterative processes, SIAM J. Numer. Anal. 5 (1968) 42–63. [13] S. Smale, Newton’s method estimates from data at one point, in: R. Ewing, K. Gross, C. Martin (Eds.), The Merging of Disciplines: New Directions in Pure, Applied and Computational Mathematics, Springer-Verlag, New York, 1986, pp. 185–196. [14] Wang Xinghua, Convergence of Newton’s method and inverse function in Banach space, Math. Comp. 68 (1999) 169–186. [15] Wang Xinghua, Convergence of Newton’s method and uniqueness of the solution of equations in Banach space, IMA J. Numer. Anal. 20 (2000) 123–134. [16] T. Yamamoto, Historical developments in convergence analysis for Newton’s and Newton-like methods, J. Comput. Appl. Math. 124 (2000) 1–23.