Two conditions concerning Newton’s method

Two conditions concerning Newton’s method

Available online at www.sciencedirect.com Applied Mathematics and Computation 194 (2007) 102–107 www.elsevier.com/locate/amc Two conditions concerni...

142KB Sizes 0 Downloads 26 Views

Available online at www.sciencedirect.com

Applied Mathematics and Computation 194 (2007) 102–107 www.elsevier.com/locate/amc

Two conditions concerning Newton’s method Min Wu Department of Mathematics, Zhejiang University, Zhejiang, Hangzhou 310027, PR China

Abstract In this note, we compare two conditions concerning Newton’s method, and find out that even under some weaker condition one has the convergence of the Newton’s method and a better error bound than the relative strong condition. Ó 2007 Elsevier Inc. All rights reserved. Keywords: Newton’s method; Majorizing function; Center Lipschitz condition

1. Introduction In this note, we investigate the convergence of the Newton’s method, i.e. using the iterative scheme 1

xnþ1 ¼ xn  f 0 ðxn Þ f ðxn Þ;

n ¼ 0; 1; . . .

ð1:1Þ

to solve the equation f ðxÞ ¼ 0;

ð1:2Þ

where f is an operator defined on a convex subset D of a Banach space E1 with values in a Banach space E2. In other words, f : D  E1 ! E2 : Many authors study the convergence of the sequence (1.1) towards a solution of (1.2) under the condition of the Kantorovich theorem (see [4]), or closely related one (see [5,7]). Roughly speaking, those results are under the assumption that the second Fre´chet derivative f 00 is continuous and bounded in D or f 0 is Lipschitz continuous in D. In order to get better error bounds in [2,3] the authors studied the convergence of Newton’s method with the assumption that f 00 is Lipschitz continuous in D. Recently, Argyros (see [1]) investigated this method under the assumption that f is m-times Fre´chet derivable. To mention his result, let Bðx0 ; rÞ ¼ fx 2 Djkx  x0 k 6 rg and B(x0, r) = {x 2 Dj kx  x0k < r}. Thus, the condition in [1] can be formulated as

E-mail address: [email protected] 0096-3003/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2007.04.035

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

103

Condition A. Let m P 2 be a positive integer, a2 > 0, g > 0, ai P 0 and 3 6 i 6 m + 1. Let further f be an mtimes Fre´chet derivable operator. Assume there exists x0 2 D such that kf 0 ðx0 Þ1 f ðx0 Þk 6 g;

ð1:3Þ

1 ðiÞ

kf 0 ðx0 Þ f ðx0 Þk 6 ai ; 0

1

kf ðx0 Þ ½f pðsÞ 6 0;

ðmÞ

ðxÞ  f

ðmÞ

i ¼ 2; . . . ; m; ðx0 Þk 6 amþ1 kx  x0 k

ð1:4Þ for all x 2 D;

ð1:5Þ ð1:6Þ

where p is defined as following: pðrÞ ¼

amþ1 mþ1 am m a2 r þ r þ    þ r2  r þ g ðm þ 1Þ! m! 2

and s is such that p 0 (s) = 0. He showed, among others. Theorem A. Under Condition A the iterative sequence {xn} (n P 0) generated by (1.1) is well defined and contained in Bðx0 ; r Þ. Moreover, xn converges to a solution x* of (1.2), which is unique in Bðx0 ; r Þ [ Bðx0 ; r Þ, where r* and r** are the only two positive zeros of p(r). Furthermore, the following error estimates hold for all n P 0: kxnþ1  xn k 6 rnþ1  rn ; kxn  x k 6 r  rn ;

ð1:7Þ ð1:8Þ

where rnþ1 ¼ rn 

pðrn Þ ; p0 ðrn Þ

r0 ¼ 0; n ¼ 0; 1; . . .

This result generalized the classical Newton’s method. On the other hand, in 1999 Wang (see [6]) also considered the similar problem under quite different conditions. Thus, let q(x) = kx  x0k, qðxx0 Þ ¼ qðxÞ þ kx0  xk and L(u) be a positive nondecreasing integrable function in [0, d] for some d > 0. Denote for some b > 0 hðtÞ :¼ b  t þ where R satisfies

RR 0

Z

t

LðuÞðt  uÞdu;

ð1:9Þ

0 6 t 6 R;

0

LðuÞðR  uÞdu ¼ 1. We need the following:

Condition B. Suppose that f has a continuous derivative in Bðx0 ; dÞ and f 0 (x0)1 exists. Let f 0 (x0)1f 0 satisfy the so-called center Lipschitz condition in the inscribed sphere with the average L, i.e. for all x 2 B(x0, d) and x0 2 Bðx; d  qðxÞÞ there holds Z qðxx 0 Þ 1 0 0 0 0 kf ðx0 Þ ðf ðx Þ  f ðxÞÞk 6 LðuÞdu; ð1:10Þ where

qðxx0 Þ

6 d. Let b ¼

R d0 0

qðxÞ

LðuÞu du with d0 satisfying

1

b ¼ kf 0 ðx0 Þ f ðx0 Þk 6 b and

t*

6 d, where

t*,

t**

R d0 0

LðuÞdu ¼ 1. Suppose ð1:11Þ

are two positive zeros of h(t).

One of the results in [6] is the following: Theorem B. Under Condition B the sequence from (1.1) is well defined for all n and converges to a solution x* of (1.2) which satisfies x 2 Bðx1 ; t  bÞ  Bðx0 ; t Þ:

ð1:12Þ

104

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

Moreover, for all n P n0 P 0 the best possible error bounds are   2nn0 kx  xn0 k   kx  xn k 6 ðt  tn Þ t  tn0

ð1:13Þ

and  2nn0 2kxnþ1  xn k kxn0 þ1  xn0 k   ffi 6 kx  xn k 6 ðt  tn Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ;  nþ1 tn0 þ1  tn0 1 þ 1 þ 4 ðtt t 2 ðt nþ1  t n Þ t Þ

ð1:14Þ

n

where t0 = 0 and tn is such that tnþ1 ¼ tn 

hðtn Þ ; h0 ðtn Þ

n ¼ 0; 1; . . .

ð1:15Þ

On the one hand, it is easy to see that the assertions in Theorem B are essentially stronger than those in Theorem A. On the other hand, both conditions (Conditions A and B) are quite different. One may therefore ask whether Condition A is essentially weaker than Condition B. In this note, we will show that in fact Condition B is weaker than Condition A. Consequently, we can use Theorem B to improve the assertions of Theorem A. Our main result of this note is Theorem 1.1. If f satisfies Condition A, then f satisfies Condition B. Consequently, under Condition A the sequence {xn} and x* satisfy (1.12)–(1.14). 2. Proof We begin with the following simple fact: Lemma 2.1. In Condition B, a sufficient condition for (1.10) is that f is two-times Fre´chet derivable and 1

kf 0 ðx0 Þ f 00 ðxÞk 6 LðqðxÞÞ:

ð2:1Þ

Proof. We have to verify that (2.1) implies (1.10). To this end, we notice f 0 ðx0 Þ  f 0 ðxÞ ¼

Z

1

f 00 ðx þ sðx0  xÞÞðx0  xÞds:

0

Hence, 1

kf 0 ðx0 Þ ðf 0 ðx0 Þ  f 0 ðxÞÞk 6

Z

1

1

kf 0 ðx0 Þ f 00 ðx þ sðx0  xÞÞkdskx0  xk:

0

Since L(u) is nondecreasing we conclude from (2.1) that Z 1 kf 0 ðx0 Þ1 ðf 0 ðx0 Þ  f 0 ðxÞÞk 6 Lðkx þ sðx0  xÞ  x0 kÞdskx0  xk 0

Z

1 0

Lðkx  x0 k þ skx  xkÞdskx  xk ¼

6

¼

Z

kxx0 kþkx0 xk

LðuÞdu kxx0 k

0

Z

0

qðxx0 Þ

LðuÞdu; qðxÞ

which gives (1.10).

h

We are in the position to verify Theorem 1.1. Proof of Theorem 1.1. We first prove that if x 2 Bðx0 ; rÞ and r 2 [0, r*], then 1

kf 0 ðx0 Þ f 00 ðxÞk 6 p00 ðkx  x0 kÞ:

ð2:2Þ

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

105

For this goal let e, b1, bi, i = 2, . . . , m be defined by e = x  x0, b1 = x0 + s1e, bi = x0 + si(bi1  x0), si 2 [0, 1]. Thus (see also [1]) f 0 ðx0 Þ1 f 00 ðxÞ ¼ f 0 ðx0 Þ1 f 00 ðx0 Þ þ f 0 ðx0 Þ1 ½f 00 ðxÞ  f 00 ðx0 Þ Z 1 1 00 1 0 0 ¼ f ðx0 Þ f ðx0 Þ þ f ðx0 Þ f 000 ½x0 þ s1 ðx  x0 Þðx  x0 Þds1 : 0 000

00

Using the same approach for f instead of f we conclude from the above that Z 1 1 00 1 00 1 0 0 0 f ðx0 Þ f ðxÞ ¼ f ðx0 Þ f ðx0 Þ þ f ðx0 Þ f 000 ðx0 Þðx  x0 Þds1 þ f 0 ðx0 Þ

1

0

Z

1

½f 000 ½x0 þ s1 ðx  x0 Þ  f 000 ðx0 Þðx  x0 Þds1 :

0

Clearly, Z 1

½f 000 ½x0 þ s1 ðx  x0 Þ  f 000 ðx0 Þðx  x0 Þds1

0

can be written as Z 1Z 1 f ð4Þ fx0 þ s2 ½x0 þ s1 ðx  x0 Þ  x0 g½s1 ðx  x0 Þðx  x0 Þds2 ds1 : 0

0

Repeatedly, we obtain 1

1

f 0 ðx0 Þ f 00 ðxÞ ¼ f 0 ðx0 Þ f 00 ðx0 Þ þ f 0 ðx0 Þ

1

Z

1

f 000 ðx0 Þe ds1 þ    þ

0

Z

Z 1 1  f 0 ðx0 Þ f ðmÞ ðx0 Þ 0 0 |fflfflfflfflfflffl{zfflfflfflfflfflffl} 1

m2

 ðbm3  x0 Þ    ðb1  x0 Þe dsm2    ds1 þ T ; where T ¼

Z

Z

1

1

1

 f 0 ðx0 Þ ½f ðmÞ ðbm2 Þ  f ðmÞ ðx0 Þðbm3  x0 Þ    ðb1  x0 Þe dsm2    ds1 : 0 0 |fflfflfflfflfflffl{zfflfflfflfflfflffl} m2

Next we apply (1.5) for the last term and (1.4) for other terms to obtain Z 1 Z 1 Z 1 1 kf 0 ðx0 Þ f 00 ðxÞk 6 a2 þ a3 kekds1 þ . . . þ am  kðbm3  x0 Þ    ðb1  x0 Þekdsm2    ds1 0 0 0 |fflfflfflfflfflffl{zfflfflfflfflfflffl} þ amþ1

Z

Z

1

m2

1

 kbm2  x0 k  ke 0 0 |fflfflfflfflfflffl{zfflfflfflfflfflffl}

m3 Y

ðbi  x0 Þkdsm2    ds1 :

i¼1

m2

The inequality (2.2) follows now from the definition of p and bi. To examine Condition B for f let amþ1 um1 þ    þ a3 u þ a2 : LðuÞ ¼ p00 ðuÞ ¼ ðm  1Þ! Then, L(u) is positive, nondecreasing and integrable in [0, r]. Thus, (2.2) can be rewritten as 1

kf 0 ðx0 Þ f 00 ðxÞk 6 LðqðxÞÞ; whichR however implies (1.10) due to Lemma 1. To find out other numbers of Condition B, let d0 > 0 be such d that 0 0 p00 ðuÞdu ¼ 1. We get p 0 (d0) = 0, while (1.6) tells us that with d0 = s there holds p(d0) 6 0. Hence, for

106

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107



Z

d0

LðuÞu du ¼

Z

0

d0

p00 ðuÞu du ¼ g  pðd0 Þ;

ð2:3Þ

0

we have by (1.3) 1

b ¼ kf 0 ðx0 Þ f ðx0 Þk 6 b; i.e. (1.11) is satisfied. Thus, Condition B is satisfied. Moreover, as Z t p00 ðuÞðt  uÞdu ¼ tp0 ð0Þ þ pðtÞ  b ¼ t þ pðtÞ  b; 0

we obtain by (1.9) hðtÞ ¼ b  t þ

Z

t

p00 ðuÞðt  uÞdu ¼ pðtÞ: 0

Hence, the majorizing function of the two theorems are the same. So {tn} and {rn} in Theorems A and B, respectively, are the same. Consequently, t* = r*, t** = r**, and x 2 Bðx1 ; t  bÞ  Bðx0 ; t Þ ¼ Bðx0 ; r Þ. Furthermore, using (1.13), we get for all n P n0 P 0   2nn0 kx  xn0 k   kx  xn k 6 ðr  rn Þ 6 r  rn ; r  rn0 which is better than (1.8), while (1.7) is contained in the proof of Theorem B (see [6]).

h

Remark 1. Let us observe (2.3). Clearly, a necessary and sufficient condition for b 6 b is p(d0) 6 0, i.e. p(s) 6 0. Therefore, we cannot find an m-times Fre´chet derivable function f which satisfies Condition B but not Condition A. In this sense Theorem B is superior as Theorem A. On the other hand, Theorem A gives a method to compute L(u) in Theorem B. 3. More about Theorems A and B In Theorem B, if L(u) is a positive constant, then the function h defined by h(t) = b  t + Lt2/2 is the majorizing function in Kantorovich theorem. Consequently, we have the convergence of Newton’s method and corresponding posterior error estimate. Moreover, if L(u) = 2c/(1  cu)3, where c satisfies kf 0 (x0)1f (n)(x0)k 6 n!cn1, n P 2, we will get a convergence theorem under a premise of Smale type and corresponding error estimate (see [6]). From the computational point of view, if m in Theorem A is great, it will take us some time to compute (1.3)–(1.6). However, with the choice of L(u) = 2c/(1  cu)3 we only need to compute kf 0 (x0)1f00 (x)k, which may save the calculation time. In Theorem A, let m = 2, we get the main result of Huang (see [3]). Now since Condition B is weaker than Condition A (see Theorem 1.1), we can also deduce Huang’s result by theorem B with the choice L(u) = Kt + c, where K and c are defined as in [3]. In the same way, one can also obtain the result concerning Newton’s method in [2]. Though Theorem B can be used to obtain convergence theorem in some cases, Theorem B seems not comparable with the Kantorovich theorem. In other words, one can construct examples to show that with these examples the Kantorovich assumptions fail while Theorem B fulfills, and vice versa. To this end, we note that by the Kantorovich theorem one assumes: f is Fre´chet derivable and there exists x0 2 D such that 1

kf 0 ðx0 Þ f ðx0 Þk 6 a; 1

kf 0 ðx0 Þ ½f 0 ðxÞ  f 0 ðyÞk 6 ckx  yk

for all x; y 2 D

and ac 6 1/2. (This is a little weaker than the original Kantorovich assumptions.) Example 1. Let E1 = E2 = R, D = [5, 5], x0 = 0, and f be defined on D and given by 1 1 f ðxÞ ¼ x3  x þ : 6 3

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

107

We have a = 1/3 and c = 5. Then ac = 5/3 > 1/2. Therefore, the Kantorovich condition fails and we cannot get the convergence of Newton’s sequence startingpwith ffiffiffi x0 defined pffiffiffi by the Kantorovich theorem. On the other hand, (1.11) is satisfied with b = a = 1/3, d0 ¼ 2, b ¼ 2 2=3. f 0 (x0) = 1, f 00 (x) = x and L(u) = u, so kf 0 (x0)1f 00 (x)k 6 L(kx  x0k). Therefore, (1.10) is fulfilled. Consequently, Theorem B can be applied to get the convergence of Newton’s sequence starting with x0. Example 2 (Gutie´rrez [2]). Let E1 = E2 = R, x0 = 0. Let f : E1 ! E2 be given by f ðxÞ ¼ sinðxÞ  5x  8: In this case, a = 2 and c = 1/4, pffiffiffi then ac pffiffi= ffi 1/2. Hence the hypothesis of the Kantorovich theorem holds. However, for L(u) = u/4, d0 ¼ 2 2, b ¼ 4 2=3 and b = a = 2, one has b > b. So the conditions of Theorem B do not be satisfied. References [1] I.K. Argyros, A Newton–Kantorovich theorem for equations involving m-Frechet differentiable operators and applications in radiative transfer, J. Comput. Appl. Math. 131 (2001) 149–159. [2] J.M. Gutie´rrez, A new semilocal convergence theorem for Newton’s method, J. Comput. Appl. Math. 79 (1997) 131–145. [3] Z. Huang, A note on the Kantorovich theorem for Newton iteration, J. Comput. Appl. Math. 47 (1993) 211–217. [4] L.V. Kantorovich, G.P. Akilov, Functional Analysis, Pergamon Press, Oxford, 1982. [5] A.M. Ostrowski, Solution of Equations in Euclidean and Banach Spaces, Academic Press, New York, 1979. [6] Xinghua Wang, Convergence of Newton’s method and inverse function theorem in Banach space, Math. Comput. 68 (1999) 169–186. [7] T. Yamamoto, A method for finding sharp error bounds for Newton’s method the Kantorovich assumptions, Numer. Math. 49 (1986) 203–220.