Two conditions concerning Newton’s method

Available online at www.sciencedirect.com Applied Mathematics and Computation 194 (2007) 102–107 www.elsevier.com/locate/amc Two conditions concerni...

Download PDF

142KB Sizes 0 Downloads 25 Views

Report

PDF Reader
Full Text

Available online at www.sciencedirect.com

Applied Mathematics and Computation 194 (2007) 102–107 www.elsevier.com/locate/amc

Two conditions concerning Newton’s method Min Wu Department of Mathematics, Zhejiang University, Zhejiang, Hangzhou 310027, PR China

Abstract In this note, we compare two conditions concerning Newton’s method, and ﬁnd out that even under some weaker condition one has the convergence of the Newton’s method and a better error bound than the relative strong condition. Ó 2007 Elsevier Inc. All rights reserved. Keywords: Newton’s method; Majorizing function; Center Lipschitz condition

1. Introduction In this note, we investigate the convergence of the Newton’s method, i.e. using the iterative scheme 1

xnþ1 ¼ xn f 0 ðxn Þ f ðxn Þ;

n ¼ 0; 1; . . .

ð1:1Þ

to solve the equation f ðxÞ ¼ 0;

ð1:2Þ

where f is an operator deﬁned on a convex subset D of a Banach space E1 with values in a Banach space E2. In other words, f : D E1 ! E2 : Many authors study the convergence of the sequence (1.1) towards a solution of (1.2) under the condition of the Kantorovich theorem (see [4]), or closely related one (see [5,7]). Roughly speaking, those results are under the assumption that the second Fre´chet derivative f 00 is continuous and bounded in D or f 0 is Lipschitz continuous in D. In order to get better error bounds in [2,3] the authors studied the convergence of Newton’s method with the assumption that f 00 is Lipschitz continuous in D. Recently, Argyros (see [1]) investigated this method under the assumption that f is m-times Fre´chet derivable. To mention his result, let Bðx0 ; rÞ ¼ fx 2 Djkx x0 k 6 rg and B(x0, r) = {x 2 Dj kx x0k < r}. Thus, the condition in [1] can be formulated as

E-mail address: [email protected] 0096-3003/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2007.04.035

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

103

Condition A. Let m P 2 be a positive integer, a2 > 0, g > 0, ai P 0 and 3 6 i 6 m + 1. Let further f be an mtimes Fre´chet derivable operator. Assume there exists x0 2 D such that kf 0 ðx0 Þ1 f ðx0 Þk 6 g;

ð1:3Þ

1 ðiÞ

kf 0 ðx0 Þ f ðx0 Þk 6 ai ; 0

1

kf ðx0 Þ ½f pðsÞ 6 0;

ðmÞ

ðxÞ f

ðmÞ

i ¼ 2; . . . ; m; ðx0 Þk 6 amþ1 kx x0 k

ð1:4Þ for all x 2 D;

ð1:5Þ ð1:6Þ

where p is deﬁned as following: pðrÞ ¼

amþ1 mþ1 am m a2 r þ r þ þ r2 r þ g ðm þ 1Þ! m! 2

and s is such that p 0 (s) = 0. He showed, among others. Theorem A. Under Condition A the iterative sequence {xn} (n P 0) generated by (1.1) is well defined and contained in Bðx0 ; r Þ. Moreover, xn converges to a solution x* of (1.2), which is unique in Bðx0 ; r Þ [ Bðx0 ; r Þ, where r* and r** are the only two positive zeros of p(r). Furthermore, the following error estimates hold for all n P 0: kxnþ1 xn k 6 rnþ1 rn ; kxn x k 6 r rn ;

ð1:7Þ ð1:8Þ

where rnþ1 ¼ rn

pðrn Þ ; p0 ðrn Þ

r0 ¼ 0; n ¼ 0; 1; . . .

This result generalized the classical Newton’s method. On the other hand, in 1999 Wang (see [6]) also considered the similar problem under quite diﬀerent conditions. Thus, let q(x) = kx x0k, qðxx0 Þ ¼ qðxÞ þ kx0 xk and L(u) be a positive nondecreasing integrable function in [0, d] for some d > 0. Denote for some b > 0 hðtÞ :¼ b t þ where R satisﬁes

RR 0

Z

t

LðuÞðt uÞdu;

ð1:9Þ

0 6 t 6 R;

0

LðuÞðR uÞdu ¼ 1. We need the following:

Condition B. Suppose that f has a continuous derivative in Bðx0 ; dÞ and f 0 (x0)1 exists. Let f 0 (x0)1f 0 satisfy the so-called center Lipschitz condition in the inscribed sphere with the average L, i.e. for all x 2 B(x0, d) and x0 2 Bðx; d qðxÞÞ there holds Z qðxx 0 Þ 1 0 0 0 0 kf ðx0 Þ ðf ðx Þ f ðxÞÞk 6 LðuÞdu; ð1:10Þ where

qðxx0 Þ

6 d. Let b ¼

R d0 0

qðxÞ

LðuÞu du with d0 satisfying

1

b ¼ kf 0 ðx0 Þ f ðx0 Þk 6 b and

t*

6 d, where

t*,

t**

R d0 0

LðuÞdu ¼ 1. Suppose ð1:11Þ

are two positive zeros of h(t).

One of the results in [6] is the following: Theorem B. Under Condition B the sequence from (1.1) is well defined for all n and converges to a solution x* of (1.2) which satisfies x 2 Bðx1 ; t bÞ Bðx0 ; t Þ:

ð1:12Þ

104

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

Moreover, for all n P n0 P 0 the best possible error bounds are 2nn0 kx xn0 k kx xn k 6 ðt tn Þ t tn0

ð1:13Þ

and 2nn0 2kxnþ1 xn k kxn0 þ1 xn0 k ﬃ 6 kx xn k 6 ðt tn Þ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ; nþ1 tn0 þ1 tn0 1 þ 1 þ 4 ðtt t 2 ðt nþ1 t n Þ t Þ

ð1:14Þ

n

where t0 = 0 and tn is such that tnþ1 ¼ tn

hðtn Þ ; h0 ðtn Þ

n ¼ 0; 1; . . .

ð1:15Þ

On the one hand, it is easy to see that the assertions in Theorem B are essentially stronger than those in Theorem A. On the other hand, both conditions (Conditions A and B) are quite diﬀerent. One may therefore ask whether Condition A is essentially weaker than Condition B. In this note, we will show that in fact Condition B is weaker than Condition A. Consequently, we can use Theorem B to improve the assertions of Theorem A. Our main result of this note is Theorem 1.1. If f satisfies Condition A, then f satisfies Condition B. Consequently, under Condition A the sequence {xn} and x* satisfy (1.12)–(1.14). 2. Proof We begin with the following simple fact: Lemma 2.1. In Condition B, a sufficient condition for (1.10) is that f is two-times Fre´chet derivable and 1

kf 0 ðx0 Þ f 00 ðxÞk 6 LðqðxÞÞ:

ð2:1Þ

Proof. We have to verify that (2.1) implies (1.10). To this end, we notice f 0 ðx0 Þ f 0 ðxÞ ¼

Z

1

f 00 ðx þ sðx0 xÞÞðx0 xÞds:

0

Hence, 1

kf 0 ðx0 Þ ðf 0 ðx0 Þ f 0 ðxÞÞk 6

Z

1

1

kf 0 ðx0 Þ f 00 ðx þ sðx0 xÞÞkdskx0 xk:

0

Since L(u) is nondecreasing we conclude from (2.1) that Z 1 kf 0 ðx0 Þ1 ðf 0 ðx0 Þ f 0 ðxÞÞk 6 Lðkx þ sðx0 xÞ x0 kÞdskx0 xk 0

Z

1 0

Lðkx x0 k þ skx xkÞdskx xk ¼

6

¼

Z

kxx0 kþkx0 xk

LðuÞdu kxx0 k

0

Z

0

qðxx0 Þ

LðuÞdu; qðxÞ

which gives (1.10).

h

We are in the position to verify Theorem 1.1. Proof of Theorem 1.1. We ﬁrst prove that if x 2 Bðx0 ; rÞ and r 2 [0, r*], then 1

kf 0 ðx0 Þ f 00 ðxÞk 6 p00 ðkx x0 kÞ:

ð2:2Þ

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

105

For this goal let e, b1, bi, i = 2, . . . , m be deﬁned by e = x x0, b1 = x0 + s1e, bi = x0 + si(bi1 x0), si 2 [0, 1]. Thus (see also [1]) f 0 ðx0 Þ1 f 00 ðxÞ ¼ f 0 ðx0 Þ1 f 00 ðx0 Þ þ f 0 ðx0 Þ1 ½f 00 ðxÞ f 00 ðx0 Þ Z 1 1 00 1 0 0 ¼ f ðx0 Þ f ðx0 Þ þ f ðx0 Þ f 000 ½x0 þ s1 ðx x0 Þðx x0 Þds1 : 0 000

00

Using the same approach for f instead of f we conclude from the above that Z 1 1 00 1 00 1 0 0 0 f ðx0 Þ f ðxÞ ¼ f ðx0 Þ f ðx0 Þ þ f ðx0 Þ f 000 ðx0 Þðx x0 Þds1 þ f 0 ðx0 Þ

1

0

Z

1

½f 000 ½x0 þ s1 ðx x0 Þ f 000 ðx0 Þðx x0 Þds1 :

0

Clearly, Z 1

½f 000 ½x0 þ s1 ðx x0 Þ f 000 ðx0 Þðx x0 Þds1

0

can be written as Z 1Z 1 f ð4Þ fx0 þ s2 ½x0 þ s1 ðx x0 Þ x0 g½s1 ðx x0 Þðx x0 Þds2 ds1 : 0

0

Repeatedly, we obtain 1

1

f 0 ðx0 Þ f 00 ðxÞ ¼ f 0 ðx0 Þ f 00 ðx0 Þ þ f 0 ðx0 Þ

1

Z

1

f 000 ðx0 Þe ds1 þ þ

0

Z

Z 1 1 f 0 ðx0 Þ f ðmÞ ðx0 Þ 0 0 |ﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄ} 1

m2

ðbm3 x0 Þ ðb1 x0 Þe dsm2 ds1 þ T ; where T ¼

Z

Z

1

1

1

f 0 ðx0 Þ ½f ðmÞ ðbm2 Þ f ðmÞ ðx0 Þðbm3 x0 Þ ðb1 x0 Þe dsm2 ds1 : 0 0 |ﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄ} m2

Next we apply (1.5) for the last term and (1.4) for other terms to obtain Z 1 Z 1 Z 1 1 kf 0 ðx0 Þ f 00 ðxÞk 6 a2 þ a3 kekds1 þ . . . þ am kðbm3 x0 Þ ðb1 x0 Þekdsm2 ds1 0 0 0 |ﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄ} þ amþ1

Z

Z

1

m2

1

kbm2 x0 k ke 0 0 |ﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄ}

m3 Y

ðbi x0 Þkdsm2 ds1 :

i¼1

m2

The inequality (2.2) follows now from the deﬁnition of p and bi. To examine Condition B for f let amþ1 um1 þ þ a3 u þ a2 : LðuÞ ¼ p00 ðuÞ ¼ ðm 1Þ! Then, L(u) is positive, nondecreasing and integrable in [0, r]. Thus, (2.2) can be rewritten as 1

kf 0 ðx0 Þ f 00 ðxÞk 6 LðqðxÞÞ; whichR however implies (1.10) due to Lemma 1. To ﬁnd out other numbers of Condition B, let d0 > 0 be such d that 0 0 p00 ðuÞdu ¼ 1. We get p 0 (d0) = 0, while (1.6) tells us that with d0 = s there holds p(d0) 6 0. Hence, for

106

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

b¼

Z

d0

LðuÞu du ¼

Z

0

d0

p00 ðuÞu du ¼ g pðd0 Þ;

ð2:3Þ

0

we have by (1.3) 1

b ¼ kf 0 ðx0 Þ f ðx0 Þk 6 b; i.e. (1.11) is satisﬁed. Thus, Condition B is satisﬁed. Moreover, as Z t p00 ðuÞðt uÞdu ¼ tp0 ð0Þ þ pðtÞ b ¼ t þ pðtÞ b; 0

we obtain by (1.9) hðtÞ ¼ b t þ

Z

t

p00 ðuÞðt uÞdu ¼ pðtÞ: 0

Hence, the majorizing function of the two theorems are the same. So {tn} and {rn} in Theorems A and B, respectively, are the same. Consequently, t* = r*, t** = r**, and x 2 Bðx1 ; t bÞ Bðx0 ; t Þ ¼ Bðx0 ; r Þ. Furthermore, using (1.13), we get for all n P n0 P 0 2nn0 kx xn0 k kx xn k 6 ðr rn Þ 6 r rn ; r rn0 which is better than (1.8), while (1.7) is contained in the proof of Theorem B (see [6]).

h

Remark 1. Let us observe (2.3). Clearly, a necessary and suﬃcient condition for b 6 b is p(d0) 6 0, i.e. p(s) 6 0. Therefore, we cannot ﬁnd an m-times Fre´chet derivable function f which satisﬁes Condition B but not Condition A. In this sense Theorem B is superior as Theorem A. On the other hand, Theorem A gives a method to compute L(u) in Theorem B. 3. More about Theorems A and B In Theorem B, if L(u) is a positive constant, then the function h deﬁned by h(t) = b t + Lt2/2 is the majorizing function in Kantorovich theorem. Consequently, we have the convergence of Newton’s method and corresponding posterior error estimate. Moreover, if L(u) = 2c/(1 cu)3, where c satisﬁes kf 0 (x0)1f (n)(x0)k 6 n!cn1, n P 2, we will get a convergence theorem under a premise of Smale type and corresponding error estimate (see [6]). From the computational point of view, if m in Theorem A is great, it will take us some time to compute (1.3)–(1.6). However, with the choice of L(u) = 2c/(1 cu)3 we only need to compute kf 0 (x0)1f00 (x)k, which may save the calculation time. In Theorem A, let m = 2, we get the main result of Huang (see [3]). Now since Condition B is weaker than Condition A (see Theorem 1.1), we can also deduce Huang’s result by theorem B with the choice L(u) = Kt + c, where K and c are deﬁned as in [3]. In the same way, one can also obtain the result concerning Newton’s method in [2]. Though Theorem B can be used to obtain convergence theorem in some cases, Theorem B seems not comparable with the Kantorovich theorem. In other words, one can construct examples to show that with these examples the Kantorovich assumptions fail while Theorem B fulﬁlls, and vice versa. To this end, we note that by the Kantorovich theorem one assumes: f is Fre´chet derivable and there exists x0 2 D such that 1

kf 0 ðx0 Þ f ðx0 Þk 6 a; 1

kf 0 ðx0 Þ ½f 0 ðxÞ f 0 ðyÞk 6 ckx yk

for all x; y 2 D

and ac 6 1/2. (This is a little weaker than the original Kantorovich assumptions.) Example 1. Let E1 = E2 = R, D = [5, 5], x0 = 0, and f be deﬁned on D and given by 1 1 f ðxÞ ¼ x3 x þ : 6 3

M. Wu / Applied Mathematics and Computation 194 (2007) 102–107

107

We have a = 1/3 and c = 5. Then ac = 5/3 > 1/2. Therefore, the Kantorovich condition fails and we cannot get the convergence of Newton’s sequence startingpwith ﬃﬃﬃ x0 deﬁned pﬃﬃﬃ by the Kantorovich theorem. On the other hand, (1.11) is satisﬁed with b = a = 1/3, d0 ¼ 2, b ¼ 2 2=3. f 0 (x0) = 1, f 00 (x) = x and L(u) = u, so kf 0 (x0)1f 00 (x)k 6 L(kx x0k). Therefore, (1.10) is fulﬁlled. Consequently, Theorem B can be applied to get the convergence of Newton’s sequence starting with x0. Example 2 (Gutie´rrez [2]). Let E1 = E2 = R, x0 = 0. Let f : E1 ! E2 be given by f ðxÞ ¼ sinðxÞ 5x 8: In this case, a = 2 and c = 1/4, pﬃﬃﬃ then ac pﬃﬃ= ﬃ 1/2. Hence the hypothesis of the Kantorovich theorem holds. However, for L(u) = u/4, d0 ¼ 2 2, b ¼ 4 2=3 and b = a = 2, one has b > b. So the conditions of Theorem B do not be satisﬁed. References [1] I.K. Argyros, A Newton–Kantorovich theorem for equations involving m-Frechet diﬀerentiable operators and applications in radiative transfer, J. Comput. Appl. Math. 131 (2001) 149–159. [2] J.M. Gutie´rrez, A new semilocal convergence theorem for Newton’s method, J. Comput. Appl. Math. 79 (1997) 131–145. [3] Z. Huang, A note on the Kantorovich theorem for Newton iteration, J. Comput. Appl. Math. 47 (1993) 211–217. [4] L.V. Kantorovich, G.P. Akilov, Functional Analysis, Pergamon Press, Oxford, 1982. [5] A.M. Ostrowski, Solution of Equations in Euclidean and Banach Spaces, Academic Press, New York, 1979. [6] Xinghua Wang, Convergence of Newton’s method and inverse function theorem in Banach space, Math. Comput. 68 (1999) 169–186. [7] T. Yamamoto, A method for ﬁnding sharp error bounds for Newton’s method the Kantorovich assumptions, Numer. Math. 49 (1986) 203–220.

Two conditions concerning Newton’s method

Two conditions concerning Newton’s method

Recommend Documents