Extending the applicability of Newton’s method on Lie groups

Applied Mathematics and Computation 219 (2013) 10355–10365 Contents lists available at SciVerse ScienceDirect Applied Mathematics and Computation jo...

Download PDF

284KB Sizes 0 Downloads 39 Views

Report

PDF Reader
Full Text

Applied Mathematics and Computation 219 (2013) 10355–10365

Contents lists available at SciVerse ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Extending the applicability of Newton’s method on Lie groups Ioannis K. Argyros a,⇑, Saïd Hilout b a b

Cameron University, Department of Mathematics Sciences, Lawton, OK 73505, USA Poitiers University, Laboratoire de Mathématiques et Applications, Bd. Pierre et Marie Curie, Téléport 2, B.P. 30179, 86962 Futuroscope Chasseneuil Cedex, France

a r t i c l e

i n f o

a b s t r a c t We use Newton’s method to approximate a zero of a mapping from a Lie group into its Lie algebra. Under the same computational cost as before, we show the semilocal convergence of Newton’s method with the following advantages over earlier works [55]: weaker sufﬁcient convergence conditions, tighter error bounds on the distances involved and at least as precise information on the location of the solution. Numerical examples are also given for solving equations in cases not covered before. Ó 2013 Elsevier Inc. All rights reserved.

Keywords: Newton’s method Lie group Lie algebra Riemannian manifold Semilocal/local convergence Majorizing sequence Kantorovich hypothesis

1. Introduction In this study, we are concerned with the problem of approximating a zero xH of C1 -mapping F : G ! Q , where G is a Lie group and Q the Lie algebra of G that is the tangent space T e G of G at e, equipped with the Lie bracket ½; : Q Q ! Q [7,20,27,28,52]. The study of numerical algorithms on manifolds for solving eigenvalue or optimization problems on Lie groups [1–4,6– 9,11,13–37,39–47,49–56] is very important in Computational Mathematics. Newton-type methods are the most popular iterative procedures used to solve equations, when these equations contain differentiable operators. A local as well as a semilocal convergence of Newton-type methods has been given by several authors under various conditions [2–4,6–11,13– 37,39–47,49–57]. There are two types of convergence results: the ﬁrst uses information from the domain of an operator (see the well-known Kantorovich theorem [7,31]); where as the second uses information only at a point (see Smale’s paper [46]). In particular, a convergence analysis of Newton’s method on manifolds can be found in [1]–[4,8,11]–[16,24,25,32– 34,49,51]. Newton’s method (NM) with initial point x0 2 G was ﬁrst introduced by Owren and Welfert [42] in the form 1

xnþ1 ¼ xn exp ðdF xn Fðxn ÞÞ ðn P 0Þ:

ð1:1Þ H

(NM) is undoubtedly the most popular method for generating a sequence fxn g approximating x [7–37,39–47,49–55,31]. Let

g ¼ kdF 1 x0 Fðx0 Þk and L > 0 be a Lipschitz constant (precised in (3.2)). Wang and Li [55] recently used the Kantorovich hypothesis

h0 ¼ L g 6

1 2

together with majorizing sequence

⇑ Corresponding author. E-mail addresses: [email protected] (I.K. Argyros), [email protected] (S. Hilout). 0096-3003/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.amc.2013.04.007

ð1:2Þ

10356

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

0

t nþ1 ¼ tn h ðtn Þ1 hðt n Þ ¼ t n þ

t0 ¼ 0; t 1 ¼ g;

L ðtn tn1 Þ2 ; 2ð1 L t n Þ

ð1:3Þ

where

hðtÞ ¼

L 2 t tþg 2

ð1:4Þ

and

lim t n ¼ .0 ¼

1

n!1

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 2 h0 L

ð1:5Þ

to provide a convergence analysis for (NM) that improved earlier works [33–35]. Hypothesis (1.2) is the famous for its simplicity and clarity sufﬁcient convergence condition for solving nonlinear equations due to Kantorovich [31]. However, condition (1.2) can be violated. Based on the same information ðF; x0 ; LÞ, we provided weaker sufﬁcient convergence conditions

h1 ¼ ‘1 g 6

1 2

½4—9; 12

ð1:6Þ

h2 ¼ ‘2 g 6

1 2

½7;

ð1:7Þ

or

using the majorizing sequence

s0 ¼ 0;

s1 ¼ g;

snþ2 ¼ snþ1 þ

Lðsnþ1 sn Þ2 2ð1 L0 snþ1 Þ

ðn P 0Þ;

ð1:8Þ

where L0 2 ½0; L is a center-Lipschitz constant (precised in (3.3), which always exists if L does),

‘1 ¼

L þ L0 ; 2

‘2 ¼

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 L þ 4 L0 þ L2 þ 8 L0 L ; 8

sH ¼ lim sn 6 .1 ¼ n!1

ð1:9Þ

ð1:10Þ

g

ð1:11Þ

1a

and

a¼

2L qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : L þ L2 þ 8 L0 L

ð1:12Þ

Note that

h0 6

1 1 1 ) h1 6 ) h2 6 ; 2 2 2

ð1:13Þ

but not necessarily vice versa unless if L ¼ L0 . We also showed for L0 < L and n P 2:

sn < tn ;

ð1:14Þ

snþ1 sn < t nþ1 t n

ð1:15Þ

sH 6 .0 :

ð1:16Þ

and

In particular, note that

h1 1 ! ; 2 h0

h2 1 ! ; 2 h0

h2 1 ! 2 h1

as

L0 ! 0: L

Hence, the following advantages have been obtained in semilocal case (As ): (i) weaker sufﬁcient convergence conditions; (ii) tighter error estimates on the distances involved;

ð1:17Þ

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

10357

and (iii) An at least as precise information on location of the solution xH . That is, the applicability of (NM) has been extended. These advantages are obtained since we use the needed, less expensive to compute and more precise (3.3) (i.e., L0 ) instead of (3.2) (i.e., L) in the computation for the upper bounds on 1 kdF x dF x0 k (x 2 G). Indeed, we get (see (3.5)) 1 kdF x dF x0 k

6

!1 k X 1 L0 kzi k

ð1:18Þ

i¼0

instead of the less tight and more expensive to compute estimate (for L0 < L)

1

kdF x dF x0 k 6

1L

k X kzi k

!1 ð1:19Þ

i¼0

used in [55]. This modiﬁcation leads to more precise majorizing sequence fsn g (for fxn g) than ft n g, which makes our approach different and ﬁner from the one in [55] under the same computational cost and information, since in practice the computation of L requires that of L0 . In this paper, we are motivated by the work in [55] and optimization considerations. Using again a combination of Lipschitz and center-type Lipschitz conditions, but even more precise majorizing sequences, we provide a convergence analysis with advantages (As ) (under the same computational cost) over the work in [55]. We are wondering if in particular the weakest hypothesis (1.7) can be improved. It turns out that in the case of nonlinear equations when F is an operator between two Banach spaces, the sufﬁcient convergence condition is

h3 ¼ ‘3 g 6

1 2

½14;

ð1:20Þ

where

‘3 ¼

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 pﬃﬃﬃﬃﬃﬃﬃﬃ L0 L þ 4 L0 þ L0 ðL þ 8 L0 Þ 8

and the majorizing sequence given by

t0 ¼ 0;

t1 ¼ g;

t2 ¼ g þ

L0 g2 ; 2ð1 L0 gÞ

t nþ2 ¼ t nþ1 þ

Lðtnþ1 t n Þ2 2ð1 L0 t nþ1 Þ

ðn P 1Þ;

ð1:21Þ

where

tH ¼ lim tn 6 .3 ¼ g þ n!1

1 L0 g2 : 1 a 2ð1 L0 gÞ

ð1:22Þ

In this case

h3 ! 0; h0

h3 ! 0; h1

h3 L0 ! 0 as !0 h2 L

ð1:23Þ

and for L0 < L; n P 2:

t n < sn ;

ð1:24Þ

tnþ1 t n < snþ1 sn

ð1:25Þ

t H < sH :

ð1:26Þ

and

Hence, the applicability of (NM) for solving nonlinear equations is extended even further. The convergence of majorizing sequences is quadratic unless if equality holds in the corresponding ’’h’’ conditions, in which case it is only linear. In this study, we show that the last conditions can be used in the setting of (NM). Then we have the advantages (As ). The paper is organized as follows. Section 2 contains the necessary background on Lie groups. In Section 3, we present the semilocal convergence analysis of (NM). Finally, numerical examples are given in the concluding Section 4, where L0 < L and our ’’h’’ conditions hold but not Kantorovich hypothesis (1.2) [55].

10358

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

2. Preliminaries In order us to make the paper as self contained as possible, we re-introduce standard concepts and notations [7,20,27,28,48,52,55]. A Lie group (G,) is a Hausdorff topological group with countable bases which also has the structure of a smooth manifold such that the group product and the inversion are smooth operations in the differentiable structure given on the manifold. The dimension of a Lie group is that of the underlying manifold and we shall always assume that it is ﬁnite. The symbol e designates the identity element of G. Let Q be the Lie algebra of the Lie group G which is the tangent space T e G of G at e, equipped with Lie bracket ½; : Q Q ! Q . In the sequel we will make use of the left translation of the Lie group G. We deﬁne for each y 2 G

Ly : G ! G;

ð2:1Þ

z ! y z;

the left multiplication in the group. The differential of Ly at e denoted by ðdLy Þe determines an isomorphism of Q ¼ T e G with the tangent space T y G via the relation

ðdLy Þe ðQ Þ ¼ T y G or, equivalently,

Q ¼ ðdLy Þ1 e ðT y GÞ ¼ ðdLy1 Þy ðT y GÞ: The exponential map is noted by exp and deﬁned by

exp : Q ! G u ! expðuÞ; which is certainly the most important construct associated to G and Q. Given u 2 Q , the left invariant vector ﬁeld X u : y ! ðdLy Þe ðuÞ determines a one-parameter subgroup of G : ru : R ! G such that

ru ð0Þ ¼ e and r0u ðtÞ ¼ X u ðru ðtÞÞ ¼ ðdLru ðtÞ Þe ðuÞ 8t 2 R:

ð2:2Þ

Consequently, the exponential map is deﬁned by the relation

expðuÞ ¼ ru ð1Þ: Consider f : G ! Q ¼ T e G be a C1 -mapping. The differential of f at a point x 2 G is a linear map fx0 : T x G ! Q deﬁned by

fx0 ðMx Þ ¼

d f ðx expðtððdLx1 Þx ÞðMx ÞÞÞjt¼0 dt

for any Mx 2 T x G:

ð2:3Þ

The differential fx0 can be expressed via a function dfx : Q ! Q given by

dfx ¼ ðf Lx Þ0e ¼ fx0 ðdLx Þe : Thus, by (2.3), it follows that

dfx ðuÞ ¼ fx0 ððdLx Þe ðuÞÞ ¼

d f ðx expðtuÞÞjt¼0 dt

for any u 2 Q :

Therefore the following lemma is clear. Lemma 2.1. Let x 2 G; u 2 Q and t 2 R. Then

d f ðx expðtuÞÞ ¼ dfxexpðtuÞ ðuÞ dt

ð2:4Þ

and

f ðx expðtuÞÞ f ðxÞ ¼

Z

t

dfxexpðsuÞ ðuÞ ds:

ð2:5Þ

0

3. Convergence analysis for (NM) We shall study the semilocal/local convergence of (NM). In the rest of the paper we assume h; i the inner product and k k on Q. As in [55] we deﬁne a distance on G for x; y 2 G as follows:

mðx; yÞ ¼ inf

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

10359

( ) k X kzi k : there exist k P 1 and z1 ; . . . ; zk 2 Q such that y ¼ x exp z1 exp zk :

ð3:1Þ

i¼1

By convention inf ; ¼ þ1. It is easy to see that mð; Þ is a distance on G and the topology induced is equivalent to the original one on G. Let w 2 G and r > 0, we denote by

Uðw; rÞ ¼ fy 2 G : mðw; yÞ < rg; the open ball centered at w and of radius r. Moreover, we denote the closure of Uðw; rÞ by Uðw; rÞ. Let also LðQ Þ denotes the set of all linear operators on Q. We need the deﬁnition of Lipschitz continuity for a mapping. Deﬁnition 3.1. Let M : G ! LðQ Þ; x0 2 G and r > 0. We say that M satisﬁes the L-Lipschitz condition on Uðx0 ; rÞ if

kMðx exp uÞ MðxÞk 6 L kuk

ð3:2Þ

holds for any u 2 Q and x 2 Uðx0 ; rÞ such that kuk þ mðx; x0 Þ < r . It follows from (3.2) that there exists L0 such that

kMðx exp uÞ Mðx0 Þk 6 L0 ðkuk þ mðx; x0 ÞÞ

ð3:3Þ

holds for any u 2 Q and x 2 Uðx0 ; rÞ such that kuk þ mðx; x0 Þ < r. We then say M satisﬁes the L0 -center Lipschitz condition at x0 2 G on Uðx0 ; rÞ. Note that if M satisﬁes the L-Lipschitz condition on Uðx0 ; rÞ, then it also satisﬁes the L0 -center Lipschitz condition at x0 2 G on Uðx0 ; rÞ. Clearly,

L0 6 L

ð3:4Þ

holds in general and LL0 can be arbitrarily large [4–15]. Let us show that (3.2) indeed implies (3.3). If (3.2) holds, then for x ¼ x0 there exists K 2 ð0; L such that

kMðx0 exp uÞ Mðx0 Þk 6 K kuk for any u 2 Q

such that kuk < r:

There exist k P 1 and z0 ; z1 ; . . . ; zk 2 Q such that x ¼ x0 exp z0 exp zk . Let y0 ¼ x0 , yiþ1 ¼ yi exp ui ; i ¼ 0; 1; . . . ; k. Then, we have that x ¼ ykþ1 . We can write the identity

Mðx exp uÞ Mðx0 Þ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðxÞ Mðx0 ÞÞ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðyk exp uk Þ Mðyk ÞÞ þ ðMðyk Þ Mðx0 ÞÞ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðyk exp uk Þ Mðyk ÞÞ þ ðMðyk1 exp uk1 Þ Mðx0 ÞÞ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðyk exp uk Þ Mðyk ÞÞ þ þ ðMðy1 exp u1 Þ Mðy1 ÞÞ þ ðMðx0 exp u0 Þ Mðx0 ÞÞ: Using (3.2) and the triangle inequality, we obtain in turn that

kMðx exp uÞ Mðx0 Þk 6 kMðx exp uÞ MðxÞk þ kMðyk exp uk Þ Mðyk Þk þ þ kMðy1 exp u1 Þ Mðy1 Þk þ kMðx0 exp u0 Þ Mðx0 Þk 6 L ðkuk þ kuk k þ þ ku1 kÞ þ K ku0 k ¼ L ðkuk þ kuk k þ þ ku0 kÞ þ ðK LÞ ku0 k 6 L ðkuk þ kuk k þ þ ku0 kÞ; which implies (3.3). We need a Banach type lemma on invertible mappings. The conclusion of the next result is weaker and more precise than in [[55] Lemma 2.2]. However, since the proof is essentially the same (simply replace L by L0 ) it is omitted. 1

1

Lemma 3.2. Let x0 2 G such that dF x0 exists and let 0 < r 6 L10 . Suppose dF x0 dF satisﬁes the L0 -center Lipschitz condition at P x0 2 G on Uðx0 ; rÞ; for x 2 Uðx0 ; rÞ, there exist k P 1 and z0 ; z1 ; . . . ; zk 2 Q , such that x ¼ x0 exp z0 exp zk and ki¼0 kzi k < r. 1 Then, linear mapping dF x exists and 1 kdF x dF x0 k

6

!1 k X 1 L0 kzi k :

ð3:5Þ

i¼0

Remark 3.3. The corresponding proof in [[55] Lemma 2.2] uses (3.2) (see also (1.19)) instead of the less expensive and tighter (for L0 < L) (3.3). Our approach differs from [55] after this point, since (3.5) leads to tighter majorizing sequence and estimates. We can show the main semilocal convergence result for (NM). 1

Theorem 3.4. Let F : G ! Q be a C1 -mapping. Let x0 2 G be such that dF x0 exists and let r > 0. Set

10360

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365 1

kdF x0 Fðx0 Þk ¼ g: Suppose

1 dF x0 dF

ð3:6Þ

satisﬁes the L-Lipschitz condition on Uðx0 ; rÞ; (1.20) holds and

Uðx0 ; .3 Þ Uðx0 ; rÞ;

ð3:7Þ

where .3 is given in (1.22). Then, sequence fxn g (n P 0) generated by (NM) is well deﬁned, remains in Uðx0 ; .3 Þ for all n P 0 and converges to a zero xH 2 Uðx0 ; .3 Þof mapping F. Moreover, the following estimates hold for all n P 1:

mðxnþ1 ; xn Þ 6 kdF xn Fðxn Þk 6 tnþ1 tn

1

ð3:8Þ

mðxH ; xn Þ 6 tH t n ;

ð3:9Þ

and

where ftn g (n P 0) and t H are given in (1.21) and (1.22), respectively. 1

Proof. We shall ﬁrst show using induction that (3.8) holds for all n P 0. Set zn ¼ dF xn Fðxn Þ. We have by (3.6) that z0 is well 1 deﬁned and x1 ¼ x0 exp z0 . We have mðx1 ; x0 Þ 6 kz0 k and kz0 k ¼ kdF x0 Fðx0 Þk ¼ g ¼ t1 t0 , which shows (3.8) for n ¼ 0. Assume zn is well deﬁned and (3.8) holds for all n 6 k 1. Then, we have that k1 X kzi k 6 tk t0 ¼ tk 6 .3

and xk ¼ x0 exp z0 exp zk1 :

ð3:10Þ

i¼0 1

It follows from Lemma 3.2 and (3.7) that dF xk exists and 1

kdF xk dF x0 k 6 ð1 L0 tk Þ1 :

ð3:11Þ

We also have that

Fðx1 Þ ¼ Fðx1 Þ Fðx0 Þ dF x0 z0 ¼

Z 0

1

ðdF x0 expðt z0 Þ dF x0 Þ z0 dt

ð3:12Þ

and 1

kdF x0 Fðx1 Þk 6

Z

1

1

kdF x0 ðdF x0 expðt z0 Þ dF x0 Þk kz0 k dt 6

0

Z

1

L0 kt z0 k kz0 k dt 6

0

L0 ðt1 t0 Þ2 : 2

ð3:13Þ

Then, we obtain that 1

1

1

kz1 k ¼ k dF x1 Fðx1 Þk 6 kdF x1 dF x0 k kdF x0 Fðx1 Þk 6

L0 ðt1 t0 Þ2 ¼ t2 t1 ; 2ð1 L0 t 1 Þ

ð3:14Þ

which shows (3.8) for n ¼ 1. Similarly for k > 1, we have that

Fðxk Þ ¼ Fðxk Þ Fðxk1 Þ dF xk1 zk1 ¼ 1

kdF x0 Fðxk Þk 6

Z 0

1

Z

1

0

ðdF xk1 expðt zk1 Þ dF xk1 Þ zk1 dt

1

kdF x0 ðdF xk1 expðt zk1 Þ dF xk1 Þk kzk1 k dt 6

Z

1

L kt zk1 k kzk1 k dt 6

0

ð3:15Þ L ðt k tk1 Þ2 : 2

ð3:16Þ

Hence, we get that 1

1

1

kzk k ¼ k dF xk Fðxk Þk 6 kdF xk dF x0 k kdF x0 Fðxk Þk 6

L ðtk tk1 Þ2 ¼ t kþ1 t k : 2ð1 L0 t k Þ

ð3:17Þ

We also have mðxkþ1 ; xk Þ 6 kzk k, since xkþ1 ¼ xk exp zk , which completes the induction for (3.8). It follows from (3.8) that fxk g is a Cauchy sequence in a Banach space and as such it converges to some xH 2 Uðx3 ; .3 Þ (since Uðx3 ; .3 Þ is a closed set). By letting k ! 1 in (3.16), we get that xH is a zero of F. Estimate (3.9) follows from (3.8) by using standard majorization techniques [7,31]. That completes the proof of Theorem 3.4. h Remark 3.5. If L ¼ L0 , Theorem 3.4 reduces to Theorem [[55] Theorem 3.1, p. 328]. Otherwise (i.e., if L0 < L), it constitutes an improvement under the same information as already stated in the Introduction of this study. Clearly, the conclusions of Theorem 3.4 hold true if (h1 ; .1 ) or (h2 ; .1 ) replace (h3 ; .3 ).

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

10361

4. Applications to optimization Following Mahony [38,39] and Wang and Li [55], we provide some applications to optimization problems. Let f : G ! R be a C2 -mapping. Consider the optimization problem:

min f ðxÞ:

ð4:1Þ

x2G

~ the left-invariant vector ﬁeld associated with X and given by Let X 2 Q . Denote by X

~ XðxÞ ¼ ðL0x Þe X

for x 2 G;

ð4:2Þ

~ f is the Lie derivative of f with respect to the left-invariant vector ﬁeld X. ~ That is, we have for each x 2 G where X

~ f ÞðxÞ ¼ d j f ðx exp t XÞ: ðX dt t¼0

ð4:3Þ

Let fX 1 ; . . . ; X n g be an orthonormal basis of Q. Then grad f [52] is a vector ﬁeld on G given by

grad f ðxÞ ¼ ðX~1 ; . . . ; X~n ÞðX~1 f ðxÞ; . . . ; X~n f ðxÞÞT ¼

n X

X~j f ðxÞ X~j

for x 2 G:

ð4:4Þ

j¼1

Then, (NM) according to Mahony [39] is given. Algorithm 4.1. Find X k 2 G such that

~ k ¼ ðL0 Þ X k X x e

~ k f Þðxk Þ ¼ 0: and grad f ðxk Þ þ grad ðX

Set xkþ1 ¼ xk exp X k . Set k k þ 1 and repeat. Let F : G ! Q be a mapping deﬁned by

FðxÞ ¼ ðL0x Þ1 for x 2 G: e grad f ðxÞ

ð4:5Þ

Deﬁne linear operator Hx f : Q ! Q by

~ for X 2 Q: ðHx f ÞX ¼ ðL0x Þ1 e gradðX f ÞðxÞ

ð4:6Þ

Then, HðÞ deﬁnes a mapping from G to LðQ Þ such that

d F x ¼ Hx f

½55:

ð4:7Þ

It follows from (4.7) that sequence generated by Algorithm 4.1 for f coincides with the one generated by (NM) for F deﬁned by (4.5). Let x0 2 G be such that ðHx0 f Þ1 exists and set

gf ¼ kðHx0 f Þ1 ðL0x0 Þ1 e grad f ðx0 Þk: With the above notation the main results of Section 3 can be re-written as follows: Theorem 4.2. Suppose

h3 ¼ ‘3 gf 6

1 ; 2

ðHx0 f Þ1 ðHðÞ f Þ satisﬁes L-Lipschitz condition and L0 -center Lipschitz condition at x0 2 G on Uðx0 ; .3 Þ. Then, sequence generated by Algorithm 4.1 starting at x0 is well deﬁned and converges to a critical point xH of f with grad f ðxH Þ ¼ 0. Moreover, if Hx0 f is positive deﬁnite

kðHx0 f Þ1 k kHxexp z f Hx f k 6 L kzk and

kðHx0 f Þ1 k kHx0 exp z f Hx0 f k 6 L0 kzk for all x 2 G and z 2 Q with mðx0 ; xÞ þ kzk < .3 , then xH is a local solution of (4.1). Corollary 4.3. Let G be an Abelian group. Let xH 2 G be a local optimal solution of (4.1) such that ðHxH f Þ1 exists. Suppose ðHxH f Þ1 ðHðÞ f Þ satisﬁes L-Lipschitz condition and LH -center Lipschitz condition at xH on UðxH ; L1H Þ. Let also x0 2 UðxH ; 4aLÞ. Then, the sequence generated by Algorithm 4.1 starting at x0 is well deﬁned, remains in UðxH ; 4aLÞ for all n P 0 and converges to some critical point yH of f such mðxH ; yH Þ < 1=LH . Next, we provide numerical example, where tighter analysis than [55] is given.

10362

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

Example 4.4. Let dmx : Q 2 ! Q for a C2 -map 2

d

m : G ! Q . Consider the second differential equation [55]

@2 mðx exp t2 u2 exp t1 u1 Þ @t2 @t 2

mx u1 u2 ¼

! ð4:8Þ

: t 2 ¼t 1 ¼0

We can write that

dmxexp t u dmx ¼

Z

T

2

d

mxexp s u u ds for all u 2 Q and t 2 R:

ð4:9Þ

0

Let N 2 NH and let e ¼ I N the identity matrix of order N. Consider the linear group G ¼ SLðN; RÞ ¼ fx 2 RNN : det x ¼ 1g endowed with standard matrix multiplication. Then G is a connected Lie group and Q is deﬁned by Q ¼ T e G ¼ slðN; RÞ ¼ fx 2 RNN : tr x ¼ 0g. Q is endowed with the inner product hx; yi ¼ tr ðxT yÞ for each x and y in Q. The corresponding norm is the Frobenius nom. IQ denotes the identity on Q. The exponential map exp : Q ! G is given by P P exp ðuÞ ¼ n2N un =n! for each u 2 Q and its inverse is exp1 ðxÞ ¼ n2NH ð1Þn1 ðx eÞn =n for each x 2 G with kx ek < 1. Let g : G R ! Q be a differentiable map and let x0 be a random initial guess. Consider the initial value problem on G as follows [42]

x0 ¼ x gðxÞ

ð4:10Þ

xð0Þ ¼ x0 where gðxÞ ¼ ðsin xÞ ð2x 5 x2 Þ ððsin xÞ ð2 x 5 x2 ÞÞT for each x 2 G. The function sin is deﬁned as

sin x ¼

X

ð1Þk1

k2NH

x2 k1 ð2 k 1Þ!

for each x 2 G:

Then, we can write g as

gðxÞ ¼ 2

X

ð1Þk1

k2NH

X x2k ðx2 k ÞT x2kþ1 ðx2kþ1 ÞT 5 ð1Þk1 ð2k 1Þ! ð2k 1Þ! H

for all x 2 G:

ð4:11Þ

k2N

Let D the size of the Euler discretization method of (4.10). Then, (4.10) leads to the ﬁxed-point problem

x ¼ x0 exp ðD gðxÞÞ

ð4:12Þ

Let F : G ! Q deﬁned by

FðxÞ ¼ exp1 ðx1 for each x 2 G: 0 xÞ D gðxÞ Then the resolution of (4.12) is equivalent to ﬁnding a zero of F. Consider the special case D ¼ 1 and x0 ¼ exp v 0 with and x ¼ exp v 1 exp v 2 exp v k for some v i 2 Q (1 6 i 6 k) with

vP0 2 Q satisfying kv 0 k 6 1=11. Let uðÞ ¼ exp1 ðx1 0 ðÞÞ k i¼0 kv i k < 1=4. By [42,55], we have that k 5X 5 kv i k < < 1: 4 i¼0 16

kx1 0 x ek 6

ð4:13Þ

By the deﬁnition of exp1 and (4.13), we have that

dux ðuÞ ¼

X

ð1Þj1

Pj1

1 i¼0 ðx0 x

j2N

j1i 1 eÞi x1 0 x uðx0 x eÞ j

for each u 2 Q

ð4:14Þ

and

kdux IQ k 6

P 2 kx1 10 ki¼0 kv i k 0 x ek 6 : P 1 kx1 4 5 ki¼0 kv i k 0 x ek

ð4:15Þ

We also have that 2

kd ux k 6

P 3 þ 54 ki¼0 kv i k 6 2 : 2 P ð1 kx1 0 x ekÞ 1 54 ki¼0 kv i k 3 þ kx1 0 x ek

ð4:16Þ

Let v 2 Q and x 2 Uðx0 ; 1=50Þ. Then there exist k P 1; v 1 ; . . . ; v k 2 Q such that x ¼ exp v 1 exp v 2 exp v k and P kv k þ ki¼1 kv i k < 1=50. Let s 2 ½0; 1 and y ¼ x exp s v . By (4.15) and since kv 0 k < 1=10, we have that

kdux0 IQ k 6

10 kv 0 k 2 6 : 4 5 kv 0 k 7

By (4.16) and since

Pk

v i k þ kv k < ð1=10Þ þ

i¼0 k

ð4:17Þ Pk

v i k þ kv k < 3=25, we have that

i¼1 k

10363

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365 3 3 þ 20 144 2 kd uy k 6 2 6 P 2 : P k k 5 1 10 1 4 10 þ i¼1 kv i k þ s kv k 35 1 7 i¼1 kv i k þ s kv k

ð4:18Þ

We also have that dg x0 ¼ ð6 cos 1 16 sin 1Þ IQ . Hence, we get that 1

dg x0 ¼

1 IQ 6 cos 1 16 sin 1

and ðIQ dg x0 Þ1 ¼

1 IQ : 1 þ 6 cos 1 þ 16 sin 1

ð4:19Þ

By (4.17), we obtain that

kðIQ dg x0 Þ1 k kdF x0 ðIQ dg x0 Þk 6

2 1 < 1: 7 1 þ 6 cos 1 þ 16 sin 1

By the Banach lemma on invertible operators [7,31], we have that 1

1

kdF x0 k 6

1

1þ6 cos 1þ16 sin 1 6 1 1þ6 cos 1þ16 sin 1

2 7

1 : 10

ð4:20Þ

By (4.11), we have that 2

kd g y k 6 2

X k2NH

2

X ð2 k þ 1Þ2 ky2 kþ1 k ð2 kÞ2 ky2 k k þ5 2 : ð2 k 1Þ! ð2 k 1Þ! H

ð4:21Þ

k2N

P We also have that kyi k 6 5 i=4 ð km¼1 kv m k þ s kv kÞ þ 1 for each i P 1. By (4.21), we deduce that 2

kd g y k 6

5 4

! k X X 4 ð2 jÞ3 þ 10 ð2 j þ 1Þ3 X 4 ð2 jÞ2 þ 10 ð2 j þ 1Þ2 kv i k þ s kv k 2 þ 2 : ð2 j 1Þ! ð2 j 1Þ! H H i¼1 j2N

ð4:22Þ

j2N

Using elementary analysis and (4.22), we obtain that

! k X kd g y k 6 1320 e kv i k þ s kv k þ 136 e 2

ð4:23Þ

i¼1

where e is the exponential number. Combining (4.18) and (4.20), we obtain the following estimate 1

2

1

2

1

2

kdF x0 d F y k 6 kdF x0 k kd uy k þ kdF x0 d g y k 72

6

k X kv i k þ s kv k

175 1 10 7

e !!2 þ 10

! ! k X 1320 kv i k þ s kv k þ 136 :

ð4:24Þ

i¼1

i¼1

Since y ¼ x exp s v and by (4.9), we can write 1

kdF x0 ðdF xexp v dF x Þk 6

Z

1

Z 6 0

1

2

kdF x0 d F xexp s v k kv k ds 0

0 1

B @

72

P k 175 1 10 i¼1 kv i k þ skv k 7

1 ! ! k X e C 1320 kv i k þ skv k þ 136 Akv kds 2 þ 10 i¼1

6 44:58088305 kv k 6 51 kv k: ð4:25Þ 1 dF x0 dF

We conclude that is L-Lipschitz at x0 on Uðx0 ; 1=50Þ with L ¼ 51. Since r0 < 1=L < 1=50, then, at x0 on Uðx0 ; r 0 Þ. Note that Fðx0 Þ ¼ v 0 . Using (4.20), we have that 1

1

h0 ¼ L g ¼ L kdF x0 Fðx0 Þk 6 L kdF x0 k kv k 6

1 dF x0 dF

is L-Lipschitz

51 1 kv 0 k < : 10 2

We conclude by Theorem 3.4 that (NM) starting at x0 converges to a zero xH of F. However, if kv 0 k < 100=1015, then, all the computation remain the same. But we can only deduce that

h0 ¼ L g 6

51 100 1 ¼ :5024630542 > : 10 1015 2

Hence, since the Kantorovich hypothesis (1.2) is not satisﬁed, Theorem 3.1 in [[34] p. 328] cannot guarantee the converP gence of (NM) to xH . If we choose v ; v 1 such that kv k þ kv 1 k 6 1=51 and kv k þ ki¼1 kv i k < 1=50, then we get that

10364

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

1

kdF x0 ðdF x0 exp v dF x0 Þk 6

Z

1 0

1

2

kdF x0 d F x0 exp s v k kv k ds Z 0

1

! e ð1320 ðkv 1 k þ s kv kÞ þ 136Þ kv k ds 2 þ 10 ðkv 1 k þ s kv kÞ 175 1 10 7 72

6 44:43966955 kv k: Hence, we can choose K ¼ 44:43966955. Moreover, according to the discussion above Lemma 3.2, we can certainly choose L0 ¼ 50 < L ¼ 51. Our strongest condition (1.6) is now satisﬁed, since

h1 ¼ ‘1 g 6

50 þ 51 1 100 1 ¼ :497536946 < : 2 10 1015 2

Hence, our Theorem 3.4 guarantees the convergence of (NM) starting at x0 to a zero xH of F in Uðx0 ; sH Þ, where sH is given in (1.11). Finally, note that if kv 0 k < 1=11, then clearly our Theorem 3.4 provides tighter error estimates and a more precise information on the location of the solution. References [1] P.A. Absil, C.G. Baker, K.A. Gallivan, Trust-region methods on Riemannian manifolds, Found. Comput. Math. 7 (2007) 303–330. [2] R.L. Adler, J.P. Dedieu, J.Y. Margulies, M. Martens, M. Shub, Newton’s method on Riemannian manifolds and a geometric model for the human spine, IMA J. Numer. Anal. 22 (2002) 359–390. [3] F. Alvarez, J. Bolte, J. Munier, A unifying local convergence result for Newton’s method in Riemannian manifolds, Found Comput. Math. 8 (2008) 197– 226. [4] I.K. Argyros, An improved unifying convergence analysis of Newton’s method in Riemannian manifolds,, J. Appl. Math. Comput. 25 (2007) 345–351. [5] I.K. Argyros, On the local convergence of Newton’s method on Lie groups, PanAm. Math. J. 17 (2007) 101–109. [6] I.K. Argyros, A Kantorovich analysis of Newton’s method on Lie groups, J. Concr. Appl. Anal. 6 (2008) 21–32. [7] I.K. Argyros, Convergence and Applications of Newton-type Iterations, Springer Verlag Publ., New York, 2008. [8] I.K. Argyros, Newton’s method in Riemannian manifolds, Rev. Anal. Numér. Théor. Approx. 37 (2008) 119–125. [9] I.K. Argyros, Newton’s method on Lie groups, J. Appl. Math. Comput. 31 (2009) 217–228. [10] I.K. Argyros, A semilocal convergence analysis for directional Newton methods, Math. Comput. AMS 80 (2011) 327–343. [11] I.K. Argyros, Y.J. Cho, S. Hilout, Numerical Methods for Equations and Its Applications, CRC Press/Taylor and Francis Group, New-York, 2012. [12] I.K. Argyros, S. Hilout, Newton’s method for approximating zeros of vector ﬁelds on Riemannian manifolds, J. Appl. Math. Comput. 29 (2009) 417–427. [13] I.K. Argyros, S. Hilout, Extending the Newton–Kantorovich hypothesis for solving equations, J. Comput. Appl. Math. 234 (2010) 2993–3006. [14] I.K. Argyros, S. Hilout, Weaker conditions for the convergence of Newton’s method, J. Complexity 28 (2012) 364–387. [15] I.K. Argyros, S. Hilout, Numerical Methods in Nonlinear Analysis, World Scientiﬁc Publ., ISBN-13: 978-9814405829, in press. [16] D.A. Bayer, J.C. Lagarias, The nonlinear geometry of linear programming I. Afﬁne and projective scaling trajectories, Trans. Am. Math. Soc. 314 (1989) 499–526. [17] R.W. Brockett, Differential geometry and the design of gradient algorithms. Differential geometry: partial differential equations on manifolds, Los Angeles, CA, 1990, pp. 69–92, in: Proc. Sympos. Pure Math., vol. 54, Part 1, The American Mathematical Society, Providence, RI, 1993. [18] M.T. Chu, K.R. Driessel, The projected gradient method for least squares matrix approximations with spectral constraints, SIAM J. Numer. Anal. 27 (1990) 1050–1060. [19] J.P. Dedieu, P. Priouret, G. Malajovich, Newton’s method on Riemannian mnifolds: convariant atheory, IMA J. Numer. Anal. 23 (2003) 395–419. [20] M.P. Do Carmo, Riemannian Geometry, Birkhauser, Boston, 1992. [21] A. Edelman, T.A. Arias, S.T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl. 20 (1999) 303–353. [22] J.A. Ezquerro, M.A. Hernández, Generalized differentiability conditions for Newton’s method, IMA J. Numer. Anal. 22 (2002) 187–205. [23] J.A. Ezquerro, M.A. Hernández, On an application of Newton’s method to nonlinear operators with x-conditioned second derivative, BIT 42 (2002) 519–530. [24] O.P. Ferreira, B.F. Svaiter, Kantorovich’s theorem on Newton’s method in Reimannian manifolds, J. Complexity 18 (2002) 304–329. [25] D. Gabay, Minimizing a differentiable function over a differential manifold, J. Optim. Theory Appl. 37 (1982) 177–219. [26] J.M. Gutiérrez, M.A. Hernández, Newton’s method under weak Kantorovich conditions, IMA J. Numer. Anal. 20 (2000) 521–532. [27] B.C. Hall, Lie Algebras and Representations. An Elementary Introduction, Graduate Texts in Mathematics, vol. 222, Springer-Verlag, New York, 2003. [28] S. Helgason, Differential Geometry, Lie Groups and Symmetric Spaces, Pure and Applied Mathematics, vol. 80, Academic Press, Inc., Harcourt Brace Jovanovich, Publishers, New York, London, 1978. [29] U. Helmke, J.B. Moore, Singular-value decomposition via gradient and self-equivalent ﬂows, Linear Algebra Appl. 169 (1992) 223–248. [30] U. Helmke, J.B. Moore, Optimization and dynamical systems, in: R. Brockett (Ed.), Communications and Control Engineering Series, Springer-Verlag London, Ltd., London, 1994. [31] L.V. Kantorovich, G.P. Akilov, Functional Analysis in Normed Spaces, Pergamon Press, New York, 1982. [32] C. Li, G. López, V. Martín-Márquez, Monotone vector ﬁelds and the proximal point algorithm on Hadamard manifolds, J. Lond. Math. Soc. 79 (2) (2009) 663–683. [33] C. Li, J.H. Wang, Convergence of the Newton method and uniqueness of zeros of vector ﬁelds on Riemannian manifolds, Sci. Chin. Ser. A 48 (2005) 1465–1478. [34] C. Li, J.H. Wang, Newton’s method on Riemannian manifolds: Smale’s point estimate theory under the c-condition, IMA J. Numer. Anal. 26 (2006) 228– 251. [35] C. Li, J.H. Wang, J.P. Dedieu, Smale’s point estimate theory for Newton’s method on Lie groups, J Complexity 25 (2009) 128–151. [36] P.D. Proinov, General local convergence theory for a class of iterative processes and its applications to Newton’s method, J. Complexity 25 (2009) 38– 62. [37] D.G. Luenberger, The gradient projection method along geodesics, Manage. Sci. 18 (1972) 620–631. [38] R.E. Mahony, Optimization algorithms on homogeneous spaces: with applications in linear systems theory, Ph.D. Thesis, Department of Systems Engineering, University of Canberra, Australia, 1994. [39] R.E. Mahony, The constrained Newton method on a Lie group and the symmetric eigenvalue problem, Linear Algebra Appl. 248 (1996) 67–89. [40] R.E. Mahony, U. Helmke, J.B. Moore, Pole placement algorithms for symmetric realisations, in: Proceedings of 32nd IEEE Conference on Decision and Control, San Antonio, TX, 1993, pp. 1355–1358. [41] J.B. Moore, R.E. Mahony, U. Helmke, Numerical gradient algorithms for eigenvalue and singular value calculations, SIAM J. Matrix Anal. Appl. 15 (1994) 881–902.

I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365

10365

[42] B. Owren, B. Welfert, The Newton iteration on Lie groups, BIT 40 (2000) 121–145. [43] P.D. Proinov, New general convergence theory for iterative processes and its applications to Newton–Kantorovich type theorems, J. Complexity 26 (2010) 3–42. [44] W.C. Rheinboldt, An Adaptive Continuation Process for Solving System of Nonlinear Equations, Banach Center Publ., 1977. vol. 3, pp. 129–142. [45] M. Shub, S. Smale, Complexity of Bézout’s theorem I. Geometric aspects, J. Am. Math. Soc. 6 (1993) 459–501. [46] S. Smale, Newton’s method estimates from data at a point, in: R. Ewing, K. Gross, C. Martin (Eds.), The Merging of Disciplines: New Directions in Pure, Applied and Computational Mathematics, Springer Verlag Publ., New-York, 1986, pp. 185–196. [47] S.T. Smith, Dynamical systems that perform the singular value decomposition, Syst. Control Lett. 16 (1991) 319–327. [48] S.T. Smith, Geometric optimization methods for adaptive ﬁltering, Thesis Ph.D., Harvard University, ProQuest LLC, Ann Arbor, MI, 1993. [49] S.T. Smith, Optimization techniques on Riemannian manifolds, Hamiltonian and Gradient Flows, Algorithms and Control, Fields Institute Communications, vol. 3, The American Mathematical Society, Providence, RI, 1994. [50] J.F. Traub, H. Wozniakowski, Convergence and complexity of Newton iteration for operator equations, J. Assoc. Comput. Mach. 26 (1979) 250–258. [51] C. Udrisßte, Convex Functions and Optimization Methods on Riemannian Manifolds, Mathematics and Its Applications, vol. 297, Kluwer Academic Publishers Group, Dordrecht, 1994. [52] V.S. Varadarajan, Lie groups, Lie algebras and their representations, Graduate Texts in Mathematics, vol. 102, Springer-Verlag, New York, 1984. Reprint of the 1974 edition. [53] J.H. Wang, C. Li, Uniqueness of the singular points of vector ﬁelds on Riemannian manifolds under the c-condition, J. Complexity 22 (2006) 533–548. [54] J.H. Wang, C. Li, Kantorovich’s theorem for Newton’s method on Lie groups, J. Zheijiang Univ. Sci. A 8 (2007) 978–986. [55] J.H. Wang, C. Li, Kantorovich’s theorems for Newton’s method for mappings and optimization problems on Lie groups, IMA J. Numer. Anal. 31 (2011) 322–347. [56] X.H. Wang, Convergence of Newton’s method and uniqueness of the solution of equations in Banach space, IMA J. Numer. Anal. 20 (2000) 123–134. [57] P.P. Zabrejko, D.F. Nguen, The majorant method in the theory of Newton–Kantorovich approximations and the Pták error estimates, Numer. Funct. Anal. Optim. 9 (1987) 671–684.

Extending the applicability of Newton’s method on Lie groups

Extending the applicability of Newton’s method on Lie groups

Recommend Documents