Applied Mathematics and Computation 219 (2013) 10355–10365
Contents lists available at SciVerse ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
Extending the applicability of Newton’s method on Lie groups Ioannis K. Argyros a,⇑, Saïd Hilout b a b
Cameron University, Department of Mathematics Sciences, Lawton, OK 73505, USA Poitiers University, Laboratoire de Mathématiques et Applications, Bd. Pierre et Marie Curie, Téléport 2, B.P. 30179, 86962 Futuroscope Chasseneuil Cedex, France
a r t i c l e
i n f o
a b s t r a c t We use Newton’s method to approximate a zero of a mapping from a Lie group into its Lie algebra. Under the same computational cost as before, we show the semilocal convergence of Newton’s method with the following advantages over earlier works [55]: weaker sufficient convergence conditions, tighter error bounds on the distances involved and at least as precise information on the location of the solution. Numerical examples are also given for solving equations in cases not covered before. Ó 2013 Elsevier Inc. All rights reserved.
Keywords: Newton’s method Lie group Lie algebra Riemannian manifold Semilocal/local convergence Majorizing sequence Kantorovich hypothesis
1. Introduction In this study, we are concerned with the problem of approximating a zero xH of C1 -mapping F : G ! Q , where G is a Lie group and Q the Lie algebra of G that is the tangent space T e G of G at e, equipped with the Lie bracket ½; : Q Q ! Q [7,20,27,28,52]. The study of numerical algorithms on manifolds for solving eigenvalue or optimization problems on Lie groups [1–4,6– 9,11,13–37,39–47,49–56] is very important in Computational Mathematics. Newton-type methods are the most popular iterative procedures used to solve equations, when these equations contain differentiable operators. A local as well as a semilocal convergence of Newton-type methods has been given by several authors under various conditions [2–4,6–11,13– 37,39–47,49–57]. There are two types of convergence results: the first uses information from the domain of an operator (see the well-known Kantorovich theorem [7,31]); where as the second uses information only at a point (see Smale’s paper [46]). In particular, a convergence analysis of Newton’s method on manifolds can be found in [1]–[4,8,11]–[16,24,25,32– 34,49,51]. Newton’s method (NM) with initial point x0 2 G was first introduced by Owren and Welfert [42] in the form 1
xnþ1 ¼ xn exp ðdF xn Fðxn ÞÞ ðn P 0Þ:
ð1:1Þ H
(NM) is undoubtedly the most popular method for generating a sequence fxn g approximating x [7–37,39–47,49–55,31]. Let
g ¼ kdF 1 x0 Fðx0 Þk and L > 0 be a Lipschitz constant (precised in (3.2)). Wang and Li [55] recently used the Kantorovich hypothesis
h0 ¼ L g 6
1 2
together with majorizing sequence
⇑ Corresponding author. E-mail addresses:
[email protected] (I.K. Argyros),
[email protected] (S. Hilout). 0096-3003/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.amc.2013.04.007
ð1:2Þ
10356
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
0
t nþ1 ¼ tn h ðtn Þ1 hðt n Þ ¼ t n þ
t0 ¼ 0; t 1 ¼ g;
L ðtn tn1 Þ2 ; 2ð1 L t n Þ
ð1:3Þ
where
hðtÞ ¼
L 2 t tþg 2
ð1:4Þ
and
lim t n ¼ .0 ¼
1
n!1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 h0 L
ð1:5Þ
to provide a convergence analysis for (NM) that improved earlier works [33–35]. Hypothesis (1.2) is the famous for its simplicity and clarity sufficient convergence condition for solving nonlinear equations due to Kantorovich [31]. However, condition (1.2) can be violated. Based on the same information ðF; x0 ; LÞ, we provided weaker sufficient convergence conditions
h1 ¼ ‘1 g 6
1 2
½4—9; 12
ð1:6Þ
h2 ¼ ‘2 g 6
1 2
½7;
ð1:7Þ
or
using the majorizing sequence
s0 ¼ 0;
s1 ¼ g;
snþ2 ¼ snþ1 þ
Lðsnþ1 sn Þ2 2ð1 L0 snþ1 Þ
ðn P 0Þ;
ð1:8Þ
where L0 2 ½0; L is a center-Lipschitz constant (precised in (3.3), which always exists if L does),
‘1 ¼
L þ L0 ; 2
‘2 ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 L þ 4 L0 þ L2 þ 8 L0 L ; 8
sH ¼ lim sn 6 .1 ¼ n!1
ð1:9Þ
ð1:10Þ
g
ð1:11Þ
1a
and
a¼
2L qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : L þ L2 þ 8 L0 L
ð1:12Þ
Note that
h0 6
1 1 1 ) h1 6 ) h2 6 ; 2 2 2
ð1:13Þ
but not necessarily vice versa unless if L ¼ L0 . We also showed for L0 < L and n P 2:
sn < tn ;
ð1:14Þ
snþ1 sn < t nþ1 t n
ð1:15Þ
sH 6 .0 :
ð1:16Þ
and
In particular, note that
h1 1 ! ; 2 h0
h2 1 ! ; 2 h0
h2 1 ! 2 h1
as
L0 ! 0: L
Hence, the following advantages have been obtained in semilocal case (As ): (i) weaker sufficient convergence conditions; (ii) tighter error estimates on the distances involved;
ð1:17Þ
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
10357
and (iii) An at least as precise information on location of the solution xH . That is, the applicability of (NM) has been extended. These advantages are obtained since we use the needed, less expensive to compute and more precise (3.3) (i.e., L0 ) instead of (3.2) (i.e., L) in the computation for the upper bounds on 1 kdF x dF x0 k (x 2 G). Indeed, we get (see (3.5)) 1 kdF x dF x0 k
6
!1 k X 1 L0 kzi k
ð1:18Þ
i¼0
instead of the less tight and more expensive to compute estimate (for L0 < L)
1
kdF x dF x0 k 6
1L
k X kzi k
!1 ð1:19Þ
i¼0
used in [55]. This modification leads to more precise majorizing sequence fsn g (for fxn g) than ft n g, which makes our approach different and finer from the one in [55] under the same computational cost and information, since in practice the computation of L requires that of L0 . In this paper, we are motivated by the work in [55] and optimization considerations. Using again a combination of Lipschitz and center-type Lipschitz conditions, but even more precise majorizing sequences, we provide a convergence analysis with advantages (As ) (under the same computational cost) over the work in [55]. We are wondering if in particular the weakest hypothesis (1.7) can be improved. It turns out that in the case of nonlinear equations when F is an operator between two Banach spaces, the sufficient convergence condition is
h3 ¼ ‘3 g 6
1 2
½14;
ð1:20Þ
where
‘3 ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 pffiffiffiffiffiffiffiffi L0 L þ 4 L0 þ L0 ðL þ 8 L0 Þ 8
and the majorizing sequence given by
t0 ¼ 0;
t1 ¼ g;
t2 ¼ g þ
L0 g2 ; 2ð1 L0 gÞ
t nþ2 ¼ t nþ1 þ
Lðtnþ1 t n Þ2 2ð1 L0 t nþ1 Þ
ðn P 1Þ;
ð1:21Þ
where
tH ¼ lim tn 6 .3 ¼ g þ n!1
1 L0 g2 : 1 a 2ð1 L0 gÞ
ð1:22Þ
In this case
h3 ! 0; h0
h3 ! 0; h1
h3 L0 ! 0 as !0 h2 L
ð1:23Þ
and for L0 < L; n P 2:
t n < sn ;
ð1:24Þ
tnþ1 t n < snþ1 sn
ð1:25Þ
t H < sH :
ð1:26Þ
and
Hence, the applicability of (NM) for solving nonlinear equations is extended even further. The convergence of majorizing sequences is quadratic unless if equality holds in the corresponding ’’h’’ conditions, in which case it is only linear. In this study, we show that the last conditions can be used in the setting of (NM). Then we have the advantages (As ). The paper is organized as follows. Section 2 contains the necessary background on Lie groups. In Section 3, we present the semilocal convergence analysis of (NM). Finally, numerical examples are given in the concluding Section 4, where L0 < L and our ’’h’’ conditions hold but not Kantorovich hypothesis (1.2) [55].
10358
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
2. Preliminaries In order us to make the paper as self contained as possible, we re-introduce standard concepts and notations [7,20,27,28,48,52,55]. A Lie group (G,) is a Hausdorff topological group with countable bases which also has the structure of a smooth manifold such that the group product and the inversion are smooth operations in the differentiable structure given on the manifold. The dimension of a Lie group is that of the underlying manifold and we shall always assume that it is finite. The symbol e designates the identity element of G. Let Q be the Lie algebra of the Lie group G which is the tangent space T e G of G at e, equipped with Lie bracket ½; : Q Q ! Q . In the sequel we will make use of the left translation of the Lie group G. We define for each y 2 G
Ly : G ! G;
ð2:1Þ
z ! y z;
the left multiplication in the group. The differential of Ly at e denoted by ðdLy Þe determines an isomorphism of Q ¼ T e G with the tangent space T y G via the relation
ðdLy Þe ðQ Þ ¼ T y G or, equivalently,
Q ¼ ðdLy Þ1 e ðT y GÞ ¼ ðdLy1 Þy ðT y GÞ: The exponential map is noted by exp and defined by
exp : Q ! G u ! expðuÞ; which is certainly the most important construct associated to G and Q. Given u 2 Q , the left invariant vector field X u : y ! ðdLy Þe ðuÞ determines a one-parameter subgroup of G : ru : R ! G such that
ru ð0Þ ¼ e and r0u ðtÞ ¼ X u ðru ðtÞÞ ¼ ðdLru ðtÞ Þe ðuÞ 8t 2 R:
ð2:2Þ
Consequently, the exponential map is defined by the relation
expðuÞ ¼ ru ð1Þ: Consider f : G ! Q ¼ T e G be a C1 -mapping. The differential of f at a point x 2 G is a linear map fx0 : T x G ! Q defined by
fx0 ðMx Þ ¼
d f ðx expðtððdLx1 Þx ÞðMx ÞÞÞjt¼0 dt
for any Mx 2 T x G:
ð2:3Þ
The differential fx0 can be expressed via a function dfx : Q ! Q given by
dfx ¼ ðf Lx Þ0e ¼ fx0 ðdLx Þe : Thus, by (2.3), it follows that
dfx ðuÞ ¼ fx0 ððdLx Þe ðuÞÞ ¼
d f ðx expðtuÞÞjt¼0 dt
for any u 2 Q :
Therefore the following lemma is clear. Lemma 2.1. Let x 2 G; u 2 Q and t 2 R. Then
d f ðx expðtuÞÞ ¼ dfxexpðtuÞ ðuÞ dt
ð2:4Þ
and
f ðx expðtuÞÞ f ðxÞ ¼
Z
t
dfxexpðsuÞ ðuÞ ds:
ð2:5Þ
0
3. Convergence analysis for (NM) We shall study the semilocal/local convergence of (NM). In the rest of the paper we assume h; i the inner product and k k on Q. As in [55] we define a distance on G for x; y 2 G as follows:
mðx; yÞ ¼ inf
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
10359
( ) k X kzi k : there exist k P 1 and z1 ; . . . ; zk 2 Q such that y ¼ x exp z1 exp zk :
ð3:1Þ
i¼1
By convention inf ; ¼ þ1. It is easy to see that mð; Þ is a distance on G and the topology induced is equivalent to the original one on G. Let w 2 G and r > 0, we denote by
Uðw; rÞ ¼ fy 2 G : mðw; yÞ < rg; the open ball centered at w and of radius r. Moreover, we denote the closure of Uðw; rÞ by Uðw; rÞ. Let also LðQ Þ denotes the set of all linear operators on Q. We need the definition of Lipschitz continuity for a mapping. Definition 3.1. Let M : G ! LðQ Þ; x0 2 G and r > 0. We say that M satisfies the L-Lipschitz condition on Uðx0 ; rÞ if
kMðx exp uÞ MðxÞk 6 L kuk
ð3:2Þ
holds for any u 2 Q and x 2 Uðx0 ; rÞ such that kuk þ mðx; x0 Þ < r . It follows from (3.2) that there exists L0 such that
kMðx exp uÞ Mðx0 Þk 6 L0 ðkuk þ mðx; x0 ÞÞ
ð3:3Þ
holds for any u 2 Q and x 2 Uðx0 ; rÞ such that kuk þ mðx; x0 Þ < r. We then say M satisfies the L0 -center Lipschitz condition at x0 2 G on Uðx0 ; rÞ. Note that if M satisfies the L-Lipschitz condition on Uðx0 ; rÞ, then it also satisfies the L0 -center Lipschitz condition at x0 2 G on Uðx0 ; rÞ. Clearly,
L0 6 L
ð3:4Þ
holds in general and LL0 can be arbitrarily large [4–15]. Let us show that (3.2) indeed implies (3.3). If (3.2) holds, then for x ¼ x0 there exists K 2 ð0; L such that
kMðx0 exp uÞ Mðx0 Þk 6 K kuk for any u 2 Q
such that kuk < r:
There exist k P 1 and z0 ; z1 ; . . . ; zk 2 Q such that x ¼ x0 exp z0 exp zk . Let y0 ¼ x0 , yiþ1 ¼ yi exp ui ; i ¼ 0; 1; . . . ; k. Then, we have that x ¼ ykþ1 . We can write the identity
Mðx exp uÞ Mðx0 Þ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðxÞ Mðx0 ÞÞ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðyk exp uk Þ Mðyk ÞÞ þ ðMðyk Þ Mðx0 ÞÞ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðyk exp uk Þ Mðyk ÞÞ þ ðMðyk1 exp uk1 Þ Mðx0 ÞÞ ¼ ðMðx exp uÞ MðxÞÞ þ ðMðyk exp uk Þ Mðyk ÞÞ þ þ ðMðy1 exp u1 Þ Mðy1 ÞÞ þ ðMðx0 exp u0 Þ Mðx0 ÞÞ: Using (3.2) and the triangle inequality, we obtain in turn that
kMðx exp uÞ Mðx0 Þk 6 kMðx exp uÞ MðxÞk þ kMðyk exp uk Þ Mðyk Þk þ þ kMðy1 exp u1 Þ Mðy1 Þk þ kMðx0 exp u0 Þ Mðx0 Þk 6 L ðkuk þ kuk k þ þ ku1 kÞ þ K ku0 k ¼ L ðkuk þ kuk k þ þ ku0 kÞ þ ðK LÞ ku0 k 6 L ðkuk þ kuk k þ þ ku0 kÞ; which implies (3.3). We need a Banach type lemma on invertible mappings. The conclusion of the next result is weaker and more precise than in [[55] Lemma 2.2]. However, since the proof is essentially the same (simply replace L by L0 ) it is omitted. 1
1
Lemma 3.2. Let x0 2 G such that dF x0 exists and let 0 < r 6 L10 . Suppose dF x0 dF satisfies the L0 -center Lipschitz condition at P x0 2 G on Uðx0 ; rÞ; for x 2 Uðx0 ; rÞ, there exist k P 1 and z0 ; z1 ; . . . ; zk 2 Q , such that x ¼ x0 exp z0 exp zk and ki¼0 kzi k < r. 1 Then, linear mapping dF x exists and 1 kdF x dF x0 k
6
!1 k X 1 L0 kzi k :
ð3:5Þ
i¼0
Remark 3.3. The corresponding proof in [[55] Lemma 2.2] uses (3.2) (see also (1.19)) instead of the less expensive and tighter (for L0 < L) (3.3). Our approach differs from [55] after this point, since (3.5) leads to tighter majorizing sequence and estimates. We can show the main semilocal convergence result for (NM). 1
Theorem 3.4. Let F : G ! Q be a C1 -mapping. Let x0 2 G be such that dF x0 exists and let r > 0. Set
10360
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365 1
kdF x0 Fðx0 Þk ¼ g: Suppose
1 dF x0 dF
ð3:6Þ
satisfies the L-Lipschitz condition on Uðx0 ; rÞ; (1.20) holds and
Uðx0 ; .3 Þ Uðx0 ; rÞ;
ð3:7Þ
where .3 is given in (1.22). Then, sequence fxn g (n P 0) generated by (NM) is well defined, remains in Uðx0 ; .3 Þ for all n P 0 and converges to a zero xH 2 Uðx0 ; .3 Þof mapping F. Moreover, the following estimates hold for all n P 1:
mðxnþ1 ; xn Þ 6 kdF xn Fðxn Þk 6 tnþ1 tn
1
ð3:8Þ
mðxH ; xn Þ 6 tH t n ;
ð3:9Þ
and
where ftn g (n P 0) and t H are given in (1.21) and (1.22), respectively. 1
Proof. We shall first show using induction that (3.8) holds for all n P 0. Set zn ¼ dF xn Fðxn Þ. We have by (3.6) that z0 is well 1 defined and x1 ¼ x0 exp z0 . We have mðx1 ; x0 Þ 6 kz0 k and kz0 k ¼ kdF x0 Fðx0 Þk ¼ g ¼ t1 t0 , which shows (3.8) for n ¼ 0. Assume zn is well defined and (3.8) holds for all n 6 k 1. Then, we have that k1 X kzi k 6 tk t0 ¼ tk 6 .3
and xk ¼ x0 exp z0 exp zk1 :
ð3:10Þ
i¼0 1
It follows from Lemma 3.2 and (3.7) that dF xk exists and 1
kdF xk dF x0 k 6 ð1 L0 tk Þ1 :
ð3:11Þ
We also have that
Fðx1 Þ ¼ Fðx1 Þ Fðx0 Þ dF x0 z0 ¼
Z 0
1
ðdF x0 expðt z0 Þ dF x0 Þ z0 dt
ð3:12Þ
and 1
kdF x0 Fðx1 Þk 6
Z
1
1
kdF x0 ðdF x0 expðt z0 Þ dF x0 Þk kz0 k dt 6
0
Z
1
L0 kt z0 k kz0 k dt 6
0
L0 ðt1 t0 Þ2 : 2
ð3:13Þ
Then, we obtain that 1
1
1
kz1 k ¼ k dF x1 Fðx1 Þk 6 kdF x1 dF x0 k kdF x0 Fðx1 Þk 6
L0 ðt1 t0 Þ2 ¼ t2 t1 ; 2ð1 L0 t 1 Þ
ð3:14Þ
which shows (3.8) for n ¼ 1. Similarly for k > 1, we have that
Fðxk Þ ¼ Fðxk Þ Fðxk1 Þ dF xk1 zk1 ¼ 1
kdF x0 Fðxk Þk 6
Z 0
1
Z
1
0
ðdF xk1 expðt zk1 Þ dF xk1 Þ zk1 dt
1
kdF x0 ðdF xk1 expðt zk1 Þ dF xk1 Þk kzk1 k dt 6
Z
1
L kt zk1 k kzk1 k dt 6
0
ð3:15Þ L ðt k tk1 Þ2 : 2
ð3:16Þ
Hence, we get that 1
1
1
kzk k ¼ k dF xk Fðxk Þk 6 kdF xk dF x0 k kdF x0 Fðxk Þk 6
L ðtk tk1 Þ2 ¼ t kþ1 t k : 2ð1 L0 t k Þ
ð3:17Þ
We also have mðxkþ1 ; xk Þ 6 kzk k, since xkþ1 ¼ xk exp zk , which completes the induction for (3.8). It follows from (3.8) that fxk g is a Cauchy sequence in a Banach space and as such it converges to some xH 2 Uðx3 ; .3 Þ (since Uðx3 ; .3 Þ is a closed set). By letting k ! 1 in (3.16), we get that xH is a zero of F. Estimate (3.9) follows from (3.8) by using standard majorization techniques [7,31]. That completes the proof of Theorem 3.4. h Remark 3.5. If L ¼ L0 , Theorem 3.4 reduces to Theorem [[55] Theorem 3.1, p. 328]. Otherwise (i.e., if L0 < L), it constitutes an improvement under the same information as already stated in the Introduction of this study. Clearly, the conclusions of Theorem 3.4 hold true if (h1 ; .1 ) or (h2 ; .1 ) replace (h3 ; .3 ).
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
10361
4. Applications to optimization Following Mahony [38,39] and Wang and Li [55], we provide some applications to optimization problems. Let f : G ! R be a C2 -mapping. Consider the optimization problem:
min f ðxÞ:
ð4:1Þ
x2G
~ the left-invariant vector field associated with X and given by Let X 2 Q . Denote by X
~ XðxÞ ¼ ðL0x Þe X
for x 2 G;
ð4:2Þ
~ f is the Lie derivative of f with respect to the left-invariant vector field X. ~ That is, we have for each x 2 G where X
~ f ÞðxÞ ¼ d j f ðx exp t XÞ: ðX dt t¼0
ð4:3Þ
Let fX 1 ; . . . ; X n g be an orthonormal basis of Q. Then grad f [52] is a vector field on G given by
grad f ðxÞ ¼ ðX~1 ; . . . ; X~n ÞðX~1 f ðxÞ; . . . ; X~n f ðxÞÞT ¼
n X
X~j f ðxÞ X~j
for x 2 G:
ð4:4Þ
j¼1
Then, (NM) according to Mahony [39] is given. Algorithm 4.1. Find X k 2 G such that
~ k ¼ ðL0 Þ X k X x e
~ k f Þðxk Þ ¼ 0: and grad f ðxk Þ þ grad ðX
Set xkþ1 ¼ xk exp X k . Set k k þ 1 and repeat. Let F : G ! Q be a mapping defined by
FðxÞ ¼ ðL0x Þ1 for x 2 G: e grad f ðxÞ
ð4:5Þ
Define linear operator Hx f : Q ! Q by
~ for X 2 Q: ðHx f ÞX ¼ ðL0x Þ1 e gradðX f ÞðxÞ
ð4:6Þ
Then, HðÞ defines a mapping from G to LðQ Þ such that
d F x ¼ Hx f
½55:
ð4:7Þ
It follows from (4.7) that sequence generated by Algorithm 4.1 for f coincides with the one generated by (NM) for F defined by (4.5). Let x0 2 G be such that ðHx0 f Þ1 exists and set
gf ¼ kðHx0 f Þ1 ðL0x0 Þ1 e grad f ðx0 Þk: With the above notation the main results of Section 3 can be re-written as follows: Theorem 4.2. Suppose
h3 ¼ ‘3 gf 6
1 ; 2
ðHx0 f Þ1 ðHðÞ f Þ satisfies L-Lipschitz condition and L0 -center Lipschitz condition at x0 2 G on Uðx0 ; .3 Þ. Then, sequence generated by Algorithm 4.1 starting at x0 is well defined and converges to a critical point xH of f with grad f ðxH Þ ¼ 0. Moreover, if Hx0 f is positive definite
kðHx0 f Þ1 k kHxexp z f Hx f k 6 L kzk and
kðHx0 f Þ1 k kHx0 exp z f Hx0 f k 6 L0 kzk for all x 2 G and z 2 Q with mðx0 ; xÞ þ kzk < .3 , then xH is a local solution of (4.1). Corollary 4.3. Let G be an Abelian group. Let xH 2 G be a local optimal solution of (4.1) such that ðHxH f Þ1 exists. Suppose ðHxH f Þ1 ðHðÞ f Þ satisfies L-Lipschitz condition and LH -center Lipschitz condition at xH on UðxH ; L1H Þ. Let also x0 2 UðxH ; 4aLÞ. Then, the sequence generated by Algorithm 4.1 starting at x0 is well defined, remains in UðxH ; 4aLÞ for all n P 0 and converges to some critical point yH of f such mðxH ; yH Þ < 1=LH . Next, we provide numerical example, where tighter analysis than [55] is given.
10362
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
Example 4.4. Let dmx : Q 2 ! Q for a C2 -map 2
d
m : G ! Q . Consider the second differential equation [55]
@2 mðx exp t2 u2 exp t1 u1 Þ @t2 @t 2
mx u1 u2 ¼
! ð4:8Þ
: t 2 ¼t 1 ¼0
We can write that
dmxexp t u dmx ¼
Z
T
2
d
mxexp s u u ds for all u 2 Q and t 2 R:
ð4:9Þ
0
Let N 2 NH and let e ¼ I N the identity matrix of order N. Consider the linear group G ¼ SLðN; RÞ ¼ fx 2 RNN : det x ¼ 1g endowed with standard matrix multiplication. Then G is a connected Lie group and Q is defined by Q ¼ T e G ¼ slðN; RÞ ¼ fx 2 RNN : tr x ¼ 0g. Q is endowed with the inner product hx; yi ¼ tr ðxT yÞ for each x and y in Q. The corresponding norm is the Frobenius nom. IQ denotes the identity on Q. The exponential map exp : Q ! G is given by P P exp ðuÞ ¼ n2N un =n! for each u 2 Q and its inverse is exp1 ðxÞ ¼ n2NH ð1Þn1 ðx eÞn =n for each x 2 G with kx ek < 1. Let g : G R ! Q be a differentiable map and let x0 be a random initial guess. Consider the initial value problem on G as follows [42]
x0 ¼ x gðxÞ
ð4:10Þ
xð0Þ ¼ x0 where gðxÞ ¼ ðsin xÞ ð2x 5 x2 Þ ððsin xÞ ð2 x 5 x2 ÞÞT for each x 2 G. The function sin is defined as
sin x ¼
X
ð1Þk1
k2NH
x2 k1 ð2 k 1Þ!
for each x 2 G:
Then, we can write g as
gðxÞ ¼ 2
X
ð1Þk1
k2NH
X x2k ðx2 k ÞT x2kþ1 ðx2kþ1 ÞT 5 ð1Þk1 ð2k 1Þ! ð2k 1Þ! H
for all x 2 G:
ð4:11Þ
k2N
Let D the size of the Euler discretization method of (4.10). Then, (4.10) leads to the fixed-point problem
x ¼ x0 exp ðD gðxÞÞ
ð4:12Þ
Let F : G ! Q defined by
FðxÞ ¼ exp1 ðx1 for each x 2 G: 0 xÞ D gðxÞ Then the resolution of (4.12) is equivalent to finding a zero of F. Consider the special case D ¼ 1 and x0 ¼ exp v 0 with and x ¼ exp v 1 exp v 2 exp v k for some v i 2 Q (1 6 i 6 k) with
vP0 2 Q satisfying kv 0 k 6 1=11. Let uðÞ ¼ exp1 ðx1 0 ðÞÞ k i¼0 kv i k < 1=4. By [42,55], we have that k 5X 5 kv i k < < 1: 4 i¼0 16
kx1 0 x ek 6
ð4:13Þ
By the definition of exp1 and (4.13), we have that
dux ðuÞ ¼
X
ð1Þj1
Pj1
1 i¼0 ðx0 x
j2N
j1i 1 eÞi x1 0 x uðx0 x eÞ j
for each u 2 Q
ð4:14Þ
and
kdux IQ k 6
P 2 kx1 10 ki¼0 kv i k 0 x ek 6 : P 1 kx1 4 5 ki¼0 kv i k 0 x ek
ð4:15Þ
We also have that 2
kd ux k 6
P 3 þ 54 ki¼0 kv i k 6 2 : 2 P ð1 kx1 0 x ekÞ 1 54 ki¼0 kv i k 3 þ kx1 0 x ek
ð4:16Þ
Let v 2 Q and x 2 Uðx0 ; 1=50Þ. Then there exist k P 1; v 1 ; . . . ; v k 2 Q such that x ¼ exp v 1 exp v 2 exp v k and P kv k þ ki¼1 kv i k < 1=50. Let s 2 ½0; 1 and y ¼ x exp s v . By (4.15) and since kv 0 k < 1=10, we have that
kdux0 IQ k 6
10 kv 0 k 2 6 : 4 5 kv 0 k 7
By (4.16) and since
Pk
v i k þ kv k < ð1=10Þ þ
i¼0 k
ð4:17Þ Pk
v i k þ kv k < 3=25, we have that
i¼1 k
10363
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365 3 3 þ 20 144 2 kd uy k 6 2 6 P 2 : P k k 5 1 10 1 4 10 þ i¼1 kv i k þ s kv k 35 1 7 i¼1 kv i k þ s kv k
ð4:18Þ
We also have that dg x0 ¼ ð6 cos 1 16 sin 1Þ IQ . Hence, we get that 1
dg x0 ¼
1 IQ 6 cos 1 16 sin 1
and ðIQ dg x0 Þ1 ¼
1 IQ : 1 þ 6 cos 1 þ 16 sin 1
ð4:19Þ
By (4.17), we obtain that
kðIQ dg x0 Þ1 k kdF x0 ðIQ dg x0 Þk 6
2 1 < 1: 7 1 þ 6 cos 1 þ 16 sin 1
By the Banach lemma on invertible operators [7,31], we have that 1
1
kdF x0 k 6
1
1þ6 cos 1þ16 sin 1 6 1 1þ6 cos 1þ16 sin 1
2 7
1 : 10
ð4:20Þ
By (4.11), we have that 2
kd g y k 6 2
X k2NH
2
X ð2 k þ 1Þ2 ky2 kþ1 k ð2 kÞ2 ky2 k k þ5 2 : ð2 k 1Þ! ð2 k 1Þ! H
ð4:21Þ
k2N
P We also have that kyi k 6 5 i=4 ð km¼1 kv m k þ s kv kÞ þ 1 for each i P 1. By (4.21), we deduce that 2
kd g y k 6
5 4
! k X X 4 ð2 jÞ3 þ 10 ð2 j þ 1Þ3 X 4 ð2 jÞ2 þ 10 ð2 j þ 1Þ2 kv i k þ s kv k 2 þ 2 : ð2 j 1Þ! ð2 j 1Þ! H H i¼1 j2N
ð4:22Þ
j2N
Using elementary analysis and (4.22), we obtain that
! k X kd g y k 6 1320 e kv i k þ s kv k þ 136 e 2
ð4:23Þ
i¼1
where e is the exponential number. Combining (4.18) and (4.20), we obtain the following estimate 1
2
1
2
1
2
kdF x0 d F y k 6 kdF x0 k kd uy k þ kdF x0 d g y k 72
6
k X kv i k þ s kv k
175 1 10 7
e !!2 þ 10
! ! k X 1320 kv i k þ s kv k þ 136 :
ð4:24Þ
i¼1
i¼1
Since y ¼ x exp s v and by (4.9), we can write 1
kdF x0 ðdF xexp v dF x Þk 6
Z
1
Z 6 0
1
2
kdF x0 d F xexp s v k kv k ds 0
0 1
B @
72
P k 175 1 10 i¼1 kv i k þ skv k 7
1 ! ! k X e C 1320 kv i k þ skv k þ 136 Akv kds 2 þ 10 i¼1
6 44:58088305 kv k 6 51 kv k: ð4:25Þ 1 dF x0 dF
We conclude that is L-Lipschitz at x0 on Uðx0 ; 1=50Þ with L ¼ 51. Since r0 < 1=L < 1=50, then, at x0 on Uðx0 ; r 0 Þ. Note that Fðx0 Þ ¼ v 0 . Using (4.20), we have that 1
1
h0 ¼ L g ¼ L kdF x0 Fðx0 Þk 6 L kdF x0 k kv k 6
1 dF x0 dF
is L-Lipschitz
51 1 kv 0 k < : 10 2
We conclude by Theorem 3.4 that (NM) starting at x0 converges to a zero xH of F. However, if kv 0 k < 100=1015, then, all the computation remain the same. But we can only deduce that
h0 ¼ L g 6
51 100 1 ¼ :5024630542 > : 10 1015 2
Hence, since the Kantorovich hypothesis (1.2) is not satisfied, Theorem 3.1 in [[34] p. 328] cannot guarantee the converP gence of (NM) to xH . If we choose v ; v 1 such that kv k þ kv 1 k 6 1=51 and kv k þ ki¼1 kv i k < 1=50, then we get that
10364
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
1
kdF x0 ðdF x0 exp v dF x0 Þk 6
Z
1 0
1
2
kdF x0 d F x0 exp s v k kv k ds Z 0
1
! e ð1320 ðkv 1 k þ s kv kÞ þ 136Þ kv k ds 2 þ 10 ðkv 1 k þ s kv kÞ 175 1 10 7 72
6 44:43966955 kv k: Hence, we can choose K ¼ 44:43966955. Moreover, according to the discussion above Lemma 3.2, we can certainly choose L0 ¼ 50 < L ¼ 51. Our strongest condition (1.6) is now satisfied, since
h1 ¼ ‘1 g 6
50 þ 51 1 100 1 ¼ :497536946 < : 2 10 1015 2
Hence, our Theorem 3.4 guarantees the convergence of (NM) starting at x0 to a zero xH of F in Uðx0 ; sH Þ, where sH is given in (1.11). Finally, note that if kv 0 k < 1=11, then clearly our Theorem 3.4 provides tighter error estimates and a more precise information on the location of the solution. References [1] P.A. Absil, C.G. Baker, K.A. Gallivan, Trust-region methods on Riemannian manifolds, Found. Comput. Math. 7 (2007) 303–330. [2] R.L. Adler, J.P. Dedieu, J.Y. Margulies, M. Martens, M. Shub, Newton’s method on Riemannian manifolds and a geometric model for the human spine, IMA J. Numer. Anal. 22 (2002) 359–390. [3] F. Alvarez, J. Bolte, J. Munier, A unifying local convergence result for Newton’s method in Riemannian manifolds, Found Comput. Math. 8 (2008) 197– 226. [4] I.K. Argyros, An improved unifying convergence analysis of Newton’s method in Riemannian manifolds,, J. Appl. Math. Comput. 25 (2007) 345–351. [5] I.K. Argyros, On the local convergence of Newton’s method on Lie groups, PanAm. Math. J. 17 (2007) 101–109. [6] I.K. Argyros, A Kantorovich analysis of Newton’s method on Lie groups, J. Concr. Appl. Anal. 6 (2008) 21–32. [7] I.K. Argyros, Convergence and Applications of Newton-type Iterations, Springer Verlag Publ., New York, 2008. [8] I.K. Argyros, Newton’s method in Riemannian manifolds, Rev. Anal. Numér. Théor. Approx. 37 (2008) 119–125. [9] I.K. Argyros, Newton’s method on Lie groups, J. Appl. Math. Comput. 31 (2009) 217–228. [10] I.K. Argyros, A semilocal convergence analysis for directional Newton methods, Math. Comput. AMS 80 (2011) 327–343. [11] I.K. Argyros, Y.J. Cho, S. Hilout, Numerical Methods for Equations and Its Applications, CRC Press/Taylor and Francis Group, New-York, 2012. [12] I.K. Argyros, S. Hilout, Newton’s method for approximating zeros of vector fields on Riemannian manifolds, J. Appl. Math. Comput. 29 (2009) 417–427. [13] I.K. Argyros, S. Hilout, Extending the Newton–Kantorovich hypothesis for solving equations, J. Comput. Appl. Math. 234 (2010) 2993–3006. [14] I.K. Argyros, S. Hilout, Weaker conditions for the convergence of Newton’s method, J. Complexity 28 (2012) 364–387. [15] I.K. Argyros, S. Hilout, Numerical Methods in Nonlinear Analysis, World Scientific Publ., ISBN-13: 978-9814405829, in press. [16] D.A. Bayer, J.C. Lagarias, The nonlinear geometry of linear programming I. Affine and projective scaling trajectories, Trans. Am. Math. Soc. 314 (1989) 499–526. [17] R.W. Brockett, Differential geometry and the design of gradient algorithms. Differential geometry: partial differential equations on manifolds, Los Angeles, CA, 1990, pp. 69–92, in: Proc. Sympos. Pure Math., vol. 54, Part 1, The American Mathematical Society, Providence, RI, 1993. [18] M.T. Chu, K.R. Driessel, The projected gradient method for least squares matrix approximations with spectral constraints, SIAM J. Numer. Anal. 27 (1990) 1050–1060. [19] J.P. Dedieu, P. Priouret, G. Malajovich, Newton’s method on Riemannian mnifolds: convariant atheory, IMA J. Numer. Anal. 23 (2003) 395–419. [20] M.P. Do Carmo, Riemannian Geometry, Birkhauser, Boston, 1992. [21] A. Edelman, T.A. Arias, S.T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl. 20 (1999) 303–353. [22] J.A. Ezquerro, M.A. Hernández, Generalized differentiability conditions for Newton’s method, IMA J. Numer. Anal. 22 (2002) 187–205. [23] J.A. Ezquerro, M.A. Hernández, On an application of Newton’s method to nonlinear operators with x-conditioned second derivative, BIT 42 (2002) 519–530. [24] O.P. Ferreira, B.F. Svaiter, Kantorovich’s theorem on Newton’s method in Reimannian manifolds, J. Complexity 18 (2002) 304–329. [25] D. Gabay, Minimizing a differentiable function over a differential manifold, J. Optim. Theory Appl. 37 (1982) 177–219. [26] J.M. Gutiérrez, M.A. Hernández, Newton’s method under weak Kantorovich conditions, IMA J. Numer. Anal. 20 (2000) 521–532. [27] B.C. Hall, Lie Algebras and Representations. An Elementary Introduction, Graduate Texts in Mathematics, vol. 222, Springer-Verlag, New York, 2003. [28] S. Helgason, Differential Geometry, Lie Groups and Symmetric Spaces, Pure and Applied Mathematics, vol. 80, Academic Press, Inc., Harcourt Brace Jovanovich, Publishers, New York, London, 1978. [29] U. Helmke, J.B. Moore, Singular-value decomposition via gradient and self-equivalent flows, Linear Algebra Appl. 169 (1992) 223–248. [30] U. Helmke, J.B. Moore, Optimization and dynamical systems, in: R. Brockett (Ed.), Communications and Control Engineering Series, Springer-Verlag London, Ltd., London, 1994. [31] L.V. Kantorovich, G.P. Akilov, Functional Analysis in Normed Spaces, Pergamon Press, New York, 1982. [32] C. Li, G. López, V. Martín-Márquez, Monotone vector fields and the proximal point algorithm on Hadamard manifolds, J. Lond. Math. Soc. 79 (2) (2009) 663–683. [33] C. Li, J.H. Wang, Convergence of the Newton method and uniqueness of zeros of vector fields on Riemannian manifolds, Sci. Chin. Ser. A 48 (2005) 1465–1478. [34] C. Li, J.H. Wang, Newton’s method on Riemannian manifolds: Smale’s point estimate theory under the c-condition, IMA J. Numer. Anal. 26 (2006) 228– 251. [35] C. Li, J.H. Wang, J.P. Dedieu, Smale’s point estimate theory for Newton’s method on Lie groups, J Complexity 25 (2009) 128–151. [36] P.D. Proinov, General local convergence theory for a class of iterative processes and its applications to Newton’s method, J. Complexity 25 (2009) 38– 62. [37] D.G. Luenberger, The gradient projection method along geodesics, Manage. Sci. 18 (1972) 620–631. [38] R.E. Mahony, Optimization algorithms on homogeneous spaces: with applications in linear systems theory, Ph.D. Thesis, Department of Systems Engineering, University of Canberra, Australia, 1994. [39] R.E. Mahony, The constrained Newton method on a Lie group and the symmetric eigenvalue problem, Linear Algebra Appl. 248 (1996) 67–89. [40] R.E. Mahony, U. Helmke, J.B. Moore, Pole placement algorithms for symmetric realisations, in: Proceedings of 32nd IEEE Conference on Decision and Control, San Antonio, TX, 1993, pp. 1355–1358. [41] J.B. Moore, R.E. Mahony, U. Helmke, Numerical gradient algorithms for eigenvalue and singular value calculations, SIAM J. Matrix Anal. Appl. 15 (1994) 881–902.
I.K. Argyros, S. Hilout / Applied Mathematics and Computation 219 (2013) 10355–10365
10365
[42] B. Owren, B. Welfert, The Newton iteration on Lie groups, BIT 40 (2000) 121–145. [43] P.D. Proinov, New general convergence theory for iterative processes and its applications to Newton–Kantorovich type theorems, J. Complexity 26 (2010) 3–42. [44] W.C. Rheinboldt, An Adaptive Continuation Process for Solving System of Nonlinear Equations, Banach Center Publ., 1977. vol. 3, pp. 129–142. [45] M. Shub, S. Smale, Complexity of Bézout’s theorem I. Geometric aspects, J. Am. Math. Soc. 6 (1993) 459–501. [46] S. Smale, Newton’s method estimates from data at a point, in: R. Ewing, K. Gross, C. Martin (Eds.), The Merging of Disciplines: New Directions in Pure, Applied and Computational Mathematics, Springer Verlag Publ., New-York, 1986, pp. 185–196. [47] S.T. Smith, Dynamical systems that perform the singular value decomposition, Syst. Control Lett. 16 (1991) 319–327. [48] S.T. Smith, Geometric optimization methods for adaptive filtering, Thesis Ph.D., Harvard University, ProQuest LLC, Ann Arbor, MI, 1993. [49] S.T. Smith, Optimization techniques on Riemannian manifolds, Hamiltonian and Gradient Flows, Algorithms and Control, Fields Institute Communications, vol. 3, The American Mathematical Society, Providence, RI, 1994. [50] J.F. Traub, H. Wozniakowski, Convergence and complexity of Newton iteration for operator equations, J. Assoc. Comput. Mach. 26 (1979) 250–258. [51] C. Udrisßte, Convex Functions and Optimization Methods on Riemannian Manifolds, Mathematics and Its Applications, vol. 297, Kluwer Academic Publishers Group, Dordrecht, 1994. [52] V.S. Varadarajan, Lie groups, Lie algebras and their representations, Graduate Texts in Mathematics, vol. 102, Springer-Verlag, New York, 1984. Reprint of the 1974 edition. [53] J.H. Wang, C. Li, Uniqueness of the singular points of vector fields on Riemannian manifolds under the c-condition, J. Complexity 22 (2006) 533–548. [54] J.H. Wang, C. Li, Kantorovich’s theorem for Newton’s method on Lie groups, J. Zheijiang Univ. Sci. A 8 (2007) 978–986. [55] J.H. Wang, C. Li, Kantorovich’s theorems for Newton’s method for mappings and optimization problems on Lie groups, IMA J. Numer. Anal. 31 (2011) 322–347. [56] X.H. Wang, Convergence of Newton’s method and uniqueness of the solution of equations in Banach space, IMA J. Numer. Anal. 20 (2000) 123–134. [57] P.P. Zabrejko, D.F. Nguen, The majorant method in the theory of Newton–Kantorovich approximations and the Pták error estimates, Numer. Funct. Anal. Optim. 9 (1987) 671–684.