Gauss–Newton-based BFGS method with filter for unconstrained minimization

Applied Mathematics and Computation 211 (2009) 354–362 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepag...

Download PDF

194KB Sizes 0 Downloads 102 Views

Report

PDF Reader
Full Text

Applied Mathematics and Computation 211 (2009) 354–362

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Gauss–Newton-based BFGS method with ﬁlter for unconstrained minimization Nataša Krejic´ a,*, Zorana Luzˇanin a, Irena Stojkovska b a b

Department of Mathematics and Informatics, Faculty of Science, University of Novi Sad, Trg Dositeja Obradovic´a 4, 21000 Novi Sad, Serbia Department of Mathematics, Faculty of Natural Sciences and Mathematics, St. Cyril and Methodius University, Gazi Baba b.b., 1000 Skopje, Macedonia

a r t i c l e

i n f o

Keywords: BFGS method Filter methods Global convergence Unconstrained minimization

a b s t r a c t One class of the lately developed methods for solving optimization problems are ﬁlter methods. In this paper we attached a multidimensional ﬁlter to the Gauss–Newton-based BFGS method given by Li and Fukushima [D. Li, M. Fukushima, A globally and superlinearly convergent Gauss–Newton-based BFGS method for symmetric nonlinear equations, SIAM Journal of Numerical Analysis 37(1) (1999) 152–172] in order to reduce the number of backtracking steps. The proposed ﬁlter method for unconstrained minimization problems converges globally under the standard assumptions. It can also be successfully used in solving systems of symmetric nonlinear equations. Numerical results show reasonably good performance of the proposed algorithm. Ó 2009 Elsevier Inc. All rights reserved.

1. Introduction Consider the following problem:

minn f ðxÞ;

ð1Þ

x2R

where f : Rn ! R is two times continuously differentiable function. There are various methods for solving the unconstrained minimization problem (1), see [8,17]. One of the most exploited is the class of quasi-Newton methods, ﬁrstly proposed by Broyden [3] in 1965. They are deﬁned by the iterative formula

xkþ1 ¼ xk þ kk pk ; where kk > 0 is the step length that is updated by line search and backtracking procedure and pk ¼ B1 k rf ðxk Þ is the quasiNewton search direction, where rf ðxÞ is the gradient mapping of f ðxÞ. In quasi-Newton methods Bk is an approximation of the Hessian r2 f ðxk Þ and it is updated at every iteration with some low-rank matrix. One of the well known update formula is BFGS formula given by

Bkþ1 ¼ Bk

Bk sk sTk Bk yk yTk þ T ; sTk Bk sk y k sk

where sk ¼ xkþ1 xk and yk ¼ rf ðxkþ1 Þ rf ðxk Þ. BFGS update formula possesses nice properties that are very useful for establishing convergence results, see [4]. For better understanding of quasi-Newton methods, we refer to [2,7,16].

* Corresponding author. E-mail addresses: [email protected] (N. Krejic´), [email protected] (Z. Luzˇanin), [email protected] (I. Stojkovska). 0096-3003/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2009.01.041

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

355

In [15], Li and Fukushima proposed a new modiﬁed BFGS method called Gauss–Newton-based BFGS method for symmetric nonlinear equations

gðxÞ ¼ 0;

ð2Þ n

n

n

where the mapping g : R ! R is differentiable and the Jacobian rgðxÞ is symmetric for all x 2 R . By introducing a new line search technique and modifying BFGS update formula they proved global and superlinear convergence of their method under suitable conditions. If we denote by g : Rn ! Rn the gradient mapping of the function f ðxÞ from (1), then gðxÞ is differentiable and the Hessian r2 f ðxÞ is symmetric for all x 2 Rn . The ﬁrst order optimality conditions for (1) are given by (2) and therefore the method from [15] is applicable for solving (1). Like many other quasi-Newton methods the Gauss–Newton-based BFGS method [15] uses a backtracking procedure in determining the step length. A line search procedure can result in a very small step lengths. In order to avoid these small steps, as well as to reduce the number of backtracking procedures, in this paper we associate a ﬁlter to the Gauss–Newton-based BFGS method. Filter methods, one of the latest developments in global optimization algorithms, were ﬁrstly proposed by Fletcher in 1996, [9]. First ﬁlter methods are a kind of alternative to penalty functions used in constrained nonlinear programming optimization algorithms, see [5,11,14,18]. In these methods, ﬁlter functions penalize the constraints violations less restrictively then penalty functions. The main purpose of ﬁlters is allowing the full Newton step to be taken more often and thus inducing global convergence of the method. Filter methods are extended to solving nonlinear equations, see [6,12] and to unconstrained optimization, [13]. For detailed reading about ﬁlter methods developments we refer to [10]. In this paper we add a ﬁlter to the Gauss–Newton-based BFGS method [15] in order to solve the unconstrained minimization problem (1). Our ﬁlter method has global convergence property and in some test problems shows better performance than the method without ﬁlters. The ﬁlter in our paper is inspired by the multidimensional ﬁlter in [6,12] although there are minor differences. In Section 2 we present the Gauss–Newton-based BFGS method [15], introduce the multidimensional ﬁlter and state our Gauss–Newton-based BFGS method with ﬁlters. In Section 3 we prove the global convergence of our method under a set of standard assumptions. Some numerical results are reported in Section 4. 2. Algorithm Let us ﬁrst brieﬂy explain the main properties of the Gauss–Newton-based BFGS method [15] for solving problem (1). That method will be used in combination with a ﬁlter in this paper. Given the gradient mapping g and current iteration xk ¼ xk1 þ kk1 pk1 ; the search direction pk is obtained from the following linear system:

Bk p þ k1 k1 ðgðxk þ kk1 gðxk ÞÞ gðxk ÞÞ ¼ 0:

ð3Þ

If the inequality

kgðxk þ pk Þk 6 qkgðxk Þk

ð4Þ

is not satisﬁed for some ﬁxed q 2 ½0; 1Þ then a line search procedure is applied. The line search rule is governed by a positive P sequence fek g satisfying k ek < 1: For the chosen sequence fek g and ﬁxed parameters r1 ; r2 > 0 we take xkþ1 ¼ xk þ kk pk as a new iterate if

kgðxk þ kk pk Þk2 kgðxk Þk2 6 r1 kkk gðxk Þk2 r2 kkk pk k2 þ ek kgðxk Þk2 :

ð5Þ

Otherwise the step size is decreased, kk :¼ rkk ; 0 < r < 1 until (5) is satisﬁed. The acceptance rule (5) starts with kk ¼ 1 and is well deﬁned since the inequality has to be satisﬁed for kk > 0 small enough due to the presence of ek > 0 at the right-hand side of (5). After determining xkþ1 ; the new matrix Bkþ1 is obtained by (6), where

sk ¼ xkþ1 xk ¼ kk pk ; dk ¼ gðxkþ1 Þ gðxk Þ; yk ¼ gðxk þ dk Þ gðxk Þ; Bkþ1 ¼ Bk

Bk sk sTk Bk yk yTk þ T : sTk Bk sk y k sk

ð6Þ

The usual safe guard condition yTk sk > 0 is used, i.e. if yTk sk 6 0 then Bkþ1 ¼ Bk . In the computational implementation of the algorithm presented in Section 4 we used BFGS approximation of the inverse Jacobian, Hk ¼ B1 k and therefore determine the search direction as

pk ¼ Hk k1 k1 ðgðxk þ kk1 gðxk ÞÞ gðxk ÞÞ

ð7Þ

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

356

with

sk yT y sT sk sT Hkþ1 ¼ I T k Hk I kT k þ T k : y k sk y k sk y k sk

ð8Þ

The Gauss–Newton-based BFGS is globally convergent under a set of standard assumptions, [15]. The same set of assumptions is used in this paper and is listed as A1–A3 in Section 3. Let us now introduce the multidimensional ﬁlter we will use in combination with the Gauss–Newton-based BFGS from [15]. Our ﬁlter is deﬁned similarly as in [6,12]. Eq. (2) are partitioned into m sets fg i ðxÞgi2Ij , j ¼ 1; . . . ; m, with the property f1; . . . ; ng ¼ I1 [ [ Im , Ij \ Ik ¼ ;, j – k and the ﬁlter functions are deﬁned as def

/j ðxÞ ¼ kg Ij ðxÞk for j ¼ 1; . . . ; m;

ð9Þ

where k k is the Euclidean norm and g Ij is the vector whose components are the components of g indexed by Ij . With this notation x satisﬁes the optimality conditions of (1) if and only if /j ðx Þ ¼ 0 for all j ¼ 1; . . . ; m. The following abbreviations will be used def

def

/ðxÞ ¼ ð/1 ðxÞ; . . . ; /m ðxÞÞT ;

def

/k ¼ /ðxk Þ and /j;k ¼ /j ðxk Þ:

A ﬁlter is a list F of m-tuples of the form ð/1;k ðxÞ; /2;k ðxÞ; . . . ; /m;k ðxÞÞ such that

/j;k < /j;l for at least one j 2 f1; . . . ; mg and for all k – l: To understand the meaning and usage of ﬁlter a concept of domination is introduced. A point x1 dominates a point x2 whenever

/j ðx1 Þ 6 /j ðx2 Þ for all j ¼ 1; . . . ; m: Therefore we say that ﬁlter keeps all iterates that are not dominated by other iterates in it. In this work, we use a ﬁlter to construct an additional acceptability condition for a new trial iterate xþ k ¼ xk þ sk : This condition is slightly different from the one in [12]. The ﬁrst difference is function d1 in (10), where we use max function while in [12] min is considered. The procedure for removing points from the ﬁlter is also different. The reason for these changes is empirical since we realized that the algorithm is more efﬁcient with the rule proposed in this paper. We say that a new trial point xþ k is acceptable for the ﬁlter F if

8/l 2 F 9j 2 f1; . . . ; mg /j ðxþk Þ < /j;l c/ d1 ðk/l k; k/þk kÞ;

ð10Þ

where c/ 2 ð0; 1Þ is a small positive constant and

d1 ðk/l k; k/þk kÞ ¼ maxfk/l k; k/þk kg: When a trial point is acceptable for the ﬁlter, we add it to the ﬁlter immediately (again different from [12]). In other words þ þ þ T we add the m-tuple /þ k ¼ /ðxk Þ ¼ ð/1 ðxk Þ; . . . ; /m ðxk ÞÞ to the ﬁlter F. þ When we add a new trial point xk to the ﬁlter, we remove all points from the ﬁlter that are dominated by the trial point , which means that we remove m-tuples ð/1;ki ; . . . ; /m;ki ÞT 2 F n f/þ xþ k g from the ﬁlter if k

/j ðxþk Þ 6 /j;ki ;

j ¼ 1; . . . ; m:

Every trial point which is acceptable for the ﬁlter is taken as a new iterate. Our implementation of the multidimensional ﬁlter in the Gauss–Newton-based BFGS with ﬁlter method is done in such a is not acceptable to the ﬁlter. The proposed ﬁlter algoway that the backtracking kk :¼ rkk is enforced only if a trial point xþ k rithm gives its best performance in the case when m ¼ n and Ij ¼ fjg – similarly as it is discussed in [10]. In this case we will have /j ðxÞ ¼ jg j ðxÞj, /ðxÞ ¼ gðxÞ, /k ¼ gðxk Þ and /j;k ¼ jg j ðxk Þj. Now we are ready to state the new algorithm. ALGORITHM GNbBFGSf. Gauss–Newton-based BFGS method with ﬁlter Step 0. Choose an initial point x0 2 Rn , an initial symmetric positive deﬁnite matrix B0 2 Rnn , a positive sequence fek g satP isfying 1 k¼0 ek < 1, and constants r; q; c/ 2 ð0; 1Þ, r1 ; r2 > 0, k1 > 0. Let k :¼ 0. Step 1. If gðxk Þ ¼ 0 then stop. Otherwise, solve the following linear equation to get pk :

Bk p þ k1 k1 ðgðxk þ kk1 gðxk ÞÞ gðxk ÞÞ ¼ 0:

ð11Þ

Take kk ¼ 1 and go to Step 2. Step 2. Let the trial point be xþ k ¼ xk þ kk pk . If

kgðxþk Þk2 kgðxk Þk2 6 r1 kkk gðxk Þk2 r2 kkk pk k2 þ ek kgðxk Þk2 xþ k

ð12Þ xþ k

then go to Step 3, else if is acceptable to the ﬁlter then add to the ﬁlter, remove all points from the ﬁlter that are dominated by xþ k and go to Step 3. Otherwise, put kk :¼ rkk and repeat Step 2.

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

357

Step 3. Take the next iterate xkþ1 ¼ xþ . k Step 4. Put

sk ¼ xkþ1 xk ¼ kk pk ; dk ¼ gðxkþ1 Þ gðxk Þ; yk ¼ gðxk þ dk Þ gðxk Þ: If yTk sk 6 0, then Bkþ1 ¼ Bk and go to Step 5. Otherwise, update Bk

Bkþ1 ¼ Bk

Bk sk sTk Bk yk yTk þ T sTk Bk sk y k sk

ð13Þ

and go to Step 5. Step 5. Let k :¼ k þ 1. Go to Step 1. At the beginning of Algorithm GNbBFGSf, the ﬁlter is initialized to be empty F ¼ ; or to be F ¼ f/ðxÞ : /j ðxÞ P /j max ; j ¼ 1; . . . ; mg for any /j max > /j ðx0 Þ, j ¼ 1; . . . ; m (similarly as in [18]). In practical implementations of this method, Section 4, the ﬁlter is initialized to be empty F ¼ ; or to be F ¼ f/ðx0 Þg. One should notice that if there are ﬁnitely many values of /k that are added to the ﬁlter, then for all k large enough (for all k P k0 where /k0 is the last m-tuple added to the ﬁlter), Algorithm GNbBFGSf is the same as the one considered in [15]. 3. Convergence result In this section, we are going to establish the global convergence of Algorithm GNbBFGSf. Some convergence results from [15] will be used. To do that we need the following assumption, see [13], for the values /k that are added to the ﬁlter by Algorithm GNbBFGSf. Assumption. A0. There exists a constant C > 0 such that

k/k k 6 C for all values /k which are added to the ﬁlter by GNbBFGSf. Let X be the level set deﬁned by e

X ¼ fx : kgðxÞk 6 e2 maxfC; kgðx0 Þkgg; where

ð14Þ

e is a positive constant such that 1 X

ek < e:

ð15Þ

k¼0

Then we have the following lemma. Lemma 3.1. Let the assumption A0 hold and let fxk g be generated by Algorithm GNbBFGSf. Then fxk g X. 0

Proof. Let xk be the current iterate. If there are no values /k0 ; k 6 k before it that are added to the ﬁlter, then e

kgðxk Þk 6 e2 kgðx0 Þk;

ð16Þ

as shown in [15]. In fact for each two consecutive iterates xk1 and xk which were not added to the ﬁlter the inequality (12) holds which means that 1

kgðxk Þk 6 ð1 þ ek1 Þ2 kgðxk1 Þk;

ð17Þ

where e is a constant satisfying (15). So (16) is obtained. Now let xiðkÞ be the last iterative point before xk such that /iðkÞ was added to the ﬁlter. Then using (17) we have 1

kgðxk Þk 6 ð1 þ ek1 Þ2 kgðxk1 Þk 1

1

6 ð1 þ ek1 Þ2 ð1 þ ek2 Þ2 kgðxk2 Þk .. . 0 1 k1 Y 1 ð1 þ ej Þ2 AkgðxiðkÞ Þk 6@ j¼iðkÞ

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

358

0

1kiðkÞ 2 k1 X 1 6 kgðxiðkÞ Þk@ ð1 þ ej ÞA k iðkÞ j¼iðkÞ 0

1kiðkÞ 2 k1 X 1 ¼ kgðxiðkÞ Þk@1 þ ej A k iðkÞ j¼iðkÞ 6 kgðxiðkÞ Þk 1 þ

e

ð18Þ

kiðkÞ 2

k iðkÞ

e

6 e2 kgðxiðkÞ Þk e

6 e2 C;

ð19Þ

where e is a constant satisfying (15) and C > 0 is a constant from assumption A0. From (16) and (19) there follows fxk g X. The following set of assumption [15] is also necessary for the global convergence analysis of Algorithm GNbBFGSf. Assumption. A1. g is continuously differentiable on an open set X1 containing X. A2.rg is symmetric and bounded on X1 i.e. rgðxÞT ¼ rgðxÞ for every x 2 X1 and there exists a positive constant M such that

krgðxÞk 6 M

8x 2 X1 :

A3.rg is uniformly nonsingular on X1 i.e. there is a constant m > 0 such that

mkpk 6 krgðxÞpk 8x 2 X1 ; p 2 Rn : As the mapping g is the gradient of f and f is two times continuously differentiable, A1 and A2 are clearly satisﬁed while A3 is a standard assumption for global convergence of optimization algorithms. In order to prove the global convergence of Algorithm GNbBFGSf we distinguish two cases. The ﬁrst one is when there are ﬁnitely many values of /k that are added to the ﬁlter by Algorithm GNbBFGSf, and for that case our algorithm is the same as the Gauss–Newton-based BFGS from [15]. Therefore, we will not consider that case in this paper. The second case occurs when there are inﬁnitely many values of /k that are added to the ﬁlter by Algorithm GNbBFGSf. For that case we state the following lemma. Lemma 3.2. Let the assumptions A0–A3 hold and assume that inﬁnitely many values of /k are added to the ﬁlter by Algorithm GNbBFGSf. Then

lim kgðxk Þk ¼ 0:

k!1

ð20Þ

Proof. The ﬁrst part of the proof is based on [6,12]. Let fki g be the subsequence of iterations such that /ki ¼ /þ ki 1 is added to the ﬁlter. Now assume that there exists a subsequence fkj g # fki g such that

k/kj k P e1 for some constant such that

lim /kl ¼ /1 l!1

ð21Þ

e1 > 0. Since by the assumption we know that fk/kj kg is bounded, there exists a subsequence fkl g # fkj g with k/1 k P e1 :

Moreover, by deﬁnition of fkl g, /kl is acceptable for every l, which implies in particular that for each l there exists an index j 2 f1; . . . ; mg such that

/j;kl /j;kl1 < c/ k/kl k

ð22Þ

as we will show now. Let us consider the following two cases: (i) If /kl1 is still in the ﬁlter, then there exists an index j 2 f1; . . . ; mg such that

/j;kl /j;kl1 < c/ d1 ðk/kl k; k/kl1 kÞ 6 c/ k/kl k: (ii) If /kl1 is removed by one trial point, say /kl0 with kl1 < kl0 < kl , and /kl0 is still in the ﬁlter, then there exists a j 2 f1; . . . ; mg such that

/j;kl /j;kl0 < c/ d1 ðk/kl k; k/kl0 kÞ 6 c/ k/kl k

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

359

and

/j;kl0 6 /j;kl1 ; so (22) holds in this case too. From (22), (21) and fkl g # fkj g, we deduce that

/j;kl /j;kl1 < c/ e1 ; but the left-hand side of it tends to zero when l tends to inﬁnity. This leads to the contradiction. Hence

lim k/ki k ¼ 0:

ð23Þ

i!1

Consider now any l R fki g and let kiðlÞ be the last iteration before l such that /kiðlÞ was added to the ﬁlter. The following inequality can be derived like (18):

0 kgðxl Þk 6 kgðxkiðlÞ Þk@1 þ

1 l kiðlÞ

l1 X

1lk2iðlÞ

ej A

:

ð24Þ

j¼kiðlÞ

By the deﬁnition of fek g, we know that

lim l!1

l1 X

ej ¼ 0;

j¼kiðlÞ

and consequently

0 lim @1 þ l!1

1 l kiðlÞ

l1 X

1lk2iðlÞ

ej A

¼ 1:

ð25Þ

j¼kiðlÞ

From previously proved (23) we have that

lim k/kiðlÞ k ¼ 0:

ð26Þ

l!1

Equalities (25) and (26) imply that when l tends to inﬁnity then the right-hand side of (24) tends to zero and therefore

lim kgðxl Þk ¼ 0; l!1

which completes the proof.

h

Lemma 3.2 and global convergence of Gauss–Newton-based BFGS algorithm from [15] clearly imply the next global convergence theorem. Theorem 3.3. Let the assumptions A0–A3 hold. Then the sequence fxk g generated by Algorithm GNbBFGSf converges to the unique solution x of problem (2). Proof. If there are ﬁnitely many values of /k that are added to the ﬁlter by Algorithm GNbBFGSf, then Algorithm GNbBFGSf is the same as the Gauss–Newton-based BFGS algorithm from [15] after some ﬁnite number of iterative steps, and in [15] it is proved that under assumptions A1–A3 fkgðxk Þkg converges and

lim inf kgðxk Þk ¼ 0: k!1

ð27Þ

The case when inﬁnitely many values of /k are added to the ﬁlter by Algorithm GNbBFGSf is considered in Lemma 3.2. So, from (27) and (20), we can say that every accumulation point of fxk g is a solution of (2). Since rg is uniformly nonsingular on X1 , (2) has only one solution. Since X is bounded, fxk g X has at least one accumulation point. Therefore, fxk g converges to the unique solution of (2). h 4. Numerical results In this section we report results of some numerical experiments with the proposed method and give comparison with the method considered in [15]. Problems 1, 2 and 3 are of ﬁxed dimension, while problems 4 and 5 can have various dimensions. Problem 1. [6].

min f ðxÞ; x2R2

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

360

where

f ðxÞ :¼

1 2 ððx x2 1Þ2 þ ððx1 2Þ2 þ ðx2 0:5Þ2 1Þ2 Þ: 2 1

Problem 2 [6].

min f ðxÞ; x2R3

where

f ðxÞ ¼

1 ðð12x1 x22 4x3 7Þ2 þ ðx21 þ 10x2 x3 11Þ2 þ ðx22 þ 10x3 8Þ2 Þ: 2

Problem 3 [1]. Wood’s Function (WF)

min f ðxÞ; x2R4

where

f ðxÞ ¼ 100ðx2 x21 Þ2 þ ð1 x1 Þ2 þ 90ðx4 x23 Þ2 þ ð1 x3 Þ2 þ 10:1ððx2 1Þ2 þ ðx4 1Þ2 Þ þ 19:8ðx2 1Þðx4 1Þ:

Problem 4 [1]. Cosine Mixture Problem (CM)

minn f ðxÞ; x2R

where

f ðxÞ ¼

n X

x2i 0:1

i¼1

n X

cosð5pxi Þ:

i¼1

Problem 5 [1]. Rosenbrock Problem (RB)

minn f ðxÞ; x2R

where

f ðxÞ ¼

n1 X ð100ðxiþ1 x2i Þ2 þ ðxi 1Þ2 Þ: i¼1

Table 1 Comparison of GNbBFGS methods. prb

n

x0

Algorithm 1

Algorithm 2 (I)

Algorithm 2(II)

iter

gcalc

iter

gcalc

ftr

iter

gcalc

ftr

1

2

(1, 1) (5, 5)

45 46

173 177

33 107

119 393

1 3

41 47

164 188

2 2

2

3

(0, 0, 0) (1, 1, 1)

27 45

103 184

32 32

127 119

1 2

35 32

133 119

3 2

3

4

(0.5, 0.5, 0.5, 0.5) (1.5, 0.5, 1.5, 0.5)

659 619

3126 2661

335 852

1468 3754

3 5

361 2123

1640 10,992

2 3

4

2

(1, 1) (5, 5)

104 145

738 1010

98 336

650 2727

2 3

52 336

299 2727

2 3

4

4

(1, 1, 1, 1) (5, 5, 5, 5)

104 188

738 1378

104 188

738 1378

0 0

42 22

245 109

1 1

5

2

(0.5, 0.5) (1.2, 1.2)

239 242

988 1141

511 178

2091 755

5 3

511 253

2091 1129

5 4

5

4

(0.5, 0.5, 0.5, 0.5) (1.2, 1.2, 1.2, 1.2)

266 288

1065 1211

1712 265

7713 1107

2 3

1712 285

7713 1246

2 3

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

361

Experiments were done skipping Step 2 in the Gauss–Newton-based BFGS method from [15], i.e. skipping the monotone decrease acceptance condition (4) and with updating the approximation the inverse Jacobian Hk . The parameters were cho2 sen to be r ¼ 0:1, r1 ¼ r2 ¼ 105 , k1 ¼ 0:01, ek ¼ k and c/ ¼ 0:5, and the initial matrix H0 was set to be the identity matrix. Test were performed in Matlab. The iteration procedure was stopped when the condition kgðxk Þk 6 eps1=3 with eps ¼ 252 was satisﬁed. Three methods were tested. In Tables 1 and 2 the following abbreviations are used: Algorithm 1 is the Gauss–Newtonbased BFGS method from [15], Algorithm 2(I) is the Gauss–Newton-based BFGS method with ﬁlter described in Algorithm GNbBFGSf with the initialization of the ﬁlter F ¼ f/ðx0 Þg and Algorithm 2(II) is the Gauss–Newton-based BFGS method with ﬁlter described in Algorithm GNbBFGSf and the initialization F ¼ ;. Table 1 shows the initial point x0 , total number of iteration (iter), number of calculations of the function g (gcalc) and number of steps where the ﬁlter was used (ftr). Table 2 shows the values of the last iterative points. The overall performance of all algorithms was good. It is clear that comparison of Algorithms 1 and 2 depends on an initial point. From Table 1 we can see that in some cases Algorithm 1 still has better performance: Problem 1 – the second initial point, Problem 2 – the ﬁrst initial point, Problem 3 – the second initial point, Problem 4 – the second initial point, Problem 5 – the ﬁrst and third initial point. But in all remaining cases one of the Algorithm 2 or both of them are superior to Algorithm 1. This shows that the usage of ﬁlter is justiﬁed and that it can reduce the number of iterations and consequently the number of calculations of the function g. In some cases the decrease was quite signiﬁcant, Problem 3 – the ﬁrst initial point, Problem 4 – ﬁrst initial point. From the same Table 1 we can also see that the performance of the Algorithm 2 depends on the

Table 2 Final points reached by the tested algorithms. prb

n

x0

1

2

(1, 1) (5, 5)

2

3

(0, 0, 0)

(1, 1, 1)

3

4

(0.5, 0.5, 0.5, 0.5)

(1.5, 0.5, 1.5, 0.5)

4

2

(1, 1) (5, 5)

4

4

(1, 1, 1, 1)

(5, 5, 5, 5)

5

2

(0.5, 0.5) (1.2, 1.2)

5

4

(0.5, 0.5, 0.5, 0.5)

(1.2, 1.2, 1.2, 1.2)

Algorithm 1

Algorithm 2(I)

Algorithm 2(II)

xð1Þ xð2Þ xð1Þ xð2Þ

1.257618339 0.795151479 1.257619385 0.795153222

1.546342884 1.391176313 1.25761907 0.795152509

1.067346967 0.139230758 1.067346082 0.139227653

xð1Þ xð2Þ xð3Þ xð1Þ xð2Þ xð3Þ

0.908926356 1.085600025 0.682147237 0.908926364 1.085600016 0.682147253

0.908926369 1.085600012 0.68214726 0.908926365 1.085600015 0.682147254

0.908926367 1.085600018 0.682147259 0.908926365 1.085600015 0.682147254

xð1Þ xð2Þ xð3Þ xð4Þ xð1Þ xð2Þ xð3Þ xð4Þ

1.000082461 1.000161982 0.999929022 0.999855112 1.000000103 1.000000204 0.999999899 0.999999796

1.000051394 1.000102612 0.999949906 0.999899011 0.999998594 0.999997189 1.000001296 1.000002592

0.999999082 0.999998153 1.00000093 1.000001867 1.00000006 1.00000012 0.999999941 0.999999881

xð1Þ xð2Þ xð1Þ xð2Þ

3.40169E09 3.40169E09 1.9795E11 1.5949E11

9.37398E09 9.37398E09 5.99218E09 2.15731E07

2.33E12 2.563E12 5.99218E09 2.15731E07

xð1Þ xð2Þ xð3Þ xð4Þ xð1Þ xð2Þ xð3Þ xð4Þ

3.40169E09 3.40169E09 3.40169E09 3.40169E09 2.47E13 2.47E13 2.47E13 2.47E13

3.40169E09 3.40169E09 3.40169E09 3.40169E09 2.47E13 2.47E13 2.47E13 2.47E13

1.3359E07 1.3359E07 1.3359E07 1.3359E07 4.42473E07 4.42473E07 4.42473E07 4.42473E07

xð1Þ xð2Þ xð1Þ xð2Þ

0.9999993 0.999998595 1.000001017 1.000002031

0.999996589 0.999993156 0.999997873 0.999995733

0.999996589 0.999993156 0.999999952 0.999999903

xð1Þ xð2Þ xð3Þ xð4Þ xð1Þ xð2Þ xð3Þ xð4Þ

1.000000433 1.000000863 1.00000172 1.000003444 0.999999591 0.999999187 0.99999837 0.999996727

1.000000062 1.000000124 1.000000249 1.000000498 1.000000078 1.000000154 1.000000306 1.000000614

1.000000062 1.000000124 1.000000249 1.000000498 1.000000008 1.000000007 1.000000003 1.000000003

362

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

initialization of the ﬁlter and so far we could not conclude which initialization is better. In all cases except Problem 1 the global minimizers are reached. For Problem 1 with both initial points, Algorithm 1 does not converge to the global minimizer while Algorithm 2(I) – the ﬁrst initial point and Algorithm 2 (II) – the ﬁrst and second initial points reach the global minimizer. Acknowledgement We are grateful to the anonymous referee whose comments helped us to improve the presentation of this paper. Research Supported by Ministry of Science, Republic of Serbia, Grant No. 144006. References [1] M.M. Ali, Ch. Khompatraporn, Z.B. Zabinsky, A numerical evaluation of several stochastic algorithms on selected continuous global optimization test problems, Journal of Global Optimization 31 (2005) 635–672. [2] C. Brezinski, A clasiﬁcation of quasi-Newton methods, Numerical Algorithms 33 (2003) 123–135. [3] C.G. Broyden, A class of methods for solving nonlinear simultaneous equations, Mathematics of Computation 19 (1965) 577–593. [4] R.H. Byrd, J. Nocedal, A tool for the analysis of quasi-Newton methods with application to unconstrained minimization, SIAM Journal of Numerical Analysis 26 (3) (1989) 727–739. [5] C.M. Chin, A global convergence theory of a ﬁlter line search method for nonlinear programming, Numerical Optimization Report, Department of Statistics, University of Oxford, England, UK, 2002. . [6] Y. Deng, Z. Liu, Two derivative-free algorithms for nonlinear equations, Optimization Methods and Software 23 (3) (2008) 395–410. [7] J.E. Dennis Jr., J.J. Moré, Quasi-Newton methods, motivation and theory, SIAM Review 19 (1977) 46–89. [8] J.E. Dennis Jr., R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall, Englewood Cliffs, 1983. [9] R. Fletcher, S. Leyffer, Nonlinear programming without a penalty function, Mathematical Programming 91 (2002) 239–270. [10] R. Fletcher, S. Leyffer, Ph. Toint, A brief history of ﬁlter methods, SIAG/OPT Views-and-News, A Forum for the SIAM Activity Group on Optimization 18 (1) (2007) 2–12. [11] C.C. Gonzaga, E. Karas, M. Vanti, A globally convergent ﬁlter method for nonlinear programming, SIAM Journal of Optimization 14 (3) (2003) 646–669. [12] N.I.M. Gould, S. Leyffer, Ph. L. Toint, A multidimensional ﬁlter algorithm for nonlinear equations and nonlinear least-squares, SIAM Journal of Optimization 15 (1) (2004) 17–38. [13] N.I.M. Gould, Ph. L. Toint, C. Sainvitu, A ﬁlter-trust-region method for unconstrained optimization, SIAM Journal of Optimization 16 (2006) 341–357. [14] E.W. Karas, A.P. Oening, A.A. Ribeiro, Global convergence of slanting ﬁlter methods for nonlinear programming, Applied Mathematics and Computation 200 (2008) 486–500. [15] D. Li, M. Fukushima, A globally and superlinearly convergent Gauss–Newton-based BFGS method for symmetric nonlinear equations, SIAM Journal of Numerical Analysis 37 (1) (1999) 152–172. [16] J.M. Martinez, Practical quasi-Newton methods for solving nonlinear systems, Journal of Computation and Applied Mathematics 124 (2000) 97–122. [17] J. Nocedal, S.J. Wright, Numerical Optimization, Springer-Verlag, New York, 1999. [18] A. Wächter, L.T. Biegler, Line search ﬁlter methods for nonlinear programming: motivation and global convergence, SIAM Journal of Optimization 16 (1) (2005) 1–31.

Gauss–Newton-based BFGS method with filter for unconstrained minimization

Gauss–Newton-based BFGS method with filter for unconstrained minimization

Recommend Documents