Gauss–Newton-based BFGS method with filter for unconstrained minimization

Gauss–Newton-based BFGS method with filter for unconstrained minimization

Applied Mathematics and Computation 211 (2009) 354–362 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepag...

194KB Sizes 0 Downloads 102 Views

Applied Mathematics and Computation 211 (2009) 354–362

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Gauss–Newton-based BFGS method with filter for unconstrained minimization Nataša Krejic´ a,*, Zorana Luzˇanin a, Irena Stojkovska b a b

Department of Mathematics and Informatics, Faculty of Science, University of Novi Sad, Trg Dositeja Obradovic´a 4, 21000 Novi Sad, Serbia Department of Mathematics, Faculty of Natural Sciences and Mathematics, St. Cyril and Methodius University, Gazi Baba b.b., 1000 Skopje, Macedonia

a r t i c l e

i n f o

Keywords: BFGS method Filter methods Global convergence Unconstrained minimization

a b s t r a c t One class of the lately developed methods for solving optimization problems are filter methods. In this paper we attached a multidimensional filter to the Gauss–Newton-based BFGS method given by Li and Fukushima [D. Li, M. Fukushima, A globally and superlinearly convergent Gauss–Newton-based BFGS method for symmetric nonlinear equations, SIAM Journal of Numerical Analysis 37(1) (1999) 152–172] in order to reduce the number of backtracking steps. The proposed filter method for unconstrained minimization problems converges globally under the standard assumptions. It can also be successfully used in solving systems of symmetric nonlinear equations. Numerical results show reasonably good performance of the proposed algorithm. Ó 2009 Elsevier Inc. All rights reserved.

1. Introduction Consider the following problem:

minn f ðxÞ;

ð1Þ

x2R

where f : Rn ! R is two times continuously differentiable function. There are various methods for solving the unconstrained minimization problem (1), see [8,17]. One of the most exploited is the class of quasi-Newton methods, firstly proposed by Broyden [3] in 1965. They are defined by the iterative formula

xkþ1 ¼ xk þ kk pk ; where kk > 0 is the step length that is updated by line search and backtracking procedure and pk ¼ B1 k rf ðxk Þ is the quasiNewton search direction, where rf ðxÞ is the gradient mapping of f ðxÞ. In quasi-Newton methods Bk is an approximation of the Hessian r2 f ðxk Þ and it is updated at every iteration with some low-rank matrix. One of the well known update formula is BFGS formula given by

Bkþ1 ¼ Bk 

Bk sk sTk Bk yk yTk þ T ; sTk Bk sk y k sk

where sk ¼ xkþ1  xk and yk ¼ rf ðxkþ1 Þ  rf ðxk Þ. BFGS update formula possesses nice properties that are very useful for establishing convergence results, see [4]. For better understanding of quasi-Newton methods, we refer to [2,7,16].

* Corresponding author. E-mail addresses: [email protected] (N. Krejic´), [email protected] (Z. Luzˇanin), [email protected] (I. Stojkovska). 0096-3003/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2009.01.041

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

355

In [15], Li and Fukushima proposed a new modified BFGS method called Gauss–Newton-based BFGS method for symmetric nonlinear equations

gðxÞ ¼ 0;

ð2Þ n

n

n

where the mapping g : R ! R is differentiable and the Jacobian rgðxÞ is symmetric for all x 2 R . By introducing a new line search technique and modifying BFGS update formula they proved global and superlinear convergence of their method under suitable conditions. If we denote by g : Rn ! Rn the gradient mapping of the function f ðxÞ from (1), then gðxÞ is differentiable and the Hessian r2 f ðxÞ is symmetric for all x 2 Rn . The first order optimality conditions for (1) are given by (2) and therefore the method from [15] is applicable for solving (1). Like many other quasi-Newton methods the Gauss–Newton-based BFGS method [15] uses a backtracking procedure in determining the step length. A line search procedure can result in a very small step lengths. In order to avoid these small steps, as well as to reduce the number of backtracking procedures, in this paper we associate a filter to the Gauss–Newton-based BFGS method. Filter methods, one of the latest developments in global optimization algorithms, were firstly proposed by Fletcher in 1996, [9]. First filter methods are a kind of alternative to penalty functions used in constrained nonlinear programming optimization algorithms, see [5,11,14,18]. In these methods, filter functions penalize the constraints violations less restrictively then penalty functions. The main purpose of filters is allowing the full Newton step to be taken more often and thus inducing global convergence of the method. Filter methods are extended to solving nonlinear equations, see [6,12] and to unconstrained optimization, [13]. For detailed reading about filter methods developments we refer to [10]. In this paper we add a filter to the Gauss–Newton-based BFGS method [15] in order to solve the unconstrained minimization problem (1). Our filter method has global convergence property and in some test problems shows better performance than the method without filters. The filter in our paper is inspired by the multidimensional filter in [6,12] although there are minor differences. In Section 2 we present the Gauss–Newton-based BFGS method [15], introduce the multidimensional filter and state our Gauss–Newton-based BFGS method with filters. In Section 3 we prove the global convergence of our method under a set of standard assumptions. Some numerical results are reported in Section 4. 2. Algorithm Let us first briefly explain the main properties of the Gauss–Newton-based BFGS method [15] for solving problem (1). That method will be used in combination with a filter in this paper. Given the gradient mapping g and current iteration xk ¼ xk1 þ kk1 pk1 ; the search direction pk is obtained from the following linear system:

Bk p þ k1 k1 ðgðxk þ kk1 gðxk ÞÞ  gðxk ÞÞ ¼ 0:

ð3Þ

If the inequality

kgðxk þ pk Þk 6 qkgðxk Þk

ð4Þ

is not satisfied for some fixed q 2 ½0; 1Þ then a line search procedure is applied. The line search rule is governed by a positive P sequence fek g satisfying k ek < 1: For the chosen sequence fek g and fixed parameters r1 ; r2 > 0 we take xkþ1 ¼ xk þ kk pk as a new iterate if

kgðxk þ kk pk Þk2  kgðxk Þk2 6 r1 kkk gðxk Þk2  r2 kkk pk k2 þ ek kgðxk Þk2 :

ð5Þ

Otherwise the step size is decreased, kk :¼ rkk ; 0 < r < 1 until (5) is satisfied. The acceptance rule (5) starts with kk ¼ 1 and is well defined since the inequality has to be satisfied for kk > 0 small enough due to the presence of ek > 0 at the right-hand side of (5). After determining xkþ1 ; the new matrix Bkþ1 is obtained by (6), where

sk ¼ xkþ1  xk ¼ kk pk ; dk ¼ gðxkþ1 Þ  gðxk Þ; yk ¼ gðxk þ dk Þ  gðxk Þ; Bkþ1 ¼ Bk 

Bk sk sTk Bk yk yTk þ T : sTk Bk sk y k sk

ð6Þ

The usual safe guard condition yTk sk > 0 is used, i.e. if yTk sk 6 0 then Bkþ1 ¼ Bk . In the computational implementation of the algorithm presented in Section 4 we used BFGS approximation of the inverse Jacobian, Hk ¼ B1 k and therefore determine the search direction as

pk ¼ Hk k1 k1 ðgðxk þ kk1 gðxk ÞÞ  gðxk ÞÞ

ð7Þ

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

356

with

    sk yT y sT sk sT Hkþ1 ¼ I  T k Hk I  kT k þ T k : y k sk y k sk y k sk

ð8Þ

The Gauss–Newton-based BFGS is globally convergent under a set of standard assumptions, [15]. The same set of assumptions is used in this paper and is listed as A1–A3 in Section 3. Let us now introduce the multidimensional filter we will use in combination with the Gauss–Newton-based BFGS from [15]. Our filter is defined similarly as in [6,12]. Eq. (2) are partitioned into m sets fg i ðxÞgi2Ij , j ¼ 1; . . . ; m, with the property f1; . . . ; ng ¼ I1 [    [ Im , Ij \ Ik ¼ ;, j – k and the filter functions are defined as def

/j ðxÞ ¼ kg Ij ðxÞk for j ¼ 1; . . . ; m;

ð9Þ

where k  k is the Euclidean norm and g Ij is the vector whose components are the components of g indexed by Ij . With this notation x satisfies the optimality conditions of (1) if and only if /j ðx Þ ¼ 0 for all j ¼ 1; . . . ; m. The following abbreviations will be used def

def

/ðxÞ ¼ ð/1 ðxÞ; . . . ; /m ðxÞÞT ;

def

/k ¼ /ðxk Þ and /j;k ¼ /j ðxk Þ:

A filter is a list F of m-tuples of the form ð/1;k ðxÞ; /2;k ðxÞ; . . . ; /m;k ðxÞÞ such that

/j;k < /j;l for at least one j 2 f1; . . . ; mg and for all k – l: To understand the meaning and usage of filter a concept of domination is introduced. A point x1 dominates a point x2 whenever

/j ðx1 Þ 6 /j ðx2 Þ for all j ¼ 1; . . . ; m: Therefore we say that filter keeps all iterates that are not dominated by other iterates in it. In this work, we use a filter to construct an additional acceptability condition for a new trial iterate xþ k ¼ xk þ sk : This condition is slightly different from the one in [12]. The first difference is function d1 in (10), where we use max function while in [12] min is considered. The procedure for removing points from the filter is also different. The reason for these changes is empirical since we realized that the algorithm is more efficient with the rule proposed in this paper. We say that a new trial point xþ k is acceptable for the filter F if

8/l 2 F 9j 2 f1; . . . ; mg /j ðxþk Þ < /j;l  c/ d1 ðk/l k; k/þk kÞ;

ð10Þ

where c/ 2 ð0; 1Þ is a small positive constant and

d1 ðk/l k; k/þk kÞ ¼ maxfk/l k; k/þk kg: When a trial point is acceptable for the filter, we add it to the filter immediately (again different from [12]). In other words þ þ þ T we add the m-tuple /þ k ¼ /ðxk Þ ¼ ð/1 ðxk Þ; . . . ; /m ðxk ÞÞ to the filter F. þ When we add a new trial point xk to the filter, we remove all points from the filter that are dominated by the trial point , which means that we remove m-tuples ð/1;ki ; . . . ; /m;ki ÞT 2 F n f/þ xþ k g from the filter if k

/j ðxþk Þ 6 /j;ki ;

j ¼ 1; . . . ; m:

Every trial point which is acceptable for the filter is taken as a new iterate. Our implementation of the multidimensional filter in the Gauss–Newton-based BFGS with filter method is done in such a is not acceptable to the filter. The proposed filter algoway that the backtracking kk :¼ rkk is enforced only if a trial point xþ k rithm gives its best performance in the case when m ¼ n and Ij ¼ fjg – similarly as it is discussed in [10]. In this case we will have /j ðxÞ ¼ jg j ðxÞj, /ðxÞ ¼ gðxÞ, /k ¼ gðxk Þ and /j;k ¼ jg j ðxk Þj. Now we are ready to state the new algorithm. ALGORITHM GNbBFGSf. Gauss–Newton-based BFGS method with filter Step 0. Choose an initial point x0 2 Rn , an initial symmetric positive definite matrix B0 2 Rnn , a positive sequence fek g satP isfying 1 k¼0 ek < 1, and constants r; q; c/ 2 ð0; 1Þ, r1 ; r2 > 0, k1 > 0. Let k :¼ 0. Step 1. If gðxk Þ ¼ 0 then stop. Otherwise, solve the following linear equation to get pk :

Bk p þ k1 k1 ðgðxk þ kk1 gðxk ÞÞ  gðxk ÞÞ ¼ 0:

ð11Þ

Take kk ¼ 1 and go to Step 2. Step 2. Let the trial point be xþ k ¼ xk þ kk pk . If

kgðxþk Þk2  kgðxk Þk2 6 r1 kkk gðxk Þk2  r2 kkk pk k2 þ ek kgðxk Þk2 xþ k

ð12Þ xþ k

then go to Step 3, else if is acceptable to the filter then add to the filter, remove all points from the filter that are dominated by xþ k and go to Step 3. Otherwise, put kk :¼ rkk and repeat Step 2.

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

357

Step 3. Take the next iterate xkþ1 ¼ xþ . k Step 4. Put

sk ¼ xkþ1  xk ¼ kk pk ; dk ¼ gðxkþ1 Þ  gðxk Þ; yk ¼ gðxk þ dk Þ  gðxk Þ: If yTk sk 6 0, then Bkþ1 ¼ Bk and go to Step 5. Otherwise, update Bk

Bkþ1 ¼ Bk 

Bk sk sTk Bk yk yTk þ T sTk Bk sk y k sk

ð13Þ

and go to Step 5. Step 5. Let k :¼ k þ 1. Go to Step 1. At the beginning of Algorithm GNbBFGSf, the filter is initialized to be empty F ¼ ; or to be F ¼ f/ðxÞ : /j ðxÞ P /j max ; j ¼ 1; . . . ; mg for any /j max > /j ðx0 Þ, j ¼ 1; . . . ; m (similarly as in [18]). In practical implementations of this method, Section 4, the filter is initialized to be empty F ¼ ; or to be F ¼ f/ðx0 Þg. One should notice that if there are finitely many values of /k that are added to the filter, then for all k large enough (for all k P k0 where /k0 is the last m-tuple added to the filter), Algorithm GNbBFGSf is the same as the one considered in [15]. 3. Convergence result In this section, we are going to establish the global convergence of Algorithm GNbBFGSf. Some convergence results from [15] will be used. To do that we need the following assumption, see [13], for the values /k that are added to the filter by Algorithm GNbBFGSf. Assumption. A0. There exists a constant C > 0 such that

k/k k 6 C for all values /k which are added to the filter by GNbBFGSf. Let X be the level set defined by e

X ¼ fx : kgðxÞk 6 e2 maxfC; kgðx0 Þkgg; where

ð14Þ

e is a positive constant such that 1 X

ek < e:

ð15Þ

k¼0

Then we have the following lemma. Lemma 3.1. Let the assumption A0 hold and let fxk g be generated by Algorithm GNbBFGSf. Then fxk g  X. 0

Proof. Let xk be the current iterate. If there are no values /k0 ; k 6 k before it that are added to the filter, then e

kgðxk Þk 6 e2 kgðx0 Þk;

ð16Þ

as shown in [15]. In fact for each two consecutive iterates xk1 and xk which were not added to the filter the inequality (12) holds which means that 1

kgðxk Þk 6 ð1 þ ek1 Þ2 kgðxk1 Þk;

ð17Þ

where e is a constant satisfying (15). So (16) is obtained. Now let xiðkÞ be the last iterative point before xk such that /iðkÞ was added to the filter. Then using (17) we have 1

kgðxk Þk 6 ð1 þ ek1 Þ2 kgðxk1 Þk 1

1

6 ð1 þ ek1 Þ2 ð1 þ ek2 Þ2 kgðxk2 Þk .. . 0 1 k1 Y 1 ð1 þ ej Þ2 AkgðxiðkÞ Þk 6@ j¼iðkÞ

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

358

0

1kiðkÞ 2 k1 X 1 6 kgðxiðkÞ Þk@ ð1 þ ej ÞA k  iðkÞ j¼iðkÞ 0

1kiðkÞ 2 k1 X 1 ¼ kgðxiðkÞ Þk@1 þ ej A k  iðkÞ j¼iðkÞ  6 kgðxiðkÞ Þk 1 þ

e

ð18Þ

kiðkÞ 2

k  iðkÞ

e

6 e2 kgðxiðkÞ Þk e

6 e2 C;

ð19Þ

where e is a constant satisfying (15) and C > 0 is a constant from assumption A0. From (16) and (19) there follows fxk g  X.  The following set of assumption [15] is also necessary for the global convergence analysis of Algorithm GNbBFGSf. Assumption. A1. g is continuously differentiable on an open set X1 containing X. A2.rg is symmetric and bounded on X1 i.e. rgðxÞT ¼ rgðxÞ for every x 2 X1 and there exists a positive constant M such that

krgðxÞk 6 M

8x 2 X1 :

A3.rg is uniformly nonsingular on X1 i.e. there is a constant m > 0 such that

mkpk 6 krgðxÞpk 8x 2 X1 ; p 2 Rn : As the mapping g is the gradient of f and f is two times continuously differentiable, A1 and A2 are clearly satisfied while A3 is a standard assumption for global convergence of optimization algorithms. In order to prove the global convergence of Algorithm GNbBFGSf we distinguish two cases. The first one is when there are finitely many values of /k that are added to the filter by Algorithm GNbBFGSf, and for that case our algorithm is the same as the Gauss–Newton-based BFGS from [15]. Therefore, we will not consider that case in this paper. The second case occurs when there are infinitely many values of /k that are added to the filter by Algorithm GNbBFGSf. For that case we state the following lemma. Lemma 3.2. Let the assumptions A0–A3 hold and assume that infinitely many values of /k are added to the filter by Algorithm GNbBFGSf. Then

lim kgðxk Þk ¼ 0:

k!1

ð20Þ

Proof. The first part of the proof is based on [6,12]. Let fki g be the subsequence of iterations such that /ki ¼ /þ ki 1 is added to the filter. Now assume that there exists a subsequence fkj g # fki g such that

k/kj k P e1 for some constant such that

lim /kl ¼ /1 l!1

ð21Þ

e1 > 0. Since by the assumption we know that fk/kj kg is bounded, there exists a subsequence fkl g # fkj g with k/1 k P e1 :

Moreover, by definition of fkl g, /kl is acceptable for every l, which implies in particular that for each l there exists an index j 2 f1; . . . ; mg such that

/j;kl  /j;kl1 < c/ k/kl k

ð22Þ

as we will show now. Let us consider the following two cases: (i) If /kl1 is still in the filter, then there exists an index j 2 f1; . . . ; mg such that

/j;kl  /j;kl1 < c/ d1 ðk/kl k; k/kl1 kÞ 6 c/ k/kl k: (ii) If /kl1 is removed by one trial point, say /kl0 with kl1 < kl0 < kl , and /kl0 is still in the filter, then there exists a j 2 f1; . . . ; mg such that

/j;kl  /j;kl0 < c/ d1 ðk/kl k; k/kl0 kÞ 6 c/ k/kl k

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

359

and

/j;kl0 6 /j;kl1 ; so (22) holds in this case too. From (22), (21) and fkl g # fkj g, we deduce that

/j;kl  /j;kl1 < c/ e1 ; but the left-hand side of it tends to zero when l tends to infinity. This leads to the contradiction. Hence

lim k/ki k ¼ 0:

ð23Þ

i!1

Consider now any l R fki g and let kiðlÞ be the last iteration before l such that /kiðlÞ was added to the filter. The following inequality can be derived like (18):

0 kgðxl Þk 6 kgðxkiðlÞ Þk@1 þ

1 l  kiðlÞ

l1 X

1lk2iðlÞ

ej A

:

ð24Þ

j¼kiðlÞ

By the definition of fek g, we know that

lim l!1

l1 X

ej ¼ 0;

j¼kiðlÞ

and consequently

0 lim @1 þ l!1

1 l  kiðlÞ

l1 X

1lk2iðlÞ

ej A

¼ 1:

ð25Þ

j¼kiðlÞ

From previously proved (23) we have that

lim k/kiðlÞ k ¼ 0:

ð26Þ

l!1

Equalities (25) and (26) imply that when l tends to infinity then the right-hand side of (24) tends to zero and therefore

lim kgðxl Þk ¼ 0; l!1

which completes the proof.

h

Lemma 3.2 and global convergence of Gauss–Newton-based BFGS algorithm from [15] clearly imply the next global convergence theorem. Theorem 3.3. Let the assumptions A0–A3 hold. Then the sequence fxk g generated by Algorithm GNbBFGSf converges to the unique solution x of problem (2). Proof. If there are finitely many values of /k that are added to the filter by Algorithm GNbBFGSf, then Algorithm GNbBFGSf is the same as the Gauss–Newton-based BFGS algorithm from [15] after some finite number of iterative steps, and in [15] it is proved that under assumptions A1–A3 fkgðxk Þkg converges and

lim inf kgðxk Þk ¼ 0: k!1

ð27Þ

The case when infinitely many values of /k are added to the filter by Algorithm GNbBFGSf is considered in Lemma 3.2. So, from (27) and (20), we can say that every accumulation point of fxk g is a solution of (2). Since rg is uniformly nonsingular on X1 , (2) has only one solution. Since X is bounded, fxk g  X has at least one accumulation point. Therefore, fxk g converges to the unique solution of (2). h 4. Numerical results In this section we report results of some numerical experiments with the proposed method and give comparison with the method considered in [15]. Problems 1, 2 and 3 are of fixed dimension, while problems 4 and 5 can have various dimensions. Problem 1. [6].

min f ðxÞ; x2R2

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

360

where

f ðxÞ :¼

1 2 ððx  x2  1Þ2 þ ððx1  2Þ2 þ ðx2  0:5Þ2  1Þ2 Þ: 2 1

Problem 2 [6].

min f ðxÞ; x2R3

where

f ðxÞ ¼

1 ðð12x1  x22  4x3  7Þ2 þ ðx21 þ 10x2  x3  11Þ2 þ ðx22 þ 10x3  8Þ2 Þ: 2

Problem 3 [1]. Wood’s Function (WF)

min f ðxÞ; x2R4

where

f ðxÞ ¼ 100ðx2  x21 Þ2 þ ð1  x1 Þ2 þ 90ðx4  x23 Þ2 þ ð1  x3 Þ2 þ 10:1ððx2  1Þ2 þ ðx4  1Þ2 Þ þ 19:8ðx2  1Þðx4  1Þ:

Problem 4 [1]. Cosine Mixture Problem (CM)

minn f ðxÞ; x2R

where

f ðxÞ ¼

n X

x2i  0:1

i¼1

n X

cosð5pxi Þ:

i¼1

Problem 5 [1]. Rosenbrock Problem (RB)

minn f ðxÞ; x2R

where

f ðxÞ ¼

n1 X ð100ðxiþ1  x2i Þ2 þ ðxi  1Þ2 Þ: i¼1

Table 1 Comparison of GNbBFGS methods. prb

n

x0

Algorithm 1

Algorithm 2 (I)

Algorithm 2(II)

iter

gcalc

iter

gcalc

ftr

iter

gcalc

ftr

1

2

(1, 1) (5, 5)

45 46

173 177

33 107

119 393

1 3

41 47

164 188

2 2

2

3

(0, 0, 0) (1, 1, 1)

27 45

103 184

32 32

127 119

1 2

35 32

133 119

3 2

3

4

(0.5, 0.5, 0.5, 0.5) (1.5, 0.5, 1.5, 0.5)

659 619

3126 2661

335 852

1468 3754

3 5

361 2123

1640 10,992

2 3

4

2

(1, 1) (5, 5)

104 145

738 1010

98 336

650 2727

2 3

52 336

299 2727

2 3

4

4

(1, 1, 1, 1) (5, 5, 5, 5)

104 188

738 1378

104 188

738 1378

0 0

42 22

245 109

1 1

5

2

(0.5, 0.5) (1.2, 1.2)

239 242

988 1141

511 178

2091 755

5 3

511 253

2091 1129

5 4

5

4

(0.5, 0.5, 0.5, 0.5) (1.2, 1.2, 1.2, 1.2)

266 288

1065 1211

1712 265

7713 1107

2 3

1712 285

7713 1246

2 3

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

361

Experiments were done skipping Step 2 in the Gauss–Newton-based BFGS method from [15], i.e. skipping the monotone decrease acceptance condition (4) and with updating the approximation the inverse Jacobian Hk . The parameters were cho2 sen to be r ¼ 0:1, r1 ¼ r2 ¼ 105 , k1 ¼ 0:01, ek ¼ k and c/ ¼ 0:5, and the initial matrix H0 was set to be the identity matrix. Test were performed in Matlab. The iteration procedure was stopped when the condition kgðxk Þk 6 eps1=3 with eps ¼ 252 was satisfied. Three methods were tested. In Tables 1 and 2 the following abbreviations are used: Algorithm 1 is the Gauss–Newtonbased BFGS method from [15], Algorithm 2(I) is the Gauss–Newton-based BFGS method with filter described in Algorithm GNbBFGSf with the initialization of the filter F ¼ f/ðx0 Þg and Algorithm 2(II) is the Gauss–Newton-based BFGS method with filter described in Algorithm GNbBFGSf and the initialization F ¼ ;. Table 1 shows the initial point x0 , total number of iteration (iter), number of calculations of the function g (gcalc) and number of steps where the filter was used (ftr). Table 2 shows the values of the last iterative points. The overall performance of all algorithms was good. It is clear that comparison of Algorithms 1 and 2 depends on an initial point. From Table 1 we can see that in some cases Algorithm 1 still has better performance: Problem 1 – the second initial point, Problem 2 – the first initial point, Problem 3 – the second initial point, Problem 4 – the second initial point, Problem 5 – the first and third initial point. But in all remaining cases one of the Algorithm 2 or both of them are superior to Algorithm 1. This shows that the usage of filter is justified and that it can reduce the number of iterations and consequently the number of calculations of the function g. In some cases the decrease was quite significant, Problem 3 – the first initial point, Problem 4 – first initial point. From the same Table 1 we can also see that the performance of the Algorithm 2 depends on the

Table 2 Final points reached by the tested algorithms. prb

n

x0

1

2

(1, 1) (5, 5)

2

3

(0, 0, 0)

(1, 1, 1)

3

4

(0.5, 0.5, 0.5, 0.5)

(1.5, 0.5, 1.5, 0.5)

4

2

(1, 1) (5, 5)

4

4

(1, 1, 1, 1)

(5, 5, 5, 5)

5

2

(0.5, 0.5) (1.2, 1.2)

5

4

(0.5, 0.5, 0.5, 0.5)

(1.2, 1.2, 1.2, 1.2)

Algorithm 1

Algorithm 2(I)

Algorithm 2(II)

xð1Þ xð2Þ xð1Þ xð2Þ

1.257618339 0.795151479 1.257619385 0.795153222

1.546342884 1.391176313 1.25761907 0.795152509

1.067346967 0.139230758 1.067346082 0.139227653

xð1Þ xð2Þ xð3Þ xð1Þ xð2Þ xð3Þ

0.908926356 1.085600025 0.682147237 0.908926364 1.085600016 0.682147253

0.908926369 1.085600012 0.68214726 0.908926365 1.085600015 0.682147254

0.908926367 1.085600018 0.682147259 0.908926365 1.085600015 0.682147254

xð1Þ xð2Þ xð3Þ xð4Þ xð1Þ xð2Þ xð3Þ xð4Þ

1.000082461 1.000161982 0.999929022 0.999855112 1.000000103 1.000000204 0.999999899 0.999999796

1.000051394 1.000102612 0.999949906 0.999899011 0.999998594 0.999997189 1.000001296 1.000002592

0.999999082 0.999998153 1.00000093 1.000001867 1.00000006 1.00000012 0.999999941 0.999999881

xð1Þ xð2Þ xð1Þ xð2Þ

3.40169E09 3.40169E09 1.9795E11 1.5949E11

9.37398E09 9.37398E09 5.99218E09 2.15731E07

2.33E12 2.563E12 5.99218E09 2.15731E07

xð1Þ xð2Þ xð3Þ xð4Þ xð1Þ xð2Þ xð3Þ xð4Þ

3.40169E09 3.40169E09 3.40169E09 3.40169E09 2.47E13 2.47E13 2.47E13 2.47E13

3.40169E09 3.40169E09 3.40169E09 3.40169E09 2.47E13 2.47E13 2.47E13 2.47E13

1.3359E07 1.3359E07 1.3359E07 1.3359E07 4.42473E07 4.42473E07 4.42473E07 4.42473E07

xð1Þ xð2Þ xð1Þ xð2Þ

0.9999993 0.999998595 1.000001017 1.000002031

0.999996589 0.999993156 0.999997873 0.999995733

0.999996589 0.999993156 0.999999952 0.999999903

xð1Þ xð2Þ xð3Þ xð4Þ xð1Þ xð2Þ xð3Þ xð4Þ

1.000000433 1.000000863 1.00000172 1.000003444 0.999999591 0.999999187 0.99999837 0.999996727

1.000000062 1.000000124 1.000000249 1.000000498 1.000000078 1.000000154 1.000000306 1.000000614

1.000000062 1.000000124 1.000000249 1.000000498 1.000000008 1.000000007 1.000000003 1.000000003

362

N. Krejic´ et al. / Applied Mathematics and Computation 211 (2009) 354–362

initialization of the filter and so far we could not conclude which initialization is better. In all cases except Problem 1 the global minimizers are reached. For Problem 1 with both initial points, Algorithm 1 does not converge to the global minimizer while Algorithm 2(I) – the first initial point and Algorithm 2 (II) – the first and second initial points reach the global minimizer. Acknowledgement We are grateful to the anonymous referee whose comments helped us to improve the presentation of this paper. Research Supported by Ministry of Science, Republic of Serbia, Grant No. 144006. References [1] M.M. Ali, Ch. Khompatraporn, Z.B. Zabinsky, A numerical evaluation of several stochastic algorithms on selected continuous global optimization test problems, Journal of Global Optimization 31 (2005) 635–672. [2] C. Brezinski, A clasification of quasi-Newton methods, Numerical Algorithms 33 (2003) 123–135. [3] C.G. Broyden, A class of methods for solving nonlinear simultaneous equations, Mathematics of Computation 19 (1965) 577–593. [4] R.H. Byrd, J. Nocedal, A tool for the analysis of quasi-Newton methods with application to unconstrained minimization, SIAM Journal of Numerical Analysis 26 (3) (1989) 727–739. [5] C.M. Chin, A global convergence theory of a filter line search method for nonlinear programming, Numerical Optimization Report, Department of Statistics, University of Oxford, England, UK, 2002. . [6] Y. Deng, Z. Liu, Two derivative-free algorithms for nonlinear equations, Optimization Methods and Software 23 (3) (2008) 395–410. [7] J.E. Dennis Jr., J.J. Moré, Quasi-Newton methods, motivation and theory, SIAM Review 19 (1977) 46–89. [8] J.E. Dennis Jr., R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall, Englewood Cliffs, 1983. [9] R. Fletcher, S. Leyffer, Nonlinear programming without a penalty function, Mathematical Programming 91 (2002) 239–270. [10] R. Fletcher, S. Leyffer, Ph. Toint, A brief history of filter methods, SIAG/OPT Views-and-News, A Forum for the SIAM Activity Group on Optimization 18 (1) (2007) 2–12. [11] C.C. Gonzaga, E. Karas, M. Vanti, A globally convergent filter method for nonlinear programming, SIAM Journal of Optimization 14 (3) (2003) 646–669. [12] N.I.M. Gould, S. Leyffer, Ph. L. Toint, A multidimensional filter algorithm for nonlinear equations and nonlinear least-squares, SIAM Journal of Optimization 15 (1) (2004) 17–38. [13] N.I.M. Gould, Ph. L. Toint, C. Sainvitu, A filter-trust-region method for unconstrained optimization, SIAM Journal of Optimization 16 (2006) 341–357. [14] E.W. Karas, A.P. Oening, A.A. Ribeiro, Global convergence of slanting filter methods for nonlinear programming, Applied Mathematics and Computation 200 (2008) 486–500. [15] D. Li, M. Fukushima, A globally and superlinearly convergent Gauss–Newton-based BFGS method for symmetric nonlinear equations, SIAM Journal of Numerical Analysis 37 (1) (1999) 152–172. [16] J.M. Martinez, Practical quasi-Newton methods for solving nonlinear systems, Journal of Computation and Applied Mathematics 124 (2000) 97–122. [17] J. Nocedal, S.J. Wright, Numerical Optimization, Springer-Verlag, New York, 1999. [18] A. Wächter, L.T. Biegler, Line search filter methods for nonlinear programming: motivation and global convergence, SIAM Journal of Optimization 16 (1) (2005) 1–31.