A nonmonotone trust region method for unconstrained optimization

A nonmonotone trust region method for unconstrained optimization

Applied Mathematics and Computation 171 (2005) 371–384 www.elsevier.com/locate/amc A nonmonotone trust region method for unconstrained optimization J...

186KB Sizes 0 Downloads 125 Views

Applied Mathematics and Computation 171 (2005) 371–384 www.elsevier.com/locate/amc

A nonmonotone trust region method for unconstrained optimization Jiangtao Mo

a,b,*

, Kecun Zhang a, Zengxin Wei

b

a

b

College of Science, Xian Jiaotong University, 710049, China College of Mathematics and Information Science, Guangxi University, 530004, China

Abstract In this paper, we propose a nonmonotone trust region method for unconstrained optimization. Our method can be regarded as a combination of nonmonotone technique, fixed steplength and trust region method. When a trial step is not accepted, the method does not resolve the subproblem but generates a iterative point whose steplength is defined by a formula. We only allow increase in function value when trial steps are not accepted in close succession of iterations. Under mild conditions, we prove that the algorithm is global convergence and superlinear convergence. Primary numerical results are reported.  2005 Elsevier Inc. All rights reserved. Keywords: Trust region method; Nonmonotone method; Fixed steplength; Unconstrained optimization

* Corresponding author. Present address: College of Mathematics and Information Science, Guangxi University, 530004, China. E-mail addresses: [email protected] (J. Mo), [email protected] (Z. Wei).

0096-3003/$ - see front matter  2005 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2005.01.048

372

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

1. Introduction In this paper, we consider the following unconstrained optimization problem: min

x 2 Rn

f ðxÞ;

ð1:1Þ

where f:Rn ! R is a twice continuously differentiable. It is well known that trust region method is a kind of important and efficient methods for nonlinear optimization. Since it was proposed by Levenberg [7] and Marquardt [8] for nonlinear least-squares problems and by Goldfeld et al. [5] for unconstrained optimization, trust region method has been studied by many researchers, see [1,2,4,10,14,15,17] and papers cited therein. Trust region methods are iterative methods. At each iterative point xk, a trial step dk is generated by solving the following subproblem: min

1 gTk d þ d T Bk d  /k ðdÞ 2

s:t: kdk 6 Dk ;

ð1:2Þ ð1:3Þ

n·n

where gk = $f(xk), Bk 2 R is an approximate Hessian matrix of f at xk, and Dk > 0 is a trust region radius. Some criterion is used to decided whether trial step dk is accepted or not. If trial step is not accepted, the subproblem (1.2) and (1.3) with a reduced trust region radius should be resolved until an acceptable step is found. Hence, the subproblem may be solved several times at an iteration and the total cost of computation for one iteration might be expensive for large scale problem. In recent years, a variety of trust region methods have been proposed in the literature. For example, Nocedal and Yuan [12], and Gertz [9] presented methods which combine line search technique and trust region method. When the trial step is not successful, their methods performance a line search to find a iterative point instead of resolving the subproblem. Therefore, their methods require little computation than classic trust region methods. Deng et al. [3], Zhang et al. [17] and Sun [13] proposed various nonmonotone trust region methods for unconstrained optimization. These papers indicated that the nonmonotone algorithm is efficient, especially for ill-conditioned problems. On the other hand, Sun and Zhang [6] and Chen and Sun [15] proposed a fixed steplength method for unconstrained optimization. In their approaches, without using line search, they computed the steplength by a formula at each iteration. Thus their methods might be practical in the cases that the line search is expensive or hard and allow a considerable saving in the number of function evaluations. In this paper, we consider a method which combines nonmonotone technique, fixed steplength and trust region method. Our aim is improve the algo-

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

373

rithm proposed in [12] and make it more effective in practical implementation. The main difference between the method in [12] and our method is that in the former one a steplength is computed by a line search when the trial step is not successful, whereas in our method a steplength is defined by a formula. We use the formula suggested by Sun and Zhang [6], Chen and Sun [15] to obtain an steplength. On the one hand, most of the nonmonotone method allows an increase in function value at each iteration. But our method only allows an increase in function value when trial steps are not accepted in close succession of iterations. The paper is organized as follows. In Section 2, we describe our algorithm for unconstrained optimization problems which combines the techniques of fixed steplength, nonmonotonicity and trust region method. In Section 3, under suitable conditions, global convergence and superlinear convergence of the proposed algorithm are established. Primary numerical results are presented in Section 4.

2. Algorithm In this section, we describe a method which combines nonmonotone technique, fixed steplength and trust region method. Throughout this paper, we use kÆk to represent the Euclid norm and denote f(xk) by fk, g(xk) by gk, etc. Vectors are column vectors unless a transpose is used. In each iteration, a trial step dk is generated by solving the trust region subproblem (1.2) and (1.3). As in [12], we solve (1.2) and (1.3) inaccurately such that kdkk 6 Dk and /k ð0Þ  /k ðd k Þ P skgk k minfDk ; kgk k=kBk kg

ð2:1Þ

d Tk gk 6 skgk k minfDk ; kgk k=kBk kg;

ð2:2Þ

and

where s 2 (0, 1) is a constant. To determine whether trial step will be accepted, we compute qk, the ratio between the actual reduction, fl(k)  f(xk + dk) and the predicted reduction, /k(0)  /k(dk), i.e., qk ¼

flðkÞ  f ðxk þ d k Þ ; /k ð0Þ  /k ðd k Þ

ð2:3Þ

where flðkÞ ¼ maxffkj : 0 6 j 6 mðkÞg and m(k) is an integer defined by

ð2:4Þ

374

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

(

0

mðkÞ ¼

if qk1 P c2

ð2:5Þ

minfmðk  1Þ þ 1; Mg otherwise; where M P 1 is an integer constant. If qk P c, we accept dk and let xk+1 = xk + dk. Otherwise, we generate a new iterative point by xk+1 = xk + akdk, where ak is a steplength computed by ak ¼ 

dgTk d k d Tk Bk d k

ð2:6Þ

and d 2 (0, 1) is a constant. Note that in our method, if the last trial step is successful, the function value will be reduce at current iteration. But if the last trial step is not successful in succession, our method should allow an increase in the function value at current iteration. Hence, our method is a nonmonotone method. Now, we describe the complete algorithm. Algorithm 1 (Nonmonotone trust region method) Step 1. Given x1 2 Rn, D1 > 0, c1, c2, c3 and c4 such that 0 < c2 < 1, 0 < c3 < c4 < 1 < c1; an integer constant M P 1, a symmetric positive define matrix B1 in Rn·n, set m(1): = 0 and k: = 1. Step 2. Solve (1.2) and (1.3) inaccurately so that kdkk 6 Dk, and so that (2.1) and (2.2) are satisfied. Step 3. Compute qk by (2.3). If qk P c2, set xkþ1 ¼ xk þ d k

ð2:7Þ

and go to Step 5. Step 4. Compute ak by (2.6) and set xkþ1 ¼ xk þ ak d k :

ð2:8Þ

Step 5. Compute m(k + 1) by (2.5) and compute Dk+1 by 8 2 ½c3 kd k k; c4 Dk  if qk < c2 ; > > < Dkþ1 ¼ Dk if qk P c2 and kd k k < Dk ; > > : if qk P c2 and kd k k ¼ Dk : 2 ½Dk ; c1 Dk 

ð2:9Þ

Generate Bk+1, set k: = k + 1 and go to Step 2. Remark 1. Applying the Algorithm 2.6 in [12], we can find an inaccurate solution of (1.2) and (1.3) that satisfies kdkk 6 Dk, (2.1) and (2.2).

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

375

Remark 2. The matrix Bk can be updated by BFGS formula [12]. 3. Global convergence From now on, we turn to the analysis of the behavior of Algorithm 1 when it is applied to problem (1.1). To this end, the following assumption is required. Assumption 1 (1) The level set L ¼ fx 2 Rn jf ðxÞ 6 f ðx1 Þg is bounded. (2) The function f(x) is LC1 in Rn, i.e., there exists l > 0 such that kgðxÞ  gðyÞk 6 lkx  yk

8 x; y 2 Rn ;

ð3:1Þ

where g(x) = $f(x). (3) Matrices {Bk} are positive definition and there exists x > 0 such that d T Bk d P xd T d

8d 2 Rn and k ¼ 1; 2; . . .

ð3:2Þ

For simplify, we define two index sets as follows: I ¼ fk : qk P c2 g

and

J ¼ fk : qk < c2 g:

Lemma 1. Let {xk} be the sequence generated by Algorithm 1. If Assumption 1 holds and d 2 ð0; x=lÞ:

ð3:3Þ

Then f ðxkþ1 Þ  flðkÞ 6 fð1  ld=xÞd=2ggTk d k

8k 2 J :

ð3:4Þ

Proof. The definition of fl(k) implies that fk 6 fl(k) for all k. Using Mean-value Theorem, we have fkþ1  flðkÞ 6 f ðxkþ1 Þ  fk ¼ gðnÞT ðxkþ1  xk Þ;

ð3:5Þ

where n 2 [xk,xk+1]. For k 2 J, from (2.8), (3.1), (2.6) and (3.2), we obtain T

T

gðnÞ ðxkþ1  xk Þ ¼ gTk ðxkþ1  xk Þ þ ðgðnÞ  gk Þ ðxkþ1  xk Þ 6 gTk ðxkþ1  xk Þ þ kgðnÞ  gk k kxkþ1  xk k 6 gTk ðxkþ1  xk Þ þ lkxkþ1  xk k2 2 ¼ ak gTk d k þ la2k kd k k 2 ¼ ð1  ldkd k k =d Tk Bk d k Þak gTk d k 6 ð1  ld=xÞak gTk d k :

ð3:6Þ

376

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

Note that (2.6), (2.1) and (1) imply that ak P d/2 for all k, and (3.3) implies that 1ld/x > 0. Thus, from (3.5), (3.6) and (2.2), it follows that (3.4) holds for all k 2 J. h Lemma 2. Let {xk} be the sequence generated by Algorithm 1. If Assumption 1 and (3.3) hold, then {fl(k)} is monotonically decreasing. Furthermore, xk 2 L(x1) for all k. Proof. We firstly show that the sequence {fl(k)} is monotonically increasing, i.e. flðkþ1Þ 6 flðkÞ

ð3:7Þ

for all k. For k 2 I, it follows from qk P c2, (2.1), (2.3) and (2.7) that f(xk+1) 6 fl(k). On the other hand, by qk P c2 and Step 5 of Algorithm 1, we have m(k + 1) = 0. Hence, fl(k+1) = f(xk+1) and then (3.7) holds for k 2 I. Suppose that k 2 J. It follows from (3.4) and (2.2) that fkþ1 6 flðkÞ :

ð3:8Þ

If m(k + 1) = 0, then fk+1 = fl(k + 1) and (3.8) implies that (3.7) holds. If m(k + 1) > 0, then m(k + 1) 6 m(k) + 1. By the definition of fl(k) and (3.8), we have (3.7) holds. Then (3.7) holds for all k. Now, we prove the last conclusion. The definition of fl(k) and Step 5 of Algorithm 1 imply that fk 6 fl(k) and fl(1) = f1. By (3.7), we have fl(k) 6 fl(1). Hence, xk 2 L(x1) for all k. h Lemma 3. Suppose that Assumption 1 and (3.3) hold, and there exists  > 0 such that kgkk P  for all k. Then there exits a constant c > 0 such that flðkÞ  fkþ1 P c minfDk ; =kBk kg

ð3:9Þ

holds for all k. Proof. If i 2 I or equivalently qk P c2, it follows from (2.1) and (2.3) that flðkÞ  fkþ1 P c2 ð/k ð0Þ  /k ðd k ÞÞ P c2 s minfDk ; =kBk kg:

ð3:10Þ

If j 2 J, from (3.4) and (2.2), it follows that flðkÞ  fkþ1 P fð1  ld=xÞd=2gðgTk d k Þ P fð1  ld=xÞd=2gs minfDk ; =kBk kg:

ð3:11Þ

Thus, (3.10) and (3.11) imply that (3.9) holds for all k with c = min{c2s, (1ld/x)ds/2}. h

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

377

Lemma 4. Suppose that Assumption 1 and (3.3) hold, and there exists  > 0 such that kgkk P  for all k. Then there exists m 2 (0, 1) such that lim

k!1

minfmDk ; =Z k g ¼ 0;

ð3:12Þ

where Zk = 1 + max1 6 i 6 kkBkk. Proof. We define S to be the set of integer k such that m(k) = 0. Let {ij : j = 1, 2, . . .} be a infinite set of integer which contains S and satisfies 1 6 ijþ1  ij 6 M þ 1

ð3:13Þ

for all j, and ijþ1  ij ¼ M þ 1

ð3:14Þ

if ij+1 62 S. Note that i1 = 1 for m(1) = 0. Next, we show that inequality flðij Þ  flðijþ1 Þ P c minfDijþ1 =c1 ; =Z ijþ1 g

ð3:15Þ

for all j. We consider two cases separately. Case 1. ij+1ij = 1. By (3.14) and M P 1, we have ij+1 2 S. This implies m(ij+1) = 0. Then from (2.4), we obtain flðijþ1 Þ ¼ fijþ1 . On the other hand, mðijþ1 Þ ¼ 0, (2.5) and (2.9) imply that Dij P Dijþ1 =c1 :

ð3:16Þ

From (3.9), (3.16) and the monotonicity of {Zk}, it follows that (3.15) holds. Case 2. ij+1ij > 1. For s = 1, 2, . . ., ij+1ij1, by the definition of {ij}, we have mðij þ sÞ > 0. It follows from (2.5) that qij þs < c2 . Then from (2.9), we have Dij P Dij þ1 P Dij þ2 P    P Dijþ1 1 P Dijþ1 s

ð3:17Þ

if qij < c2 , or c1 Dij P Dij þ1 P Dij þ2 P    P Dijþ1 1 P Dijþ1

ð3:18Þ

if qij P c2 . If ij+1 2 S, then m(ij+1) = 0 and flðijþ1 Þ ¼ fijþ1 . From (3.9), we have flðijþ1 1Þ  fijþ1 P a minfDijþ1 1 ; =Z ijþ1 1 g:

ð3:19Þ

From Lemma 2, it follows that flðij Þ P flðijþ1 1Þ . Thus, from (3.17)–(3.19) and the monotonicity of {Zk} we obtain flðij Þ  fijþ1 P a minfDijþ1 ; =Z ijþ1 g:

ð3:20Þ

Now, we assume that ij+1 62 S. For s = 0, 1, . . ., ij+1  ij  1, by c1 > 1, (3.9), (3.17), (3.18) and the definition of Zk, we have flðij þsÞ P fij þsþ1 þ a minfDij þs ; =kBij þs kg P fij þsþ1 þ a minfDijþ1 =c1 ; =Z ijþ1 g:

378

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

By Lemma (2) and (3.14), it follows that flðij Þ P maxffij þsþ1 : 0 6 s 6 Mg þ a minfDijþ1 =c1 ; =Z ijþ1 g P flðijþ1 Þ þ a minfDijþ1 =c1 ; =Z ijþ1 g;

ð3:21Þ

where the last inequality follows from the definition of fl(k) and m(ij+1) 6 M. Since c1 > 1, from (3.20) and (3.21), we see that (3.15) holds. Now, we prove (3.12) holds. By Lemma (2) and Assumption 1, the sequence fflðij Þ g is convergent. Then summing inequalities (3.15), we obtain 1 X minfDijþ1 =c1 ; =Z ijþ1 g 6 flði1 Þ  lim flðijþ1 Þ < þ1: j!1

j¼1

From (3.13), (3.17), (3.18) and the monotonicity of {Zk}, we get 1 X

minfDk =c21 ; =Z k g

¼

k¼1

1 1 ijþ1 X X s¼ij

j¼1

6

minfDs =c21 ; =Z s g

1 X

ðijþ1  ij Þ minfDij =c1 ; =Z ij g

j¼1

6 ðM þ 1Þ

1 X

minfDij =c1 ; =Z ij g < þ1:

j¼1

This implies that (3.12) holds with m ¼ 1=c21 .

h

Lemma 5. Suppose that Assumption (1) and (3.3) hold, and there exists  > 0 such that kgkk P  for all k. Then the following inequality kd k k P minf1; sð1  c2 Þg=Z k

ð3:22Þ

holds for k 2 J sufficiently large. Proof. Notice that Algorithm 1 ensures that kdkk 6 Dk. If the inequality kdkk > s(1c2)/(2l) holds for sufficiently large k 2 J, it follows from (3.12) that 1=Z k 6 sð1  c2 Þ=ð2lÞ for sufficiently large k 2 J. Then kd k k > 1=Z k

ð3:23Þ

holds for sufficiently large k 2 J. Now assume that kdkk 6 s(1c2)/(2l) for k 2 J. Since qk < c2 for k 2 J, by (2.3) we have fk  f ðxk þ d k Þ < c2 ðgTk d k  d Tk Bk d k =2Þ:

ð3:24Þ

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

379

Using Mean-value theorem and (3.1), we obtain T

T

fk  f ðxk þ d k Þ ¼ gðgÞ d k ¼ gTk d k þ ðgk  gðgÞÞ d k 2

P gTk d k  lkd k k P gTk d k  sð1  c2 Þkd k k=2

ð3:25Þ

where g 2 [xk,xk + dk]. It follows from (3.24) and (3.25) that ð1  c2 ÞðgTk d k þ skd k k=2Þ > c2 d Tk Bk d k =2:

ð3:26Þ

From kgkk P  and (2.1), we have gTk d k  d Tk Bk d k =2 P s minfkd k k; =kBk kg:

ð3:27Þ

2

Then, it follows from d Tk Bk d k 6 kd k k kBk k, (3.26) and (3.27) that kd k k2 kBk k P sð1  c2 Þ minfkd k k; 2=kBk k  kd k kg: If kdkk > 2/kBkkkdkk, it holds kd k k > =kBk k:

ð3:28Þ

Otherwise, we have kd k kkBk k P sð1  c2 Þ:

ð3:29Þ

From the definition of Zk, (3.23), (3.28) and (3.29), it follows that (3.22) holds for all k 2 J. h Lemma 6. Suppose that Assumption (1) and (3.3) hold, and there exists  > 0 such that kgkk P  for all k. Then inequality Dk P c3 minf1; sð1  c2 Þg=Z k

ð3:30Þ

holds for all sufficiently large k. Proof. If J is a finite set, there exists a positive constant a such that Dk P a for all k, then (3.12) implies that limk!11/Zk = 0, and hence (3.30) holds for all large k. Now, we assume that J is an infinite set. By Lemma 5, there exists a k 2 J such that (3.22) holds for k 2 J and k P k. For any k 2 I and k P k, let ^k ¼ maxfj : j 2 J and j 6 kg. The definition of ^k implies that kd ^k k P minf1; sð1  c2 Þg=Z ^k

ð3:31Þ

^k þ s 2 I

ð3:32Þ

and for all s ¼ 1; 2; . . . ; k  ^k. Moreover, (3.32) implies that q^kþs P c2 and for all s ¼ 1; 2; . . . ; k  ^k. From this and (2.9), we have D^kþ1 6 D^kþ2 6    6 Dk :

ð3:33Þ

380

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

On the other hand, from (2.9), we have D^k 6 D^kþ1 if q^k P c2 , or c3 kd ^k k 6 D^kþ1 if q^k < c2 . Since c3 2 (0, 1) and Algorithm 1 ensures kdkk6Dk for all k, we have D^kþ1 P c3 kd ^k k:

ð3:34Þ

By the monotonicity of {Zk}, (3.31), (3.33) and (3.34), we see that (3.30) holds for k 2 I and k P k. h Now, we prove the global convergence of Algorithm 1. Theorem 7. If Assumption (1) and (3.3) hold, and {Bk} satisfies 1 X 1=Z k ¼ 1;

ð3:35Þ

k¼1

where Zk = 1 + max16j6kkBkk, then sequence {xk} generated by Algorithm 1 satisfies lim inf kgk k ¼ 0:

ð3:36Þ

k!1

Proof. If (3.36) is not true, there is a constant  > 0 such that kgkk P  for all k. From Lemma 6, there exist a integer k such that ð3:37Þ minfDk ; =Z k g P r=Z k holds for all k P k, where r = min{,s(1  c2)}. Let k be any integer such that k P k. From (3.37) and (3.9), we have that flðkþsÞ P fkþsþ1 þ c minfDkþs ; =kBkþs kg P fkþsþ1 þ cr=Z kþs

ð3:38Þ

holds for s = 0, 1, . . ., M. From Lemma 2 and the monotonicity of {Zk}, it follows that flðkÞ P maxffkþsþ1 : 0 6 s 6 Mg þ cr=Z kþMþ1 P flðkþMþ1Þ þ cr=Z kþMþ1 ;

ð3:39Þ

where the last inequality follows from the definition of fl(k). From Assumption (1) and Lemma (2), {fl(k)} is monotonically decreasing and convergent. Combing with (3.39), we obtain X X 1=Z kþMþ1 6 ð1=crÞ ðflðkÞ  flðkþMþ1Þ Þ kPk

kPk

¼ ð1=crÞ

M XX

ðflðkþsÞ  flðkþsþ1Þ Þ

kPk s¼0

6 ð1=crÞ

X kP1

This contradicts (3.35).

h

ðflðkÞ  flðkþ1Þ Þ < 1:

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

381

Based on this theorem, we can derive the following superlinear convergence result for Algorithm 1. Theorem 8. Suppose that for all k, kB1 k g k k 6 Dk , subproblem (1.2) is solved accurately, i.e. d k ¼ B1 g . Suppose also that Assumption 1 holds and that k k limk!1

kðBk  r2 f ðx ÞÞd k k ¼ 0: kd k k

ð3:40Þ

Then if {xk} converges to a point x*, the rate of convergence is superlinear, i.e., kxkþ1  x k ¼ oðkxk  x kÞ:

ð3:41Þ

Proof. Let xk+1 = xk + kk dk, where  1 if qk P c2 ; kk ¼ ak if qk < c2

ð3:42Þ

and ak is defined by (2.6). From (2.6), (2.1) and (1), it follows that ak P d/2 for all k. Then, kk P d/2 and kxkþ1  xk k P ðd=2Þkd k k for all k. Since {xk} is convergent, we have dk ! 0 as k ! 1. Using Taylor expansion, we have 1 2 fk  f ðxk þ d k Þ ¼ gTk d k  d Tk r2 f ðxk Þd k þ oðkd k k Þ 2 1 2 ¼ gTk d k  d Tk r2 f ðx Þd k þ oðkd k k Þ: 2 Since d k ¼ B1 k gk , we obtain /ð0Þ  /k ðd k Þ ¼ d Tk Bk d k =2: Thus, qk  1 P

fk  f ðxk þ d k Þ 1 /ð0Þ  /k ðd k Þ 2

¼

d Tk Bk d k  d Tk r2 f ðx Þd k oðkd k k Þ þ T : d k Bk d k d Tk Bk d k

ð3:43Þ

From (3.2), i.e., dTBkd P xkdkk2 for all k, it follows that  T  2 T T 2 d k Bk d k  d Tk r2 f ðx Þd k    ¼ kd k k jd k Bk d k  d k r f ðx Þd k j   dT B d d Tk Bk d k kd k k2 k k k 6

1 kBk d k  r2 f ðx Þd k k : x kd k k

ð3:44Þ

382

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

From (3.2), (3.43), (3.44) and (3.40), we see that qk P c2 for sufficiently large k. Therefore, Algorithm 1 becomes the Newton method or quasi-Newton method for sufficiently large k. The superlinear convergent result can be established as Theorem 5.5.2 in [16]. h

4. Numerical results We have implemented the new algorithm and compared it both with a trust region algorithm combing line search (TRACLS) given by Nocedal and Yuan [12] and a nonmonotone trust region algorithm (NTRA) given by Sun [13]. We nave tested the algorithms for the following problems with different values of n ranging from n = 32 to n = 512: Problem 1. Extended Powell singular function [11]. f ðxÞ ¼

n=4 X 2 2 4 fðx4i3 þ 10x4i2 Þ þ 5ðx4i1  x4i Þ þ ðx4i2  2x4i1 Þ i¼1

þ 10ðx4i3  x4i Þ4 g;

x 2 Rn :

The function has a minimizer x* = (0, 0, . . ., 0)T with f(x*) = 0. The initial point has been taken x1 = (3, 1, 0, 1, . . ., 3,1, 0, 1)T. Problem 2. Extended Rosenbrock function [11]. f ðxÞ ¼

n=2 X 2 2 f100ðx2i  x22i1 Þ þ ð1  x2i1 Þ g;

x 2 Rn :

i¼1

Here, x* = (1, 1, . . ., 1)T with f(x*) = 0 and x1 = (1.2, 1, 1.2, 1, . . ., 1.2, 1)T. Problem 3. Extended Beale function [11]. f ðxÞ ¼

n=2 X 2 2 f½1:5  x2i1 ð1x2i Þ þ ½2:25  x2i1 ð1  x22i Þ i¼1 2

þ ½2:625  x2i1 ð1  x32i Þ g;

x 2 Rn :

Here x* = (3, 0.5, 3, 0.5, . . ., 3, 0.5)T with f(x*) = 0, and x1 = (1, 1, . . ., 1)T. The algorithms are coded in Matlab 6.1. The initial trust region radius is chosen as D1 = 0.5. The trial step is computed by Algorithm 2.6 [12] with c = 1 + 106, and Bk is updated by BFGS formula. But we do not update Bk

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

383

Table 1 Numerical comparisons Problem

n

TRACLS

NTRA

Algorithm 1

1. Extended Rosenbrock

32 64 128 256 512

37/46/38(3) 38/52/39(6) 42/43/43(0) 41/54/42(5) 47/55/48(3)

47/48/48 48/49/49 54/55/55 52/53/53 55/56/56

38/43/43(4) 278/305/305(26) 40/43/43(2) 42/46/46(3) 45/49/49(3)

2. Extended Powell

32 64 128 256 512

67/70/68(1) 69/76/70(3) 71/74/72(1) 69/75/70(2) 73/77/74(1)

59/60/60 66/67/67 74/75/75 89/90/90 80/81/81

66/69/69(2) 69/72/72(2) 65/68/68(2) 70/73/73(2) 83/87/87(3)

3. Extended Beale

32 64 128 256 512

Failed Failed Failed 25/26/26(0) 27/30/28(1)

21/22/22 22/23/23 27/28/28 25/26/26 31/32/32

23/27/27(3) 24/29/29(4) 54/62/62(7) 25/26/26(0) 27/29/29(1)

if sTk y k 6 0, where sk = xk+1xk and yk = g k+1gk. In all tests, the initial matrix B1 was chosen as jf1jI, where I is the identify matrix. The stopping condition is maxfkgðxk Þk; kf ðxk Þ  f ðx Þkg 6 106: The iteration is also terminated if the number evaluations exceed 300. The numerical result of our tests are reported in Table 1. The numerical results are given in the form of I/F/G(S), where I, F, G denote the numbers of iterations, function evaluations and gradient evaluations, respectively; S denotes the number of line searches for TRACLS, or computing steplengths for Algorithm 1. For NTRA, we omit (S). Comparing Algorithm 1 with TRACLS, we find that there are quite a number of test problems for which Algorithm 1 outperforms TRACLS, in the terms of function evaluations; whereas Algorithm 1 and NTRA perform very similarly. Therefore, we could say that Algorithm 1 is an efficient method for unconstrained optimization.

References [1] Z.W. Chen, J.Y. Han, D.C. Xu, A nonmonotone trust region method for nonlinear programming with simple bound constraints, Appl. Math. Opt. 43 (2001) 63–85. [2] A.R. Conn, I.M. Nicholas, P.L. Gould, Trust-region Methods, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2000. [3] N.Y. Deng, Y. Xiao, F.J. Zhou, Nonmonotone trust region algorithm, J. Optim. Theory Appl. 76 (1993) 259–285.

384

J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384

[4] Detong Zhu, Nonmonotonic projected algorithm with both trust region and line search for constrained optimization, J. Comput. Appl. Math. 117 (2000) 35–60. [5] S. Goldfeld, R. Quandt, H. Trotter, Maximization by quadratic hill climbing, Econometrica 34 (1966) 541–551. [6] Jie. Sun, Jiapu. Zhang, Global convergence of conjugate gradient methods without line search, Ann. Oper. Res. 103 (2001) 161–173. [7] K. Levernberg, A method for the solution of certain nonlinear problems in least squares, Quart. Appl. Math. 2 (1944) 164–168. [8] D.W. Marquardt, An algorithm for least squares estimation with nonlinear parameters, SIAM J. Appl. Math. 11 (1963) 431–441. [9] E.M. Gertz, Combination trust-region line search methods for unconstrained optimization, University of California San Diego, 1999. [10] E.M. Gertz, A quasi-Newton trust-region method, Math. Program. Ser.A. (1963) 1–24. [11] B.J. More´, B.S. Garbow, K.E. Hillstrom, Testing unconstrained optimization, ACM Trans. Math. Software 7 (1981) 17–41. [12] J. Nocedal, Y. Yuan, Combining trust region and linear search techniques, in: Y. Yuan (Ed.), Advances in Nonlinear Programming, Kluwer Academic Publishers, Dordrecht, 1996, pp. 153–175. [13] Sun. Wenyu, Nonmonotone trust region method for solving optimization problems, Appl. Math. Comput. 156 (1) (2004) 159–174. [14] P.L. Toint, Non-monotone trust-region algorithms for nonlinear optimization subject to convex constraints, Math. Program. 77 (1) (1997) 69–94. [15] Xiongda. Chen, Jie. Sun, Global convergence of a two-parameter family of conjugate gradient methods without line search, J. Comput. Appl. Math. 146 (2002) 37–45. [16] Y. Yuan, W. Sun, Optimization Theory and Methods, Science Press, Beijing, 1997. [17] J. Zhang, X. Zhang, A nonmonotone adaptive trust region method and its convergence, Comput. Math. Appl. 45 (10–11) (2003) 1469–1477.