Applied Mathematics and Computation 171 (2005) 371–384 www.elsevier.com/locate/amc
A nonmonotone trust region method for unconstrained optimization Jiangtao Mo
a,b,*
, Kecun Zhang a, Zengxin Wei
b
a
b
College of Science, Xian Jiaotong University, 710049, China College of Mathematics and Information Science, Guangxi University, 530004, China
Abstract In this paper, we propose a nonmonotone trust region method for unconstrained optimization. Our method can be regarded as a combination of nonmonotone technique, fixed steplength and trust region method. When a trial step is not accepted, the method does not resolve the subproblem but generates a iterative point whose steplength is defined by a formula. We only allow increase in function value when trial steps are not accepted in close succession of iterations. Under mild conditions, we prove that the algorithm is global convergence and superlinear convergence. Primary numerical results are reported. 2005 Elsevier Inc. All rights reserved. Keywords: Trust region method; Nonmonotone method; Fixed steplength; Unconstrained optimization
* Corresponding author. Present address: College of Mathematics and Information Science, Guangxi University, 530004, China. E-mail addresses:
[email protected] (J. Mo),
[email protected] (Z. Wei).
0096-3003/$ - see front matter 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2005.01.048
372
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
1. Introduction In this paper, we consider the following unconstrained optimization problem: min
x 2 Rn
f ðxÞ;
ð1:1Þ
where f:Rn ! R is a twice continuously differentiable. It is well known that trust region method is a kind of important and efficient methods for nonlinear optimization. Since it was proposed by Levenberg [7] and Marquardt [8] for nonlinear least-squares problems and by Goldfeld et al. [5] for unconstrained optimization, trust region method has been studied by many researchers, see [1,2,4,10,14,15,17] and papers cited therein. Trust region methods are iterative methods. At each iterative point xk, a trial step dk is generated by solving the following subproblem: min
1 gTk d þ d T Bk d /k ðdÞ 2
s:t: kdk 6 Dk ;
ð1:2Þ ð1:3Þ
n·n
where gk = $f(xk), Bk 2 R is an approximate Hessian matrix of f at xk, and Dk > 0 is a trust region radius. Some criterion is used to decided whether trial step dk is accepted or not. If trial step is not accepted, the subproblem (1.2) and (1.3) with a reduced trust region radius should be resolved until an acceptable step is found. Hence, the subproblem may be solved several times at an iteration and the total cost of computation for one iteration might be expensive for large scale problem. In recent years, a variety of trust region methods have been proposed in the literature. For example, Nocedal and Yuan [12], and Gertz [9] presented methods which combine line search technique and trust region method. When the trial step is not successful, their methods performance a line search to find a iterative point instead of resolving the subproblem. Therefore, their methods require little computation than classic trust region methods. Deng et al. [3], Zhang et al. [17] and Sun [13] proposed various nonmonotone trust region methods for unconstrained optimization. These papers indicated that the nonmonotone algorithm is efficient, especially for ill-conditioned problems. On the other hand, Sun and Zhang [6] and Chen and Sun [15] proposed a fixed steplength method for unconstrained optimization. In their approaches, without using line search, they computed the steplength by a formula at each iteration. Thus their methods might be practical in the cases that the line search is expensive or hard and allow a considerable saving in the number of function evaluations. In this paper, we consider a method which combines nonmonotone technique, fixed steplength and trust region method. Our aim is improve the algo-
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
373
rithm proposed in [12] and make it more effective in practical implementation. The main difference between the method in [12] and our method is that in the former one a steplength is computed by a line search when the trial step is not successful, whereas in our method a steplength is defined by a formula. We use the formula suggested by Sun and Zhang [6], Chen and Sun [15] to obtain an steplength. On the one hand, most of the nonmonotone method allows an increase in function value at each iteration. But our method only allows an increase in function value when trial steps are not accepted in close succession of iterations. The paper is organized as follows. In Section 2, we describe our algorithm for unconstrained optimization problems which combines the techniques of fixed steplength, nonmonotonicity and trust region method. In Section 3, under suitable conditions, global convergence and superlinear convergence of the proposed algorithm are established. Primary numerical results are presented in Section 4.
2. Algorithm In this section, we describe a method which combines nonmonotone technique, fixed steplength and trust region method. Throughout this paper, we use kÆk to represent the Euclid norm and denote f(xk) by fk, g(xk) by gk, etc. Vectors are column vectors unless a transpose is used. In each iteration, a trial step dk is generated by solving the trust region subproblem (1.2) and (1.3). As in [12], we solve (1.2) and (1.3) inaccurately such that kdkk 6 Dk and /k ð0Þ /k ðd k Þ P skgk k minfDk ; kgk k=kBk kg
ð2:1Þ
d Tk gk 6 skgk k minfDk ; kgk k=kBk kg;
ð2:2Þ
and
where s 2 (0, 1) is a constant. To determine whether trial step will be accepted, we compute qk, the ratio between the actual reduction, fl(k) f(xk + dk) and the predicted reduction, /k(0) /k(dk), i.e., qk ¼
flðkÞ f ðxk þ d k Þ ; /k ð0Þ /k ðd k Þ
ð2:3Þ
where flðkÞ ¼ maxffkj : 0 6 j 6 mðkÞg and m(k) is an integer defined by
ð2:4Þ
374
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
(
0
mðkÞ ¼
if qk1 P c2
ð2:5Þ
minfmðk 1Þ þ 1; Mg otherwise; where M P 1 is an integer constant. If qk P c, we accept dk and let xk+1 = xk + dk. Otherwise, we generate a new iterative point by xk+1 = xk + akdk, where ak is a steplength computed by ak ¼
dgTk d k d Tk Bk d k
ð2:6Þ
and d 2 (0, 1) is a constant. Note that in our method, if the last trial step is successful, the function value will be reduce at current iteration. But if the last trial step is not successful in succession, our method should allow an increase in the function value at current iteration. Hence, our method is a nonmonotone method. Now, we describe the complete algorithm. Algorithm 1 (Nonmonotone trust region method) Step 1. Given x1 2 Rn, D1 > 0, c1, c2, c3 and c4 such that 0 < c2 < 1, 0 < c3 < c4 < 1 < c1; an integer constant M P 1, a symmetric positive define matrix B1 in Rn·n, set m(1): = 0 and k: = 1. Step 2. Solve (1.2) and (1.3) inaccurately so that kdkk 6 Dk, and so that (2.1) and (2.2) are satisfied. Step 3. Compute qk by (2.3). If qk P c2, set xkþ1 ¼ xk þ d k
ð2:7Þ
and go to Step 5. Step 4. Compute ak by (2.6) and set xkþ1 ¼ xk þ ak d k :
ð2:8Þ
Step 5. Compute m(k + 1) by (2.5) and compute Dk+1 by 8 2 ½c3 kd k k; c4 Dk if qk < c2 ; > > < Dkþ1 ¼ Dk if qk P c2 and kd k k < Dk ; > > : if qk P c2 and kd k k ¼ Dk : 2 ½Dk ; c1 Dk
ð2:9Þ
Generate Bk+1, set k: = k + 1 and go to Step 2. Remark 1. Applying the Algorithm 2.6 in [12], we can find an inaccurate solution of (1.2) and (1.3) that satisfies kdkk 6 Dk, (2.1) and (2.2).
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
375
Remark 2. The matrix Bk can be updated by BFGS formula [12]. 3. Global convergence From now on, we turn to the analysis of the behavior of Algorithm 1 when it is applied to problem (1.1). To this end, the following assumption is required. Assumption 1 (1) The level set L ¼ fx 2 Rn jf ðxÞ 6 f ðx1 Þg is bounded. (2) The function f(x) is LC1 in Rn, i.e., there exists l > 0 such that kgðxÞ gðyÞk 6 lkx yk
8 x; y 2 Rn ;
ð3:1Þ
where g(x) = $f(x). (3) Matrices {Bk} are positive definition and there exists x > 0 such that d T Bk d P xd T d
8d 2 Rn and k ¼ 1; 2; . . .
ð3:2Þ
For simplify, we define two index sets as follows: I ¼ fk : qk P c2 g
and
J ¼ fk : qk < c2 g:
Lemma 1. Let {xk} be the sequence generated by Algorithm 1. If Assumption 1 holds and d 2 ð0; x=lÞ:
ð3:3Þ
Then f ðxkþ1 Þ flðkÞ 6 fð1 ld=xÞd=2ggTk d k
8k 2 J :
ð3:4Þ
Proof. The definition of fl(k) implies that fk 6 fl(k) for all k. Using Mean-value Theorem, we have fkþ1 flðkÞ 6 f ðxkþ1 Þ fk ¼ gðnÞT ðxkþ1 xk Þ;
ð3:5Þ
where n 2 [xk,xk+1]. For k 2 J, from (2.8), (3.1), (2.6) and (3.2), we obtain T
T
gðnÞ ðxkþ1 xk Þ ¼ gTk ðxkþ1 xk Þ þ ðgðnÞ gk Þ ðxkþ1 xk Þ 6 gTk ðxkþ1 xk Þ þ kgðnÞ gk k kxkþ1 xk k 6 gTk ðxkþ1 xk Þ þ lkxkþ1 xk k2 2 ¼ ak gTk d k þ la2k kd k k 2 ¼ ð1 ldkd k k =d Tk Bk d k Þak gTk d k 6 ð1 ld=xÞak gTk d k :
ð3:6Þ
376
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
Note that (2.6), (2.1) and (1) imply that ak P d/2 for all k, and (3.3) implies that 1ld/x > 0. Thus, from (3.5), (3.6) and (2.2), it follows that (3.4) holds for all k 2 J. h Lemma 2. Let {xk} be the sequence generated by Algorithm 1. If Assumption 1 and (3.3) hold, then {fl(k)} is monotonically decreasing. Furthermore, xk 2 L(x1) for all k. Proof. We firstly show that the sequence {fl(k)} is monotonically increasing, i.e. flðkþ1Þ 6 flðkÞ
ð3:7Þ
for all k. For k 2 I, it follows from qk P c2, (2.1), (2.3) and (2.7) that f(xk+1) 6 fl(k). On the other hand, by qk P c2 and Step 5 of Algorithm 1, we have m(k + 1) = 0. Hence, fl(k+1) = f(xk+1) and then (3.7) holds for k 2 I. Suppose that k 2 J. It follows from (3.4) and (2.2) that fkþ1 6 flðkÞ :
ð3:8Þ
If m(k + 1) = 0, then fk+1 = fl(k + 1) and (3.8) implies that (3.7) holds. If m(k + 1) > 0, then m(k + 1) 6 m(k) + 1. By the definition of fl(k) and (3.8), we have (3.7) holds. Then (3.7) holds for all k. Now, we prove the last conclusion. The definition of fl(k) and Step 5 of Algorithm 1 imply that fk 6 fl(k) and fl(1) = f1. By (3.7), we have fl(k) 6 fl(1). Hence, xk 2 L(x1) for all k. h Lemma 3. Suppose that Assumption 1 and (3.3) hold, and there exists > 0 such that kgkk P for all k. Then there exits a constant c > 0 such that flðkÞ fkþ1 P c minfDk ; =kBk kg
ð3:9Þ
holds for all k. Proof. If i 2 I or equivalently qk P c2, it follows from (2.1) and (2.3) that flðkÞ fkþ1 P c2 ð/k ð0Þ /k ðd k ÞÞ P c2 s minfDk ; =kBk kg:
ð3:10Þ
If j 2 J, from (3.4) and (2.2), it follows that flðkÞ fkþ1 P fð1 ld=xÞd=2gðgTk d k Þ P fð1 ld=xÞd=2gs minfDk ; =kBk kg:
ð3:11Þ
Thus, (3.10) and (3.11) imply that (3.9) holds for all k with c = min{c2s, (1ld/x)ds/2}. h
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
377
Lemma 4. Suppose that Assumption 1 and (3.3) hold, and there exists > 0 such that kgkk P for all k. Then there exists m 2 (0, 1) such that lim
k!1
minfmDk ; =Z k g ¼ 0;
ð3:12Þ
where Zk = 1 + max1 6 i 6 kkBkk. Proof. We define S to be the set of integer k such that m(k) = 0. Let {ij : j = 1, 2, . . .} be a infinite set of integer which contains S and satisfies 1 6 ijþ1 ij 6 M þ 1
ð3:13Þ
for all j, and ijþ1 ij ¼ M þ 1
ð3:14Þ
if ij+1 62 S. Note that i1 = 1 for m(1) = 0. Next, we show that inequality flðij Þ flðijþ1 Þ P c minfDijþ1 =c1 ; =Z ijþ1 g
ð3:15Þ
for all j. We consider two cases separately. Case 1. ij+1ij = 1. By (3.14) and M P 1, we have ij+1 2 S. This implies m(ij+1) = 0. Then from (2.4), we obtain flðijþ1 Þ ¼ fijþ1 . On the other hand, mðijþ1 Þ ¼ 0, (2.5) and (2.9) imply that Dij P Dijþ1 =c1 :
ð3:16Þ
From (3.9), (3.16) and the monotonicity of {Zk}, it follows that (3.15) holds. Case 2. ij+1ij > 1. For s = 1, 2, . . ., ij+1ij1, by the definition of {ij}, we have mðij þ sÞ > 0. It follows from (2.5) that qij þs < c2 . Then from (2.9), we have Dij P Dij þ1 P Dij þ2 P P Dijþ1 1 P Dijþ1 s
ð3:17Þ
if qij < c2 , or c1 Dij P Dij þ1 P Dij þ2 P P Dijþ1 1 P Dijþ1
ð3:18Þ
if qij P c2 . If ij+1 2 S, then m(ij+1) = 0 and flðijþ1 Þ ¼ fijþ1 . From (3.9), we have flðijþ1 1Þ fijþ1 P a minfDijþ1 1 ; =Z ijþ1 1 g:
ð3:19Þ
From Lemma 2, it follows that flðij Þ P flðijþ1 1Þ . Thus, from (3.17)–(3.19) and the monotonicity of {Zk} we obtain flðij Þ fijþ1 P a minfDijþ1 ; =Z ijþ1 g:
ð3:20Þ
Now, we assume that ij+1 62 S. For s = 0, 1, . . ., ij+1 ij 1, by c1 > 1, (3.9), (3.17), (3.18) and the definition of Zk, we have flðij þsÞ P fij þsþ1 þ a minfDij þs ; =kBij þs kg P fij þsþ1 þ a minfDijþ1 =c1 ; =Z ijþ1 g:
378
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
By Lemma (2) and (3.14), it follows that flðij Þ P maxffij þsþ1 : 0 6 s 6 Mg þ a minfDijþ1 =c1 ; =Z ijþ1 g P flðijþ1 Þ þ a minfDijþ1 =c1 ; =Z ijþ1 g;
ð3:21Þ
where the last inequality follows from the definition of fl(k) and m(ij+1) 6 M. Since c1 > 1, from (3.20) and (3.21), we see that (3.15) holds. Now, we prove (3.12) holds. By Lemma (2) and Assumption 1, the sequence fflðij Þ g is convergent. Then summing inequalities (3.15), we obtain 1 X minfDijþ1 =c1 ; =Z ijþ1 g 6 flði1 Þ lim flðijþ1 Þ < þ1: j!1
j¼1
From (3.13), (3.17), (3.18) and the monotonicity of {Zk}, we get 1 X
minfDk =c21 ; =Z k g
¼
k¼1
1 1 ijþ1 X X s¼ij
j¼1
6
minfDs =c21 ; =Z s g
1 X
ðijþ1 ij Þ minfDij =c1 ; =Z ij g
j¼1
6 ðM þ 1Þ
1 X
minfDij =c1 ; =Z ij g < þ1:
j¼1
This implies that (3.12) holds with m ¼ 1=c21 .
h
Lemma 5. Suppose that Assumption (1) and (3.3) hold, and there exists > 0 such that kgkk P for all k. Then the following inequality kd k k P minf1; sð1 c2 Þg=Z k
ð3:22Þ
holds for k 2 J sufficiently large. Proof. Notice that Algorithm 1 ensures that kdkk 6 Dk. If the inequality kdkk > s(1c2)/(2l) holds for sufficiently large k 2 J, it follows from (3.12) that 1=Z k 6 sð1 c2 Þ=ð2lÞ for sufficiently large k 2 J. Then kd k k > 1=Z k
ð3:23Þ
holds for sufficiently large k 2 J. Now assume that kdkk 6 s(1c2)/(2l) for k 2 J. Since qk < c2 for k 2 J, by (2.3) we have fk f ðxk þ d k Þ < c2 ðgTk d k d Tk Bk d k =2Þ:
ð3:24Þ
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
379
Using Mean-value theorem and (3.1), we obtain T
T
fk f ðxk þ d k Þ ¼ gðgÞ d k ¼ gTk d k þ ðgk gðgÞÞ d k 2
P gTk d k lkd k k P gTk d k sð1 c2 Þkd k k=2
ð3:25Þ
where g 2 [xk,xk + dk]. It follows from (3.24) and (3.25) that ð1 c2 ÞðgTk d k þ skd k k=2Þ > c2 d Tk Bk d k =2:
ð3:26Þ
From kgkk P and (2.1), we have gTk d k d Tk Bk d k =2 P s minfkd k k; =kBk kg:
ð3:27Þ
2
Then, it follows from d Tk Bk d k 6 kd k k kBk k, (3.26) and (3.27) that kd k k2 kBk k P sð1 c2 Þ minfkd k k; 2=kBk k kd k kg: If kdkk > 2/kBkkkdkk, it holds kd k k > =kBk k:
ð3:28Þ
Otherwise, we have kd k kkBk k P sð1 c2 Þ:
ð3:29Þ
From the definition of Zk, (3.23), (3.28) and (3.29), it follows that (3.22) holds for all k 2 J. h Lemma 6. Suppose that Assumption (1) and (3.3) hold, and there exists > 0 such that kgkk P for all k. Then inequality Dk P c3 minf1; sð1 c2 Þg=Z k
ð3:30Þ
holds for all sufficiently large k. Proof. If J is a finite set, there exists a positive constant a such that Dk P a for all k, then (3.12) implies that limk!11/Zk = 0, and hence (3.30) holds for all large k. Now, we assume that J is an infinite set. By Lemma 5, there exists a k 2 J such that (3.22) holds for k 2 J and k P k. For any k 2 I and k P k, let ^k ¼ maxfj : j 2 J and j 6 kg. The definition of ^k implies that kd ^k k P minf1; sð1 c2 Þg=Z ^k
ð3:31Þ
^k þ s 2 I
ð3:32Þ
and for all s ¼ 1; 2; . . . ; k ^k. Moreover, (3.32) implies that q^kþs P c2 and for all s ¼ 1; 2; . . . ; k ^k. From this and (2.9), we have D^kþ1 6 D^kþ2 6 6 Dk :
ð3:33Þ
380
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
On the other hand, from (2.9), we have D^k 6 D^kþ1 if q^k P c2 , or c3 kd ^k k 6 D^kþ1 if q^k < c2 . Since c3 2 (0, 1) and Algorithm 1 ensures kdkk6Dk for all k, we have D^kþ1 P c3 kd ^k k:
ð3:34Þ
By the monotonicity of {Zk}, (3.31), (3.33) and (3.34), we see that (3.30) holds for k 2 I and k P k. h Now, we prove the global convergence of Algorithm 1. Theorem 7. If Assumption (1) and (3.3) hold, and {Bk} satisfies 1 X 1=Z k ¼ 1;
ð3:35Þ
k¼1
where Zk = 1 + max16j6kkBkk, then sequence {xk} generated by Algorithm 1 satisfies lim inf kgk k ¼ 0:
ð3:36Þ
k!1
Proof. If (3.36) is not true, there is a constant > 0 such that kgkk P for all k. From Lemma 6, there exist a integer k such that ð3:37Þ minfDk ; =Z k g P r=Z k holds for all k P k, where r = min{,s(1 c2)}. Let k be any integer such that k P k. From (3.37) and (3.9), we have that flðkþsÞ P fkþsþ1 þ c minfDkþs ; =kBkþs kg P fkþsþ1 þ cr=Z kþs
ð3:38Þ
holds for s = 0, 1, . . ., M. From Lemma 2 and the monotonicity of {Zk}, it follows that flðkÞ P maxffkþsþ1 : 0 6 s 6 Mg þ cr=Z kþMþ1 P flðkþMþ1Þ þ cr=Z kþMþ1 ;
ð3:39Þ
where the last inequality follows from the definition of fl(k). From Assumption (1) and Lemma (2), {fl(k)} is monotonically decreasing and convergent. Combing with (3.39), we obtain X X 1=Z kþMþ1 6 ð1=crÞ ðflðkÞ flðkþMþ1Þ Þ kPk
kPk
¼ ð1=crÞ
M XX
ðflðkþsÞ flðkþsþ1Þ Þ
kPk s¼0
6 ð1=crÞ
X kP1
This contradicts (3.35).
h
ðflðkÞ flðkþ1Þ Þ < 1:
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
381
Based on this theorem, we can derive the following superlinear convergence result for Algorithm 1. Theorem 8. Suppose that for all k, kB1 k g k k 6 Dk , subproblem (1.2) is solved accurately, i.e. d k ¼ B1 g . Suppose also that Assumption 1 holds and that k k limk!1
kðBk r2 f ðx ÞÞd k k ¼ 0: kd k k
ð3:40Þ
Then if {xk} converges to a point x*, the rate of convergence is superlinear, i.e., kxkþ1 x k ¼ oðkxk x kÞ:
ð3:41Þ
Proof. Let xk+1 = xk + kk dk, where 1 if qk P c2 ; kk ¼ ak if qk < c2
ð3:42Þ
and ak is defined by (2.6). From (2.6), (2.1) and (1), it follows that ak P d/2 for all k. Then, kk P d/2 and kxkþ1 xk k P ðd=2Þkd k k for all k. Since {xk} is convergent, we have dk ! 0 as k ! 1. Using Taylor expansion, we have 1 2 fk f ðxk þ d k Þ ¼ gTk d k d Tk r2 f ðxk Þd k þ oðkd k k Þ 2 1 2 ¼ gTk d k d Tk r2 f ðx Þd k þ oðkd k k Þ: 2 Since d k ¼ B1 k gk , we obtain /ð0Þ /k ðd k Þ ¼ d Tk Bk d k =2: Thus, qk 1 P
fk f ðxk þ d k Þ 1 /ð0Þ /k ðd k Þ 2
¼
d Tk Bk d k d Tk r2 f ðx Þd k oðkd k k Þ þ T : d k Bk d k d Tk Bk d k
ð3:43Þ
From (3.2), i.e., dTBkd P xkdkk2 for all k, it follows that T 2 T T 2 d k Bk d k d Tk r2 f ðx Þd k ¼ kd k k jd k Bk d k d k r f ðx Þd k j dT B d d Tk Bk d k kd k k2 k k k 6
1 kBk d k r2 f ðx Þd k k : x kd k k
ð3:44Þ
382
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
From (3.2), (3.43), (3.44) and (3.40), we see that qk P c2 for sufficiently large k. Therefore, Algorithm 1 becomes the Newton method or quasi-Newton method for sufficiently large k. The superlinear convergent result can be established as Theorem 5.5.2 in [16]. h
4. Numerical results We have implemented the new algorithm and compared it both with a trust region algorithm combing line search (TRACLS) given by Nocedal and Yuan [12] and a nonmonotone trust region algorithm (NTRA) given by Sun [13]. We nave tested the algorithms for the following problems with different values of n ranging from n = 32 to n = 512: Problem 1. Extended Powell singular function [11]. f ðxÞ ¼
n=4 X 2 2 4 fðx4i3 þ 10x4i2 Þ þ 5ðx4i1 x4i Þ þ ðx4i2 2x4i1 Þ i¼1
þ 10ðx4i3 x4i Þ4 g;
x 2 Rn :
The function has a minimizer x* = (0, 0, . . ., 0)T with f(x*) = 0. The initial point has been taken x1 = (3, 1, 0, 1, . . ., 3,1, 0, 1)T. Problem 2. Extended Rosenbrock function [11]. f ðxÞ ¼
n=2 X 2 2 f100ðx2i x22i1 Þ þ ð1 x2i1 Þ g;
x 2 Rn :
i¼1
Here, x* = (1, 1, . . ., 1)T with f(x*) = 0 and x1 = (1.2, 1, 1.2, 1, . . ., 1.2, 1)T. Problem 3. Extended Beale function [11]. f ðxÞ ¼
n=2 X 2 2 f½1:5 x2i1 ð1x2i Þ þ ½2:25 x2i1 ð1 x22i Þ i¼1 2
þ ½2:625 x2i1 ð1 x32i Þ g;
x 2 Rn :
Here x* = (3, 0.5, 3, 0.5, . . ., 3, 0.5)T with f(x*) = 0, and x1 = (1, 1, . . ., 1)T. The algorithms are coded in Matlab 6.1. The initial trust region radius is chosen as D1 = 0.5. The trial step is computed by Algorithm 2.6 [12] with c = 1 + 106, and Bk is updated by BFGS formula. But we do not update Bk
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
383
Table 1 Numerical comparisons Problem
n
TRACLS
NTRA
Algorithm 1
1. Extended Rosenbrock
32 64 128 256 512
37/46/38(3) 38/52/39(6) 42/43/43(0) 41/54/42(5) 47/55/48(3)
47/48/48 48/49/49 54/55/55 52/53/53 55/56/56
38/43/43(4) 278/305/305(26) 40/43/43(2) 42/46/46(3) 45/49/49(3)
2. Extended Powell
32 64 128 256 512
67/70/68(1) 69/76/70(3) 71/74/72(1) 69/75/70(2) 73/77/74(1)
59/60/60 66/67/67 74/75/75 89/90/90 80/81/81
66/69/69(2) 69/72/72(2) 65/68/68(2) 70/73/73(2) 83/87/87(3)
3. Extended Beale
32 64 128 256 512
Failed Failed Failed 25/26/26(0) 27/30/28(1)
21/22/22 22/23/23 27/28/28 25/26/26 31/32/32
23/27/27(3) 24/29/29(4) 54/62/62(7) 25/26/26(0) 27/29/29(1)
if sTk y k 6 0, where sk = xk+1xk and yk = g k+1gk. In all tests, the initial matrix B1 was chosen as jf1jI, where I is the identify matrix. The stopping condition is maxfkgðxk Þk; kf ðxk Þ f ðx Þkg 6 106: The iteration is also terminated if the number evaluations exceed 300. The numerical result of our tests are reported in Table 1. The numerical results are given in the form of I/F/G(S), where I, F, G denote the numbers of iterations, function evaluations and gradient evaluations, respectively; S denotes the number of line searches for TRACLS, or computing steplengths for Algorithm 1. For NTRA, we omit (S). Comparing Algorithm 1 with TRACLS, we find that there are quite a number of test problems for which Algorithm 1 outperforms TRACLS, in the terms of function evaluations; whereas Algorithm 1 and NTRA perform very similarly. Therefore, we could say that Algorithm 1 is an efficient method for unconstrained optimization.
References [1] Z.W. Chen, J.Y. Han, D.C. Xu, A nonmonotone trust region method for nonlinear programming with simple bound constraints, Appl. Math. Opt. 43 (2001) 63–85. [2] A.R. Conn, I.M. Nicholas, P.L. Gould, Trust-region Methods, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2000. [3] N.Y. Deng, Y. Xiao, F.J. Zhou, Nonmonotone trust region algorithm, J. Optim. Theory Appl. 76 (1993) 259–285.
384
J. Mo et al. / Appl. Math. Comput. 171 (2005) 371–384
[4] Detong Zhu, Nonmonotonic projected algorithm with both trust region and line search for constrained optimization, J. Comput. Appl. Math. 117 (2000) 35–60. [5] S. Goldfeld, R. Quandt, H. Trotter, Maximization by quadratic hill climbing, Econometrica 34 (1966) 541–551. [6] Jie. Sun, Jiapu. Zhang, Global convergence of conjugate gradient methods without line search, Ann. Oper. Res. 103 (2001) 161–173. [7] K. Levernberg, A method for the solution of certain nonlinear problems in least squares, Quart. Appl. Math. 2 (1944) 164–168. [8] D.W. Marquardt, An algorithm for least squares estimation with nonlinear parameters, SIAM J. Appl. Math. 11 (1963) 431–441. [9] E.M. Gertz, Combination trust-region line search methods for unconstrained optimization, University of California San Diego, 1999. [10] E.M. Gertz, A quasi-Newton trust-region method, Math. Program. Ser.A. (1963) 1–24. [11] B.J. More´, B.S. Garbow, K.E. Hillstrom, Testing unconstrained optimization, ACM Trans. Math. Software 7 (1981) 17–41. [12] J. Nocedal, Y. Yuan, Combining trust region and linear search techniques, in: Y. Yuan (Ed.), Advances in Nonlinear Programming, Kluwer Academic Publishers, Dordrecht, 1996, pp. 153–175. [13] Sun. Wenyu, Nonmonotone trust region method for solving optimization problems, Appl. Math. Comput. 156 (1) (2004) 159–174. [14] P.L. Toint, Non-monotone trust-region algorithms for nonlinear optimization subject to convex constraints, Math. Program. 77 (1) (1997) 69–94. [15] Xiongda. Chen, Jie. Sun, Global convergence of a two-parameter family of conjugate gradient methods without line search, J. Comput. Appl. Math. 146 (2002) 37–45. [16] Y. Yuan, W. Sun, Optimization Theory and Methods, Science Press, Beijing, 1997. [17] J. Zhang, X. Zhang, A nonmonotone adaptive trust region method and its convergence, Comput. Math. Appl. 45 (10–11) (2003) 1469–1477.