Applied Mathematics and Computation 202 (2008) 766–770
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
Global convergence of a modified spectral FR conjugate gradient method q Shou-qiang Du a,b,*, Yuan-yuan Chen b a b
School of Management, University of Shanghai for Science and Technology, Shanghai 200093, China College of Mathematics, Qingdao University, Qingdao 266071, China
a r t i c l e
i n f o
Keywords: Unconstrained optimization Conjugate gradient method Line search Global convergence
a b s t r a c t This paper deals with a new nonlinear modified spectral FR conjugate gradient method for solving large scale unstrained optimization problems. The direction generated by the method is a descent direction for the objective function. Under mild conditions, we prove that the modified spectral FR conjugate gradient method with Wolfe type line search is globally convergent. Preliminary numerical results show the proposed method is very promising. Ó 2008 Elsevier Inc. All rights reserved.
1. Introduction Consider the following unconstrained optimization problem: min f ðxÞ; n x2R
ð1:1Þ
where f : Rn ! R is continuously differentiable and its gradient gðxÞ rf ðxÞ is available. Iterative methods are widely used for solving (1.1) and the iterative formula is given by xkþ1 ¼ xk þ ak dk ;
ð1:2Þ
where xk 2 Rn is the kth approximation to the solution, ak is a steplength obtained by carrying out a line search, and dk is a search direction. There are many kinds of iterative methods that include the Newton method, the steepest descent method and nonlinear conjugate gradient methods, for example. The conjugate gradient methods are the most famous methods for solving unstrained optimization (1.1), especially in case of large scale optimization problems in scientific and engineering computation due to the simplicity of their iteration and low memory requirements. The search direction dk is defined by g k ; k ¼ 1; dk ¼ g k þ bk dk1 ; k P 2; where bk is a scalar and g k gðxk Þ. The original nonlinear conjugate gradient method proposed by Fletcher and Reeves (FR conjugate gradient method) [1], in which bk is defined by
q
This work was supported by National Science Foundation of China (under Grant: 10671126, 40771095), Shanghai leading academic discipline project (under Grant: T0502) and key project for Fundamental Research of STCSM (Project Number: 06JC14057). * Corresponding author. Address: School of Management, University of Shanghai for Science and Technology, Shanghai 200093, China. E-mail address:
[email protected] (S.-q. Du).
0096-3003/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2008.03.020
S.-q. Du, Y.-y. Chen / Applied Mathematics and Computation 202 (2008) 766–770
bFR k ¼
kg k k2 kg k1 k2
767
ð1:3Þ
;
when applied to strictly quadratic objective functions this method reduces to the linear conjugate gradient method provided that ak is the exact minimizer. Zoutendijk [2] also proved that the FR method with exact line search is globally convergent. And Al-Baali [3] proved the global convergence of FR method under the strong Wolfe line search f ðxk þ ak dk Þ 6 f ðxk Þ þ dak g Tk dk ;
T
jdk gðxk þ ak dk Þj 6 rjg Tk dk j;
where 0 < d < r < 12. Some good results about the method have also been reported in recent years [4,5]. Other nonlinear conjugate gradient methods and the global convergence results about Polak–Ribiere–Polyak (PRP) method, Hestenes–Stiefel (HS) method, Dai–Yuan (DY) method can see [6–10]. Quite recently, Birgin and Martinez [11] proposed a spectral conjugate gradient method by combining conjugate gradient method and spectral gradient method [10]. The direction dk is given by the following way dk ¼ hk g k þ bk dk1 ; where hk is a parameter and bk ¼
ðhk yk1 sk1 ÞT g k T
dk1 yk1
;
where yk1 ¼ g k g k1 , sk1 ¼ xk xk1 and hk is taken to be the spectral gradient hk ¼
sTk1 sk1 : sTk1 yk1
Unfortunately, the spectral conjugate gradient method [11] cannot guarantee to generate descent directions. So in [12], based on the FR formula, they take modification to the FR method such that the direction generated is always a descent direction. The dk is defined by g k ; k ¼ 0; ð1:4Þ dk ¼ d ; k > 0; hk g k þ bFR k1 k where bFR k is specified by (1.3) and T
hk ¼
dk1 yk1 kg k1 k2
ð1:5Þ
;
where yk ¼ g kþ1 g k . In this paper, based the modified FR conjugate gradient method [12], under some mild conditions, we give the global convergence of the modified spectral FR method with Wolfe type line search. The rest of the paper is organized as follows. In Section 2, we present the Wolfe type line search and the modified spectral FR conjugate gradient method. The global convergence of the modified spectral FR conjugate gradient method will be given in Section 3. Finally, numerical experiments in Section 4 indicate that the method is able to efficiently solve unconstrained optimization problems.
2. Algorithm and lemmas In this section, we will give the following assumptions on objective function f ðxÞ, which have been used often in the literature to analyze the global convergence of nonlinear conjugate gradient methods with inexact line searches. Assumption 2.1 (i) The level set L0 ¼ fx 2 Rn jf ðxÞ 6 f ðx0 Þg is bounded. (ii) In some neighborhood U of L0 , f ðxÞ is continuously differentiable and its gradient is Lipschitz continuous, namely, there exists a constant L > 0 such that kgðxÞ gðyÞk 6 Lkx yk
8x; y 2 U:
Wolfe type line search (we have proposed in [13]). The line search is to choose ak > 0 such that f ðxk Þ f ðxk þ ak dk Þ P qa2k kdk k2 ;
ð2:1Þ
gðxk þ ak dk ÞT dk P 2rak kdk k2 ;
ð2:2Þ
where 0 < q < r < 1: Now we present the new modified spectral FR conjugate gradient method as follows.
768
S.-q. Du, Y.-y. Chen / Applied Mathematics and Computation 202 (2008) 766–770
Algorithm 2.1 Step 0. Step 1. Step 2. Step 3.
Given x0 2 Rn , set d0 ¼ g 0 ; k :¼ 0; If g 0 ¼ 0; then stop. Find ak > 0 satisfying the Wolfe type line search (2.1) and (2.2), by (1.2), xkþ1 is given. If g kþ1 ¼ 0, then stop. Compute dk by (1.4). Set k :¼ k þ 1, go to Step 1.
In order to establish the global convergence of Algorithm 2.1, we give the following results. Lemma 2.1. Suppose that Assumption 2.1 holds, then the Wolfe type line search ð2.1Þ and ð2.2Þ is feasible. The proof is essentially the same as Lemma 1 of ½13, hence we omit it. Lemma 2.2. Suppose that Assumption 2.1 holds. Then, there is a constant c1 > 0, such that 8x 2 U:
kgðxÞk 6 c1
This is very obviously by Assumption 2.1. Lemma 2.3. Suppose direction dk is given by ð1.4Þ, then we have T
dk g k ¼ kg k k2 ;
ð2:3Þ
holds for any k P 0. Proof. We show by induction that (2.3) holds for all k. For k ¼ 0, (2.3) is clearly true as d0 ¼ g 0 . Now, assume for some k P 1, T
dk1 g k1 ¼ kg k1 k2 ; then by (1.3)–(1.5), we have g Tk dk ¼ hk kg k k2 þ ¼
kg k k2
T
kg k k2
T
2
kg k1 k
kg k k2
T
2
kg k1 k
dk1 g k1 ¼
dk1 g k ¼
kg k1 k2
dk1 ðg k g k1 Þ 2
kg k1 k
kg k k2 þ
kg k k2
T
2
kg k1 k
dk1 g k ¼
kg k k2 kg k1 k2
T
T
T
ðdk1 g k þ dk1 g k1 þ dk1 g k Þ
ðkg k1 k2 Þ ¼ kg k k2 :
Therefore, (2.3) is still true with k 1 replaced k, so by induction the proof is completed. h T
Remark 2.1. From the above Lemma 2.3, we know dk g k < 0 hold for all k P 0. Lemma 2.4. Suppose that Assumption 2.1 holds, we have X kg k4 k kP0
kdk k2
< þ1:
ð2:4Þ
Proof. From line search (2.1), (2.2) and Assumption 2.1, we obtain ð2r þ LÞak kdk k2 P g Tk dk : Then, we have ak kdk k P
T g k dk 1 : 2r þ L kdk k
Squaring both sides of above formula, we have a2k kdk k2 P
2 T 2 ðg k dk Þ 1 : 2r þ L kdk k2
By (2.1), we know 1 X ðg Tk dk Þ2 2
k¼1
kdk k
6 ð2r þ LÞ2
1 X k¼1
a2k kdk k2 6
1 ð2r þ LÞ2 X ff ðxk Þ f ðxkþ1 Þg < þ1: q k¼1
Then by Lemma 2.3, we know (2.4) holds. This completes the proof.
h
769
S.-q. Du, Y.-y. Chen / Applied Mathematics and Computation 202 (2008) 766–770
3. Convergence analysis for Algorithm 2.1 From the above Lemmas and Assumption 2.1, we give the following theorem of global convergence for the modified spectral FR conjugate gradient method. The general idea for proving the global convergence of the modified spectral FR conjugate gradient method is to assume first by contradiction, then derive contradictory relations. Theorem 3.1. Consider the modified spectral FR conjugate gradient Algorithm 2.1, suppose that Assumptions 2.1 holds. Then we have lim inf kg k k ¼ 0: k!1
Proof. Suppose by contradiction that there exists a positive constant > 0 such that kg k k P
ð3:1Þ
holds for all k P 0. From (1.4), we have 2 2 2 2 T kdk k2 ¼ ðbFR k Þ kdk1 k 2hk g k dk hk kg k k :
Dividing both sides of (3.2) by kdk k2 4
kg k k
¼
kdk k2 ðg Tk dk Þ2
ðg Tk dk Þ2 , !2
kg k k2
¼
by (1.3), (2.3) and (3.1), we have
kdk1 k2
2
kg k1 k
ð3:2Þ
4
kg k k
þ
2hk 2
kg k k
h2k 2
kg k k
¼
kdk1 k2 4
kg k1 k
ðhk1 Þ2 2
kg k k
þ
1 2
kg k k
6
kdk1 k2 kg k1 k4
þ
1 kg k k2
:
Then, we have kdk k2 4
6
kg k k
k1 X
1
i¼0
kg i k2
6
k : 2
So, we obtain X kg k4 k 2
kP1
kdk k
P 2
X1 ¼ þ1: k kP1
Which contradicts (2.4). Therefore, we have lim inf kg k k ¼ 0:
k!1
Remark 3.1. We have proved the global convergence of a modified spectral FR conjugate gradient method. It is also clear that if exact line search is used, this method reduces to the standard FR conjugate gradient method.
4. Numerical experiments In this section, in order to investigate the efficiency of our method, we did some numerical experiments. The problem that we tested are from [14]. Our line search subroutine computes ak such that the Wolfe type line search condition (2.1) and (2.2) hold with q ¼ 0:001 and r ¼ 0:01. We also use the condition kg kþ1 k 6 106 as the stopping criterion. We use MATLAB 7.0 to test the chosen problems. The numerical results are listed in Table 1, where the items in each column have the following meanings: Name: the name of the test problem; Dim: the dimension of the problem; NI: the number of iterations;
Table 1 Test results for Algorithm 2.1 Name
Dim
NI
NF
NG
ROSE GULF ROSEX VARDIM LIN TRIG TRIG LIN1 LIN1
2 3 8 2 1000 50 100 2 10
107 2 141 4 1 337 369 1 1
1243 52 1433 59 3 1325 1773 3 3
237 3 252 8 3 460 543 3 3
770
S.-q. Du, Y.-y. Chen / Applied Mathematics and Computation 202 (2008) 766–770
NF: the number of function evaluations; NG: the number of gradient evaluations. Acknowledgement The authors wish to thank the editor and the anonymous referee for their extensive and most helpful comments of this paper. References [1] R. Fletcher, C. Reeves, Function minimization by conjugate gradients, Comput. J. 7 (1964) 149–154. [2] G. Zoutendijk, Nonlinear programming computational methods, in: Abadie (Ed.), Integer and Nonlinear Programming, North-Holland, Amsterdam, 1970, pp. 37–86. [3] A. Al-Baali, Descent property and global convergence of the Fletcher–Reeves method with inexact line search, IMA J. Numer. Anal. 5 (1985) 121–124. [4] Y. Dai, Y. Yuan, A nonlinear conjugate gradient method with nice global convergence properties, SIAM J. Optimiz. 10 (1999) 177–182. [5] G. Liu, J. Han, H. Yin, Global convergence of the Fletcher–Reeves method with inexact line search, Appl. Math. J. Chin. Univ. Ser. B 10 (1995) 75–82. [6] J.C. Gilbert, J. Nocedal, Global convergence properties of conjugate gradient methods for optimization, SIAM J. Optimiz. 2 (1992) 21–24. [7] Y. Liu, C. Storey, Efficient generalized conjugate gradient algorithms, part 1: theory, J. Optimiz. Theory Appl. 69 (1991) 129–137. [8] C. Pan, L. Chen, A class of efficient new descent methods, ACTA Math. Appl. 30 (2007) 88–98 (in Chinese). [9] J. Sun, J. Zhang, Convergence of conjugate gradient methods without line search, Ann. Operat. Res. 103 (2001) 161–173. [10] M. Raydan, The Barzilain and Borwein gradient method for the large unconstrained minimization problem, SIAM J. Optimiz. 7 (1997) 26–33. [11] E.G. Birgin, J.M. Martinez, A spectral conjugate gradient method for unconstrained optimization, Appl. Math. Optimiz. 43 (2001) 117–128. [12] L. Zhang, W. Zhou, D. Li, Global convergence of a modified Fletcher–Reeves conjugate gradient method with Armijo-type line search, Numer. Math. 104 (2006) 561–572. [13] C. Wang, Y. Chen, S. Du, Further insight into the Shamanskii modification of Newton method, Appl. Math. Comput. 180 (2006) 46–52. [14] J.J. More, B.S. Garbow, K.E. Hillstrom, Testing unconstrained optimization software, ACM Trans. Math. Softw. 7 (1981) 17–41.