Applied Mathematics and Computation 207 (2009) 124–134
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
The global convergence of augmented Lagrangian methods based on NCP function in constrained nonconvex optimization H.X. Wu a, H.Z. Luo b,*, S.L. Li b a b
Department of Mathematics, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, PR China Department of Applied Mathematics, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China
a r t i c l e
i n f o
a b s t r a c t In this paper, we present the global convergence properties of the primal–dual method using a class of augmented Lagrangian functions based on NCP function for inequality constrained nonconvex optimization problems. We construct four modified augmented Lagrangian methods based on different algorithmic strategies. We show that the convergence to a KKT point or a degenerate point of the original problem can be ensured without requiring the boundedness condition of the multiplier sequence. Ó 2008 Elsevier Inc. All rights reserved.
Keywords: Nonconvex optimization Constrained optimization Augmented Lagrangian methods Convergence to KKT point Degenerate point
1. Introduction Consider the following inequality constrained nonlinear optimization problem:
ðPÞ
min s:t:
f ðxÞ g i ðxÞ P 0;
i ¼ 1; . . . ; m;
x 2 X; where f and each g i : Rn ! R are all continuously differentiable functions, and X is a nonempty closed convex set in Rn . Notice that f and each g i are not necessarily convex. The first augmented Lagrangian method was proposed by Hestenes [14] and Powell [32] in order to eliminate the duality gap between an equality constrained problem and its Lagrangian dual problem. Later, Rockafellar [34,35] extended this method to deal with inequality constraints. Since then, various modified augmented Lagrangian methods have been proposed. The strong duality properties and exact penalization of different types of augmented Lagrangians or nonlinear Lagrangians have been studied by many researchers (see e.g., [15–17,20,28,30,37–39]). The existence of a saddle point of certain augmented or nonlinear Lagrangian functions has been investigated in [20–22,36,39,41]. Local convergence results of augmented Lagrangian methods were studied in [10,12,26,27,30]. The global convergence of Rockafellar’s augmented Lagrangian methods for nonconvex constrained problems were established in [3,13,29,42] under a restrictive assumption that the sequence of multipliers generated by the algorithms is bounded. Modified augmented Lagrangian methods for nonconvex constrained problems have been proposed to ensure the global convergence without appealing to the boundedness assumption of the multiplier sequence (see [1,2,5,8,9,19]). Global convergence of augmented Lagrangian methods for convex programming has been also studied in [4,18,31,34,40]. Convergence to a global optimal solution of modified augmented Lagrangian methods was established in [23–25] for nonconvex constrained global optimization problems without requiring the boundedness of the multiplier sequence.
* Corresponding author. E-mail address:
[email protected] (H.Z. Luo). 0096-3003/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2008.10.015
125
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
Recently, a augmented Lagrangian function for ðPÞ was proposed in [33] based on Fishcher–Burmeister NCP function. Good numerical behavior of this augmented Lagrangian function was shown in [33] by comparing with the other known ones. We now consider a general class of augmented Lagrangian functions based on NCP Function. Let,
Lðx; k; cÞ ¼ f ðxÞ
m 1X Wðcg i ðxÞ; ki Þ; c i¼1
ð1Þ
where c > 0, x 2 X, k ¼ ðk1 ; . . . ; km ÞT P 0, and the function W : R2 ! R is given by
Wðs; tÞ ¼ st þ
Z
s
/ðu; tÞdu:
ð2Þ
0
Here, the function / : R2 ! R is assumed to satisfy the following conditions: (A1) /ðs; tÞ ¼ 0 () s P 0, t P 0, st ¼ 0; (A2) /ðs; tÞ þ t P 0, 8s 2 R, t 2 Rþ , where Rþ ¼ ½0; 1Þ; (A3) lims!þ1 /ðs; tÞ ¼ t, lims!1 /ðs; tÞ ¼ þ1, 8t 2 Rþ . The function / satisfying condition (A1) isffi called an NCP function. Examples of / that satisfy conditions (A1)–(A3) include qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
the min-function pffiffiffiffiffiffiffiffiffiffiffiffiffiffi s2 þ t 2 s t. We note that
st þ
Z 0
s
/min ðs; tÞ ¼ 12
ðs tÞ2 s t
¼ minfs; tg
and
the
Fischer–Burmeister
function
/FB ðs; tÞ ¼
1 1 /min ðu; tÞdu ¼ ½minf0; s tg2 þ t 2 ; 2 2
the augmented Lagrangian of Rockafellar [34,35] is a special case of Lðx; k; cÞ when setting /ðs; tÞ ¼ /min ðs; tÞ in (2). When taking /ðs; tÞ ¼ /FB ðs; tÞ in (2), Lðx; k; cÞ reduces to the augmented Lagrangian function given in [33]. Local convergence results of augmented Lagrangian methods using Lðx; k; cÞ associated with /ðs; tÞ ¼ /FB ðs; tÞ were studied in [33]. One purpose of this paper is to study global convergence properties of augmented Lagrangian methods based on Lðx; k; cÞ. Four different algorithmic strategies are considered to circumvent the boundedness condition of the multipliers in the convergence analysis for basic primal–dual method. We first show that under weaker conditions, the augmented Lagrangian method using safeguarding strategy converges to a KKT point or a degenerate point of the original problem. The convergence properties of the augmented Lagrangian method using conditional multiplier updating rule is then presented. Finally, we investigate the use of penalty parameter updating criteria and normalization of the multipliers in augmented Lagrangian methods. The paper is organized as follows. In Section 2, the convergence properties of the basic primal–dual method under the standard assumptions. The modified augmented Lagrangian method with safeguarding is investigated in Section 3. In Section 4, we establish the convergence results of the augmented Lagrangian method with the conditional multiplier updating. The use of penalty parameter updating criteria and normalization of multipliers are discussed in Section 5. Some preliminary numerical results are reported in Section 6. Finally, some concluding remarks are given in Section 7. 2. Basic primal–dual scheme In this section, we present the basic primal–dual scheme based on the above class of augmented Lagrangians Lðx; k; cÞ and discuss its convergence to a KKT point or a degenerate point under standard conditions. Define hðx; k; cÞ ¼ ðh1 ðx; k; cÞ; . . . ; hm ðx; k; cÞÞT with
hi ðx; k; cÞ ¼ ð1=cÞ/ðcg i ðxÞ; ki Þ;
i ¼ 1; . . . ; m:
ð3Þ
Algorithm 1. Basic primal–dual method 1 Step 0. (Initialization) Select two positive sequences fck g1 k¼0 and fk gk¼0 with k Step 1. (Relaxation problem) Compute an x 2 X such that:
kP½xk rx Lðxk ; kk ; ck Þ xk k 6 k ;
k ! 0 as k ! 1. Choose k0 P 0. Set k ¼ 0. ð4Þ
where P is the Euclidean projection operator onto X. Step 2. (Multiplier updating) Compute
kkþ1 ¼ kk þ ck hðxk ; kk ; ck Þ: Step 3. Set k ¼ k þ 1, go to Step 1.
ð5Þ
126
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
Remark 1. The multiplier updating formula (5) is derived by noticing the following fact:
rx Lðxk ; kk ; ck Þ ¼ 0 ) rx Fðxk ; kkþ1 Þ ¼ 0; where F is the classical Lagrangian function given by Fðx; kÞ ¼ f ðxÞ that kk P 0 for all k.
Pm
i¼1 ki g i ðxÞ.
We note from condition (A2), k0 P 0 and (5)
Definition 1 [5]. A point x 2 X is said to be degenerate if there exists k 2 Rm with k P 0 such that:
X
" ki > 0; P x þ
i2Iðx Þ
X
# ki rg i ðx Þ x ¼ 0;
i2Iðx Þ
where Iðx Þ ¼ fi : g i ðx Þ 6 0; i ¼ 1; . . . ; mg. The following convergence result for Algorithm 1 can be shown by using the similar arguments as in the proof of Proposition 2.2 in [4]. Theorem 1. Assume that conditions (A1)–(A3) for / are satisfied. Let x be a limit point of the sequence fxk g generated by Algorithm 1. Suppose that fkk g is bounded. If ck ! 1 as k ! 1, then, either x is degenerate or x is a KKT point of ðPÞ. Remark 2. The multiplier update in Step 2 of Algorithm 1 is not essential for the convergence analysis of Algorithm 1. Indeed, no matter how kk is updated, Theorem 1 holds true as long as fkk g is bounded. In the subsequent sections, we will present different approaches to modify the basic primal–dual scheme so that the convergence to a degenerate point or a KKT point of ðPÞ can still be achieved without assuming the boundedness of fkk g. 3. Modified augmented Lagrangian method using safeguarding In this section, we will use the safeguarding technique to modify the basic primal–dual algorithm (see [1,2,5]). The purpose of using safeguarding technique is to ensure the boundedness of the multiplier sequence by projecting the multipliers in Step 2 of Algorithm 1 onto suitable bounded intervals. We first describe the modified algorithm using safeguarding technique. Algorithm 2. Modified primal–dual method using safeguarding Step 0. (Initialization) Choose a positive sequence fk g1 k¼1 satisfying k ! 0 as k ! 1. Choose c > 1, and k1 2 Rm such that 0 6 k1i 6 kmax for i ¼ 1; . . . ; m. Let r0 > 0. Set k ¼ 1. i Step 1. (Relaxation problem) Compute an xk 2 X such that:
s 2 ð0; 1Þ, c1 > 0, kmax
kP½xk rx Lðxk ; kk ; ck Þ xk k 6 k :
ð6Þ
Step 2. (Multiplier updating) Compute
kkþ1 ¼ kk þ ck hðxk ; kk ; ck Þ;
ð7Þ
where h is defined in (3). Step 3. (Safeguarding projection) Compute
kkþ1 ¼ ProjT ðkkþ1 Þ;
ð8Þ
kþ1 Þ denotes the Euclidean projection of where ProjT ðk kkþ1 on T ¼ fk 2 R j0 6 ki 6 Step 4. (Parameter updating) Let rk ¼ khðxk ; kk ; ck Þk. If m
kmax ; i
i ¼ 1; . . . ; mg.
rk 6 srk1 ;
ð9Þ
set ckþ1 ¼ ck , otherwise, set ckþ1 ¼ cck . Set k :¼ k þ 1 and go to Step 1. Similar to Theorem 8 in [24], we have the following global convergence results for Algorithm 2. Theorem 2. Assume that conditions (A1)–(A3) for / are satisfied. Let x be a limit point of the sequence fxk g generated by Algorithm 2. Then, either x is degenerate or x is a KKT point of ðPÞ. Proof. By (6) and (7), we have:
" # m X k k kþ1 k k lim P x rf ðx Þ þ ki rg i ðx Þ x ¼ 0: k!1 i¼1
ð10Þ
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
127
By condition (A2) and (7) and (8) of Algorithm 2, it holds kk P 0 and kk P 0 for all k. Let K # f1; 2; . . .g be such that fxk gK ! x . Since X is closed, we have x 2 X. We consider the following two cases. Case (i): f kkþ1 gK is unbounded. There exists an infinite subsequence K1 # K such that:
Kk ¼
m X
kkþ1 ! 1; i
k ! 1; k 2 K1 :
i¼1
Since 0 6
kkþ1 i
kkþ1 i
Kk
Kk
6 1, we may assume that:
! ki ;
k ! 1; k 2 K1
ð11Þ
for i ¼ 1; . . . ; m. On the other hand, since Kk ! 1, we have 0 < 1k < 1 for sufficiently large k 2 K1 . By using the following property of K Euclidean projection:
kP½z þ td zk 6 kP½z þ d zk 8z 2 X; d 2 Rn ; t 2 ½0; 1; we deduce from (10) that:
" !# m X 1 k k kþ1 k k P x þ r f ðx Þ þ r g ðx Þ x k ¼ 0: i i k!1;k2K1 Kk i¼1 lim
ð12Þ
Since xk ! x and Kk ! 1 as k ! 1, k 2 K1 , we deduce from (12) and (11) that:
" P x þ
m X
# ki rg i ðx Þ x ¼ 0:
ð13Þ
i¼1
We now prove ki ¼ 0 for i R Iðx Þ. We consider two cases: (a) ck ! 1 as k ! 1. For i R Iðx Þ, since g i ðx Þ > 0, we have ck g i ðxk Þ ! 1 as k ! 1, k 2 K. Note that fkk g T is bounded by Step 3 of the algorithm. Also, by condition (A3), lims!þ1 /ðs; tÞ ¼ t, 8t 2 Rþ . Thus, from (7), we obtain:
lim
k!1;k2K
kkþ1 ¼ i
lim ½/ðck g i ðxk Þ; kki Þ þ kki ¼ 0;
k!1;k2K
i R Iðx Þ:
ð14Þ
Thus, by (11), ki ¼ 0 for i R Iðx Þ. (b) fck g is bounded as k ! 1. In this case, (9) in Step 4 is satisfied at each iteration for sufficiently large k. This, together with s 2 ð0; 1Þ, implies that rk ! 0 ðk ! 1Þ and there exists k1 > 0 such that ck ¼ ck1 for all k P k1 . Since rk ¼ khðxk ; kk ; ck Þk, it then follows from the definition of h (cf. (3)) that:
lim ð1=ck Þ/ðck g i ðxk Þ; kki Þ ¼ 0;
i ¼ 1; . . . ; m:
k!1
ð15Þ
Since fkk g T is bounded, we can assume, without loss of generality, that kk ! k ðk ! 1; k 2 KÞ. From (15) and ck ¼ ck1 for all k P k1 , we obtain:
/ðck1 g i ðxÞ; ki Þ ¼ 0;
i ¼ 1; . . . ; m;
ð16Þ
which in turn implies by condition (A1) that:
g i ðxÞ P 0;
ki P 0;
ki g ðxÞ ¼ 0; i
i ¼ 1; . . . ; m:
ð17Þ
So, ki ¼ 0, i R Iðx Þ. This together with (16) gives (14), and hence by (11), ki ¼ 0 for i R Iðx Þ. Therefore, we obtain from (13) that:
" P x þ
X
# ki rg i ðx Þ x ¼ 0:
i2Iðx Þ
So, x is degenerate. kþ1 g is bounded. In this case, there exists an infinite subsequence K2 # K such that limk!1;k2K kkþ1 ¼ Case (ii): fk K 2 k P 0. So, taking limit in (10) gives rise to
"
P x rf ðx Þ þ
m X i¼1
# ki
rg i ðx Þ x ¼ 0:
ð18Þ
128
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
We consider two following cases: (a) ck ! 1 when the algorithm is executed. We claim that g i ðx Þ P 0 for i ¼ 1; . . . ; m. Otherwise, if g i0 ðx Þ < 0 for some i0 , then ck g i0 ðxk Þ ! 1 as k ! 1, k 2 K. Note from condition (A3) that lims!1 /ðs; tÞ ¼ þ1 for all t 2 Rþ . By the boundedness of fkk g, it follows from (7) that:
lim
k!1;k2K
kkþ1 ¼ i0
lim ½/ðck g i0 ðxk Þ; kki0 Þ þ kki0 ¼ 1;
k!1;k2K
contradicting to the boundedness of f kkþ1 i0 g. Thus, x is a feasible solution to ðPÞ. If g i ðx Þ > 0 for some i, then, as in Case (i)(a), we can show that ki ¼ 0. Thus, we have:
ki P 0;
g i ðx Þ P 0;
ki g i ðx Þ ¼ 0;
i ¼ 1; . . . ; m:
ð19Þ
(b) fck g is bounded when the algorithm is executed. In this case, as shown in Case (i)(b), we also get (19).Therefore, the combination of (18) and (19) implies that x is a KKT point of ðPÞ and k is the corresponding optimal multiplier vector. h
4. Modified augmented Lagrangian method with conditional multiplier updating In this section, we investigate an alternative strategy to modify the basic primal–dual algorithm for solving ðPÞ. The idea is to modify Step 2 of Algorithm 1 so that the multipliers remain unchanged unless certain progress for the feasibility–complementarity is achieved. Similar idea for multiplier updating has been used in [9] for equality constrained problem. Algorithm 3. Modified primal–dual method with conditional multiplier updating Step 0. Choose initial multiplier vector k0 P 0 and the constants c0 > 1, u0 > 0, v 0 > 0, s > 1, c1 2 ð0; 1Þ, 0 6 1, ag > 0:5, bg > 0, ax > 0, bx > 0. Set a0 ¼ min c10 ; c1 , 0 ¼ v 0 ða0 Þax , and g0 ¼ u0 ða0 Þag , and k ¼ 0. Step 1. Find an xk 2 X satisfying
kP½xk rx Lðxk ; kk ; ck Þ xk k 6 k : Let
ð20Þ
k
rk ¼ khðxk ; k ; ck Þk, where h is defined by (3). If
rk 6 gk ;
ð21Þ
go to Step 2. Otherwise, go to Step 3. Step 2. If rk 6 , stop. Otherwise, set
8 > kkþ1 ¼ kk þ ck hðxk ; kk ; ck Þ; > > > > > ckþ1 ¼ ck ; > > < 1 akþ1 ¼ min ckþ1 ; c1 ; > > > > > kþ1 ¼ k ðakþ1 Þbx ; > > > : gkþ1 ¼ gk ðakþ1 Þbg ;
ð22Þ
Set k :¼ k þ 1, go to Step 1. Step 3. Set
8 kþ1 k ¼ kk ; > > > > > ¼ sck ; c > > < kþ1 1 akþ1 ¼ min ckþ1 ; c1 ; > > > ax > > kþ1 ¼ v 0 ðakþ1 Þ ; > > : gkþ1 ¼ u0 ðakþ1 Þag :
ð23Þ
Set k :¼ k þ 1, go to Step 1. Let ¼ 0 in Algorithm 3. We first give the following lemmas which can be proved using the similar arguments as in the proofs of Lemma 4.1 in [9] and Lemma 2 in [24]. k
Lemma 1. If ck ! 1 when Algorithm 3 is executed, then limk!1 pkcffiffiffikffi ¼ 0. Lemma 2. If fck g is bounded when Algorithm 3 is executed, then fkk g is convergent.
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
129
We now present the convergence result for Algorithm 3 based on the above lemmas. In addition to (A1)–(A3), let /ðs; tÞ in the definition of L satisfy the following condition: (A4) For any sequence fsk g # R, nonnegative sequence ft k g # R and positive sequence fck g # R with ck ! 1 ðk ! 1Þ, one has
tk lim pffiffiffiffiffi ¼ 0; ck tk lim pffiffiffiffiffi ¼ 0; k!1 ck k!1
lim sk ¼ s > 0 ) lim ½/ðck sk ; tk Þ þ t k ¼ 0;
ð24Þ
lim sk ¼ s < 0 ) lim ½/ðck sk ; tk Þ þ t k ¼ 1:
ð25Þ
k!1
k!1
k!1
k!1
It can be verified that /min ðs; tÞ and /FB ðs; tÞ satisfy condition (A4). We note that using condition (A3), if ft k g is bounded, then condition (A4) is clearly satisfied. Theorem 3. Suppose that conditions (A1)–(A4) for / are satisfied. Let x be a limit point of the sequence fxk g generated by Algorithm 3. Then, either x is degenerate or x is a KKT point of ðPÞ. Proof. Let K # f1; 2; . . .g be such that fxk gK ! x 2 X. By (20), we have:
" # m X kkþ1 rg ðxk Þ xk lim P xk rf ðxk Þ þ ¼ 0; i i k!1 i¼1
ð26Þ
where kkþ1 is defined by
kkþ1 ¼ /ðck g ðxk Þ; kk Þ þ kk ; i i i i
i ¼ 1; . . . ; m:
ð27Þ
Suppose that ck ! 1 when the algorithm is executed. By Lemma 1, it holds
kk lim pffiffiffiffiffi ¼ 0: k!1 ck
ð28Þ
Thus, by (24) in condition (A4), we have:
lim
k!1;k2K
kkþ1 ¼ i
lim ½/ðck g i ðxk Þ; kki Þ þ kki ¼ 0
k!1;k2K
ð29Þ
for some i with g i ðx Þ > 0. Moreover, from (28) and (25) in (A4), we have that:
lim
k!1;k2K
kkþ1 ¼ i
lim ½/ðck g i ðxk Þ; kki Þ þ kki ¼ 1;
k!1;k2K
ð30Þ
when g i ðx Þ < 0. kkþ1 gK is bounded. For the first case, using (29) and We now consider two case: (i) f kkþ1 gK is unbounded, and (ii) f Lemma 2, similar to Case (i) in the proof of Theorem 2, we can infer from (26) that x is degenerate. In the second case, using (30), Lemma 2 and the similar arguments as Case (ii) in the proof of Theorem 2, we can infer from (26) that x is a KKT point of ðPÞ. h 5. Penalty parameter updating and normalization of multipliers In this section, another two strategies are presented in modifying the basic augmented Lagrangian algorithm. We first investigate the strategy of updating the penalty parameter ck using the information of multiplier (see [12,31,40]). Theorem 4. Suppose that conditions (A1)–(A4) for / are satisfied. Let ck in Algorithm 1 is updated by the following formulation:
2 ckþ1 ¼ ck max c; max kkþ1 ; i i¼1;...;m
ð31Þ
where c > 1. Let x be a limit point of the sequence fxk g generated by the modified Algorithm 1. Then, either x is degenerate or x is a KKT point of ðPÞ. Proof. By (4) and (5), we have:
" # m X kþ1 k k k k lim P x rf ðx Þ þ ki rg i ðx Þ x ¼ 0: k!1 i¼1
ð32Þ
130
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
Since c > 1, we see from (31) that ck ! 1 as k ! 1. Again, from (31), we have:
1 pffiffiffiffiffi P ck
Pm1 kþ1 j i¼1 jki ffiffiffiffiffiffiffiffiffi P 0: p ckþ1
Thus,
kk lim pffiffiffiffiffi ¼ 0: k!1 ck
ð33Þ
Let K # f1; 2; . . .g be such that fxk gK ! x 2 X. Using the similar arguments as in the proof of Theorem 3, by (33) and condition (A4), we can infer from (32) that either x is degenerate or x is a KKT point of ðPÞ. h Next, we consider another approach to guarantee the boundedness of multipliers in the basic augmented Lagrangian method. The idea is to normalize the multipliers in the augmented Lagrangian L. The resulting augmented Lagrangian is as follows:
1 e Lðx; k; cÞ ¼ f ðxÞ c
m X
W cg i ðxÞ;
i¼1
ki ; 1 þ kkk
ð34Þ
1 where c > 0, x 2 X, k P 0. Notice that the factor 1þkkk plays the role of normalizing the multiplier vector k. Similar idea was used in [28] for constructing another type of augmented Lagrangian function.
k ¼ kk k . Let L in Step 1 and kk in Step 2 of Algorithm 1 be replaced by e Theorem 5. Let k L and kk , respectively. Suppose that 1þkk k conditions (A1)–(A3) for / are satisfied. Let x be a limit point of the sequence fxk g generated by the modified Algorithm 1 associated with e L. If ck ! 1 as k ! 1, then, either x is degenerate or x is a KKT point of ðPÞ. Proof. By definition (34), we have:
1 e Lðx; kk ; cÞ ¼ f ðxÞ c
m X
Wðcg i ðxÞ; kki Þ ¼ Lðx; kk ; cÞ:
i¼1
By Step 1 of Algorithm 1, we have:
kP½xk rx Lðxk ; kk ; ck Þ xk k 6 k :
ð35Þ
k
Since fk g is bounded and ck ! 1, using the similar arguments as in the proof of Theorem 2, we can deduce from (35) that either x is degenerate or x is a KKT point of ðPÞ. h 6. Numerical examples In this section, we report some preliminary numerical results of the four modified augmented Lagrangian methods discussed in Sections 3–5. The purpose of our numerical experiment is to provide some insights into the numerical behavior of the methods and to make numerical comparisons among the four modified augmented algorithms. For convenience, we denote by Algorithm 4 the modified augmented Lagrangian method with penalty parameter updating in Theorem 4. Also, we denote by Algorithm 5 the modified augmented Lagrangian method with normalization of multipliers in Theorem 5. In our implementation of Algorithms 2–5, the NCP function /ðs; tÞ in Lðx; k; cÞ (cf. (1) and (2)) takes the following form, respectively:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 ðs tÞ2 s t ; /min ðs; tÞ ¼ 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi FB / ðs; tÞ ¼ s2 þ t2 s t:
ð36Þ ð37Þ min
FB
It is easy to see that the above functions / and / satisfies conditions (A1)–(A4). The algorithms were coded in Matlab 7.1 and run on a Pentium IV PC. In our implementation, the stopping criterion for the algorithms is khðxk ; kk ; ck Þk1 6 105 . The augmented Lagrangian relaxation subproblems in Algorithms 2–5 are solved by the Matlab subroutine fmincon with a given initial point. The subroutine fmincon finds a constrained minimum of a scalar function of several variables starting at an initial estimate. The default large-scale algorithm of the subroutine is chosen in our implementation which is a subspace trust region method and is based on the interior-reflective Newton method described in [6,7]. Four test problems from the literature are considered in our numerical experiment. The parameters in the algorithms are set as follows: – – – – –
¼ 107 , k1i ¼ 1, i ¼ 1; . . . ; m. Algorithm 2: s ¼ 0:25, c ¼ 2, c1 ¼ 1, kmax i Algorithm 3: c1 ¼ 0:1, s ¼ 2, ag ¼ 0:1, bg ¼ 0:9, l0 ¼ m0 ¼ ax ¼ bx ¼ 1, c1 ¼ 1, k1i ¼ 1, i ¼ 1; . . . ; m. Algorithm 4: s ¼ 0:25, c ¼ 2, c1 ¼ 1, k1i ¼ 1, i ¼ 1; . . . ; m. Algorithm 5: s ¼ 0:1, c ¼ 2, c1 ¼ 1, k1i ¼ 1, i ¼ 1; . . . ; m. The parameter c is set to 10 when Algorithms 2–4 are applied to the fourth test problem (Example 4).
131
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
Example 1 [11]. Consider the following concave quadratic problem: 5 X
min
f ðxÞ ¼ cT x 0:5
x2i
s:t:
6x1 þ 3x2 þ 3x3 þ 2x4 þ x5 6 6:5;
i¼1
10x1 þ 10x2 þ x6 6 20; x 2 X ¼ fx 2 R6 j0 6 xi 6 1; i ¼ 1; . . . ; 5; x6 P 0g; where c ¼ ð10:5; 7:5; 3:5; 2:5; 1:5; 10ÞT . The global solution of this example is x ¼ ð0; 1; 0; 1; 1; 20ÞT with optimal value f ðx Þ ¼ 213. Table 1 summarizes the numerical results for Example 1, where x0 is the given initial point of the algorithm, k is the number of iterations, ck is the penalty parameter, kk is the multiplier, xk is the optimal solution found by the algorithm and f ðxk Þ is the optimal objective value. Example 2 [11]. Consider the following nonconvex problem:
min
f ðxÞ ¼ x1 x2 x3
s:t:
x1 þ 2x2 þ 2x3 6 72; x1 þ 2x2 þ 2x3 P 0 x 2 X ¼ fx 2 R3 j0 6 xi 6 42; i ¼ 1; 2; 3g:
The global solution to this example is x ¼ ð24; 12; 12ÞT with optimal value f ðx Þ ¼ 3456. Numerical results of Algorithms 2–5 for Example 2 are reported in Table 2. Example 3 [11]. Consider the following nonconvex problem:
min
f ðxÞ ¼ x1 x2
s:t:
2x41 þ 8x31 8x21 þ x2 2 6 0; 4x41 þ 32x31 88x21 þ 96x1 þ x2 36 6 0; x 2 X ¼ fxj0 6 x1 6 3; 0 6 x2 6 4g:
Table 1 Numerical results for Example 1 with x0 ¼ ð0; 0; 0; 0; 0; 19Þ. k
ck
kk
xk
f ðxk Þ
2
min
/ /FB
3 5
4 4
(0, 1.0503) (0, 1.0000)
(0, 1, 0, 1, 1, 20) (0, 1, 0, 1, 1, 20)
213.0 213.0
3
/min /FB
4 7
2 2
(0, 1.0503) (0, 1.0000)
(0, 1, 0, 1, 1, 20) (0, 1, 0, 1, 1, 20)
213.0 213.0
4
/min /FB
3 5
4 4
(0, 1.0503) (0, 1.0000)
(0, 1, 0, 1, 1, 20) (0, 1, 0, 1, 1, 20)
213.0 213.0
5
/min /FB
17 9
216 16
(0, 1.0523) (0, 0.3846)
(0, 1, 0, 1, 1, 20) (0, 1, 0, 1, 1, 20)
213.0 213.0
Algorithm
/
Table 2 Numerical results for Example 2 with x0 ¼ ð20; 10; 10Þ. Algorithm
/
k
ck
kk
xk
f ðxk Þ
2
/min /FB
13 10
32 8
(143.9999, 0) (18.0000, 0)
(24, 12, 12) (24, 12, 12)
3456.0 3456.0
3
/min /FB
13 8
512 16
(143.9998, 0) (9.0000, 0)
(24, 12, 12) (24, 12, 12)
3456.0 3456.0
4
/min /FB
5 4
139 276
(144.0011, 0) (0.5219, 0)
(24, 12, 12) (24, 12, 12)
3456.0 3456.0
5
/min /FB
25 15
224 212
(142.7017, 0) (0.0702, 0)
(24, 12,12) (24, 12, 12)
3456.0 3456.0
132
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
Table 3 Numerical results for Example 3 with x0 ¼ ð2; 2Þ. Algorithm
/
k
ck
kk
xk
f ðxk Þ
2
/min /FB
4 4
2 4
(0.2876, 0.7126) (0.0721, 0.1780)
(2.3295, 3.1785) (2.3295, 3.1785)
5.5080 5.5080
3
/min /FB
4 5
1 1
(0.2877, 0.7128) (0.2876, 0.7124)
(2.3295, 3.1785) (2.3295, 3.1785)
5.5080 5.5080
4
/min /FB
4 4
2 4
(0.2876, 0.7126) (0.0721, 0.1780)
(2.3295, 3.1785) (2.3295, 3.1785)
5.5080 5.5080
5
/min /FB
13 9
212 128
(0.2976, 0.7128) (0.0040, 0.0109)
(2.3295, 3.1785) (2.3295, 3.1785)
5.5080 5.5080
Table 4 Numerical results for Example 4 with x0 ¼ ð5; 1; 5; 0; 0; 5Þ. Algorithm
kk
xk
f ðxk Þ
10 100
(1, 0, 37.9971, 0, 111.9991, 0) (0.0792, 0, 0.3805, 0, 1.1208, 0)
(5, 1, 5, 0, 5, 10) (5, 1, 5, 0, 5, 10)
310 310
7 5
104 100
(1.0, 0, 37.9962, 0, 111.9879, 0) (0.0788, 0.0, 0.3799, 0.0, 1.1199, 0)
(5, 1, 5, 0, 5, 10) (5, 1, 5, 0, 5, 10)
310 310
/min /FB
5 5
19,876 131.6
(1.0, 0, 37.9913, 0, 111.9529, 0) (0.0004, 0, 0.2891, 0, 0.8518, 0.0)
(5, 1, 5, 0, 5, 10) (5, 1, 5, 0, 5, 10)
310 310
/min /FB
22 11
221 512
(0.0, 0, 37.9601, 0, 111.9617, 0) (0.0, 0, 0.07420, 0.2189, 0.0002)
(5, 1, 5, 0, 5, 10) (5, 1, 5, 0, 5, 10)
310 310
k
/
2
min
/ /FB
7 5
3
/min /FB
4 5
ck 3
The best known global solution to this example is x ¼ ð2:3295; 3:1783Þ with function value f ðx Þ ¼ 5:5079. Table 3 summarizes the numerical results for Example 3. Example 4 [11]. Consider the following nonconvex quadratic problem:
min
f ðxÞ ¼ 25ðx1 2Þ2 ðx2 2Þ2 ðx3 1Þ2 ðx4 4Þ2 ðx5 1Þ2 ðx6 4Þ2
s:t:
ðx3 3Þ2 x4 þ 4 6 0; ðx5 3Þ2 x6 þ 4 6 0; x1 3x2 2 6 0; x1 þ x2 2 6 0; x1 þ x2 6 6 0; x1 x2 þ 2 6 0; x 2 X ¼ fxj0 6 x1 6 6; 0 6 x2 6 8; 1 6 x3 6 5; 0 6 x4 6 6; 1 6 x5 6 5; 0 6 x6 6 10g:
The best known global solution to this example is x ¼ ð5; 1; 5; 0; 5; 10Þ with function value f ðx Þ ¼ 310. Table 4 summarizes the numerical results for Example 4. Table 5 illustrates numerical results of Algorithms 2–4 using the exponential augmented Lagrangian Lðx; k; cÞ defined in (1) associated with Wðs; tÞ ¼ tðes 1Þ for Examples 1–4. From Tables 1–4, we observe that among the four proposed augmented Lagrangian methods using different strategies, Algorithms 2 and 3 are more effective method in terms of the number of iterations. We also see that Algorithms 2–5 using the Fischer–Burmeister function /FB ðs; tÞ are more efficient than those using the min-function /min ðs; tÞ. In particular, we observe from Tables 1–4 that Algorithm 5 using the min-function /min ðs; tÞ requires more iterations and larger penalty coefficient ck than that using the Fischer–Burmeister function /FB ðs; tÞ for Examples 1–4. From Table 5, we notice that Algorithms 2–4 using the Fischer–Burmeister function /FB ðs; tÞ are more efficient than those using the exponential function Wðs; tÞ ¼ tðes 1Þ. Therefore, our preliminary numerical results suggest that the Fischer–Burmeister function is more preferable than the min-function and the exponential function for the four modified augmented Lagrangian methods. 7. Concluding remarks We have presented some global convergence properties for modified augmented Lagrangian methods using a class of augmented Lagrangian functions based on NCP function for inequality constrained optimization problems. The main contribution of the paper is an investigation of different strategies for modifying the basic primal–dual method so that the
133
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134 Table 5 Numerical results of Algorithms 2–4 using the exponential augmented Lagrangian for Examples 1–4. Example
Algorithm
k
ck
kk
xk
f ðxk Þ
1
2 3 4
6 10 6
8 4 8
(0, 1.0498) (0, 1.0461) (0, 1.0498)
(0, 1, 0, 1, 1, 20) (0, 1, 0, 1, 1, 20) (0, 1, 0, 1, 1, 20)
213.0 213.0 213.0
2
2 3 4
7 13 4
4 512 165.15
(143.9999, 0) (143.9887, 0) (143.9966, 0)
(24, 12, 12) (24, 12, 12) (24, 12, 12)
3456.0 3456.0 3456.0
3
2 3 4
4 5 4
2 1 2
(0.2875, 0.7125) (0.2876, 0.7123) (0.2875, 0.7125)
(2.3295, 3.1785) (2.3295, 3.1785) (2.3295, 3.1785)
5.5080 5.5080 5.5080
4
2 3 4
8 13 9
8 512 32,373
(1, 0, 38.0001, 0, 112.0000, 0) (1, 0, 38.0522, 0, 112.0317, 0) (1, 0, 38.1098, 0, 112.0102, 0)
(5, 1, 5, 0, 5, 10) (5, 1, 5, 0, 5, 10) (5, 1, 5, 0, 5, 10)
310 310 310
convergence to a KKT point or a degenerate point can be achieved without the boundedness condition of the multipliers. We have also presented some preliminary numerical results and have made comparisons among the four modified augmented algorithms. One of the future research topics is to further investigate the computational aspects of the proposed modified augmented Lagrangian methods for large-scale nonconvex problems. Another research topics is to study the characterization of the boundedness of the penalty parameter. Acknowledgements This work was supported by the Province Natural Science Foundation of Zhejiang grant Y7080184, the National Natural Science Foundation of China grants 70671064 and 60673177, and the Education Department Foundation of Zhejiang Province grant 20070306. References [1] R. Andreani, E.G. Birgin, J.M. Martínez, M.L. Schuverdt, On augmented Lagrangian methods with general lower-level constraints, SIAM J. Optim. 18 (2007) 1286–1309. [2] R. Andreani, E.G. Birgin, J.M. Martínez, M.L. Schuverdt, Augmented Lagrangian methods under the constant positive linear dependence constraint qualification, Math. Program. 111 (2008) 5–32. [3] M.C. Bartholomew-Biggs, Recursive quadratic programming methods based on the augmented Lagrangian function, Math. Program. Study 31 (1987) 21–41. [4] D.P. Bertsekas, Constrained Optimization and Lagrangian Multiplier Methods, Academic Press, New York, 1982. [5] E.G. Birgin, R.A. Castillo, J.M. Martínez, Numerical comparison of augmented Lagrangian algorithms for nonconvex problems, Comput. Optim. Appl. 31 (2005) 31–55. [6] T.F. Coleman, Y. Li, On the convergence of reflective Newton methods for large-scale nonlinear minimization subject to bounds, Math. Program. 67 (1994) 189–224. [7] T.F. Coleman, Y. Li, An interior trust region approach for nonlinear minimization subject to bounds, SIAM J. Optim. 6 (1996) 418–445. [8] A.R. Conn, N.I.M. Gould, A. Sartenaer, P.L. Toint, Convergence properties of an augmented Lagrangian algorithm for optimization with a combination of general equality and linear constraints, SIAM J. Optim. 6 (1996) 674–703. [9] A.R. Conn, N.I.M. Gould, P.L. Toint, A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds, SIAM J. Numer. Anal. 28 (1991) 545–572. [10] L. Contesse-Becker, Extended convergence results for the method of multipliers for nonstrictly binding inequality constraints, J. Optim. Theory Appl. 79 (1993) 273–310. [11] C.A. Floudas, P.M. Pardalos, C.S. Adjiman, W.R. Esposito, Z.H. Gumus, S.T. Harding, J.L. Klepeis, C.A. Meyer, C.A. Schweiger, Handbook of Test Problems in Local and Global Optimization, Kluwer Academic Publishers, Dordrecht, 1999. [12] I. Griva, R. Polyak, A primal–dual nonlinear rescaling method with dynamic scaling parameter update, Math. Program. 106 (2006) 237–259. [13] W.W. Hager, Dual techniques for constrained optimization, J. Optim. Theory Appl. 55 (1987) 37–71. [14] M.R. Hestenes, Multiplier and gradient methods, J. Optim. Theory Appl. 4 (1969) 303–320. [15] X.X. Huang, X.Q. Yang, A unified augmented Lagrangian approach to duality and exact penalization, Math. Oper. Res. 28 (2003) 524–532. [16] K.C. Kiwiel, On the twice differentiable cubic augmented Lagrangian, J. Optim. Theory Appl. 88 (1996) 233–236. [17] B.W. Kort, D.P. Bertsekas, A new penalty method for constrained minimization, in: Proceedings of the 1972 IEEE Conference on Decision and Control, New Orleans, 1972, pp. 162–166. [18] B.W. Kort, D.P. Bertsekas, Combined primal–dual and penalty methods for convex programming, SIAM J. Control Optim. 14 (1976) 268–294. [19] R.M. Lewis, V. Torczon, A globally convergent augmented Lagrangian pattern search algorithm for optimization with general constraints and simple bounds, SIAM J. Optim. 12 (2002) 1075–1089. [20] D. Li, Zero duality gap for a class of nonconvex optimization problems, J. Optim. Theory Appl. 85 (1995) 309–324. [21] D. Li, X.L. Sun, Convexification and existence of saddle point in a pth-power reformulation for nonconvex constrained optimization, Nonlinear Anal. 47 (2001) 5611–5622. [22] D. Li, X.L. Sun, Existence of a saddle point in nonconvex constrained optimization, J. Global Optim. 21 (2001) 39–50. [23] H.Z. Luo, Studies on the augmented Lagrangian methods and convexification methods for constrained global optimization, Ph.D. Thesis, Department of Mathematics, Shanghai University, 2007. [24] H.Z. Luo, X.L. Sun, D. Li, On the convergence of augmented Lagrangian methods for constrained global optimization, SIAM J. Optim. 18 (2007) 1209– 1230.
134
H.X. Wu et al. / Applied Mathematics and Computation 207 (2009) 124–134
[25] H.Z. Luo, X.L. Sun, H.X. Wu, Convergence properties of augmented Lagrangian methods for constrained global optimization, Optim. Methods Softw. 23 (2008) 763–778. [26] O.L. Mangasarian, Unconstrained Lagrangians in nonlinear programming, SIAM J. Control Optim. 13 (1975) 772–791. [27] V.H. Nguyen, J.J. Strodiot, On the convergence rate of a penalty function method of exponential type, J. Optim. Theory Appl. 27 (1979) 495–508. [28] G. Di Pillo, S. Lucidi, An augmented Lagrangian function with improved exactness properties, SIAM J. Optim. 12 (2001) 376–406. [29] E. Polak, A.L. Tits, A globally convergent implementable multiplier method with automatic penalty limitation, Appl. Math. Optim. 6 (1980) 335–360. [30] R. Polyak, Modified barrier functions: theory and methods, Math. Program. 54 (1992) 177–222. [31] R. Polyak, Nonlinear rescaling vs. smoothing technique in convex optimization, Math. Program. 92 (2002) 197–235. [32] M.J.D. Powell, A method for nonlinear constraints in minimization problems, in: R. Fletcher (Ed.), Optimization, Academic Press, New York, 1969, pp. 283–298. [33] Y.H. Ren, L.W. Zhang, X.T. Xiao, A nonlinear Lagrangian based on Fischer–Burmeister NCP function, Appl. Math. Comput. 188 (2007) 1344–1363. [34] R.T. Rockafellar, The multiplier method of Hestenes and Powell applied to convex programming, J. Optim. Theory Appl. 12 (1973) 555–562. [35] R.T. Rockafellar, Augmented Lagrange multiplier functions and duality in nonconvex programming, SIAM J. Control Optim. 12 (1974) 268–285. [36] R.T. Rockafellar, Lagrange multipliers and optimality, SIAM Rev. 35 (1993) 183–238. [37] A.M. Rubinov, X.X. Huang, X.Q. Yang, The zero duality gap property and lower semicontinuity of the perturbation function, Math. Oper. Res. 27 (2002) 775–791. [38] A.M. Rubinov, X.Q. Yang, Lagrange-type Functions in Constrained Non-Convex Optimization, Kluwer Academic Publishers, Dordrecht, 2003. [39] X.L. Sun, D. Li, K.I.M. McKinnon, On saddle points of augmented Lagrangians for constrained nonconvex optimization, SIAM J. Optim. 15 (2005) 1128– 1146. [40] P. Tseng, D.P. Bertsekas, On the convergence of the exponential multiplier method for convex programming, Math. Program. 60 (1993) 1–19. [41] Z.K. Xu, Local saddle points and convexification for nonconvex optimization problems, J. Optim. Theory Appl. 94 (1997) 739–746. [42] H. Yamashita, A globally convergent constrained quasi-Newton method with an augmented Lagrangian type penalty function, Math. Program. 23 (1982) 75–86.