Applied Mathematics and Computation 215 (2010) 3589–3598
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
A type of efficient feasible SQP algorithms for inequality constrained optimization q Zhong Jin a,*, Yuqing Wang b a b
Department of Mathematics, Tongji University, Shanghai 200092, PR China Department of Mathematics, Jiaxing College, Jiaxing 314001, PR China
a r t i c l e
i n f o
a b s t r a c t In this paper, motivated by Zhu et al. methods [Z.B. Zhu, K.C. Zhang, J.B. Jian, An improved SQP algorithm for inequality constrained optimization, Math. Meth. Oper. Res. 58 (2003) 271–282; Zhibin Zhu, Jinbao Jian, An efficient feasible SQP algorithm for inequality constrained optimization, Nonlinear Anal. Real World Appl. 10(2) (2009) 1220–1228], we propose a type of efficient feasible SQP algorithms to solve nonlinear inequality constrained optimization problems. By solving only one QP subproblem with a subset of the constraints estimated as active, a class of revised feasible descent directions are generated per single iteration. These methods are implementable and globally convergent. We also prove that the algorithms have superlinear convergence rate under some mild conditions. Ó 2009 Elsevier Inc. All rights reserved.
Keywords: Inequality constrained optimization FSQP KKT point Global convergence Superlinear convergence
1. Introduction In this paper, we consider the following nonlinear inequality constrained optimization problem:
ðPÞ
min s:t:
f ðxÞ g j ðxÞ 6 0;
j 2 I ¼ f1; 2; . . . ; mg;
ð1Þ
where x 2 Rn ; f : Rn ! R and g j ðj 2 IÞ : Rn ! R are assumed to be continuously differentiable. It is well known that the sequential quadratic programming (SQP) method is one of the most efficient methods to solve problem (P). Because of its superlinear convergence rate, it has been widely studied [1–6]. SQP is solved as follows
min s:t:
1 2 g j ðxk Þ þ rg j ðxk ÞT d 6 0;
rf ðxk ÞT d þ dT Bk d
ð2Þ j 2 I ¼ f1; 2; . . . ; mg;
where Bk 2 Rnn is a symmetric positive definite matrix. However, a drawback of the above quadratic programming subproblem is that it may be inconsistent, i.e., the feasible region of (2) may be empty. Another drawback is that there exists Maratos effect [7], that is to say, the unit stepsize cannot be accepted although the iterations are close enough to the solution of (1). Furthermore, many practical problems arisen from engineering design and real-time applications strictly require certain feasibility of the iteration points, i.e., q
This research is supported by the National Natural Science Foundation of China (No. 10771162). * Corresponding author. E-mail address:
[email protected] (Z. Jin).
0096-3003/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2009.10.055
3590
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
the iteration points must satisfy all or part of the constraints. Because of the above reasons, there have been proposed some interesting strategies to overcome these shortcomings, and generated a class of SQP algorithms which are called as feasible sequential quadratic programming (FSQP) (Refs. [10,11,13]). By using the once-order feasible descent condition to ensure the global convergence, if a poor initial point is chosen, the method in [10] may give slow convergence. From the view of computational cost, the main drawback of FSQP algorithm pointed in [13] is the need to solve three QPs (or two QPs and a linear least squares problem) at each iteration, so for many problems it would be desirable to reduce the number of QPs at each iteration while preserving the generation of feasible iterates as well as the global and local convergence properties. Recently in [8,15], the following QP subproblem with smaller scale is considered:
1 2
min
rf ðxk ÞT d þ dT Bk d
s:t:
g j ðxk Þ þ rg j ðxk ÞT d 6 0;
ð3Þ j 2 Lk ; k
where Lk is a subset of the constraints estimated as active and its solution d0 is obtained. In these two papers, per single iterk ation, different feasible decent directions are generated by taking full advantage of the property of d0 , and they have good convergence properties. In this paper, we generalize the methods mentioned above [8,15] to get a type of efficient feasible SQP algorithms where a class of different feasible decent directions are generated, i.e., two feasible decent directions in above two papers are special cases of ours. Our methods are implementable and globally convergent. We also prove that the algorithms overcome the Maratos effect and have superlinear convergence rate under some mild conditions. In the end, numerical experiments show that the methods in this paper are effective. This paper is organized as follows. In the next section, we propose a type of generalized feasible SQP algorithms and prove the global convergence of these algorithms. In Section 3, we discuss the suitable conditions which make the generalized algorithms to be superlinear convergent. In Section 4, we present a brief discussion and practical methods, whose global convergence and superlinear convergence are analyzed. In Section 5, some numerical examples are given. 2. A type of efficient feasible SQP algorithms For convenience, the number of all of elements of any set L is denoted by jLj, and the feasible set of problem (P) is denoted by X ¼ fx 2 Rn jg j ðxÞ 6 0; j 2 Ig. Assumptions: A1 The feasible set is nonempty, i.e., X ¼ fx 2 Rn jg j ðxÞ 6 0; j 2 Ig–;. A2 For any x 2 X, the vectors frg j ðxÞ; j 2 IðxÞg are linearly independent, where IðxÞ ¼ fj 2 Ijg j ðxÞ ¼ 0g. Let xk 2 X be a given iterative point, similar to [8,15,14], we use the following pivoting operation to generate an approximate active set Lk # I. Pivoting operation Step 1 Let i ¼ 0; Step 2 Set
k;i ¼ 0 > 0.
Lk;i ¼ fj 2 Ij k;i 6 g j ðxk Þ 6 0g; Ak;i ¼ rg Lk;i ðxk Þ: If detðATk;i Ak;i Þ P k;i , let Lk ¼ Lk;i ; Ak ¼ Ak;i ; ik ¼ i, STOP. Step 3 Set i ¼ i þ 1; k;i ¼ k;i1 =2, and go to Step 2.
Lemma 1. For any iterate k, the index ik defined in Step 2 above is finite, which means that the pivoting operation terminates in ^ > 0, such that k;ik P ^, for k 2 K, k large enough. finite number of times. Moreover, if fxk gk2K ! x , then there exists a constant Proof. Please see Lemma 1 in [8], or Theorem 2.1 in [15], or Lemma 2.8 in [14]. h From the pivoting operation and Lemma 1, we know the matrix Ak is of full rank about column. Let A1k be the matrix whose rows are jLk j linearly independent rows of Ak , and A2k be the matrix whose rows are the remaining n jLk j rows of Ak . We ! A1k rf1 ðxk Þ : : Like Ak , we might as well let rf ðxk Þ ¼ might denote Ak ¼ 2 k rf2 ðx Þ Ak
3591
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598 k
It is well known that generally the solution d0 of the problem (3) is a descent direction but may not a feasible direction for k the problem (P). In order to find a feasible descent direction, by taking full advantage of the property of d0 , we define a class of variants as follows:
sk ¼ M 1 ðxk ÞððA1k Þ1 ÞT e; ! sk k ; d1 ¼ 0 k
k
ð4Þ ð5Þ k
d ¼ M 2 ðxk Þd0 þ M3 ðxk Þd1 ;
ð6Þ
where e ¼ ð1; 1; . . . ; 1ÞT 2 RjLk j ; M1 ðxk Þ; M3 ðxk Þ and M 3 ðxk Þ are three expressions about xk to be discussed later. k The next Lemma 2 tells us, if M1 ðxk Þ; M2 ðxk Þ and M 3 ðxk Þ satisfy the following two conditions, d is a feasible descent direction. k
Condition 1: If d0 – 0, then M 2 ðxk Þ P 0 and M 1 ðxk ÞM3 ðxk Þ > 0. k k Condition 2: If d0 – 0, then M 2 ðxk Þrf ðxk ÞT d0 M 1 ðxk ÞM3 ðxk ÞððA1k Þ1 rf1 ðxk ÞÞT e < 0. k
k
Lemma 2. (I) If d0 ¼ 0, then xk is a KKT point of the problem (1); (II) If d0 –0; M1 ðxk Þ; M2 ðxk Þ and M 3 ðxk Þ satisfy Conditions 1 and 2, then it holds
rf ðxk ÞT dk < 0;
rg j ðxk ÞT dk < 0;
j 2 Iðxk Þ;
ð7Þ
k
i.e., d is a feasible direction with descent of (1) at xk . Proof (I) It is evident according to the conditions of the KKT point. k (II) If d0 – 0, k k (i) Because ATk d1 ¼ ðA1k ÞT sk ¼ M1 ðxk Þe and rg j ðxk ÞT d0 6 0; j 2 Iðxk Þ # Lk , in view of Condition 1 we have,
rg j ðxk Þdk ¼ M2 ðxk Þrg j ðxk ÞT dk0 þ M3 ðxk Þrg j ðxk ÞT dk1 6 M3 ðxk Þrg j ðxk ÞT dk1 ¼ M1 ðxk ÞM3 ðxk Þ < 0;
j 2 Iðxk Þ # Lk ; ð8Þ
k
so d is a feasible direction of (1) at xk . (ii) From Condition 2, we have
rf ðxk ÞT dk ¼ M2 ðxk Þrf ðxk ÞT dk0 þ M3 ðxk Þrf ðxk ÞT dk1 ¼ M2 ðxk Þrf ðxk ÞT dk0 þ M3 ðxk Þrf1 ðxk ÞT sk k
¼ M 2 ðxk Þrf ðxk ÞT d0 M1 ðxk ÞM3 ðxk ÞððA1k Þ1 rf1 ðxk ÞÞT e < 0; k
ð9Þ
k
so d is also a descent direction of (1) at x . The lemma holds. h We can see the two feasible decent directions in [8,15] are special cases of ours. In [8],
1
k
M1 ðxk Þ ¼ rf ðxk ÞT d0
1þ
2jððA1k Þ1
rf1 ðxk ÞÞT ej
;
and k
M2 ðxk Þ ¼ M3 ðxk Þ ¼ rf ðxk ÞT d0 : In [15], k
k
M1 ðxk Þ ¼ rf ðxk ÞT d0
kd0 k2 k
1 þ 2jeT ðkk þ ðA1k Þ1 pk1 Þj kd0 k2
;
and
M2 ðxk Þ ¼ M3 ðxk Þ ¼ 1: It is easy to verify that these two cases satisfy the Conditions 1 and 2. In [8,15], the Maratos effect can be avoided by using a high-order corrected direction. In this paper, we can also get the high-order corrected direction by the following sub-algorithm A.
3592
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
Sub-algorithm A Obtain qk by solving the following jLk j jLk j systems of linear equations: k k ðA1k ÞT q ¼ kd ks e g^ðxk þ d Þ; k
ð10Þ
k
k
where g^ðxk þ d Þ ¼ ðg j ðxk þ d Þ g j ðxk Þ rg j ðxk ÞT d ; j 2 Lk Þ; e ¼ ð1; ; 1ÞT 2 RjLk j . Define
^ ¼ d k
qk
! ;
0
which holds that
^k ¼ ðA1 ÞT qk þ ðA2 ÞT 0 ¼ ðA1 ÞT qk ; ðAk ÞT d k k k
ð11Þ
^k k > kdk k, set d ^k ¼ 0. if kd Now, the generalized algorithm for the solution of problem (P) can be stated as follows. Algorithm 2.1. Step 0: Initialization:Choose an initial feasible point x0 2 Rn . Given constants 0 > 0; d > 2; s 2 ð2; 3Þ; a 2 ð0; 12Þ, and a symmetric and definite matrix B0 2 Rnn . Set k ¼ 0. Step 1: Obtain an ‘active’ constraint set Lk by the pivoting operation. k k Step 2: Obtain the KKT point pair ðd0 ; uk Þ by solving the QP subproblem (3). If d0 – 0, continue; Otherwise, stop. k k k k Step 3: Choose suitable M 1 ðx Þ; M2 ðx Þ and M 3 ðx Þ, and obtain the feasible descent direction d by (6). ^k is obtained by the sub-algorithm A. Step 4: The high-order corrected direction d Step 5: Line search. Compute the largest t k 2 f1; 12 ; 212 ; 213 ; . . .g satisfying
^k Þ 6 f ðxk Þ þ at rf ðxk ÞT d ; f ðxk þ t k d þ t2k d k k ^k Þ 6 0; j 2 I: g ðxk þ t d þ t2 d k
j
k
k
ð12Þ ð13Þ
k
Step 6: Update. Obtain Bkþ1 by updating the positive definite matrix Bk using some quasi-Newton formulas. Set k ^k . Let k ¼ k þ 1, go to Step 1. xkþ1 ¼ xk þ t k d þ t2k d For the global convergence of the above generalized algorithm, we always assume that following conditions hold. A3 The functions f ðxÞ and g j ðxÞ; j ¼ 1; 2; . . . m are twice continuously differentiable. A4 The sequence fxk g generated by Algorithm 2.1 is bounded, and there exist two constants b P a > 0 such that the T matrix sequence fBk g satisfy akdk2 6 d Bk d 6 bkdk2 for all k and d 2 Rn . Before we show the global convergence, we must ensure that it is possible to execute all the steps defined in Algorithm 2.1. Lemma 3. The line search in Step 5 yields a stepsize t k ¼ ð12 Þi for some finite i ¼ iðkÞ. Proof. According to Lemma 2, it is similar to the proof of Proposition 3.3 in [12]. h Lemma 1 shows Step 1 terminates in finite number of times and Lemma 3 shows Step 5 terminates in finite number of times, so Algorithm 2.1 is well-defined. In what follows, if Algorithm 2.1 generates an infinite sequence fxk g, then by A4, we might as well assume that there exits a subsequence K, such that
xk ! x ;
Bk ! B ;
k
d0 ! d0 ;
k
d1 ! d1 ;
k
d !d ;
Lk L;
k 2 K:
ð14Þ
Theorem 1. The algorithm either stops at the KKT point xk of problem (1) in finite number of steps, or generates an infinite sequence fxk g and any accumulation point x of which is a KKT point of the problem (1). k
Proof. By Lemma 2, the first statement is easy to show, since the only stopping point is in Step 2 when d0 ¼ 0 for some k. Thus, assume that Algorithm 2.1 generates an infinite sequence fxk g and (14) holds. Obviously, it is only necessary to prove that d0 ¼ 0. Suppose by contradiction that d0 – 0. Because d0 – 0, Conditions 1 and 2 also hold at x . By imitating the analysis of Lemma 2, we have
rf ðx ÞT d < 0;
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
3593
and
rg j ðx ÞT d < 0;
j 2 Iðx Þ:
Thereby, it is easy to see that the stepsize tk obtained in Step 5 are bounded away from zero on K, i.e.,
tk P t ¼ inf ftk ; k 2 Kg > 0;
k 2 K:
ð15Þ k
In view of (12) and Lemma 2, we know ff ðx Þg is monotonous decreasing, and it holds that
f ðxk Þ ! f ðx Þ;
k ! 1:
ð16Þ
So we have that
1 at rf ðx ÞT d < 0: 2
k
0 ¼ lim ðf ðxkþ1 Þ f ðxk ÞÞ 6 lim at k rf ðxk ÞT d 6 k!1 k2K
k!1 k2K
ð17Þ
It is a contradiction, which shows that d0 ¼ 0. Thus x is a KKT point of the problem (1). h 3. The convergence rate of the generalized algorithm In this section we prove that, under mild conditions, the generalized algorithm described in above section has superlinear convergence rate. For this reason, we now strengthen Conditions 1 and 2, which are replaced by k
Condition 1 0 : If d0 – 0, then k
Fðxk Þ ¼ 1 M2 ðxk Þ ¼ Oðkd0 k2 Þ
M2 ðxk Þ P 0; and
M1 ðxk ÞM3 ðxk Þ > 0;
k
M 1 ðxk ÞM 3 ðxk Þ ¼ Oðkd0 k3 Þ:
k
Condition 20 : If d0 – 0, then k
k
M2 ðxk Þrf ðxk ÞT d0 M1 ðxk ÞM3 ðxk ÞððA1k Þ1 rf1 ðxk ÞÞT 6 mrf ðxk ÞT d0 < 0; where m is a positive constant. In addition, we add the following assumptions. A5 The strong second-order sufficiency conditions are satisfied at the KKT point x and the corresponding multiplier vector u . A6 Bk ! B ; k ! 1. Moreover, B is positive on the subspace Hðx Þ ¼ fd 2 Rn jrg j ðx ÞT d ¼ 0; j 2 Iþ ¼ fj 2 Ijuj > 0gg. The first task is to show that the entire sequence fxk g is convergent to x . This is the object of Theorem 2. Lemma 4. Under Condition 10 , there exists some k
k
v k 2 Rn such that
k
d ¼ d0 þ v k and kv k k ¼ Oðkd0 k3 Þ:
ð18Þ
Proof. By the definition of sk , we know M 3 ðxk Þsk ¼ M1 ðxk ÞM3 ðxk ÞððA1k Þ1 ÞT e, then from Condition 10 and (5), it holds that
M3 ðxk Þsk
k
kM3 ðxk Þd1 k ¼ k
! k
k ¼ Oðkd0 k3 Þ;
0
ð19Þ
k
so by the definition of d , we have k
k
k
k
k
k
d ¼ M 2 ðxk Þd0 þ M3 ðxk Þd1 ¼ d0 Fðxk Þd0 þ M 3 ðxk Þd1 : We set
vk ¼
k Fðxk Þd0
þ
k M 3 ðxk Þd1 ,
ð20Þ
then the lemma holds. h
Theorem 2. The entire sequence fxk g is convergent to x , i.e., xk ! x ; k ! 1: Proof. In view of (12) and Lemma 2, we know ff ðxk Þg is monotonous decreasing and it holds that f ðxk Þ ! f ðx Þ; k ! 1. From Condition 20 , we have k
k
0 ¼ lim f ðxkþ1 Þ f ðxk ÞÞ 6 lim at k rf ðxk ÞT d 6 lim amt k rf ðxk ÞT d0 6 0; k!1
k!1
k!1
ð21Þ
3594
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
so k
lim t k kd0 k ¼ 0:
ð22Þ
k!1
From Lemma 4, we also have k
lim t k kd k ¼ 0:
ð23Þ
k!1
Since k
k
^k k 6 2 lim t kd k ¼ 0; 0 6 lim kxkþ1 xk k ¼ lim ktk d þ t 2k d k k!1
k!1
k!1
then
lim kxkþ1 xk k ¼ 0:
k!1
According to A5 and Proposition 4.1 in [10], we get limk!1 xk ¼ x . h In order to establish the main results, we now need four lemmas. Lemma 5. For k large enough, it holds that k
k
lim d0 ¼ 0;
lim d ¼ 0;
k!1
k!1
Iþ # J k # Iðx Þ # Lk ; k
where J k ¼ fj 2 Lk jg j ðxk Þ þ rg j ðxk ÞT d0 ¼ 0g. k
Proof. Since limk!1 xk ¼ x , according to the proof of Theorem 1, it is clear to see that limk!1 d0 ¼ 0, which with Lemma 4 k k implies that limk!1 d ¼ 0. As limk!1 d0 ¼ 0, by the definition of Iþ ; J k ; Iðx Þ and Lk , for k large enough, it is easy to see that Iþ # J k # Iðx Þ # Lk . h ^k k ¼ Oðkdk k2 Þ. ^k obtained in Step 4 satisfies that kd Lemma 6. The d Proof. By the definition, we have k k k k g^j ðxk þ d Þ ¼ g j ðxk þ d Þ g j ðxk Þ rg j ðxk Þd ¼ Oðkd k2 Þ; k
k
j 2 Lk ;
k 2
i.e., kg^ðx þ d Þk ¼ Oðkd k Þ. It follows from (11) that
^k ¼ ðA1 ÞT qk ¼ kdk ks e g^ðxk þ dk Þ; ATk d k while
A1k
ð24Þ
^k k ¼ Oðkdk k2 Þ. h is invertible and s 2 ð2; 3Þ, it is true that kd
Lemma 7. There exists some constant r > 0 such that, for k large enough,
X
ukj g j ðxk Þ
6 r
j2Iþ
X
!12 g 2j ðxk Þ
:
j2Iþ
Proof. For k large enough we have, from the convergence of the multipliers,
X j2Iþ
ukj g j ðxk Þ 6
X X 1 1 g j ðxk Þ 6 minfuj s:t: j 2 Iþ g g 2j ðxk Þ minfuj s:t: j 2 Iþ g 2 2 j2I j2I þ
!12 :
ð25Þ
þ
We set r ¼ 12 minfuj s:t: j 2 Iþ g, which is positive due to the definition of Iþ . h A crucial requirement for achieving superlinear convergence is that a unit stepsize be used in a neighbourhood of the solution.The next proposition shows that the generalized algorithm does achieve this goal if the following assumption is satisfied. A7 For the symmetric matrix sequence fBk g, it satisfies that k
k
kPk ðBk r2xx Lðxk ; uk ÞÞd k ¼ oðkd kÞ;
3595
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
where
Pk ¼ En Rk ðRTk Rk Þ1 RTk ; Rk ¼ ðrg j ðxk Þ; j 2 Iþ Þ; and
r2xx Lðxk ; uk Þ ¼ r2 f ðxk Þ þ
X
ukj r2xx g j ðxk Þ:
j2Iþ
Lemma 8. Under assumption A7, for k large enough, it holds that
0 !12 1 X 1 k T 2 k k k k 2 k g j ðx Þ A þ oðkd k2 Þ: ðd Þ ðrxx Lðx ; u Þ Bk Þd ¼ o@ 2 j2I þ
Proof. Denote R ¼ ðrg j ðx Þ; j 2 Iþ Þ and P ¼ En R ðRT R Þ1 RT , then Rk ! R and Pk ! P . Let k
k
k
yk ¼ R ðRT R Þ1 RT d ;
d ¼ P d þ yk ; so we have
k
k
yk ¼ R ðRT R Þ1 ðR Rk ÞT d þ R ðRT R Þ1 RTk d : From Lemma 4, it holds that k
k
k
k
g j ðxk Þ þ rg j ðxk ÞT d ¼ g j ðxk Þ þ rg j ðxk ÞT d0 þ Oðkd k3 Þ ¼ Oðkd k3 Þ;
j 2 Jk ;
by Iþ # J k , it implies that
0
X
k
kRTk d k ¼ O@
!12 1 k g 2 ðxk Þ A þ Oðkd k3 Þ: j
j2Iþ
With Rk ! R , so we get
0
k
kyk k ¼ oðkd kÞ þ O@
X
!12 1 g 2 ðxk Þ A: j
j2Iþ
Then by A7, we have
0 !12 1 X 1 k T 2 1 k k k k T 2 2 g 2j ðxk Þ A: ðd Þ ðrxx Lðxk ; uk Þ Bk Þd ¼ ððd Þ P þ yTk Þðrxx Lðxk ; uk Þ Bk Þd ¼ oðkd k Þ þ o@ 2 2 j2I
ð26Þ
þ
Proposition 1. For k large enough, t k 1. Proof. It is only necessary to prove that for k large enough k
k
^k Þ 6 f ðxk Þ þ arf ðxk ÞT d ; f ðxk þ d þ d k ^k Þ 6 0; j 2 I: g ðxk þ d þ d
ð27Þ ð28Þ
j
k
k
(i) Firstly, we prove that for k large enough, (28) are true. For j 2 I n Iðx Þ, from the fact that g j ðx Þ < 0; x ! x ; d ! 0 and ^k ! 0; k ! 1, it is clear that g ðxk þ dk þ d ^k Þ 6 0 for k large enough. For j 2 Iðx Þ # L , it follows from (24) that d k j
rg j ðxk ÞT d^k ¼ kdk ks g j ðxk þ dk Þ þ g j ðxk Þ þ rg j ðxk ÞT dk ;
j 2 Lk :
ð29Þ
The Lemma 4 implies that k
k
k
g j ðxk Þ þ rg j ðxk ÞT d ¼ g j ðxk Þ þ rg j ðxk ÞT d0 þ Oðkd k3 Þ:
ð30Þ
^k Þ around xk þ dk , by Lemma 6, (29) and (30), we get Expanding g j ðx þ d þ d k
k
k ^k Þ ¼ g ðxk þ dk Þ þ rg ðxk þ dk ÞT d ^ k k2 Þ ^k þ Oðkd g j ðxk þ d þ d j j k ^k þ Oðkdk k3 Þ ¼ g ðxk þ d Þ þ rg ðxk ÞT d j
j
k
s
k
k
¼ kd k þ g j ðx Þ þ rg j ðxk ÞT d þ Oðkd k3 Þ k
¼ kd ks þ g j ðxk Þ þ r k
k k 6 kd ks þ Oðkd k3 Þ;
In view of
k g j ðxk ÞT d0
j 2 Iðx Þ # Lk :
s 2 ð2; 3Þ, so (28) are true for any j 2 I.
k
þ Oðkd k3 Þ
ð31Þ ð32Þ
3596
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
(ii) Secondly, we prove that for k large enough, (27) is true. Denote k ^k Þ f ðxk Þ arf ðxk ÞT dk ¼ rf ðxk ÞT ðdk þ d ^k Þ þ 1 ðdk ÞT r2 f ðxk ÞT dk arf ðxk ÞT dk þ oðkdk k2 Þ: Sk ¼ f ðxk þ d þ d 2
ð33Þ
From the KKT condition of (3), Lemmas 4 and 6 and ukj ¼ 0; j 2 Lk =J k , we have
X
rf ðxk ÞT dk ¼ ðdk ÞT Bk dk
k
k
ukj rg j ðxk ÞT d þ oðkd k2 Þ
j2J k k T
k
¼ ðd Þ Bk d þ
X
k
ukj g j ðxk Þ þ oðkd k2 Þ;
ð34Þ
j2J k
rf ðxk ÞT ðdk þ d^k Þ ¼ ðdk ÞT Bk dk
X
^k Þ þ oðkd k2 Þ: ukj rg j ðxk ÞT ðd þ d k
k
ð35Þ
j2J k
From
s 2 ð2; 3Þ, (31) and g j ðxk Þ þ rg j ðxk ÞT dk0 ¼ 0; j 2 Jk , we get g j ðxk þ dk þ d^k Þ ¼ oðkdk k2 Þ; j 2 Jk , thus k ^k þ 1 ðdk ÞT r2 g ðxk Þdk ¼ oðkdk k2 Þ; g j ðxk Þ þ rg j ðxk ÞT d þ rg j ðxk ÞT d xx j 2
j 2 Jk ;
ð36Þ
then
X
ukj
X
k
rg j ðx Þ ðd þ d^k Þ ¼ k T
j2J k
ukj g j ðxk Þ
j2J k
! 1 k T X k 2 k k k þ ðd Þ uj rxx g j ðx Þ d þ oðkd k2 Þ: 2 j2J
ð37Þ
k
By substituting (34), (35) and (37) into (33), together with Lemma 8, it is easy to get that
1 1 k 1 k k k k ðd ÞT Bk d þ ðd ÞT ðr2xx Lðxk ; uk Þ Bk Þd þ ðd ÞT Sk ¼ a 2 2 2 0
þ ð1 aÞ
X
k
ukj r2xx g j ðxk Þ d þ ð1 aÞ
j2J k =Iþ
!12 1 X 1 1 k k 2 k 2 akd k þ oðkd k Þ þ o@ 6 a g 2j ðxk Þ A þ ðd ÞT 2 2 j2I þ
!
X
X
X
k
ukj g j ðxk Þ þ oðkd k2 Þ
j2J k
! k
ukj r2xx g j ðxk Þ d þ ð1 aÞ
j2J k =Iþ
ukj g j ðxk Þ:
X
ukj g j ðxk Þ
j2Iþ
ð38Þ
j2J k =Iþ
According to a 2 0; 12 , (1) From Lemma 7, we get
ð1 aÞ
X
0 ukj g j ðxk Þ
X
þ o@
j2Iþ
!12 1 A 6 0:
g 2j ðxk Þ
j2Iþ
(2) By ukj ! 0; j 2 J k =Iþ , we have
1 k T ðd Þ 2
!
X
ukj
2 k xx g j ðx Þ
r
k
k
d ¼ oðkd k2 Þ:
j2Jk =Iþ
(3) As ukj P 0 and g j ðxk Þ 6 0; j 2 I, we obtain
ð1 aÞ
X
ukj g j ðxk Þ 6 0:
j2Jk =Iþ
So for k large enough, it holds that
1 k k akd k2 þ oðkd k2 Þ 6 0; Sk 6 a 2 i.e., (27) is satisfied, which shows the unit stepsize brings a sufficient decrease on f. h Moreover, in view of the way of Theorem 5.2 in [9], we may obtain the following theorem: Theorem 3. If Conditions 10 and 20 hold true, under all stated assumptions the generalized algorithm is superlinearly convergent, i.e., the sequence fxk g generated by Algorithm 2.1 satisfies kxkþ1 x k ¼ oðkxk x kÞ.
3597
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
Remark. The drawback that the strict complementarity condition shall be necessary is avoided, although we can also obtain the same result by some suitable change to the above assumptions in this section if the second-order sufficiency conditions with strict complementary slackness are satisfied at the KKT point x and the corresponding multiplier vector u .
4. A brief discussion and practical methods Now we discuss the reasons for that the methods in [8,15] have good convergence properties. In [15], it is easy to see Conditions 10 and 20 hold true, where k
kd0 k2
k
M1 ðxk Þ ¼ rf ðxk ÞT d0
k
1 þ 2jeT ðk þ ðA1k Þ1 pk1 Þj kd0 k2 k
;
and
M2 ðxk Þ ¼ M3 ðxk Þ ¼ 1: So the theoretical analysis in above sections shows that the method in [15] has global and superlinear convergence. In [8],
1
k
M1 ðxk Þ ¼ rf ðxk ÞT d0
1 þ 2jððA1k Þ1 rf1 ðxk ÞÞT ej
;
and k
M2 ðxk Þ ¼ M3 ðxk Þ ¼ rf ðxk ÞT d0 ; we can see Conditions 1 and 2 hold, while Conditions 10 and 20 are not true. But in the method of [8], with some suitable strategies it is verified that for k large enough, it is not necessary to compute the feasible descent direction k k k k ^k satisfies the request of the next iteration. Thereby, for k large enough, d ¼ M2 ðxk Þd0 þ M 3 ðxk Þd1 because xkþ1 ¼ xk þ d0 þ d the unit stepsize is used naturally, and the method in [8] also has good convergence properties. From the above analysis, the easier way to obtain a practical method is to choose suitable M 1 ðxk Þ; M2 ðxk Þ and M3 ðxk Þ which satisfy Conditions 10 and 20 . Thus a type of practical methods can be proposed, for example we set:
M 1 ðxk Þ ¼
1þ
k
M 2 ðx Þ ¼ 1 þ
k kd0 k2 ; 1 1 3jððAk Þ f1 ðxk ÞÞT ej k kd0 k2 ;
r
and k
M3 ðxk Þ ¼ rf ðxk ÞT d0 : In this case, it follows that (1)
M2 ðxk Þ P 0;
k
k
Fðxk Þ ¼ 1 M2 ðxk Þ ¼ kd0 k2 ¼ Oðkd0 k2 Þ
and
M1 ðxk ÞM3 ðxk Þ > 0;
k
M 1 ðxk ÞM 3 ðxk Þ ¼ Oðkd0 k3 Þ;
(2) k
k
M2 ðxk Þrf ðxk ÞT d0 M1 ðxk ÞM3 ðxk ÞððA1k Þ1 rf1 ðxk ÞÞT e ¼ rf ðxk ÞT d0 þ 1 þ
ððA1k Þ1 rf1 ðxk ÞÞT e 1 þ 3jððA1k Þ1 rf1 ðxk ÞÞT ej
! k
k
kd0 k2 rf ðxk ÞT d0
k
6 rf ðxk ÞT d0 ; so Conditions 10 and 20 are satisfied, which shows that the global and superlinear convergence are induced.
5. Numerical experiments In this section, we carry out some numerical experiments based on the practical method mentioned in above section over a set of problems from [16], where a feasible initial point is provided for each problem.
3598
Z. Jin, Y. Wang / Applied Mathematics and Computation 215 (2010) 3589–3598
Table 1 The detail information of the numerical results. No.
NIT
n
m
kd0 k
FV
12 24 29 30 31 33 43 76 100
18 12 16 14 12 44 24 9 42
2 2 3 3 3 3 4 4 7
1 5 1 7 7 6 3 7 4
2.354932418798342e007 1.627076093917421e007 4.842626518282314e007 8.092551071935648e007 7.558956505926029e008 2.689843527261163e009 1.638297470240726e007 7.123859887358703e009 4.747608437048903e007
2.999999999999995e+001 9.999997438604772e001 2.262741699794893e+001 1.000000000002633e+000 6.000000006712304e+000 4.585786409126588e+000 4.399999999999994e+001 4.681818181818160e+000 6.806300573744028e+002
k
(1) During the numerical experiments, Bk is updated by the BFGS formula. The stop criteria of Step 2 is changed to k kd0 k < 106 . The algorithm parameters were set as follows: B0 ¼ I; I is the n n unit matrix, 0 ¼ 0:2, d ¼ 3; s ¼ 2:5 and a ¼ 0:25. (2) In Table 1, for each test problem, No. is the number of problems in [16], n the number of variables, m the number of inequality constraints, NIT the number of iterations, FV the final value of the objective function. From the above examples and the results summarized in Table 1, we can see one case of our generalized algorithm does well compared with other algorithms, which implies that a type of methods can also be generated and effective. Acknowledgments The authors would like to thank the anonymous referees for the careful reading and helpful comments and suggestions, which led to an improved version of the original paper. References [1] P.T. Boggs, J.W. Tolle, P. Wang, On the local convergence of quasi-newton methods for constrained optimization, SIAM Journal on Control and Optimization 20 (1982) 161–171. [2] P.T. Boggs, J.W. Tolle, Sequential quadratic programming, Acta Numerica (1995) 1–51. [3] S.P. Han, Superlinearly convergence variable metric algorithm for general nonlinear programming problems, Mathematical Programming 11 (1976) 263–282. [4] M.J.D. Powell, A fast algorithm for nonlinear constrained optimization calculations, in: G.A. Waston (Ed.), Numerical Analysis, Springer-Verlag, Berlin, 1978, pp. 144–157. [5] M.J.D. Powell, Variable metric methods for constrained optimization, in: Bachen et al. (Eds.), Mathematical Programming – The State of Art, SpringerVerlag, Berlin, 1982. [6] M.J.D. Powell, Y. Yuan, A recursive quadratic programming algorithm that uses differential exact penalty functions, Mathematical Programming 35 (1986) 265–278. [7] N. Maratos, Exact penalty function algorithm for finite dimensional and control optimization problems, Ph.D. Thesis, Imperial College Science, Technology, University of London, 1978. [8] Z.B. Zhu, K.C. Zhang, J.B. Jian, An improved SQP algorithm for inequality constrained optimization, Mathematical Methods of Operations Research 58 (2003) 271–282. [9] F. Facchinei, S. Lucidi, Quadraticly and superlinearly convergent for the solution of inequality constrained optimization problem, JOTA 85 (1995) 265– 289. [10] E.R. Panier, A.L. Tits, A superlinearly convergent feasible method for the solution of inequality constrained optimization problems, SIAM Journal on Control and Optimization 25 (1987) 934–950. [11] E.R. Panier, A.L. Tits, On combining feasibility descent and superlinear convergence in inequality constrained optimization, Mathematical Programming 59 (1993) 261–276. [12] E.R. Panier, A.L. Tits, J.N. Herskovits, A QP-free globally convergent, locally superlinearly convergent algorithm for inequality constrained optimization, SIAM Journal on Control and Optimization 26 (1988) 788–811. [13] C.T. Lawarence, A.L. Tits, A computationally efficient feasible sequential quadratic programming algorithm, SIAM Journal of Optimization 11 (4) (2001) 1092–1118. [14] Z.-Y. Gao, G.-P. He, F. Wu, A method of sequential systems of linear equations with arbitrary initial point, Science in China (Series A) 27 (1) (1997) 24– 33. [15] Zhibin Zhu, Jinbao Jian, An efficient feasible Sqp algorithm for inequality constrained optimization, Nonlinear Analysis on Real World Applications 10 (2) (2009) 1220–1228. [16] W. Hoch, K. Schittkowshi, Test Examples for Nonlinear Programming Codes, Lecture Notes in Econom and Math Systems, vol. 187, Springer-Verlag, Berlin, 1981.