A modified SQP method and its global convergence

A modified SQP method and its global convergence

Applied Mathematics and Computation 186 (2007) 945–951 www.elsevier.com/locate/amc A modified SQP method and its global convergence Jianren Che a a,*...

163KB Sizes 0 Downloads 89 Views

Applied Mathematics and Computation 186 (2007) 945–951 www.elsevier.com/locate/amc

A modified SQP method and its global convergence Jianren Che a

a,*

, Ke Su

b

Department of Surveying and Geoinformatics, Tongji University, Shanghai 200092, PR China b Department of Mathematics, Tongji University, Shanghai 200092, PR China

Abstract In this paper, a modified SQP method is presented. The algorithm starts from an arbitrary initial point and can overcome the Maratos effect. Moreover, it avoid choosing the penalty parameters. Under some reasonable conditions, the global convergence is shown.  2006 Elsevier Inc. All rights reserved. Keywords: Constrained optimization; KKT point; Sequential quadratic programming; Global convergence

1. Introduction In this paper, we concern the minimization of a nonlinear function with nonlinear constraints, which can be stated as: ðPÞ min s:t:

f ðxÞ gj ðxÞ 6 0;

j 2 I ¼ f1; 2; . . . ; mg;

ð1Þ

where x 2 Rn, f : Rn ! R and gj(j 2 I) : Rn ! R are assumed to be twice continuously differentiable. Nonlinear programming problem (P), arising often in science, engineering and many fields in society, is extremely important. There have existed a plenty of literatures about nonlinear programming. There are various methods to attack problem (P), among which, the sequential quadratic programming (SQP) method is one of the most efficient methods to solve problem (P). Because its superlinear convergence rate, it has been widely studied [1–5]. SQP is solved as follows: T

min

rf ðxk Þ d þ 12 d T H k d

s:t:

gj ðxk Þ þ rgj ðxk Þ d 6 0;

T

j 2 I;

where Hk 2 Rn·n is a symmetric positive definite matrix. *

Corresponding author. E-mail address: [email protected] (J. Che).

0096-3003/$ - see front matter  2006 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2006.08.034

ð2Þ

946

J. Che, K. Su / Applied Mathematics and Computation 186 (2007) 945–951

But there are a lot of theoretic and practical problems which are still actively interested, especially if subproblem (2) is infeasible or the solutions of the sequential quadratic problem are bounded, the SQP method fails or generates a sequential which divergences. Because of the above reasons, Burke and Han [6], Zhou [7], Zhang and Zhang [8] and Zhu and Zhang [9] modified the quadratic subproblem respectively to ensure that their methods are globally convergent. However, Burke and Han’s method is only a conceptual method and cannot be implementable practically. Zhou’s method is based on the exact linear search. Zhang’s method focus on the inexact line search, but there is much difficult to choose the penalty parameter in penalty function, which is used as a merit function. In this paper, a modified SQP method is proposed based on the subproblem proposed in [7]. The proposed algorithm has no requirement on initial point and only need to solve one QP subproblem, it overcomes the Maratos effect by solving an additional linear equation system. Moreover, it avoids using the penalty function. Under some reasonable conditions, we prove its global convergence. The symbols we use in this paper are standard. (1) (2) (3) (4)

f 0 (x, d) = limk#0(f(x + kd)  f(x))/k; g 0 (x) is Frechet derivative of g at x; kxk1 = max{jxjj, j = 1, 2, . . . , n}; I = {1, 2, . . . , m}.

Furthermore, for convenience, we list some signs and lemmas as follows: Define function U(x), W(x) by: UðxÞ ¼ maxf0; gj ðxÞ : j 2 Ig;

ð3Þ

WðxÞ ¼ maxfgj ðxÞ : j 2 Ig:

ð4Þ

For "x, d 2 Rn, let W*(x; d) be the first order approximation to W(x + d), namely   T W ðx; dÞ ¼ max gj ðxÞ þ rgj ðxÞ d : j 2 I :

ð5Þ

For "r > 0, function W(x, r), W0(x, r): Rn · R+ ! R are defined as follows: Wðx; rÞ ¼ minfW ðx; dÞ : kdk 6 rg;

ð6Þ

0

W ðx; rÞ ¼ maxfWðx; rÞ; 0g:

ð7Þ

Remark 1. (6) equals to the following linear programming: T

LPðx; rÞ : minfz : gj ðxÞ þ rgj ðxÞ d 6 z; j 2 I; kdk 6 rg: Denote hðx; rÞ ¼ Wðx; rÞ  WðxÞ; 0

0

ð8Þ ð9Þ

h ðx; rÞ ¼ W ðx; rÞ  WðxÞ;

ð10Þ

F ¼ fx : gj ðxÞ 6 0j 2 Ig ¼ fx : WðxÞ 6 0g;

ð11Þ

c

F ¼ fx : WðxÞ > 0g:

ð12Þ

Definition 1 [6]. Mangasarian–Fromotz constraint qualification (MFCQ) is said to be satisfied by g(x) 6 0 at x if $z 2 Rn such that rgj ðxÞT z < 0

8j 2 fj 2 Ijgj ðxÞ P 0g:

Lemma 1 [7]. "x 2 Fc, if MFCQ is satisfied at x, then h(x, r) < 0("r > 0). Lemma 2 [7]. W(x, r), W0(x, r), h(x, r), h0(x, r) are all continuous on Rn · R+. Lemma 3 [7]. "x 2 Fc, if h(x, r) < 0, then h0(x, r) < 0.

J. Che, K. Su / Applied Mathematics and Computation 186 (2007) 945–951

947

This paper is organized as follows. In Section 2, we describe the algorithm. The global convergence theory for the method is presented in Section 3, and some numerical examples are given in the last section. 2. Describe of the algorithm Given x 2 Rn, r > 0. D(x, r) is defined as the following set: Dðx; rÞ ¼ fdjgj ðxÞ þ rgj ðxÞT d 6 W0 ðx; rÞ; j 2 Ig: If d* is the solution of LP(x, r), the d* 2 D(x, r), hence D(x, r) is nonempty. The quadratic subproblem (2) is replaced by the following convex programming problem: Qðxk ; H k ; rk Þ : min s:t:

rf ðxk ÞT d þ 12 d T H k d T

gj ðxk Þ þ rgj ðxk Þ d 6 W0 ðxk ; rk Þ j 2 Lk ;

ð13Þ

where Lk is the set of approximate active indices of the point xk. Clearly, by the above statement, the convex programming Q(xk, Hk, rk) is feasible. If Hk is positive definite, then the solution of Q(xk, Hk, rk) is unique. The convex programming problem has the following properties. Theorem 1 [7]. Suppose that xk 2 Rn, Hk 2 Rn·n is a symmetric positive definite matrix. If MFCQ is satisfied at xk, then (1) The convex programming problem Q(xk, Hk, rk) has a unique solution dk which satisfies KKT conditions, i.e. there exist vectors kk ¼ ðkkj j 2 Lk [ EÞ such that (a) gj(xk) + $gj(xk)Tdk 6 W0(xk, rk) j 2 Lk; (b) kkj P 0 j 2 Lk ; (c) $f(xk) + Hkdk + Akkk = 0, Ak = ($ gj(xk)j 2 Lk); T (d) kkj ðgj ðxk Þ þ rgj ðxk Þ d k Þ ¼ 0 j 2 Lk ; (2) If dk = 0 is the solution of Q(xk, Hk, rk), then xk is a KKT point of problem (P). Lemma 4. "x 2 F, d 2 D(x, r), then W*(x; d) = 0. Now, the algorithm for the solution of problem (P) can be stated as follows. Algorithm A Step 0: Initialization: Given x0 2 Rn, R is a compact set which consists of symmetric positive definite matrices. H0 2 R, k = 0, 0 > 0, rr > rl > 0, r0 2 [rl, rr]; Step 1: Computation of an ‘active’ constraint set Lk: S1.1 Let i = 0, k, i = 0; S1.2 Set Lk;i ¼ fj 2 Ij  k;i 6 gj ðxk Þ  Uðxk Þ 6 0g; Ak;i ¼ ðrgj ðxk Þ j 2 Lk;i Þ: If detðATk;i Ak;i Þ P k;i , let Lk = Lk,i, Ak = Ak,i, ik = i, go to step 2; S1.3 Set i = i + 1,k,i = k,i1/2, and go to S1.2 (inner loop). Step 2: Computation of the direction dk: Compute W(x, r), W0(x, r), let dk be the solution of convex programming problem Q(xk, Hk, rk). If dk=0, then xk is a KKT point of problem (P); Step 3: Let d^k be the least norm solution of the following linear equation system: gj ðxk þ d k Þ þ rgj ðxk ÞT d ¼ 0

j 2 Lk

and if the above linear equation system is inconsistent or kd^k k > kd k k, then let d^k ¼ 0;

948

J. Che, K. Su / Applied Mathematics and Computation 186 (2007) 945–951

T Step 4: If f ðxk þ d k þ d^k Þ 6 f ðxk Þ þ arf ðxÞk ðd k þ d^k Þ and gj ðxk þ d k þ d^k Þ 6 0 j 2 I hold, then let tk ¼ 1; ^ d k ¼ d k þ d k , and go to step 7. Step 5: Computation of direction qk: 2 Let A1k be the matrix whose rows are jLkj linearly independent rows of  Ak, and  Ak be the matrix 1 Ak whose rows are the remaining n jLkj rows of Ak. We might denote Ak ¼ : Like Ak, we might  A2k rf1 ðxk Þ : Compute as well let rf ðxk Þ ¼ rf2 ðxk Þ qk ððA1k Þ1 ÞT e T 1 qk ¼ rf ðxk Þ d k ; pk ¼ ðA1k Þ rf1 ðxk Þ; d~k ¼ ; qk ¼ qk ðd k þ dk Þ; ð14Þ 1 þ 2jeT pk j   d~k T where dk ¼ ; e ¼ ð1; 1; . . . ; 1Þ 2 RjLk j . 0 Step 6: Compute t such that T

f ðxk þ tqk Þ 6 f ðxk Þ þ atrf ðxk Þ qk ; gj ðxk þ tqk Þ 6 0 j 2 I:

ð15Þ

Let dk = qk; Step 7: Update: P Choose H kþ1 2 , rk+1 2 [rl, rr], xk = xk + dk, k = k + 1, go to step 1. Remark 2 (1) Hk+1 can be obtained by iterative formula. (2) Whether in Zhou’s or in Zhang’s algorithm, the penalty function is needed. In this paper, the penalty function is avoided. (3) When the solution of Q(xk, Hk, rk) is unacceptable, it generates a revised direction by solving a system of linear equation, which takes full advantage of good property of d. 3. Global convergence of algorithm In the sequential analysis, we always assume that following conditions hold. Assumptions A1 A2 A3 A4

The objective function f and the constraint functions gj (j 2 I) are twice continuously differentiable. For any x 2 Rn, the vectors {$gj(x), j 2 I(x)} are linearly independent, where I(x) = {j 2 Ijgj(x) = U(x)}. The iterate {xk} and the multiplier kk remain in a closed, bounded convex subsets S  Rn. There exists two constants 0 < a 6 b such that akdk2 6 dTHkd 6 bkdk2, for all k, for all d 2 Rn.

Without loss of generality, we may assume that kxkk 6 M, kkkk 6 M. Before we show the global convergence of Algorithm A, we must to ensure the algorithm is implementable. Lemma 5. For any iterate k, the index ik defined in step 1 in finite, which means that the inner loop A terminates in finite number of times. Proof. Suppose by a contradiction that Algorithm A will run infinitely between step 1.2 and step 1.3, so we have 1 detðATk;i Ak;i Þ < i 0 : ð16Þ 2 By the definition of Lk,i, we can see that Lk,i+1  Lk,i. And there are only finite possible subsets of I, so we have Lk,i+1  Lk,i for large enough i. We denote it by Lk , now letting i ! 1, then we obtain detðATL ALk Þ ¼ 0 k

and

Lk ¼ Iðxk Þ

which contradicts the Assumption A2.

ð17Þ h

J. Che, K. Su / Applied Mathematics and Computation 186 (2007) 945–951

Lemma 6. If dk 5 0, then it holds T T rf ðxk Þ d k < 0; rf ðxk Þ qk 6  12 q2k < 0; T

rgj ðxk Þ d k ¼ 0;

T

949

ð18Þ

q2

rgj ðxk Þ qk 6  1þ2jekT pk j < 0:

The proof is similar to Lemma 2 [12]. From Lemma 6 we know the search in step 6 will be successful. By the above statement, we see that Algorithm A is implementable. Now, we turn to prove the global convergence of Algorithm A. Theorem 2 [7]. Assume that MFCQ is satisfied at x0 2 Rn. Let rl > 0 and F = {xjg(x) 6 0}, then there exists a neighbor N(x0) such that (1) MFCQ is satisfied at any point in N(x0). (2) If x0 2 F, then W0(x, r) = 0 for all x 2 N(x0) and r P rl. (3) If x0 2 F, then ( ) m X sup lj : H 2 R; x 2 N ðx0 Þ; r 2 ½rl ; rr  < 1; j¼1

where R  Rn·n is a compact set which consists of symmetric positive definite matrices and 0 < rl < rr. Theorem 3. Suppose the Algorithm A is applied to problem (P), and the Assumptions A1–A4 hold. Let {xk} be the sequence of iterates produced by the algorithm. Then there are two following possible cases: (A) The iteration terminates at a KKT point. (B) Any accumulation point of {xk} is a KKT point of problem (P). Proof (A) It is evident according to the above statement. (B) By the construction of Algorithm A, there are two cycles between step 1 and step 7, one generates {xk} with the form xkþ1 ¼ xk þ d k þ d^k , the other generates it with the form xk+1 = xk + tkqk. We prove that the claim according the two cycles. Case I: Suppose there are infinite points gotten by the relation xkþ1 ¼ xk þ d k þ d^k , by Assumption A3, there must exists a point x* such that xk ! x*(k 2 K), where K is a infinite index set. Also, by the algorithm, we 0 can obtain that x* is a feasible point n o and W (xk, rk) ! 0, (k 2 K). Suppose x* is not a KKT point, let T 1 T K 1 ¼ k 2 Kjrf ðxk Þ d k >  2 d k H k d k  K. (i) K1 is an infinite index set. If limk2K 1 ;k!1 kd k k ¼ 0, then it is easy to see that x* is a KKT point. It is a contradiction. So, without loss of generality, we suppose that kdkk >  for k 2 K1. By the algorithm and Assumption A4, we can assume $k0, for k > k0, k 2 K1, it holds a2 akd k k2 d Tk H k d k 6 6 : ð19Þ 2M 2M 2M While by KKT condition of the problem (2), we have $ f(xk) + Hkdk + Akkk = 0, Ak = ($gj(xk), j 2 Lk) Together with (19), we obtain that for all k 2 K1, k > k0, it holds gðxk Þ 6

T

T

rf ðxk Þ d k ¼ d Tk H k d k  d Tk Ak kk ¼ d Tk H k d k  ðkk Þ gðxk Þ 6 d Tk H k d k þ kkk k1 gðxk Þ 1 6 Mgðxk Þ  d Tk H k d k 6  d Tk H k d k 2 which contradicts the definition of K1.

ð20Þ

950

J. Che, K. Su / Applied Mathematics and Computation 186 (2007) 945–951

(ii)K1 is a finite index set. That means it holds rf ðxk ÞT d k 6  12 d Tk H k d k for large enough k. We have 1 a T 2 T 2 f ðxk Þ  f ðxkþ1 Þ ¼ rf ðxk Þ d k þ Oðkd k k Þ P rf ðxk Þ d k P d Tk H k d k P kd k k : 2 2 Because f is bounded below, for some integer i0, we have 1 1 X X a 2 kd k k : ðf ðxk Þ  f ðxkþ1 ÞÞ P 1> 2 k¼i k¼i 0

0

Then 1 X

kd k k2 < þ1:

k¼i0

That means kdkk ! 0. Hence x* is a KKT point. Case II: Suppose there are infinite points gotten from the relation xk+1 = xk + akqk. Suppose also by contradiction that kdkk > , k 2 K. By Lemma 3, we have tk P t > 0. Without loss of generality, we can assume that qk ! q*. Since kdkk > , it easy to see that $f(x*)Tq* < 0. By Lemma 2, then 0 ¼ limðf ðxkþ1 Þ  f ðxk ÞÞ ¼ limðtk rf ðxk ÞT qk þ Oðkak qk k2 ÞÞ 6 limðtk rf ðxk ÞT qk Þ 6 trf ðx ÞT q < 0 k2K

k2K

k2K

which is a contradiction. Combined case I and case II, we can see that the claim holds.

ð21Þ

h

4. Numerical experiments In this section, we carry out some numerical experiments based on the algorithm. The result show that the algorithm is effective. During the numerical experiments, 0 = 0.01, rl = 1, rr = 2, and H0 = I, the n · n unit matrix. Hk is updated by BFGS formula. Example 1 min s:t:

1 1 f ðxÞ ¼ x  þ cos2 ðxÞ 2 2 xP0 x0 ¼ 2;

x ¼ 0;

f ðx Þ ¼ 0;

iterate ¼ 2:

Example 2 min

f ðxÞ ¼ x21 þ x22 þ x23 þ x24

s:t:

6  x21  x22  x23  x24 6 0 T

x0 ¼ ð2; 2; 2; 2Þ ;

T

x ¼ ð1:2248; 1:2248; 1:2248; 1:2248Þ ;

f ðx Þ ¼ 6;

iterate ¼ 4:

Example 3 2

min

f ðxÞ ¼ 100ðx2  x21 Þ þ ð1  x1 Þ

s:t:

x2 P 1:5 T

x0 ¼ ð2; 1Þ ;

2

T

x ¼ 1:0e010 ð0:0551; 0Þ ;

f ðx Þ ¼ 2:0709e  026;

iterate ¼ 5:

J. Che, K. Su / Applied Mathematics and Computation 186 (2007) 945–951

951

Acknowledgements The authors thank the referees who gave us valuable suggestions and comments for improving the paper. This work is supported in part by the National Science Foundation of China (No. 10571137). References [1] P.T. Boggs, J.W. Tolle, P. Wang, On the local convergence of quasi-newton methods for constrained optimization, SIAM J. Control Optim. 20 (1982) 161–171. [2] J.F. Bonnons, E.R. Painer, A.L. Titts, J.L. Zhou, Avoiding the Maratos effect by means of nonmontone linesearch, inequality constrained problems-feasible iterates, SIAM J. Numer. Anal. 29 (1992) 1187–1202. [3] S.P. Han, Superlinearly convergence variable metric algorithm for general nonlinear programming problems, Math. Program. 11 (1976) 263–282. [4] M.J.D. Powell, A fast algorithm for nonlinear constrained optimization calculations, in: G.A. Waston (Ed.), Numerical Analysis, Springer-Verlag, Berlin, 1978, pp. 144–157. [5] M.J.D. Powell, Variable metric methods for constrained optimization, in: Bachen et al. (Eds.), Mathematical Programming – The State of Art, Springer-Verlag, Berlin, 1982. [6] J.V. Burke, S.P. Han, A robust SQP method, Math. Program. 43 (1989) 277–303. [7] G.L. Zhou, A modified SQP method and its global convergence, J. Global Optim. 11 (1997) 193–205. [8] J.L. Zhang, X.S. Zhang, A modified SQP method with nonmonotone linesearch technique, J. Global Optim. 21 (2001) 201–218. [9] Z.B. Zhu, K.C. Zhang, J.B. Jian, An improved SQP algorithm for inequality constrained optimization, Math. Meth. Oper. Res 58 (2003) 271–282.