A truncated aggregate smoothing Newton method for minimax problems

A truncated aggregate smoothing Newton method for minimax problems

Applied Mathematics and Computation 216 (2010) 1868–1879 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homep...

251KB Sizes 3 Downloads 142 Views

Applied Mathematics and Computation 216 (2010) 1868–1879

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

A truncated aggregate smoothing Newton method for minimax problems q Yu Xiao, Bo Yu * School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning 116025, PR China

a r t i c l e

i n f o

Keywords: Minimax problem Truncated aggregate function Stabilized Newton method

a b s t r a c t Aggregate function is a useful smoothing function to the max-function of some smooth functions and has been used to solve minimax problems, linear and nonlinear programming, generalized complementarity problems, etc. The aggregate function is a single smooth but complex function, its gradient and Hessian calculations are time-consuming. In this paper, a truncated aggregate smoothing stabilized Newton method for solving minimax problems is presented. At each iteration, only a small subset of the components in the max-function are aggregated, hence the number of gradient and Hessian calculations is reduced dramatically. The subset is adaptively updated with some truncating criterions, concerning only with computation of function values and not their gradients or Hessians, to guarantee the global convergence and, for the inner iteration, locally quadratic convergence with as few computational cost as possible. Numerical results show the efficiency of the proposed algorithm. Ó 2009 Elsevier Inc. All rights reserved.

1. Introduction Consider the following minimax problem:

min fFðxÞ ¼ max f j ðxÞg n x2R

j2q

ð1:1Þ

where q ¼ f1; . . . ; qg. It is a typical nonsmooth problem and appears in many application fields, such as vehicle routing (see [2,3]), structural optimization (see [5,8]), location (see [6,9,10,14]), resource-allocation (see [25,30]), portfolio selection (see [41]), data fitting and data mining (see [1,16]), optimal-control (see [15,29]) and etc. Different algorithms have been proposed to solve minimax problems (1.1), such as subgradient methods (see [37] for details), SQP methods (see [27,42,47]), SELQP method (see [36]), bundle-type methods (see [11,12,17,48]), smooth approximation methods (see [4,7,13,26,34,35,43,46]) and etc. In this paper, we concern on the aggregate function smooth approximation method. The aggregate function (also known as exponential penalty function (see [18])) with smoothing parameter p > 0,

F p ðxÞ ¼

!   X 1 j exp pf ðxÞ ; ln p j2q

ð1:2Þ

q The research was supported by the National Natural Science Foundation of China (10671029) and the Research Fund for the Doctoral Programme of Higher Education (20060141029). * Corresponding author. E-mail addresses: fi[email protected] (Y. Xiao), [email protected] (B. Yu).

0096-3003/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2009.11.034

1869

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

was induced from the Jayne’s maximum entropy principle by Li (see [19]). It is a smooth uniform and monotonic approximation to FðxÞ and has been used to solve minimax problems (see [21–24,34,43]), linear and nonlinear programming (see [20,39,40,45]), generalized complementarity problems etc. (see [31,32]). In [21], Li converted a nonsmooth minimax problem to an unconstrained minimization on a differentiable aggregate function. In [43], a sequence of approximation problems were solved with geometrically increasing p, and a line search procedure based on the merit function F p ðxÞ was used to ensure the global convergence under a rather restrictive convexity assumption. In [45], an aggregate homotopy method was proposed to compute a KKT point of the nonconvex programming problems. In this method, t ¼ 1=p was taken the same as homotopy parameter, existence and convergence of a smooth interior path to a KKT point was proven under the weak normal cone condition. However, although F p ðxÞ is a single smooth function, some difficulties still exist, such as overflow, ill-condition for large p, complicated gradient and Hessian calculations for large q which can be seen from the following expressions

rF p ðxÞ ¼

X

fjp ðxÞrf j ðxÞ;

ð1:3Þ

j2q

r2 F p ðxÞ ¼

X

fjp ðxÞr2 f j ðxÞ þ p

j2q

X

fjp ðxÞrf j ðxÞrf j ðxÞT 

j2q

X j2q

fjp ðxÞrf j ðxÞ

X

! fjp ðxÞrf j ðxÞT ;

ð1:4Þ

j2q

where

fjp ðxÞ

  j exp pf ðxÞ   2 ð0; 1; ¼P j j2q exp pf ðxÞ

X

fjp ðxÞ ¼ 1:

ð1:5Þ

j2q

Hence, more effort is needed to gain efficient performance of the aggregate smoothing method. Overflow can be avoided by alternating (1.2) as

F p ðxÞ ¼

! X    1 ln exp p f j ðxÞ  FðxÞ þ FðxÞ: p j2q

ð1:6Þ

To overcome the ill-condition, Yang proposed a modified maximum entropy method in [44] by introducing adjust factors uj to (1.2), i.e.,

!   X 1 j F p ðx; uÞ ¼ ln uj exp pf ðxÞ : p j2q In [34], the effect of ill-conditioning was reduced by adopting stabilized Newton–Armijo algorithm, and a feedback precision-adjustment rule to adaptively update p made the sequence of approximation problems have good initial points. In this paper, we make effort to reduce the computation cost of the aggregate function method. We start with a basic observation. It can be seen from (1.6) that for f j ðxÞ – FðxÞ, if f j ðxÞ is much smaller than FðxÞ or p is sufficiently large, then the term expðpðf j ðxÞ  FðxÞÞÞ is approximately equal to zero, hence f j ðxÞ has little contribution to F p ðxÞ and can be ignored in (1.2). Same thing happens for rF p ðxÞ and r2 F p ðxÞ, too. Based on this observation, a truncated aggregate smoothing stabilized Newton method to solve (1.1) is presented. At each iteration, only a small subset of components in max-function is aggregated, hence the number of gradient and Hessian calculations is dramatically reduced, and hence the computation cost is greatly reduced especially for the problems with large q and complicated gradients or Hessians. We give some truncating criterions, concerning only with computation of function values and not their gradients or Hessians, for adaptively updating the subset to guarantee the global convergence and, for the inner iteration, locally quadratic convergence with as few computational cost as possible. The following assumption and results will be used in this paper. Assumption 1.1. The functions f j ðxÞ; j 2 q, are twice continuously differentiable. Theorem 1.2 (Theorem 4.2.8, [33]). Suppose that the functions f j ðxÞ; j 2 q, are continuously differentiable. If x 2 Rn is a local minimizer for (1.1), then

0 2 @Fðx Þ, conv frf j ðx Þg;  j2qðx Þ

where

qðx Þ ¼ fj 2 qjf j ðx Þ ¼ Fðx Þg; and convfAg denotes the convex hull of A.

1870

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

Proposition 1.3 [22]. F p ðxÞ decreases monotonically as p increases and,

FðxÞ 6 F p ðxÞ 6 FðxÞ þ

1 ln q; p

8x 2 Rn ;

ð1:7Þ

i.e., F p ðxÞ ! FðxÞ uniformly and monotonically as p ! 1. i ^ Proposition 1.4 (Proposition 2.4, [45]). For any sequences fðxi ; pi Þg1 i¼0 satisfying ðx ; pi Þ ! ðx; þ1Þ,

lim fjpi ðxi Þ ¼ 0; if j R qð^xÞ;

i!þ1

where fjpi ðxi Þ was defined in (1.5). The paper is organized as follows. The truncated aggregate function is described in Section 2. In Section 3, a truncated aggregate smoothing stabilized Newton method for solving (1.1) is formulated and the global and locally quadratic convergence are proven. Numerical tests comparing the performance of the proposed method with several other different algorithms are given in Section 4. By the way, we will use ]ðÞ to denote the cardinality of a set. 2. The truncated aggregate function For any given  x 2 Rn and constant

l > 0, denote

   ¼ jjFðxÞ  f j ðxÞ 6 l; j 2 q : q

ð2:1Þ

 is defined as The truncated aggregate function with respect to q 

F qp ðxÞ ¼

!   X 1 j exp pf ðxÞ : ln p  j2q

ð2:2Þ

  # q. Lemma 2.1. Under Assumption 1.1, F qp ðxÞ is twice continuously differentiable for arbitrary p > 0 and q 

In the following, some estimations of differences between F qp ðxÞ and F p ðxÞ, as well as their gradients and Hessians, are given.  and F qp ðxÞ be defined as in Proposition 2.2. Suppose that Assumption 1.1 holds. For any given  x 2 Rn ; p > 0; l > 0, let F p ðxÞ; q (1.2), (2.1) and (2.2). Then  (i) 0 6 F p ð xÞ  F qp ð xÞ 6 ðq  1Þ expðplÞ=p;  q xÞ  rF p ð xÞk 6 2cð xÞðq  1Þ=ðexpðplÞ þ q  1Þ; (ii) krF p ð  xÞ  r2 F p ð xÞk 6 ð2xð xÞ þ 6pc2 ð xÞÞðq  1Þ=ðexpðplÞ þ q  1Þ, (iii) kr2 F qp ð

where cðxÞ ¼ maxfkrf j ðxÞkjj 2 qg; xðxÞ ¼ maxfkr2 f j ðxÞkjj 2 qg. Proof (i) From

F p ðxÞ 

 F qp ðxÞ

 j ! P 1  exp pðf ðxÞ  FðxÞÞ jRq ; ¼ ln 1 þ P j p  exp ðpðf ðxÞ  FðxÞÞÞ j2q 

it is obvious that F p ð xÞ  F qp ð xÞ P 0. Since

X

  exp pðf j ðxÞ  FðxÞÞ P 1;

 j2q

and

X

   ÞÞ exp ðplÞ 6 ðq  1Þ exp ðplÞ; exp pðf j ðxÞ  FðxÞÞ 6 ðq  ]ðq

 jRq

together with the inequality lnð1 þ xÞ < xðx > 0Þ, it follows that  F p ðxÞ  F qp ðxÞ 6

  1 X exp pðf j ðxÞ  FðxÞÞ p jRq

! 6 ðq  1Þ expðplÞ=p:

1871

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

(ii) Since

     X     X  X j X j     q     fqp ;j ðxÞrf j ðxÞ  fp ðxÞrf j ðxÞ ¼  fqp ;j ðxÞ  fjp ðxÞ rf j ðxÞ  fp ðxÞrf j ðxÞ; rF p ðxÞ  rF p ðxÞ ¼     j2q    j2q j2q jRq

ð2:3Þ

where  fqp ;j ðxÞ

  j exp pf ðxÞ  ; ¼P j  exp pf ðxÞ j2q

; j2q

we should estimate the bounds of

X

fjp ðxÞ ¼

X

 fqp ;j ðxÞ

¼ 1;

P

 ;j q xÞ  jfp ð j2q

0 6 fjp ðxÞ 6

 fjp ð xÞj and

P

j xÞ  fp ð jRq

, firstly. From the fact that for all j 2 q

 fqp ;j ðxÞ;

 j2q

j2q

we have

X  X X q ;j  fqp ;j ðxÞ  fjp ðxÞ ¼ fjp ðxÞ: fp ðxÞ  fjp ðxÞ ¼  j2q

 j2q

 jRq

 , it follows that By the definition of q

X

fjp ðxÞ

 jRq

  j  P xÞ  FðxÞ  exp p f ð jRq ¼P ¼ j   j2q exp ðpðf ðxÞ  FðxÞÞÞ

 j !1

P  Þ  1Þ exp ðplÞ 1 xÞ  FðxÞÞ 1 þ ð]ðq  exp pðf ð j2q 6 1 þ : 1þ P j xÞ  Fð  ÞÞ exp ðplÞ ðq  ]ðq xÞÞÞ  exp ðpðf ð jRq

 Þ P 1, we have Then together with ]ðq

X X q ;j fjp ðxÞ 6 ðq  1Þ=ðexpðpuÞ þ q  1Þ: fp ðxÞ  fjp ðxÞ ¼  j2q

ð2:4Þ

 jRq

From (2.3) and (2.4), it is easy to see that

kr

 F qp ðxÞ

X X q ;j  rF p ðxÞk 6 cðxÞ fjp ðxÞ fp ðxÞ  fjp ðxÞ þ  j2q

! 6 2cðxÞðq  1Þ=ðexpðplÞ þ q  1Þ:

 jRq

(iii) As for the Hessian, 2

r F p ðxÞ  r

 2 q F p ðxÞ

¼

X

fjp ðxÞ

2 j

r f ðxÞ 

!  fqp ;j ðxÞ

X

r f ðxÞ

fjp ðxÞrf j ðxÞrf j ðxÞT 

X j2q

X

! fqp ;j ðxÞrf j ðxÞrf j ðxÞT 

 j2q

j2q

p

2 j

 j2q

j2q

þp

X

fjp ðxÞ

j

rf ðxÞ

X j2q

fjp ðxÞ

j

T

rf ðxÞ 

X

 fqp ;j ðxÞ

j

rf ðxÞ

 j2q

X

!  fqp ;j ðxÞ

j

T

rf ðxÞ

;

ð2:5Þ

 j2q

notice that the first and second terms of the right hand side in (2.5) have upper bounds

   X X j X q ;j X q ;j  T T j j j j j fp ðxÞrf ðxÞ fp ðxÞrf ðxÞ  fp ðxÞrf ðxÞ fp ðxÞrf ðxÞ     j2q   j2q j2q j2q    X  X X q ;j X q ;j    6 fjp ðxÞrf j ðxÞ þ fp ðxÞrf j ðxÞ fjp ðxÞrf j ðxÞ  fp ðxÞrf j ðxÞ    j2q    j2q j2q j2q    X X q ;j   6 2cðxÞ fj ðxÞrf j ðxÞ  fp ðxÞrf j ðxÞ 6 4c2 ðxÞðq  1Þ=ðexpðplÞ þ q  1Þ;   j2q p  j2q then it follows that

   kr2 F p ðxÞ  r2 F qp ðxÞk 6 2 xðxÞ þ 3pc2 ðxÞ ðq  1Þ=ðexpðplÞ þ q  1Þ:



For any x0 2 Rn and p0 > 0, denote X ¼ fxjFðxÞ 6 F p0 ðx0 Þg. From the above proposition, we have the following corollary. Corollary 2.3. Suppose that cðxÞ and xðxÞ are upper bounded in X; c P maxx2X cðxÞ and x P maxx2X xðxÞ. For any x 2 X; p > 0; e1 > 0 and e2 > 0, if l in (2.1) is chosen to be

1872

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

1 p





l ¼ ln maxf1; ð2c  e1 Þðq  1Þ=e1 ; ð2x þ 6pc2  e2 Þðq  1Þ=e2 g ;

ð2:6Þ

then, it has 

krF p ðxÞ  rF qp ðxÞk 6 e1 ; and 

kr2 F p ðxÞ  r2 F qp ðxÞk 6 e2 : Proof. It can be induced from

l P ln ðmaxf1; ð2c  e1 Þðq  1Þ=e1 gÞ=p; that

2cðq  1Þ=ðexpðplÞ þ q  1Þ 6 e1 ; and hence from ðiiÞ of Proposition 2.2, we have 

krF p ðxÞ  rF qp ðxÞk 6 e1 : Similarly, we can prove (2.3) holds. h

3. Truncated aggregate smoothing Newton method and its convergence Combining the truncated aggregate function with stabilized Newton–Armijo algorithm, we now give the truncated aggre gate smoothing stabilized Newton–Armijo algorithm for solving (1.1). To stabilize the Newton method, the Hessian r2 F qp ðxÞ  q of the truncated aggregate function F p ðxÞ can be modified as the ways introduced in [28] such as eigenvalue modification, adding a multiple of the identity, modified Cholesky factorization, Gershgorin modification, etc. Here, we use the first method, i.e., 

BðxÞ ¼ hðxÞI þ r2 F qp ðxÞ;

ð3:1Þ 

where hðxÞ ¼ maxf0; d  eðxÞg with eðxÞ denoting the minimum eigenvalue of r2 F qp ðxÞ and d > 0. The adaptive smoothing parameter adjustment subroutine given in [34] is called to adjust the smoothing parameter p adaptively. Based on the estimations in Section 2, we give some truncating criterions, concerning only with computation of function values and not their gradients or Hessians, for adaptively updating the subset. Global convergence as well as locally quadratic convergence of the inner iteration, for fixed p, of the truncated aggregate smoothing stabilized Newton–Armijo algorithm are given. Algorithm 1 (Truncated aggregate smoothing Newton algorithm). Data. x0 2 Rn . ^  1; a; b; j1 2 ð0; 1Þ; g 2 ð0; ð1  aÞj21 =32Þ; d > 0; c; x are sufficient big numbers such that Parameters. p0 > 0; p c P maxx2X cðxÞ and x P maxx2X xðxÞ; functions a ðpÞ; b ðpÞ; sðpÞ; eðpÞ : ð0; 1Þ ! ð0; 1Þ, satisfying b ðpÞ P a ðpÞ > sðpÞ for all p > 0; limp!þ1 sðpÞ ¼ 0; e1 ðpÞ ¼ gsðpÞ; e2 ðpÞ > 0. Step 1. Set i ¼ 0; k ¼ 0; s ¼ 1; xk;i ¼ x0 .  according to (2.6) and (2.1). If krF qp ðxk;i Þk > sðpk Þ, go to Step 3, else go to Step 8. Step 2. Compute l; q k Step 3. Compute Bðxk;i Þ according to (3.1), then compute Cholesky factor R such that Bðxk;i Þ ¼ RRT and the reciprocal condition number cðRÞ of R. If cðRÞ P j1 , go to Step 4, else go to Step 5.  Step 4. Compute hk;i ¼ Bðxk;i Þ1 rF qpk ðxk;i Þ, go to Step 6.  q k;i Step 5. Set hk;i ¼ rF pk ðx Þ: Step 6. Compute the step length kk;i ¼ bl , where l P 0 is the smallest integer satisfying

D E  F pk ðxk;i þ kk;i hk;i Þ  F pk ðxk;i Þ 6 akk;i rF qpk ðxk;i Þ; hk;i :

Step 7. Set xk;iþ1 ¼ xk;i þ kk;i hk;i ; i ¼ i þ 1. Compute

   q k;i  rF pk ðx Þ 6 sðpk Þ;

go to Step 8, else go to Step 3.

ð3:2Þ

l; q according to (2.6) and (2.1). If ð3:3Þ

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

1873

Step 8. If s ¼ 1, compute p such that





a ðpk Þ 6 rF qp ðxk;i Þ 6 b ðpk Þ;

ð3:4Þ



go to Step 9, else set pkþ1 ¼ sðk þ 2Þ; k ¼ k þ 1; i ¼ 0, go to Step 2, ^, set pkþ1 ¼ maxfp ; pk þ 1g; k ¼ k þ 1; i ¼ 0, go to Step 2, else set s ¼ maxf2; ðp ^ þ 1Þ=ðk þ 1Þg; pkþ1 ¼ Step 9. If p 6 p maxfpk þ 1; sðk þ 2Þg; k ¼ k þ 1; i ¼ 0, go to Step 2. Denoting the following unconstrained optimization problem as ðPp Þ,

min F p ðxÞ: n x2R

In Algorithm 1, pk will increase to pkþ1 when the current iterate is sufficiently close to a stationary point of ðPpk Þ, and the last iteration may be set as a warm start for the next problem ðP pkþ1 Þ. Then, Algorithm 1 solves a sequence of approximating prob1 lem fðPpk Þg1 k¼0 , where the sequence fpk gk¼0 is monotonically increasing and diverges to infinity. If we take e1 ðpÞ and e2 ðpÞ as  is equal to qðxÞ as p ! 1. The global convergence analysis is as Oðpr Þðr > 0Þ, it has l ! 0 as p ! 1, which means that q follows. Assumption 3.1. The level set X is bounded. Lemma 3.2. Under Assumptions 1.1 and 3.1, suppose that sequences fpk g and fx1;i g; fx2;i g; . . . ; fxk;i g; . . ., are generated by Algo rithm 1. Then, for any xk;i 2 X such that krF qpk ðxk;i Þk > sðpk Þ, it has 



(i) ð1  gÞkrF qpk ðxk;i Þk 6 krF pk ðxk;i Þk 6 ð1 þ gÞkrF qpk ðxk;i Þk; (ii) kk;i is computed using a finite number of function evaluations; (iii) xk;iþ1 2 X. Proof (i) It directly follows from Corollary 2.3.   (ii) Since hk;i ¼ Bðxk;i Þ1 rF qpk ðxk;i Þ (Bðxk;i Þ ¼ I if hk;i ¼ rF qpk ðxk;i Þ), then 





khk;i k 6 kBðxk;i Þ1 kkrF qpk ðxk;i Þk ¼ krF qpk ðxk;i Þk=rmin ðxk;i Þ 6 krF qpk ðxk;i Þk=j0 ; where j0 ¼ minfd; 1g; rmin ðxk;i Þ and constant such that

rmax ðxk;i Þ are the smallest and largest eigenvalues of Bðxk;i Þ, respectively. Let L < 1 be a

hy; r2 F pk ðxÞyi 6 Lkyk2 

for all x 2 Bðxk;i ; krF qpk ðxk;i Þk=j0 Þ and y 2 Rn . Now we have that, for some s 2 ½0; 1

D E  F pk ðxk;i þ khk;i Þ  F pk ðxk;i Þ  ak rF qpk ðxk;i Þ; hk;i D E D E D E   ¼ k rF pk ðxk;i Þ  rF qpk ðxk;i Þ; hk;i þ kð1  aÞ rF qpk ðxk;i Þ; hk;i þ k2 hk;i ; r2 F pk ðxk;i þ skhk;i Þhk;i =2 





6 kgkrF qpk ðxk;i Þk2 =rmin ðxk;i Þ  kð1  aÞkrF qpk ðxk;i Þk2 =rmax ðxk;i Þ þ Lk2 krF qpk ðxk;i Þk2 =ð2r2min ðxk;i ÞÞ 

6 kkrF qpk ðxk;i Þk2 =ðj21 rmax ðxk;i ÞÞðg  ð1  aÞj21 þ Lk=ð2j0 ÞÞ: Let k ¼ minf2j0 ðð1  aÞj21  gÞ=L; 1g, then for any l such that bl 6 k ; bl satisfies the inequality (3.2). Hence, step length kk;i is computed in a finite number of operations. 



(iii) Since hrF qpk ðxk;i Þ; hk;i i 6 krF qpk ðxk;i Þk=rmax ðxk;i Þ, from (ii), (1.7) and Lemma 2.1, it has

Fðxk;iþ1 Þ 6 F pk ðxk;iþ1 Þ < F pk ðxk;i Þ <    < F pk ðxk;0 Þ < F pk1 ðxk;0 Þ <    < F p0 ðx0 Þ: Hence xk;iþ1 2 X. h Lemma 3.3. Under Assumptions 1.1 and 3.1, suppose that sequences fpk g and fx1;i g, fx2;i g; . . . ; fxk;i g; . . ., are generated by Algorithm 1. Then, (i) for any k, the sequence fxk;i g is finite, i.e., there exists a ik 2 N such that (3.3) holds for i ¼ ik ; (ii) the sequence fpk g is strictly monotone increasing and pk ! 1 as k ! 1.

1874

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

Proof. 0



0

(i) If not, let k be the smallest integer such that krF qp 0 ðxk ;i Þk > sðpk0 Þ for any i 2 N. Since x0 2 X, from Lemma 3.2, it has k 0 xk ;i 2 X for all i 2 N and 0



0

ð1  gÞkrF qp 0 ðxk ;i Þk 6 krF pk0 ðxk ;i Þk: k

0

0

Then by the boundedness of X, there exists an infinite subsequence N 0  N such that xk ;i !N ^ x satisfying xÞk P ð1  gÞsðpk0 Þ. By the continuity of krF pk0 ðxÞk, there exists a q0 > 0 such that for all x 2 Bð^ x; q0 Þ, krF pk0 ð^

krF pk0 ð^xÞk=2 6 krF pk0 ðxÞk 6 4krF pk0 ð^xÞk=3: 0

It then follows from Lemma 3.2 that for all xk ;i 2 Bð^ x; q0 Þ, 0

0

 krF qp 0 ðxk ;i Þk 6 ð1 þ gÞkrF pk0 ðxk ;i Þk 6 2krF pk0 ð^xÞk; k

and 0



0

krF qp 0 ðxk ;i Þk P ð1  gÞkrF pk0 ðxk ;i Þk P krF pk0 ð^xÞk=4: k

 # q and x 2 Bð^ Furthermore, because q is finite and Bð^ x; q0 Þ is bounded, there exists a j2 < 1 such that for any q x; q0 Þ, it has  k0 ;i ^ rmax ðxÞ 6 j2 . Hence, for any x 2 Bðx; q0 Þ, we have 0



khk0 ;i k 6 krF qp 0 ðxk ;i Þk=j0 6 2krF pk0 ð^xÞk=j0 ; k

and 0



0



hrF qp 0 ðxk ;i Þ; hk0 ;i i 6 krF qp 0 ðxk ;i Þk2 =j2 6 krF pk0 ð^xÞk2 =ð16j2 Þ; k

where

k

j0 ¼ minfd; 1g. Let L < 1 be a constant such that hy; r2 F pk0 ðxÞyi 6 Lkyk2

^Þk=j0 Þ and y 2 Rn . Similar to the proof of ðiÞ for all x 2 Bð^ x; 2krF pk0 ðx 0 x; q0 Þ and k 2 ð0; k , minfj0 ðð1  aÞj21  32gÞ=ð32LÞ; 1Þg such that for any xk ;i 2 Bð^

in

Lemma

3.2,

there

exists

k ¼

D E 0 0 0  F pk0 ðxk ;i þ khk0 ;i Þ  F pk0 ðxk ;i Þ 6 ak rF qp 0 ðxk ;i Þ; hk0 ;i 6 akkrF pk0 ð^xÞk2 =ð16j2 Þ: k

In view of the Step 6 in Algorithm 1, it follows that step length kk0 ;i P k b, and then 0

0

F pk0 ðxk ;i þ kk0 ;i hk0 ;i Þ  F pk0 ðxk ;i Þ 6 abk krF p ð^xÞk2 =ð16j2 Þ: 0

0

00

0

ð3:5Þ 0

00

0

00

Since xk ;i !N ^ x; q0 Þ for all i 2 N and xk ;i !N ^ x, there exists an infinite subset N  N , such that xk ;i 2 Bð^ x, and hence, we have 0 00 xÞ. Since for all i 2 N, by continuity that F pk0 ðxk ;i Þ!N F pk0 ð^ 0

0

F pk0 ðxk ;iþ1 Þ  F pk0 ðxk ;i Þ < 0; 0

hence F pk0 ðxk ;i Þ ! F pk0 ð^ xÞ as i!N 1. This contradicts (3.5), which means that for any k, the sequence fxk;i g is finite, i.e., there exists a ik 2 N such that (3.3) holds for i ¼ ik . (ii) The conclusion is obvious from Step 8.

h

k;ik 1 gk¼0 , are generated by Algorithm 1. Theorem 3.4. Under Assumptions 1.1 and 3.1, suppose that sequences fpk g1 k¼0 and fx 0 k;ik  N0 Then, there exists an infinite subsequence N  N such that x ! x as k! 1 and 0 2 @Fðx Þ.

Proof. From Lemma 3.2, it has xk;ik 2 X for any k 2 N, and hence there exists an infinite subsequence N 0  N such that 0  xk;ik !N x . Since krF qpk ðxk;ik Þk 6 sðpk Þ and sðpk Þ ! 0 as k ! 1, we have

krF qpk ðk;ik Þk ! 0; as k ! 1: 

Together with 

krF pk ðxk;ik Þ  rF qpk ðxk;ik Þk 6 gsðpk Þ; it follows that

lim krF pk ðxk;ik Þk ¼ 0: 0

k!N 1

ð3:6Þ

1875

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

Now, since

rF pk ðxk;ik Þ ¼

X

fjpk ðxk;ik Þrf j ðxk;ik Þ;

j2q

from Proposition 1.3 and (3.6), it has

lim rF pk ðxk;ik Þ ¼ 0

k!N

1

X

^fj rf j ðx Þ ¼ 0;

j2qðx Þ

where ^fj P 0 for all j 2 qðx Þ and

P

^j j2qðx Þ f

¼ 1, hence 0 2 @Fðx Þ. h

In the following, we’ll analyse the local convergence of the iteration in Algorithm 1 for the fixed problem ðPp Þ. Let x be a solution of ðP p Þ. Theorem 3.5. Suppose x is a solution of ðPp Þ; r2 F p ðx Þ is sufficiently positive such that rmin ðr2 F p ðx ÞÞ > a1 dða1 > 1Þ, the condition number of r2 F p ðx Þ is upper bounded such that condðr2 F p ðx ÞÞ < a2 =j21 ða2 < 1Þ. Under Assumptions 1.1 and 3.1, for sufficiently small sðpÞ > 0, let fxi g be the sequence generated by Algorithm 1 for fixed p, if e1 ¼ minfgsðpÞ; Oðs2 ðpÞÞg and e2 ¼ OðsðpÞÞ, then for all i greater than a certain index i0 , it has 



(i) the search direction hi ¼ r2 F qp ðxi Þ1 rF qp ðxi Þ and step length ki ¼ 1; and (ii) kxiþ1  x k ¼ Oðkxi  x k2 Þ.

0

Proof. (i) From (i) of Lemma 3.3, the sequence fxi g is finite, i.e., (3.3) is satisfied by finite iterations. Denote its length as i . It   is obvious that hi ¼ r2 F qp ðxi Þ1 rF qp ðxi Þ for all i greater than a certain index i0 . Then,

D E  F p ðxi þ hi Þ  F p ðxi Þ  a rF qp ðxi Þ; hi D E D E D E   ¼ rF p ðxi Þ  rF qp ðxi Þ; hi þ ð1  aÞ rF qp ðxi Þ; hi þ hi ; r2 F p ðxi Þhi =2 þ oðkhi k2 Þ   E D E D    ¼ rF p ðxi Þ  rF qp ðxi Þ; hi þ hi ; r2 F qp ðxi Þ1 r2 F p ðxi Þ  r2 F qp ðxi Þ hi =2 D E    þ ða  3=2Þ rF qp ðxi Þ; r2 F qp ðxi Þ1 rF qp ðxi Þ þ oðkhi k2 Þ:

Since 

krF p ðxi Þ  rF qp ðxi Þk 6 e1 6 Oðs2 ðpÞÞ; 2

jr F p ðxÞ  r kr

 F qp ðxi Þk

 2 q F p ðxÞk

ð3:7Þ

6 e2 ¼ OðsðpÞÞ;

ð3:8Þ

0

P sðpÞ; 8 i < i ;

ð3:9Þ

and 

krF qp ðxi Þk ¼ OðkrF p ðxi ÞkÞ; which can be induced from (3.7) and (3.8), we have

D E D E     F p ðxi þ hi Þ  F p ðxi Þ  a rF qp ðxi Þ; hi ¼ ða  3=2Þ rF qp ðxi Þ; r2 F qp ðxi Þ1 rF qp ðxi Þ þ oðkrF p ðxi Þk2 Þ; which means that step length ki ¼ 1 for all i greater than a certain index i0 since a < 1. (ii) For i < i0 such that xi is sufficiently close to x , we have 

krF p ðxi Þ  rF qp ðxi Þk 6 OðkrF p ðxi Þk2 Þ ¼ Oðkxi  x k2 Þ; and 

kr2 F p ðxi Þ  r2 F qp ðxi Þk 6 OðkrF p ðxi ÞkÞ ¼ Oðkxi  x kÞ: Then, it has

     kxiþ1  x k 6 kr2 F qp ðxi Þ1 k krF p ðxi Þ  rF p ðx Þ  r2 F p ðx Þ xi  x k þ krF qp ðxi Þ  rF p ðxi Þk           þ  r2 F qp ðxi Þ  r2 F p ðxi Þ xi  x k þ k r2 F p ðxi Þ  r2 F p ðx Þ xi  x k ¼ Oðkxi  x k2 Þ:



4. Numerical experiment In this section, we give some numerical results, comparing Algorithm 1 with some other algorithms, to show the efficiency of our algorithm. Algorithms PRW1 and PRW2 were proposed by Polak et al. in [34], which have been introduced

1876

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

in Section 1. Both of these two algorithms are based on the stabilized Newton–Armijo algorithm, and the Hessian in PRW2 is replaced by BFGS approximation. Fminimax is the matlab function in matlab optimization toolbox using a SQP algorithm. Three test problems and numerical results are listed below. Problems 1–2 are artificial and problem 3 is from [38]. Either the number of functions or the dimension of variables can be altered arbitrarily for these examples. ^ ¼ 105 ln q; j1 ¼ 107 ; j2 ¼ 1030 ; j3 ¼ 1000p ^; sðpÞ ¼ 103 ; Parameters for PRW1 and PRW2 are set as a ¼ 0:5; b ¼ 0:8; p 2 2 p0 ¼ 1; ða ; b Þ ¼ ð0:01; 0:2Þ;  ¼ 0:1. In Algorithm 1, ða ; b Þ ¼ ð0:1; 0:45Þ; c ¼ 10 ; x ¼ 10; s ðpÞ ¼ minf101 ; 1000=pg; e1 ¼ 102 ; e2 ¼ 101 ; d ¼ 2 for Example 2 and d ¼ 0:1 for other examples. Other parameters are the same with that in PRW1 and PRW2. In [34], algorithm was stopped when

kF pk ðxk;i Þ  Fð^xÞk 6 t; where Fð^ xÞ is the known optimum value. As it is said in [34], although this stopping criterion cannot be used in practical problem, it is the most useful one when comparing the computational efficiency of different algorithms. We also take this stopping criterion in the numerical experiments. Ò Ò All the computations are done by running MATLAB 7.6.0 on a laptop with AMD Turion 64  2 CPU 1.9 GHz and 896 M memory. Only the matlab functions chol, rcond and eigs(A,1,‘SA’) are utilized to compute the Cholesky decomposition, reciprocal condition number and the smallest eigenvalue. Fminimax is implemented by calling matlab function fminimax directly. The results are listed in the following tables, where x denotes the final approximate solution point, F  is the value of the objective function at x . Time is the CPU time in seconds. Example 1. Let x ¼ ðx1 ; x2 ; x3 ; x4 ; x5 ; x6 Þ 2 R6

FðxÞ ¼ max jf j ðxÞj; 16j6q

x3 expðt j x1 Þ sinðtj x2 Þ þ x1 expðx2 t j Þ cosðx3 tj þ x4 Þ þ x5 expðx6 tj Þ  yj ; x2 3 1 1 1 yj ¼ expðt j Þ þ expð5t j Þ  expð2tj Þð3 sinð2t j Þ þ 11 cosð2tj ÞÞ þ expðtj Þ 20 52 65 2 1 3 3 j 5 j j j j j  expð2t Þ þ expð3t Þ þ expð t Þ sinð7t Þ þ expð t Þ sinð5t Þ; 2 2 2 2 10ðj  1Þ j t ¼ ; j ¼ 1; . . . ; q: q1 j

f ðxÞ ¼

Example 2. Let x ¼ ðx1 ; . . . ; xn Þ 2 Rn

FðxÞ ¼ max jf j ðxÞj; 16j6q

j

f ðxÞ ¼

!

X

j

cosð2xk t Þ þ sinðx

kþn2

16k6n=2

j

2t Þ

X

! j

xk cosð2kt Þ þ x

kþ2n

j

sinð2kt Þ  yj ;

16k6n=2

2pðj  1Þ ; yj ¼ f j ðsÞ þ rj ; j ¼ 1; . . . ; q; q where s ¼ ð0:5; . . . ; 0:5Þ; r Nð0; 0:3Þ: tj ¼

Example 3. For fitting curves or surfaces to observed or measured data, a common criterion is to minimize the l1 norm of residual errors (see [1]). Consider rotated paraboloid data (see [38]), which appears in computational metrology when a parabolic reflector is measured. Data sets fðzj1 ; zj2 ; zj3 Þgqj¼1 are generated similarly to the prescription in [38]: produce point ð~zj1 ; ~zj2 ; ~zj3 Þ on unrotated paraboloid z3 ¼ 5ðz21 þ z22 Þ, add error item rj on ~zj3 , where r Nð0; 0:2Þ, then make rotations and translation such as

    zj1 ; zj2 ; zj3 ; 1 ¼ ~z1j ; ~zj2 ; ~zj3 ; 1 Cðp=25ÞBðp=20ÞAð2:1; 1:4; 1:3Þ;

where

2

1 60 6 Aðx3 ; x4 ; x5 Þ ¼ 6 40 x3

0 1 0 x4

0 0 1 x5

3 0 07 7 7; 05 1

2

1 0 0 6 0 cosðx Þ sinðx Þ 1 1 6 Bðx1 Þ ¼ 6 4 0  sinðx1 Þ cosðx1 Þ 0 0 0

Hence

  FðxÞ ¼ max zj3  x6 ðzj1 Þ2 þ ðzj2 Þ2 ; 16j6q

    where zj1 ; zj2 ; zj3 ; 1 ¼ zj1 ; zj2 ; zj3 ; 1 Aðx3 ; x4 ; x5 ÞBðx1 ÞCðx2 Þ:

3 0 07 7 7; 05 1

2

cosðx2 Þ 6 0 6 Cðx2 Þ ¼ 6 4 sinðx2 Þ 0

0  sinðx2 Þ 1 0 0 cosðx2 Þ 0 0

3 0 07 7 7: 05 1

1877

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879 Table 1 The numerical results of Example 1, x0 ¼ ð2; 2; 7; 0; 2; 1Þ; t ¼ 105 . Method

Algorithm 1 PRW1 PRW2 Fminimax

q ¼ 2000

q ¼ 4000

q ¼ 10000

q ¼ 20000

Time

F

Time

F

Time

F

Time

F

2.043 4.193 4.640 3.468

0.162825 0.162827 0.162827 0.162824

3.532 10.621 7.704 5.969

0.162827 0.162831 0.162831 0.162828

9.503 38.101 18.455 19.344

0.162827 0.162828 0.162828 0.162835

8.691 46.869 15.928 58.109

0.162827 0.162827 0.162827 0.162832

Solution obtained by Algorithm 1 q ¼ 2000 q ¼ 4000 q ¼ 10000 q ¼ 20000

x x x x

¼ ð2:576076; 1:998258; 6:406992; 1:129789; 1:262420; 2:558387Þ ¼ ð2:579942; 2:009336; 6:404138; 1:129936; 1:263729; 2:571348Þ ¼ ð2:579116; 2:006984; 6:404779; 1:129916; 1:263426; 2:568508Þ ¼ ð2:579010; 2:006741; 6:404895; 1:129937; 1:263332; 2:568006Þ

Table 2 The numerical results of Example 2 with changeable n; t ¼ 105 . Method

Algorithm 1 PRW1 PRW2 Fminimax

q ¼ 1000; n ¼ 20 x0 ¼ ð0:7; . . . ; 0:7Þ

q ¼ 1000; n ¼ 50 x0 ¼ ð0:6; . . . ; 0:6Þ

q ¼ 1000; n ¼ 70 x0 ¼ ð0:6; . . . ; 0:6Þ

q ¼ 1000; n ¼ 100 x0 ¼ ð0:5; . . . ; 0:5Þ

Time

F

Time

F

Time

F

Time

F

4.119 11.916 16.375 31.297

0.827575 0.827573 0.827573 0.828838

35.600 159.889 63.953 253.203

0.637887 0.637887 0.638864 0.637887

41.934 160.069 252.515 627.485

0.616779 0.616779 0.617766 0.616769

44.894 277.318 260.125 3058.569

0.605597 0.605597 0.605752 23.998441

Table 3 The numerical results of Example 2 with changeable q; t ¼ 104 . Method

Algorithm 1 PRW1 PRW2 Fminimax

q ¼ 2000; n ¼ 50 x0 ¼ ð0:6; . . . ; 0:6Þ

q ¼ 3000; n ¼ 50 x0 ¼ ð0:6; . . . ; 0:6Þ

q ¼ 4000; n ¼ 50 x0 ¼ ð0:6; . . . ; 0:6Þ

q ¼ 5000; n ¼ 50 x0 ¼ ð0:6; . . . ; 0:6Þ

Time

F

Time

F

Time

F

Time

F

20.905 65.886 132.507 118.835

0.797176 0.797184 0.797184 0.797087

43.824 113.687 214.901 244.817

0.895389 0.895389 0.895389 0.895293

78.750 351.884 267.891 422.121

0.970461 0.970461 0.970553 0.970454

67.840 184.323 356.485 2998.065

0.910058 0.910056 0.910149 0.910050

We see from the results showed in Tables 1–4 that the truncated aggregate smoothing Newton method is faster than other three methods especially for the problems with large q. In the following, a test on Example 3 (q ¼ 10000) for different parameter values of c and x is given, and the result is listed in the following Table 5. From the result in Table 5, we find that the performance of our algorithm moderately depends on the values of parameters c and x. In our experience, these parameters can generally take values in a considerable wide range from one to thousands. This seems reasonable by observing that l in (2.6) is mainly dependent on p and is moderately dependent on c and x because of the ln operation.

Table 4 The numerical results of Example 3, x0 ¼ ð0; 0; 2; 2; 2; 5Þ; t ¼ 105 . Method

Algorithm 1 PRW1 PRW2 Fminimax

q ¼ 1000

q ¼ 5000

q ¼ 10000

q ¼ 20000

Time

F

Time

F

Time

F

Time

F

0.594 0.779 1.172 1.672

0.563854 0.563854 0.563854 1.916302

2.911 9.132 14.313 5.110

0.729804 0.729805 0.729805 1.978806

2.932 14.243 13.569 9.094

0.748167 0.748167 0.748166 1.990350

8.129 24.671 26.578 21.891

0.748834 0.748834 0.748832 1.990973

Solution obtained by Algorithm 1 q ¼ 1000 q ¼ 5000 q ¼ 10000 q ¼ 20000

x x x x

¼ ð0:156835; 0:126266; 2:083052; 1:400898; 1:273032; 4:977858Þ ¼ ð0:157433; 0:127799; 2:061212; 1:384125; 1:231683; 5:008953Þ ¼ ð0:156252; 0:126171; 2:095124; 1:405886; 1:294852; 4:984403Þ ¼ ð0:156598; 0:126365; 2:094939; 1:403342; 1:298032; 4:989953Þ

1878

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

Table 5 The CPU time of Example 3 with changeable c; x.

x

1

1e1

1e2

1e3

1e4

2.865 2.970 3.030 3.029 3.069

2.910 2.964 2.932 3.098 3.086

3.005 2.965 2.962 2.932 3.057

3.108 2.867 3.020 3.010 3.048

3.064 2.937 2.960 3.038 3.053

c 1 1e1 1e2 1e3 1e4

5. Conclusion We have developed a truncated aggregate technique and combined it with stabilized Newton method for the solution of finite minimax problems. At each iteration, only a small subset of components in max-function is aggregated, hence the number of gradient and Hessian calculations is dramatically reduced, and hence the computation cost is greatly reduced especially for the problems with large q and complicated gradients or Hessians. The truncating criterions, concerning only with computation of function values and not their gradients or Hessians, can adaptively update the subset to guarantee the global convergence and locally quadratic convergence with as few computational cost as possible. The numerical results show that our algorithm is competitive with other aggregate smoothing algorithms. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42]

I. Al-Subaihi, G.A. Watson, Fitting parametric curves and surfaces by l1 distance regression, BIT 45 (2005) 443–461. D. Applegate, W. Cook, S. Dash, A. Rohe, Solution of a min–max vehicle routing problem, INFORMS J. Comput. 14 (2002) 132–143. E.M. Arkin, R. Hassin, A. Levin, Approximations for minimum and min–max vehicle routing problems, J. Algorithms 59 (2006) 1–18. J.W. Bandler, C. Charalambous, Practical least pth optimization of networks, IEEE Trans. Microwave Theory Tech. 20 (1972) 834–840. N.V. Banichuk, Minimax approach to structural optimization problems, J. Optim. Theory Appl. 20 (1976) 111–127. O. Berman, Z. Drezner, R.M. Wang, G.O. Wesolowsky, The minimax and maximin location problems on a network with uniform distributed weights, IIE Trans. 35 (2003) 1017–1025. C. Charalambous, Acceleration of the least pth algorithm for minimax optimization with engineering applications, Math. Program. 17 (1979) 270–297. E. Cherkaev, A. Cherkaev, Minimax optimization problem of structural design, Comput. Struct. 86 (2008) 1426–1435. Z. Drezner, G.O. Wesolowsky, Single facility lp -distance minimax location, SIAM J. Alg. Disc. Meth. 1 (1980) 315–321. E. Erkut, R.L. Francis, A. Tamir, Distance-constrained multifacility minimax location-problems on tree networks, Networks 22 (1992) 37–54. A. Frangioni, Generalized bundle methods, SIAM J. Optim. 113 (2002) 117–156. M. Gaudioso, M.F. Monaco, A bundle type approach to the unconstrained minimization of convex nonsmooth functions, Math Program. 23 (1982) 216– 226. C. Gígola, S. Gomez, A regularization method for solving the finite convex min–max problem, SIAM J. Numer. Anal. 27 (1990) 1621–1634. P. Hansen, D. Peeters, D. Richard, J.F. Thisse, The minimum and minimax location-problems revisited, Oper. Res. 33 (1985) 1251–1265. K. Holmaker, Minimax optimal-control problem, J. Optim. Theory Appl. 28 (1979) 391–410. T. Kitahara, S. Mizuno, K. Nakata, Quadratic and convex minimax classification problems, J. Oper. Res. Soc. Jpn. 51 (2008) 191–201. K.C. Kiwiel, Methods of descent for nondifferentiable optimization, Lecture Notes in Mathematics, vol. 1133, Springer, Berlin, 1985. B.W. Kort, D.P. Bertsekas, A new penalty function algorithm for constrained minimization, in: Proceedings of the 1972 IEEE Conference on Decision and Control, New Orleans, Louisiana, 1972. X.S. Li, An aggregate function method for nonlinear programming, Sci. China Ser. A 34 (1991) 1467–1473. X.S. Li, An aggregate function constraint method for nonlinear programming, J. Oper. Res. Soc. 42 (1991) 1003–1010. X.S. Li, An entropy-based aggregate method for minimax optimization, Eng. Optim. 18 (1992) 277–285. X.S. Li, S.C. Fang, On the entropic regularization method for solving min–max problems with applications, Math. Methods Oper. Res. 46 (1997) 119– 130. X.S. Li, S. Pan, Solving the finite min–max problem via an exponential penalty method, Comput. Tech. 8 (2003) 3–15. G. Liuzzi, S. Lucidi, M. Sclandrone, A derivative-free algorithm for linearly constrained finite minimax problems, SIAM J. Optim. 16 (2006) 1054–1075. H. Luss, Minimax resource-allocation problems: Optimization and parametric analysis, European J. Oper. Res. 60 (1992) 76–86. D.Q. Mayne, E. Polak, Nondifferentiable optimization via adaptive smoothing, J. Optim. Theory Appl. 43 (1984) 601–614. W. Murray, L.M. Overton, A projected Lagrangian algorithm for nonlinear minimax optimization, SIAM J. Sci. Statist. Comput. 1 (1980) 345–370. J. Nocedal, S.J. Wright, Numerical optimization, in: Peter Glynn, Stephen M. Robinson (Eds.), Springer Series in Operations Research, Springer-Verlag, New York, 1999. H.J. Oberle, Numerical-solution of minimax optimal-control problems by multiple shooting technique, J. Optim. Theory Appl. 50 (1986) 331–357. J.S. Pang, C.S. Yu, A min–max resource-allocation problem with substitutions, European J. Oper. Res. 41 (1989) 218–223. J.M. Peng, A smoothing function and its applications, in: M. Fukushima, L. Qi (Eds.), Reformulation-Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, Kluwer Academic Publisher, 1998, pp. 293–321. J.M. Peng, Z.H. Lin, A non-interior continuation method for generalized linear complementarity problems, Math. Program. 86 (1999) 533–563. E. Polak, Optimization: Algorithms and Consistent Approximations, Springer Verlag, New York, NY, 1997. E. Polak, J.O. Royset, R.S. Womersley, Algorithms with adaptive smoothing for finite minimax problems, J. Optim. Theory Appl. 119 (2003) 459–484. R.A. Polyak, Smooth optimization methods for minimax problems, SIAM J. Control Optim. 26 (1988) 1274–1286. L. Qi, W. Sun, An iterative method for minimax problem, in: P.M. Pardalos, D.Z. Du (Eds.), Minimax and Applications, Kluwer, Boston, USA, 1995, pp. 55–67. N.Z. Shor, Minimization Methods for Non-differentiable Functions, New York, Berlin, 1985. H. Späth, Least squares fitting with rotated paraboloids, Math. Commun. 6 (2001) 173–179. J. Sun, L.W. Zhang, On the log-exponential trajectory of linear programming, J. Global Optim. 25 (2003) 75–90. H.W. Tang, L.W. Zhang, A maximum entropy method for the convex programming, Chinese Sci. Bull. 39 (1994) 682–684. K.L. Teo, X.Q. Yang, Portfolio selection problem with minimax type risk function, Ann. Oper. Res. 101 (2001) 333–349. R.S. Womersley, R. Fletcher, An algorithm for composite nonsmooth optimization problems, J. Optim. Theory Appl. 48 (1986) 493–523.

Y. Xiao, B. Yu / Applied Mathematics and Computation 216 (2010) 1868–1879

1879

[43] S. Xu, Smoothing method for minimax problems, Comput. Optim. Appl. 20 (2001) 267–279. [44] Q.Z. Yang, D.Z. Yang, M.H. Zhang, Adjustable entropy function method, Math. Numer. Sin. 23 (1998) 81–86. [45] B. Yu, G.C. Feng, S.L. Zhang, The aggregate constraint homotopy method for nonconvex nonlinear programming, Nonlinear Anal. TMA 45 (2001) 839– 847. [46] I. Zang, A smoothing technique for min–max optimization, Math. Program. 19 (1980) 61–77. [47] J.L. Zhou, A.L. Tits, An SQP algorithm for finely discretized continuous minimax problems and other minimax problems with many objective functions, SIAM J. Optim. 6 (1996) 461–487. [48] J. Zowe, Nondifferentiable optimization: a motivation and a short introduction into the subgradient and the bundle concept, in: K. Schittkowski (Ed.), Computational Mathematical Programming, NATO SAI Series, vol. 15, Springer, New York, 1985, pp. 323–356.