Trust region algorithm for nonsmooth optimization

Trust region algorithm for nonsmooth optimization

li NOgIIt - ItOIlAND Trust Region Algorithm for Nonsmooth Optimization R. J. B. de Sampaio and Jin-Yun Y u a n Departamento de MatemStica Universid...

282KB Sizes 0 Downloads 118 Views

li NOgIIt

- ItOIlAND

Trust Region Algorithm for Nonsmooth Optimization R. J. B. de Sampaio and Jin-Yun Y u a n

Departamento de MatemStica Universidade Federal do Paranh Centro Polit~cnico, Cx. P.: 19.081 CEP: 81531-990, Curitiba, PR, Brazil W e n - Y u Sun

Department of Mathematics Nanjing University, Nanjing, 210093, People's Republic of China

ABSTRACT Minimization of a composite function h(f(x)) is considered here, where f: R n --) R m is a locally Lipschitzian function, and h: R 'n --* R is a continuously differentiable convex function. Theory of trust region algorithm for nonsmooth optimization given by Fletcher, Powell, and Yuan, is extended to this case. Trust region algorithm and its global convergence are studied. Finally, some applications on nonlinear and nonsmooth least squares problems are also given. © Elsevier Science Inc., 1997

1.

INTRODUCTION

T h e problem of trust region algorithm for n o n s m o o t h optimization and n o n s m o o t h equations has been considered by Fletcher [1], Powell [2], Y u a n [3], Qi and Sun [4], Martinez and Qi [5], and Sun and Y u a n [6]. Fletcher [1] and Y u a n [3] consider trust region methods for composite N D O problem (Nondifferentiable Optimization Problem)

MinimizeF( x) = g(x) + h( f ( x ) ) x~

(1.1)

R n

APPLIED MATHEMATICSAND COMPUTATION85:109-116(1997) © Elsevier Science Inc., 1997 655 Avenue of the Americas, New York, NY 10010

0096-3003/97//$17.00 PII S0096-3003(96)00112-9

110

SAMPAIO ET AL.

where g: R ~ --* R and f: R ~ ~ R ~ are continuously differentiable functions, and h: R ~ ---) R is a convex but n o n s m o o t h function bounded below. For Y u a n [3] g(x) is identically zero for all x ~ R ~. However, several practical problems from engineering and statistics fall into the following minimization problem of n o n s m o o t h composite function

MinimizeF( x) = h( f ( x ) ) x~

(1.2)

R '~

where f: R n -* R " is a locally Lipschitzian function, and h: R n -* R is a continuously differentiable convex function bounded below. For instance, the problem of solving the system of n o n s m o o t h equations and the least squares problem with nonsmooth d a t a are also special cases of (1.2). Therefore, in this paper we are likely to use trust region m e t h o d s to deal with (1.2) and extend the results of Fletcher [1] and Powell [2] to the case of (1.2). T h e algorithm of trust region is iterative, and at each iteration we need to solve a constrained subproblem. So, on the kth iteration Xk, a step-bound A k > 0 and a Bk, n × n s y m m e t r i c matrix, should be given. At each iteration, our s u b p r o b l e m will be defined as

1

Minimize¢~k( d) = h( f(Xk) + ZTd) + -~ dTBkd I[dll
(1.3)

where Z ~ O f ( x ) the so-called generalized Jacobian of f at x is fixed. T h e vector xk+ 1 is given the value

Xk+l =

x k + dk,

if F( x k+ dk) < F( xk)

Xk,

if F( x k+ dk) >1 F( x~)

where d k is the solution of (1.3), and then Ak+ 1 is defined for the next iteration. If F ( x k ) - F ( x k + dk) >/ c a [ F ( x ~ ) - ~bk(dk)], then

IIdkll < Ak+l < min{clAk, A}; otherwise, let c3lldkll < Ak+l < c4A k.

(1.4)

Trust Region Algorithm for Nonsmooth Optimization

111

where c 1 >i 1, c 2 < 1, c3 < c 4 < 1, and A is a positive constant which m a y be taken equal to the d i a m e t e r of a set B t h a t will be defined below. And then B k is u p d a t e d according to some rule. Fletcher [1] proved t h a t if the sequence { x k} is all in a bounded set B, and if B k is the Hessian of the Lagrangean function on the kth iteration, then there exists an accumulation point x* to the sequence { x k} at which the first order condition holds, t h a t is, the generalized derivative of F at x*, for all d ~ R ~, is bigger t h a n or equal to zero, which means

max

( d, Z T V h ( f ( x * ) ) )

>10.

Z E ~ .f( z* )

T h e condition (1.5) is called Fletcher's condition, and it holds in particular if V h ( f ( x * ) ) = 0. Following Y u a n [3], we are able to prove t h a t for each m a t r i x B k if k

IIBkll < c5 -4- c6 ~ A~, i=1

Fletcher's necessary condition (1.5) holds. 2.

PRELIMINARIES

Let f: R " --* R m be a locally Lipschitzian function. According to a t h e o r e m from R a d e m a r c h e r [7] f is Frechet differentiable almost everywhere. If we denote by 12f the subset of R " where f fails to be differentiable, we define the generalized Jacobian of f at x, denoted a f ( x ) , as the convex hull of all m X n matrices Z obtained as the limit of a sequence of the form { J( xi)} , where x i --* x, xi ~ Ftf, and J ( x i ) stands for the Jacobian of f at x~. Symbolically, one has ~f(x)

= co{lim J ( x i ) , x( --* x, x i ~ •f}.

Following Clarke [7] the next two propositions, the first about a locally Lipschitzian function at x and the second about composite functions, are true.

PROPOSITION 1. then:

1.

If f: R ~ ~ R m is a locally Lipschitzian function at x,

a ]( x) is a nonempty convex compact subset of R m× ~;

112

SAMPAIO ET AL.

2. 3 ] ( x ) is closed at x, that is, if x~ ~ x, Z i • 3](x,), Z~ --) Z, then z • 31(x); 3. 3 f( x) is upper semicontinuous at x, that is, f o r any E > O, there is a 8 > O, such that, f o r all y • x + 8 B~, it implies that 3 f( y) c 3 f( x) + eBmx ~, where B~ stands f o r a unit ball in R ~ and Bmx ~ stands f o r a unit ball in R ~x ~" 4. I f each c o m p o n e n t f u n c t i o n f i o f f is L i p s c h i t z i a n o f r a n k k ~ at x, then f is Lipschitzian at x with rank k = II(kl, k2,... , k~)ll, and 3 f ( x ) c kB~x m.

PROPOSITION 2. Let F = h o f, where f: R ~ --* R m is locally Lipschitzian at x, and h: Rm ~ R is a continuously differentiable convex f u n c t i o n Then F is locally Lipschitzian f u n c t i o n at x and one has 3 F( x) = ZTVh(f(x)), where Z • 3 f ( x ) . As a matter of simplicity of notation, we denote as • ¢P( x; d; Z) = h ( f ( x ) ) - h( f ( x ) + ZTd) • ~I~(x; Z) = maxd{O(x; d; Z): Ildll ~< r} • F ° ( x ; d) = m a x z E ~f(x)( d, ZTVh(f(x))}.

(2.1)

If F°(x*; d) >1 0, Vd • R n, then x* is a critical point of F(x) = h(f(x)). The following are elementary results obtained from Yuan [3] immediately:

PROPOSITION 3.

L e t F, ¢P, and •

as abov¢ then:

1. F°(x*; d) exists f o r all x and d in Rn; 2. (I)(x;.; Z) is a concave f u n c t i o n on R ~, and given d • R ~, its directional derivative in the direction d evaluated at d . = 0, is F°( x; d); 3. ¢P( x; d; . ) is a concave f u n c t i o n on 3f(x); 4. ~ r ( x ; Z ) > / 0 , f o r any r > O. ~ r ( x; Z ) = O if and only if x is a stationary point of h( ]( x)); 5. xItr( x; Z) is a concave f u n c t i o n in r; 6. ~r('; Z) is continuous f o r any given r >10.

3.

THE MAIN RESULT

Next we shall prove that under conditions of Section 1 on h, f, and Bk, Fletcher's condition (1.5) holds.

Trust Region Algorithm for Nonsmooth Optimization PROPOSITION 4.

113

The following lower bound for predicted reduction at

each iteration 1

h( f ( x ) ) - dPk(dk) >/ -~xIf~( xk; Z)min{1, ~I'a~( xk; Z) /(I[ BkHA2k)} holds. PROOF. If d k minimize ~bk(d), then

h(f(x))-Cbk(d~)>~h(f(x))-~bk(d

),

VdeR'~,Vl[dll
Let d k be such that the maximum in (2.1) is achieved; thus, %k( x~; Z) = h(S(Xk)) -- h(S(XE) + Z T ~ ) . Then using the convexity of h(f(x) + zT(')), we have for all a ~ (0, 1],

h(S(~))

- ¢~(d~) >~ h ( S ( ~ ) )

-

~(~)

= h(f( x ~ ) ) - h ( f ( x ~ ) + ~ Z T ~ ) 1

--is >~ h( f(

2 -

T

--

d k Bk dk

~)) - .h( f ( ~ ) + Z ~ )

-(1 - -) h(S(~))

- ~-2~B~

= a[ h( f(xk) ) - h( f(xk) + zT-~k)] -- ½a 2 d-~kTBk-~k =

~ ' I ' ~ ~( z ~ ; z )

-

1 ~ -dk - ~ Bk dk 7a

>/ ,'I'~k( ~,, • Z) _ 1,211B, IIA2.

114

SAMPAIO ET AL.

The above inequality means 1

h( f( Xk) ) -- qbk( dk) >1 max a q ~ ( xk; Z) O
~,~211BklIA~

1 1 % , ( x k; Z) 2 >/ min -~xItak(Xk; Z), 2 IIBklIA~ ]" []

The proof is complete.

Now we are able to prove under certain conditions on Bk, that if the sequence { x k} generated by algorithm of Section 1 is in a bounded set B whose diameter is not bigger than A, then the sequence {x k} has an accumulation point that satisfies Fletcher's condition (1.5).

THEOREM 5. If h( f ( x ) ) satisfies all the conditions stated in Section 1, if the sequence { x k} generated by the algorithm of Section 1 is in a bounded set B whose diameter is not bigger than A, if B k satisfies k

IIBkll < c5 -4- c6 ~ A~, i=1

then the sequence { x k} has an accumulation point x* that satisfies the condition (1.5).

PROOF. By the contrary, suppose that the sequence { x k} is bounded from stationary points of h(f(x)). By 4 of Proposition 3, there exists 8 > 0 such that ~1( xk; Z) > 8, for all k. Proposition 4 and Proposition 3 (5) indicate that

{11

h( f( xk) ) - Ck( dk) >1 cTmin Ak, i[-~kH >/

c7A k

1 + II Bkll~k

holds for all k. Using the fact that h is bounded below, we have that F~k[h(f(Xk)) -- h( f( Xk+ l))] is finite, so is ~:'k[h(f(Xk)) -- ¢bk(dk)] , where F.' stands for the sum over the iterations on which (1.4) holds. If we choose % c6 such that k

1+

IIB~IIAk <

c5 + es ~ A, i=l

Trust Region Algorithm for Nonsmooth Optimization

115

then F~'kAk/(c 5 + c6F.~=IA ~) is finite. Therefore, E'kA k is convergent. By definition of hk, we have, due to Powell [2],

E A~ <

( Cl)[ 1 + (1

c4)

A1 "~-

]

"

(3.2)

i=1

W e notice by (3.2) t h a t L-~i=1Ai is finite, and then B k is bounded over all k, t h a t is, B k is uniformly bounded. Thus, ~1( Xk; Z) cannot be bounded away from zero by a Fletcher's result [1]. This contradiction shows t h a t our theorem is true. [] 4.

APPLICATIONS

There are two obvious applications to the above theory: least squares problems with n o n s m o o t h d a t a and n o n s m o o t h equations. It is enough to identify the function h with square Euclidean norm. T a k e h(.) = 1/2]1-]]2 and f(.) = F(.) - b where F ( x ) T = ( f l ( x ) , f 2 ( x ) , . . . , fro(x)) and b T = (bl, b 2 , . . . , b,n) are not necessarily smooth. In this case, problem (1.2) is the n o n s m o o t h linear least squares problem

Minimize½[[ xE R n

F(x)

-

b1122.

(4.1)

Since h is a continuously differentiable convex function bounded below, and F(.) - b is locally Lipschitzian, we can apply our trust region m e t h o d to solve (4.1).

ALGORITHM.

1.

2.

set initial values: given x 0 ~ R n, B 0 ~ R n x n , A > A 0 > 0 ; given positive constants c 1 >t 1, a2 < 1, c 3 < c a ~< 1. for k = 0 , 1 , 2 . . . . . 2.1 Solve the s u b p r o b l e m 1 T min 1HF( xk) -- b + ZTd[[~ +-~d Bkd

(5.1)

Ildl[
where Z ~ 0 F ( x ) , is conveniently fixed. 2 . 2 I f ]lF(x k + d k) - b]l~ < ]]F(x k) - b]]~, then Xk+ 1 = x k + dk, else Xk+ 1 = Xk, where d k is a solution of (5.1).

116

SAMPAIO ET AL.

2.3 Check convergence. If some convergence rule is satisfied, then stop, else, go to step 2.4. 2.4 Update Ak: set

Vk =

liE(

) -

-liE(

+

{[

Ak],min{clAk,-A}

Ak+l ~

[c3Ak, c4Ak]

- bll

if rk >~ C2 otherwise.

2.5 Update B k according to some update formula. Similarly, we can solve nonsmooth equation F ( x ) = 0 by using our trust region technique to solve the following minimization problem min ½l] F ( x ) [122. xER

n

The work was supported by CNPq / Brazil, Foundation of The Federal University of Paran[z, Brazil, and the National Natural Science Foundation of China. This work was done when the last author visited the Federal University of Paran•, Brazil. REFERENCES 1 R. Fletcher, Practical Methods of Optimization, Vol. 2, Constrained Optimization, John Wiley and Sons, New York, 1981. 2 M.J.D. Powell, Convergence properties of a class of minimization algorithm, In: O. L. Mangassarian, R. R. Meyer and S. M. Robinson, Eds., NonlinearPrograw¢ ming 2, Academic Press, New York, 1975. 3 Y. Yuan, Conditions for convergence of trust region algorithm for nonsmooth optimization, MathematicalProgramming31:220-228 (1985). 4 L. Qi and Jie Sun, A nonsmooth version of Newton's Method, Mathematical Programming 58:353-367 (1993). 5 J. M. Martinez and L. Qi, Inexact Newton Methods for solving nonsmooth equations, J. Comput. Appl. Math., to appear. 6 W. Sun and Y. Yuan, Optimization Theory and Method4 Academic Press, Beijing, 1995. 7 F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.