Approximations in Decomposition of Large-Scale Convex Programs via a Nondifferentiable Optimization Method

Approximations in Decomposition of Large-Scale Convex Programs via a Nondifferentiable Optimization Method

Copyrig ht © I FAC I I th Trie nnial World Con g rt'ss. Tallillll . Es to llia. L SS R. I ~J ~ IO APPROXIMATIONS IN DECOMPOSITION OF LARGE-SCALE CONV...

1MB Sizes 2 Downloads 28 Views

Copyrig ht © I FAC I I th Trie nnial World Con g rt'ss. Tallillll . Es to llia. L SS R. I ~J ~ IO

APPROXIMATIONS IN DECOMPOSITION OF LARGE-SCALE CONVEX PROGRAMS VIA A NONDIFFERENTIABLE OPTIMIZATION METHOD K.

c.

Kiwiel

Systems R esea rch I nstitute, P olish Academy of Sciences, N ewelska 6, 01 -447 W arsaw, Poland

Abstract _ A proximal bundle methoo is presented for minimizing a nonsmooth convex function f . At each iteration it requires only one approximate evaluation of f and its c-subgradient, and finds a search direction via quadratic programming. When applied to Lagrangian decQq)osition of convex programs, it allows for inexact solutions of decomposed subproblems; yet, increasing their required accuracy automatically , it asymptotically finds both primal and dual solutions . Some encouraging numerical exPerience is reported. Ke~rds . Convex programming; decomposition ; mathematical programming; nonlinear programming ; nondifferentiable optimization; large-scale programming; proximal bundle methods .

wiel (1989b) solves the dual problem to (1) minimize f(x) over all x e S, (7) where S=lRN. It is a feasible point methoo of "descent" in t~e sense of generating a sequence {xle } e S converging to some xeX, over-estimates fle~f(xk) such that ~!f(x) and ~+l <~x if lH""l , t~leranI x x ces cle",O and trial points les for evaluating approximate linearizations of f . Hore specifically, we say that f( . ;y,c) is an c-linearization of f at y if it is an affine function of the form f(x ; y,c) = f(y ;y,c) + Vx (8a) which supports the epigraph of f at y with tolerance c in the sense that f(x) ~ f(x;y,c) V x e S, (8b) f(y ; y,c) ~ fey) - c. (Bc) Incidentally, gr(Y'c) is an c-subgradient of f at y. Of course, by (2) and (6), we may use f(y ; y,c) = Y'o(z(y , c» + , (9a) grey ,c) = Y'(z(y,c» . (Sb) At the k-th iteration the methoo employs the following piecewise linear (polyhedral) lower approximation to f ?(x) = max{ f(x ; yj,c j ) : jeJle} (10) with i'e{1, .. . ,k}, 1JIe I ::s N+2 . The next trial POint IeH . -:.le le le 2 Y = argmlO{ r (x) + u Ix-x I /2 : x e S}, (11) where ule >O is chosen by safeguarded quadratic interpolation to estimate the curvature of f between yk and xk-~ . A serious st~ from xle to xle+~=yk+~ occu-

I NTRODUCTI ON

This paper presents a new decomposition method solving separable convex programming problems. motivation for this algorithm comes from the of Kiwiel (1989b) on proximal bundle methoos nondifferentiable optimization .

for The work for

The convex separable optimization problem is to Y'o(z) : = E ~=~ Y'00. (z)

maximize

Y'(z) := E nL =1 Y'J\... (z) ~ 0, j=l, . . . , N, J \.

subject to

.. . , zn) e Z := Z~" .. ,"Zn ' (1) where Y'ji. are (possibly nondifferentiable) closed proper concave functions on closed sets Zi. e dom Y'ji. :=

Z

(z~,

n.

e (R

L, j=O, ... ,N, i=l , . . . ,no Suppose that (1) is feasible . We assume that the dual function f(x) = sup{ Y'o(z) + O we can evaluate

fi. (x)=sup{Y'00. (z)+t; =:t.XjY'ji. (zi.) : zi. eZ) with accuracy c/N by finding some z;,
(3)

Z. (x,c)={zeZ: Y'.... (z. )+t': X.Y' . (z )~f (x)-c/N} L

and

L

L

~.

z(x,c)=(z~(x,c) ,

L

J =~

J

JL

L

L

(

4)

. . . ,zn(x,c» , so that

f(x) = E ~=~ fi. (x) can be computed with accuracy c f(x) ~ Y'o(z(x,c» + X"Y'(z(x,c»

(5) ~

f(x)-c .

(6)

rs if lH is significantly better than xle in the sense that fleH < ~ le (12)

above assumptions are realistic in many applications (see, e.g . Demyanov and Vasiliev, 1985; Shor, 1985). For example , it may be impossible to find z(x,O) in finite time, or calculating z(x,c) for a prescribed c>o may require much less work, e.g . when (3) involves solving a linear, quadratic or discrete problem by the methoos of Gabasov and co-workers (1987) .

The

Y

-

x+lI\.v,

where fleH := f( ylc+~ ; yIeH,CkH) + clc+~ Y kH overestLmates fey ) by (8c), Ic+~ le

(13)



c

Our extension of the proximal bundle methOO of Ki-

=-

is the next tolerance, 161

(m.-~)v

O<~ <~ <1

(14)

are fixed, and

We denote by "f. the ordinary subdifferential "of. The mapping "f ( . ) is locally bounded and f is locally Lipschitz continuous on 5 (Kiwiel, 1985a).

(15) is the predicted descent (if vk ~ 0 the algorithm may stop with an opt ilia 1 xk ; see (25»; then IH= k+ i.Jc: . k+. k . x f < r . Otherwlse, a null step x =x l..q)roves Y x .... K+t . _ k+t k+t . the next model f wlth f(.;y ,£ ) and -f( ykH ;yhi ,£ kH) - fkx > I\. vk-£ kH

::m.vk > vk

THE METHOD

We shall now describe our method for problem (7). We assume, for generality, that S={x: h,(x)~O,iEI}, where h,are affine functions and I is finite, that f is convex and finite on an open neighborhood of 5 and that at each YES we can find an E-linearization of f for any £>0.

(16)

so that a better next yleT2~yleTi is found; cf. (15) . We show that Xk-+XEX under no additional as~tio­ ns. Moreover, the method also produces (asymptotically) an optimal primal solution to (1) as follows . Solving (11) through quadratic progrlllllllling (QP) yields Lagrange multipliers ~1e~0, jeJ', summing up to J 1, that produce the aggregate solution le k j j (17) Z = E Jc ~ . z(y ,£ ) j6..1

We start by specializing the (1989b) to subproblem (11) .

I=f(xk;~ ,£j) and ",k=t-f~O by (8b) and t~f(l), J J )( J x jeJk. Then (11) can be solved by finding (d k , vk ) to mlnlmlZe v + ukldI2/2, subject to -"'~ + <~ ,d> ~ v V j E J k , (20) J k h (x ) + <<;7h ,d>~ 0 ViE I, . kTt k k 1. \. Slnce y =x +d . Denote the Lagrange multipliers of (20) by ~ \J jEJIe, v\ iEI. ,

J

in Z (by convexity), such that k le V'o(z ) ~ sup{ V'o(z): V'(z)~O, ZEZ} - £.'

results .of Kiwiel Let gl=gr(yJ,£J),

(18)

V',(l)~£~, i=l, ... ,N, (19) with easily computable tolerances £~-+O and £~-+O . Hence every accumulation point of {zle} (that exists e.g. for compact Z) solves (1); in practice one may stop if £~ and £~ are small enough . (We may add that the method will find a solution in a finite number of iterations if f is polYhedral, £k=O and either ~=1 in (12) or certain technical conditions are satisfied; see Kiwiel (1989c) for details) .

Let 6 h denote the indicator function of 5 (6 h (x)=0 -le -k le le 2 if XES, =f +u I· -x I /2+6 h . As in (Kiwiel, 1989b), from (10) and the optimality condition ae,,;\lH) for (11) we deduce the . l e ' le leH le leH eXlstence of Pf E "f (y ) and Ph E "6 h(y ) such that the aggregate linearizations rle(x) = fK(yleTi)+ k leH -le le hi and 6 h (x) = minorize f and 6 h ' respectively, and that k k k k Pf + Ph + u d =0, (21) k k k 2 dk _v = u l l + ;k = Ipkl2/u + ;k, (22)

There is extensive literature on decomposition in separable programming (see, e .g. the references of Sen and Sherali, 1986 ; Spingarn, 1985) . Our method may be regarded as a regularized version of the classical cutting plane approach (Dantzig and Wolfe, 1960), that deletes the quadratic term in (11) and, hence, suffers from slow convergence and unbounded storage. The recently proposed augmented Lagrangian methods (see, e.g. Golshtein, 1987; Spingarn, 1985) are simpler and have strong convergence properties, but they involve additional nonlinearities in the objective of (3), and controlling their accuracy tolerances in not easy. Moreover, our method (just like the cutting plane ones) produces aggregate primal solutions (42) that have important "approximate discreteness" properties in the context of Lagrangian relaxation of discrete problems (see, e.g . Bertsekas, 1982; Lasdon, 1970) .

p

-k

-k

p

-k

"'p = "'fp + "'h ~ 0,

(23) k f(x) ~ f(x ) + < pk, x - xk > - ;k V x e 5, (24) p

In the context of nondifferentiable optimization methods that use approximate linearizations, we hope that our method is an improvement on those given in (Demyanov and Vasiliev, 1985; Kiwiel, 1985; Rzhevski and Kuncevich, 1985; Shor, 1985), since it generalizes one of the currently most efficient methods with exact linearizations (Kiwiel, 1989b).

f(x) ~ f(Xk)_lukvkll/2Ix_xkl+vk V x E 5,

(25)

k k k k k p = Pf + Ph = -u d

(26)

(Hint: (24) can be derived by adding the inequalit:le -k ies f~r and 6h~6h and using (21), (23) and (26); k - k Ie+i le -le k le k le (22) follows from v =f (y )-f =f (x )-f +Pr"d = -k -le le le x x -"'p-"'h+P "d , whereas (25) results from (22)-(24) and the Cauchy-Schwartz inequality . ) (21)-(26) also follow from the Karush-Kuhn-Tucker conditions for (20), with plef=E~~gi; cf . (Kiwiel, 1987).

The paper is organized as follows. The algorithm is derived in Section 2. Its global convergence is studied in Section 3. Implications for decomposition are studied in Section 4. Some modifications are described in Section 5. Our preliminary numerical experience is reported in Section 6. Finally, we have a conclusion section .

J

J

k The choice of weights u is crucial in practice, since "too large" ukle produce very short steps, whereas "too small" u may result in many null steps (Kiwiel, 1989b). Assumming te~rarily that f behaves like a quadratic between l and lH with • • le le le 2 le le the derlvatlve v =-u Id I at x along d ,we have feyhi )=f(xk+dle )=f(xle )+vle +u- Idle 12/2, where u- equals ZUIe (l_[f(yle Ti)_f(xle )]/vK ) . To prevent drastic chank ges of u , we shall use

We use the following notation. We denote by <.,.> and I. I ' respective ly, the usual inner product and norm in (RN . Both superscripts and subscripts are used to denote different vectors. For £~O, the £subdifferential of f at x is defined by N N "£f(x) = {peR : f(y)~f(x)+-£ VyER }. 162

Ie+s. IeTi le le U = mln[max{uint,u /10,umin },lOu ], UIeTi = ZuIe (l _ [fle+i _ ~)/vle), \nt

where 0 < um,n .

y

~

the method can (at least in theory) be made invariant to the objective and constraint scaling; see (Kiwiel, 1989b).

('2:l )

x

1 is fixed. CONVERGENCE

We may now state the method in detail.

In this section we show that {xk}-+XeX=Ar'i!.llin{f IS} if »"0. We assume, of course, that the tolerance

Algor! ttun 1

I: =0. Then (25) implies that upon termination lEX. s

Step 0 (Initialization). Select an initial point x' E S, a final optmality tolerance I:s~O, i.q;>roveJlel1t

Hence we may suppose that the algoritl'm does not terminate. From lack of space, we shall only indicate how to modify the corresponding results of Kiwiel (1989b).

parameters O<~ O, a lower bound for weights umin>O' the maximum number of stored subgradients M ~ N+2 and an initial accuracy tolerance I: , >0. Set y" =x . , f ={1}, l=g(y',I:') and r=r+I:'. Set the x ,

Consider the following condition

..i ' " ,I: ), r,=f(y;y

counters

f(xk)~f(;) for some fixed ;ESh and all k,

k=l,

(30)

1=0 and k(O)=l.

k which holds if X",0 or ; is a cluster point of {x },

Step 1 (Direction finding). Find the solution (dk,vk ) of (20) and its multipliers Ale such that -le Je le :Jc J the set J ={je.J : \"'O} satisfies I J I ~ M-l. Comle -k Ie+s pute Ip I and Olp from (22). Set I: by (14).

since ~H~~ and f(l)~~ for all k. x

co

and v xk-+

otherwise, continue. le Step 3 (Descent test). Set l+i= and fhi by le Je+s Ie+s Y (13). If (12) holds, set t L=l, rx =f y , k(l+l)= k+1 and increase the counter of serious steps 1 by

l+d

~

••

Se t

~+'4~



k

k

L

x for

(31)

0 if K={k: t~=l} is infinite. some X E S.

s

s

Moreover,

s

0

s

T)k = min ~k = f:k(yk+1) + u k lyk+1_x k I 2/ 2 .

(32)

Note that T)~~k(xk)~f(xk). As in (Kiwiel,1989b), we have yk+i=argmin ~k, f:k(yk+i)=f:k(yk+'), T)k= min ~k,

such







~~(X) ~ T)k + u k I x_y k+112/2

and

V x E (RN.

(33)

Setting x=x with ~k(x)~f(x) we get

.

le

k+1 k k+1 Step 5 ( freight updating). If x "'x, select u E a

J

-

and

fkTi=fk+<~,xkT'_xk> for jei' . J

er: - f(x)]/~,

Let f:k=max{f(.;~,l:j): jeJk}, ~k= f:k+ukl.-xkI2/2+6h

{k+1}.

-f( IeH ;ykH ,I:kH ) g Ie+' =gr (IeH y ,I:IeH) , rJeH kH = x

k

ProoL Use (12) and (24) as in (Kiwiel, 1989b).

1; otherwise, set tLIe=O and ~T'=~ (null step). Set x x xle+S=xle +t~dle . that :Jkc.lcJk and I ~ I ~M-1, and set

"

I: k=i tLlv I ~

$t.ep 2 (Stopping criterion). If vle~-I:s' terminate;

Step" (Linearization updating). Select

x

Lenana 1. If (30) holds then

k

[umin'u ] (e.g . by (27»; otherwise, either set k k i k k k U + =u or choose u +'E[u ,lOu ] (e .g. by (27» if kH k -k k a k+, > max{ Ip I + Olp' -lOv } . (28) Step 6. Increase k by 1 and go to Step 1 .

T)k+uklyk+2_yk+112/~T)k+1~f(Xk) if xk+1=xk

from (33) . Letting wk=~_T)k, we get from (15) , (22)

" wk = u k ld k l 2/2 +';k = Ipkl2/Zu k +';k ,

A few comments on the method are in order .

p

Step 1 may use the dual QP method of Kiwiel (1989a) which can solve efficiently sequences of related subproblems (20). Step 2 is (25) .

justified

by

the

optimality

v

k

~

-w

wk'~

estimate

l

~

k

LeJllllla 2. (D If k(l)

~

p

v k/2 k

~

~ O.

(36)

(37)

k'
~ Ig,,l: k(l»1 2/Zu k(l) + I:k(l).

If (30) holds and {I:k} is bounded then {l} is bounded . k (UD If (30) holds and {I: } is bounded then there exists C < co such that (1D

By the rules of Step 3,

l

(35)

= xk(l) if k(l) ~ k < k(l+l) ,

(29)

where we may let k(l+1)=o:> if the number 1 of serious steps stays bounded .

k+' < C / l ' k + Je(l) -f( k(l» OlkH .j u·r" x

then, if necessary, drop from Jk+1 an index jeJ,\:J k with the 1 largest error 0l1+ .

T)

k

. le k k kz mln{ f(x;y ,& ) + U Ix-x I /2 : xeS} ~ min{ ~ + <~,x-xk> + uklx _xk I 2/2 : xElRN} ~

k = ~ _ Il12/Zu , so

(e.g. u 1=lg11

min stopping

(38)

Proof' • (D If k = k(l) then k E Jk and

Step 5 may use the weight updating procedure of Kiwiel (1989b), which has additional criteria for k changing u .

and Umin=10-20Ul) and another

kH

if k(l) ~ k < k(l+l) and t~=O.

At Step 4 one may let Jk+1=J\;{k+1} and

With a suitable choice of u 1 and u

+ I:

k

k

k

wk~ ~ -~ + Ill2 /Zu = I: + Ill2 /Zu , and assertion

(i) follows from (35) . Use Lemma 1, (14) with ~ < m. and

(1D

criterion,

163

(22)

with

Uk~ um,n . to obtain /H=l +
invoke

Proo:f. Since ylt:H eS=IR:, P~e",\ (/+f.) implies P~ ~ 0 and p~./+f.=O. Let zi=z(i ,£i), jei.'. By (Ba) and

the

r'

We

(lID

\dk\~[(\l
have

(9)-(11),

from (35)' uk~U . , /H=l
......

(35) (37)

k imply that {gk} is bounded. Let = limsup v and k K'choose K'c{1,2, ... } such that v --+v. Let ~k(l) -k_-f( yk+f..,ykH ,c k+1)_J( and £rr yk+1) . Then - k -_ -f( yk+z ;yk+1 ,c k+1) - f-k ( yk+f.) - £r

v

II /+z_/+1 1 v +1 _v + \l+1 1I/+z _/+1 1'

= f~+I_f~_vk_£k+I

and vk-+O

It:+f.

It:

'Pr' and

By exploiting the additive structure (5) of f one may increase the speed of convergence at the cost of more storage and work per iteration. To this end, use the approximations

> (~-l+m.-~)Vk

~ (l-m.)lvkl with m.e(O,l) imply V=O. Then wk!O (37). 0

.

To trade off storage and QP work per iteration for speed of convergence, one may replace subgradient selection with aggregation as in (Kiwiel, 19B9b) (so that zlt: is generated recusively and M~ 2. The global convergence results are the same.

k

since k+ leJ k+1, so liJDsuP{~~: keK' }~O, k > ~v for ~k(l), so ~~

It:

)=EA .'I' (zJ)+y J J 0

MODI FI CATIONS

~ fk+1(/+z)_r\/+1)+\l+1

~

It:+f.

Comparing (lB)-(19) with (39)-(41), we can identify It: -It: It: It: It: It: It:. the tolerances cr=a -p'x and £F=lp I. If {u } 1S It: _P It: It: bounded, then x -+x, v -+0 and (22) imply that £r -+0 and £;-+0, so that {l} is a generalized minimizing sequence for (1), and every accUllUlation point of {zk} (which exists, e .g. for compact Z) solves (1). On the other hand, if {ult:} is unbounded then the proof of Lelllla 5 shows that a subsequence of max{c~,£;} vanishes (cf. (2B». Of course, one may impose an upper bound on UIt:+I at Step 5 to ensure the stronger convergence result.

Lemma 3. If xk = xk(l)= x for some fixed I and all k ~ k(l), then wk ! 0 and v k -+ O. Proof. By the rules of Step 5 and Lelllla 2, uk+1~ u k and wk+1~ wk for all k ~ k(l) . By (35)' llin~ f(x).

k

J

have r (y

EAIt:'I' (zi )=r\/+f. )-=I_vlt: _plt:'xlt: +plt:.dk, h J J 0 x so (39) follows from the concavity of 'Po' (17), (22) and (26) , Similarly, It: It:i It: It: It: It: 'I'(z ) ~ Ei \'P(z ) = Pr = p - Ph ~ p with (22) imply (40). (41) follows from the weak duality theorem. 0

so we may use (29) with xk-+x, the local boundedness of ir and the local Lipschitz continuity of f to complete the proof. 0

Hence (34) shows that {yk} is bounded, while yields \yk+2_yk+l l -+ 0 . Lemma 2(i), (14) and

we

-k

by

n

J

f (x ) = E i=Iri (x),

r~. (x) =

We may now state our principal result.

max{

f , (x;~,ci/n):

j

e Jk, }

constructed from the c-linearizations of f ,. defined via (B) with "f" replaced by "f. ", where the sets

Theorem 1 . Either Xk -+x e X or X = 0 and 1xlt: \ -+ 00 . In both cases I ! inf{ f(x): x e S}, If inf{ f(x): x k X e S} > -
~

Proo:f. If (30) holds, then the preceding results imply that xk-+x e X and I!f(x)=f(~)' so ~ e X and x the definition of inf{ f(x): xe S} yields the desired conclusion . 0

satisfying (=I I~ \ ~ M wit~ M ~ N+2n are selected by finding at most N+n nonzero Lagrange IlUltipliers ;."k ., jei.', i=l, . . . ,n, of the corresponding 'J , extension of (20) (see Kiwiel, 19B9b). In view of (3) and (4), we may use f(x;yi,C i ) = 'P . (z. (i,£i)+E.N x.'P(z(yi,£i»

Thus the method IlUSt terminate if inf{f(x): xe S} > - O. s

and compute separate aggregate primal solutions l-E AIt: z. <~,Ci),i=l, ... ,n, (42)

0\.

i.. -

.ei.'

'J

+

,1) e

(RN,

\.

to form zlt:=(zk, ... ,Zk) for which the preceding conI n at most N+n

(9), calculates aggregate solutions zk by (17). (1, ...

\.

vergence results hold. The fact that IlUltipliers ;."k . are nonzero with

Suppose now that algorithm 1 applied to problem (7) with S = (RN, X .. 0 and e-linearizations given by

Lemma 6. With 1 =

J J1.

i

J

APPLICATION TO DECOMPOSITION

i. J

J= t

I,.

E

Jc Je..! .

we have

,


i=l, ... ,n,

has important implications in Lagrangian relaxation of discrete problems (Bertsekas, 19B2; Kiwiel and ToczYlowski, 19B6).

(39) k

'P(z )

It:

~

k

~

Pr

~

- ,ll,uz1 ,

p

~

-Ip

It:

HUMERI CAL RESULTS

/1

~ ~ f(l) ~ sup{ 'Po(z): 'P(z) ~ 0, z e Z}.

We shall now report on computational testing of the algorithm with a double precision Fortran code on an IBM PC/XT microcomputer with relative accuracy

(40)

(41)

164

Kiwiel, K.C. (1985a) . Hethods of Descent for Nondifferentiable Optimization . Lecture Notes in Mathematics 1133 . Springer , Berlin . Kiwiel, K.C. (1985b) . An algorithm for nonsmooth convex minimization with errors . Hathematics of CQQPutatian, 45, 173-180 . Kiwiel, K.C. (1987) . A constraint linearization method for nondifferentiable convex mln1mlzation. Numerische Hathematik, 51, 395-414. Kiwiel, K.C . (1989a). A dual method for solving certain positive semi-definite quadratic programming problems . SIAH Journal an Scientific

of 2 .2"10-16( =2 .ZE- 16). The parameters had values 1)..=0 . 1, m..=0 .2, £'=0.01, and c s =1E-6 in the stopping criterion vk? _£ s 0+ I fkx I ). We used the collection of fourteen nondifferentiable problems of the form (7) from (Kiwiel, 1989b) . Table 1 in the Appendix contains results obtained for ··max:iJnally·· inaccurate I inearizations . This means that the problem subroutine evaluated an exact subgradient lHE{1f(l~) and set f(l~ ;yk+~ £kH )=f(yk~ )_£k+l , ttrus making lH an £kH -sub-

and Statistical CQQPuting, 10.

Kiwiel, K.C. 0989b) . Proximity control in bundle methods for convex nondifferentiable minimization. Hath. Progr8JDll1ing ( to appear). Kiwiel, K.C . (1989c) . Exact penalty functions in prox:iJnal bundle methods for constrained convex nondifferentiable minimization . Technical Report, Systems Research Institute , Warsaw . Kiwiel, K.C., and E. Toczlowski (986) . Aggregate subgradients in Lagrangian relaxations of discrete optimization problems . ZeszYty Naukowe Politechniki Slqskiej , seria AutQJDatyka z . 84, (Wydawnictwo Naukowe Politechniki Sl~s­ kiej , Gliwice), 119-129 (Polish). Lasdon, L.S. (1970) . Optimization Theory for Large Systems. Macmillan , Toronto . Lemarechal, C. (1982). Numerical experiments in nonsmooth optimization . In E.A. Nurminski

gradient of f at l~ ; the resulting linearizations satisfied (8) with equality in (8c) . We also tested the opposite case, in which the subroutine returns exact function values (f(yk~;yk+l,£k+l)=f(yk+l» but they are perturbed by the algorithm according to (13) . In this case the results were similar to those in Table 1, in which k denotes the final iteration number (and the total number of function and subgradient evaluations) , f(x) denotes the optimal value , and the left-hand columns contain results for exact I inearizations (£k=O) from (Kiwiel, 1989b) . The largest growth of computational effort occurs on polyhedral functions (tests 3-6) . This is not surprizing, since for such functions our method with exact linearizations finds minimizers in fini tely many iterations (Kiwiel, 1989c), whereas £-linearizations ··soften·· the edges of epigraphs of these functions, ttrus preventing finite convergence.

(Ed . ) . Progress in Nondifferentiable Optimization . CP- 82- S8 . International Institute for

Applied Systems Analysis , Laxenburg, Austria . pp .61-84 . Rzhevskii , S.V., and A.V. Kuncevich ( 1985) . Application of an £ - subgradient method to the solution of the dual and primal problems of mathematical programming . Kibernetika, No . 5, 5154 (Russian) . Sen, 5 . , and H.D. Sherali ( 1986 ). A class of convergent pr:iJnal-dual subgradient algorithms for decomposable convex programs . Hath. Progr8l1JlIing, 35, 279-297 . Shor, N. Z. (1985) . Hinimizatian Hethods for Nondifferentiable Functions. Springer, Berlin . Spingarn, J .E. (1985) . Application of the method of partial inverses to convex programming decomposition . Hath . Progr8JDll1ing, 32 , 199-223 .

We may add, with regret, that we have found no comparable results in the literature . Although our academic examples have no direct connection with decomposition , they suggest that our method is quite effic ient, since some of them are considered to be difficult even for methods that use exact linearizations (Lemarechal, 1981 ; Shor, 1985) . Our experience with decomposition of production scheduling problems (Kiwiel and Tocmlowski, 1986) will be reported elsewhere. CONCLUSIONS

We have presented an extension of the proximal bundle method for convex nondifferentiable optimization to the case where only approximate objective linearizations are available . Our limited computational experience suggests that the method is promising . We have also exhibited important implications of this algorithm for decomposition of convex separable programs . Acknowledgmen t. This research was supported by the Polish Academy of Sciences , under Project CPBP/0215.

REFERENCES

Bertsekas, D.P. and

(982) .

Lagr~e

Constrained Optimization Hultiplier Hethods. Academic

Press, New York . Dantzig, G.E. , andP . Wolfe (1960). DecOllpOsition principle for linear programs . Operations Research, 8, 101- 111. Demyanov, V.F . , and L.V. Vasiliev (985) . Nondifferentiable Optimization. Optimization Software Inc . , New York . Gabasov, R., F.M . Kirilova , O.I. Kostyukova and V.M. Raketskii (1987) . Constructi~ Optimization Hethods, Part 4, Convex Problems. Izdatelstvo ··Universitetskoye", Minsk (Russian) . Go lshte in , E.G. (1987) . A general approach to decomposition of optimizing systems. Tekhnicheskaia Kibernetika , No . 1 , 59-69 (Russian). 165

APPENDIX

TABLE 1

Test

N

1 2 3 4 5 6 7

5 10 50 48 50 30

8

4 4 6 5 4 4 6

9 10

11 12 13 14

10

Results for Exact and Approximate Linearizations

Exact linearizations k f(x k ) 29

41 52 180 16 7 23 14 9 47 10 20 15 23

22 .600162 -0.8414074 6 .0E-13 -638565.00 3.6E-7 3.5E-9 -0 .3681664 0 .7071074 1.0142141 0.0147064 - 32.348679 -43 .999961 23 .886767 68.829581

Optimal value f(x) 22 .60016 -0 .841408 0 -638565 0 0 -0 .36811664 0 .7071068 1.0142136 0 .0147063 - 32 .348679 -44

23 .886767 68.82956

166

Approximate linearizations ic f(x ) k 35

45 89 435 50 13 29 25 14 61 14 20 19 26

22.60016 - 0.841407 5.9E- 7 -638564.51 1.7E-6 -2.3E-9 -0 .3681658 0.7071069 1.0142136 0.0147066 -32. 248679 -43.999971 23 . 886767 68.82956