-£ VyER }. 162
Ie+s. IeTi le le U = mln[max{uint,u /10,umin },lOu ], UIeTi = ZuIe (l _ [fle+i _ ~)/vle), \nt
where 0 < um,n .
y
~
the method can (at least in theory) be made invariant to the objective and constraint scaling; see (Kiwiel, 1989b).
('2:l )
x
1 is fixed. CONVERGENCE
We may now state the method in detail.
In this section we show that {xk}-+XeX=Ar'i!.llin{f IS} if »"0. We assume, of course, that the tolerance
Algor! ttun 1
I: =0. Then (25) implies that upon termination lEX. s
Step 0 (Initialization). Select an initial point x' E S, a final optmality tolerance I:s~O, i.q;>roveJlel1t
Hence we may suppose that the algoritl'm does not terminate. From lack of space, we shall only indicate how to modify the corresponding results of Kiwiel (1989b).
parameters O<~
Consider the following condition
..i ' " ,I: ), r,=f(y;y
counters
f(xk)~f(;) for some fixed ;ESh and all k,
k=l,
(30)
1=0 and k(O)=l.
k which holds if X",0 or ; is a cluster point of {x },
Step 1 (Direction finding). Find the solution (dk,vk ) of (20) and its multipliers Ale such that -le Je le :Jc J the set J ={je.J : \"'O} satisfies I J I ~ M-l. Comle -k Ie+s pute Ip I and Olp from (22). Set I: by (14).
since ~H~~ and f(l)~~ for all k. x
co
and v xk-+
otherwise, continue. le Step 3 (Descent test). Set l+i= and fhi by le Je+s Ie+s Y (13). If (12) holds, set t L=l, rx =f y , k(l+l)= k+1 and increase the counter of serious steps 1 by
l+d
~
••
Se t
~+'4~
•
k
k
L
x for
(31)
0 if K={k: t~=l} is infinite. some X E S.
s
s
Moreover,
s
0
s
T)k = min ~k = f:k(yk+1) + u k lyk+1_x k I 2/ 2 .
(32)
Note that T)~~k(xk)~f(xk). As in (Kiwiel,1989b), we have yk+i=argmin ~k, f:k(yk+i)=f:k(yk+'), T)k= min ~k,
such
•
•
•
~~(X) ~ T)k + u k I x_y k+112/2
and
V x E (RN.
(33)
Setting x=x with ~k(x)~f(x) we get
.
le
k+1 k k+1 Step 5 ( freight updating). If x "'x, select u E a
J
-
and
fkTi=fk+<~,xkT'_xk> for jei' . J
er: - f(x)]/~,
Let f:k=max{f(.;~,l:j): jeJk}, ~k= f:k+ukl.-xkI2/2+6h
{k+1}.
-f( IeH ;ykH ,I:kH ) g Ie+' =gr (IeH y ,I:IeH) , rJeH kH = x
k
ProoL Use (12) and (24) as in (Kiwiel, 1989b).
1; otherwise, set tLIe=O and ~T'=~ (null step). Set x x xle+S=xle +t~dle . that :Jkc.lcJk and I ~ I ~M-1, and set
"
I: k=i tLlv I ~
$t.ep 2 (Stopping criterion). If vle~-I:s' terminate;
Step" (Linearization updating). Select
x
Lenana 1. If (30) holds then
k
[umin'u ] (e.g . by (27»; otherwise, either set k k i k k k U + =u or choose u +'E[u ,lOu ] (e .g. by (27» if kH k -k k a k+, > max{ Ip I + Olp' -lOv } . (28) Step 6. Increase k by 1 and go to Step 1 .
T)k+uklyk+2_yk+112/~T)k+1~f(Xk) if xk+1=xk
from (33) . Letting wk=~_T)k, we get from (15) , (22)
" wk = u k ld k l 2/2 +';k = Ipkl2/Zu k +';k ,
A few comments on the method are in order .
p
Step 1 may use the dual QP method of Kiwiel (1989a) which can solve efficiently sequences of related subproblems (20). Step 2 is (25) .
justified
by
the
optimality
v
k
~
-w
wk'~
estimate
l
~
k
LeJllllla 2. (D If k(l)
~
p
v k/2 k
~
~ O.
(36)
(37)
k'
~ Ig,
If (30) holds and {I:k} is bounded then {l} is bounded . k (UD If (30) holds and {I: } is bounded then there exists C < co such that (1D
By the rules of Step 3,
l
(35)
= xk(l) if k(l) ~ k < k(l+l) ,
(29)
where we may let k(l+1)=o:> if the number 1 of serious steps stays bounded .
k+' < C / l ' k + Je(l) -f( k(l» OlkH .j u·r" x
then, if necessary, drop from Jk+1 an index jeJ,\:J k with the 1 largest error 0l1+ .
T)
k
. le k k kz mln{ f(x;y ,& ) + U Ix-x I /2 : xeS} ~ min{ ~ + <~,x-xk> + uklx _xk I 2/2 : xElRN} ~
k = ~ _ Il12/Zu , so
(e.g. u 1=lg11
min stopping
(38)
Proof' • (D If k = k(l) then k E Jk and
Step 5 may use the weight updating procedure of Kiwiel (1989b), which has additional criteria for k changing u .
and Umin=10-20Ul) and another
kH
if k(l) ~ k < k(l+l) and t~=O.
At Step 4 one may let Jk+1=J\;{k+1} and
With a suitable choice of u 1 and u
+ I:
k
k
k
wk~ ~ -~ + Ill2 /Zu = I: + Ill2 /Zu , and assertion
(i) follows from (35) . Use Lemma 1, (14) with ~ < m. and
(1D
criterion,
163
(22)
with
Uk~ um,n . to obtain /H=l +
invoke
Proo:f. Since ylt:H eS=IR:, P~e",\ (/+f.) implies P~ ~ 0 and p~./+f.=O. Let zi=z(i ,£i), jei.'. By (Ba) and
the
r'
We
(lID
\dk\~[(\l
have
(9)-(11),
from (35)' uk~U . , /H=l
......
(35) (37)
k imply that {gk} is bounded. Let = limsup v and k K'choose K'c{1,2, ... } such that v --+v. Let ~k(l) -k_-f( yk+f..,ykH ,c k+1)_J( and £rr yk+1) . Then - k -_ -f( yk+z ;yk+1 ,c k+1) - f-k ( yk+f.) -
v
II /+z_/+1 1 v +1 _v + \l+1 1I/+z _/+1 1'
= f~+I_f~_vk_£k+I
and vk-+O
It:+f.
It:
'Pr' and
By exploiting the additive structure (5) of f one may increase the speed of convergence at the cost of more storage and work per iteration. To this end, use the approximations
> (~-l+m.-~)Vk
~ (l-m.)lvkl with m.e(O,l) imply V=O. Then wk!O (37). 0
.
To trade off storage and QP work per iteration for speed of convergence, one may replace subgradient selection with aggregation as in (Kiwiel, 19B9b) (so that zlt: is generated recusively and M~ 2. The global convergence results are the same.
k
since k+ leJ k+1, so liJDsuP{~~: keK' }~O, k > ~v for ~k(l), so ~~
It:
)=EA .'I' (zJ)+y J J 0
MODI FI CATIONS
~ fk+1(/+z)_r\/+1)+\l+1
~
It:+f.
Comparing (lB)-(19) with (39)-(41), we can identify It: -It: It: It: It: It: It:. the tolerances cr=a -p'x and £F=lp I. If {u } 1S It: _P It: It: bounded, then x -+x, v -+0 and (22) imply that £r -+0 and £;-+0, so that {l} is a generalized minimizing sequence for (1), and every accUllUlation point of {zk} (which exists, e .g. for compact Z) solves (1). On the other hand, if {ult:} is unbounded then the proof of Lelllla 5 shows that a subsequence of max{c~,£;} vanishes (cf. (2B». Of course, one may impose an upper bound on UIt:+I at Step 5 to ensure the stronger convergence result.
Lemma 3. If xk = xk(l)= x for some fixed I and all k ~ k(l), then wk ! 0 and v k -+ O. Proof. By the rules of Step 5 and Lelllla 2, uk+1~ u k and wk+1~ wk for all k ~ k(l) . By (35)' llin~ f(x).
k
J
have r (y
EAIt:'I' (zi )=r\/+f. )-
so we may use (29) with xk-+x, the local boundedness of ir and the local Lipschitz continuity of f to complete the proof. 0
Hence (34) shows that {yk} is bounded, while yields \yk+2_yk+l l -+ 0 . Lemma 2(i), (14) and
we
-k
by
n
J
f (x ) = E i=Iri (x),
r~. (x) =
We may now state our principal result.
max{
f , (x;~,ci/n):
j
e Jk, }
constructed from the c-linearizations of f ,. defined via (B) with "f" replaced by "f. ", where the sets
Theorem 1 . Either Xk -+x e X or X = 0 and 1xlt: \ -+ 00 . In both cases I ! inf{ f(x): x e S}, If inf{ f(x): x k X e S} > -
~
Proo:f. If (30) holds, then the preceding results imply that xk-+x e X and I!f(x)=f(~)' so ~ e X and x the definition of inf{ f(x): xe S} yields the desired conclusion . 0
satisfying (=I I~ \ ~ M wit~ M ~ N+2n are selected by finding at most N+n nonzero Lagrange IlUltipliers ;."k ., jei.', i=l, . . . ,n, of the corresponding 'J , extension of (20) (see Kiwiel, 19B9b). In view of (3) and (4), we may use f(x;yi,C i ) = 'P . (z. (i,£i)+E.N x.'P(z(yi,£i»
Thus the method IlUSt terminate if inf{f(x): xe S} > -
and compute separate aggregate primal solutions l-E AIt: z. <~,Ci),i=l, ... ,n, (42)
0\.
i.. -
.ei.'
'J
+
,1) e
(RN,
\.
to form zlt:=(zk, ... ,Zk) for which the preceding conI n at most N+n
(9), calculates aggregate solutions zk by (17). (1, ...
\.
vergence results hold. The fact that IlUltipliers ;."k . are nonzero with
Suppose now that algorithm 1 applied to problem (7) with S = (RN, X .. 0 and e-linearizations given by
Lemma 6. With 1 =
J J1.
i
J
APPLICATION TO DECOMPOSITION
i. J
J= t
I,.
E
Jc Je..! .
we have
,
i=l, ... ,n,
has important implications in Lagrangian relaxation of discrete problems (Bertsekas, 19B2; Kiwiel and ToczYlowski, 19B6).
(39) k
'P(z )
It:
~
k
~
Pr
~
- ,ll,uz1 ,
p
~
-Ip
It:
HUMERI CAL RESULTS
/1
~ ~ f(l) ~ sup{ 'Po(z): 'P(z) ~ 0, z e Z}.
We shall now report on computational testing of the algorithm with a double precision Fortran code on an IBM PC/XT microcomputer with relative accuracy
(40)
(41)
164
Kiwiel, K.C. (1985a) . Hethods of Descent for Nondifferentiable Optimization . Lecture Notes in Mathematics 1133 . Springer , Berlin . Kiwiel, K.C. (1985b) . An algorithm for nonsmooth convex minimization with errors . Hathematics of CQQPutatian, 45, 173-180 . Kiwiel, K.C. (1987) . A constraint linearization method for nondifferentiable convex mln1mlzation. Numerische Hathematik, 51, 395-414. Kiwiel, K.C . (1989a). A dual method for solving certain positive semi-definite quadratic programming problems . SIAH Journal an Scientific
of 2 .2"10-16( =2 .ZE- 16). The parameters had values 1)..=0 . 1, m..=0 .2, £'=0.01, and c s =1E-6 in the stopping criterion vk? _£ s 0+ I fkx I ). We used the collection of fourteen nondifferentiable problems of the form (7) from (Kiwiel, 1989b) . Table 1 in the Appendix contains results obtained for ··max:iJnally·· inaccurate I inearizations . This means that the problem subroutine evaluated an exact subgradient lHE{1f(l~) and set f(l~ ;yk+~ £kH )=f(yk~ )_£k+l , ttrus making lH an £kH -sub-
and Statistical CQQPuting, 10.
Kiwiel, K.C. 0989b) . Proximity control in bundle methods for convex nondifferentiable minimization. Hath. Progr8JDll1ing ( to appear). Kiwiel, K.C . (1989c) . Exact penalty functions in prox:iJnal bundle methods for constrained convex nondifferentiable minimization . Technical Report, Systems Research Institute , Warsaw . Kiwiel, K.C., and E. Toczlowski (986) . Aggregate subgradients in Lagrangian relaxations of discrete optimization problems . ZeszYty Naukowe Politechniki Slqskiej , seria AutQJDatyka z . 84, (Wydawnictwo Naukowe Politechniki Sl~s kiej , Gliwice), 119-129 (Polish). Lasdon, L.S. (1970) . Optimization Theory for Large Systems. Macmillan , Toronto . Lemarechal, C. (1982). Numerical experiments in nonsmooth optimization . In E.A. Nurminski
gradient of f at l~ ; the resulting linearizations satisfied (8) with equality in (8c) . We also tested the opposite case, in which the subroutine returns exact function values (f(yk~;yk+l,£k+l)=f(yk+l» but they are perturbed by the algorithm according to (13) . In this case the results were similar to those in Table 1, in which k denotes the final iteration number (and the total number of function and subgradient evaluations) , f(x) denotes the optimal value , and the left-hand columns contain results for exact I inearizations (£k=O) from (Kiwiel, 1989b) . The largest growth of computational effort occurs on polyhedral functions (tests 3-6) . This is not surprizing, since for such functions our method with exact linearizations finds minimizers in fini tely many iterations (Kiwiel, 1989c), whereas £-linearizations ··soften·· the edges of epigraphs of these functions, ttrus preventing finite convergence.
(Ed . ) . Progress in Nondifferentiable Optimization . CP- 82- S8 . International Institute for
Applied Systems Analysis , Laxenburg, Austria . pp .61-84 . Rzhevskii , S.V., and A.V. Kuncevich ( 1985) . Application of an £ - subgradient method to the solution of the dual and primal problems of mathematical programming . Kibernetika, No . 5, 5154 (Russian) . Sen, 5 . , and H.D. Sherali ( 1986 ). A class of convergent pr:iJnal-dual subgradient algorithms for decomposable convex programs . Hath. Progr8l1JlIing, 35, 279-297 . Shor, N. Z. (1985) . Hinimizatian Hethods for Nondifferentiable Functions. Springer, Berlin . Spingarn, J .E. (1985) . Application of the method of partial inverses to convex programming decomposition . Hath . Progr8JDll1ing, 32 , 199-223 .
We may add, with regret, that we have found no comparable results in the literature . Although our academic examples have no direct connection with decomposition , they suggest that our method is quite effic ient, since some of them are considered to be difficult even for methods that use exact linearizations (Lemarechal, 1981 ; Shor, 1985) . Our experience with decomposition of production scheduling problems (Kiwiel and Tocmlowski, 1986) will be reported elsewhere. CONCLUSIONS
We have presented an extension of the proximal bundle method for convex nondifferentiable optimization to the case where only approximate objective linearizations are available . Our limited computational experience suggests that the method is promising . We have also exhibited important implications of this algorithm for decomposition of convex separable programs . Acknowledgmen t. This research was supported by the Polish Academy of Sciences , under Project CPBP/0215.
REFERENCES
Bertsekas, D.P. and
(982) .
Lagr~e
Constrained Optimization Hultiplier Hethods. Academic
Press, New York . Dantzig, G.E. , andP . Wolfe (1960). DecOllpOsition principle for linear programs . Operations Research, 8, 101- 111. Demyanov, V.F . , and L.V. Vasiliev (985) . Nondifferentiable Optimization. Optimization Software Inc . , New York . Gabasov, R., F.M . Kirilova , O.I. Kostyukova and V.M. Raketskii (1987) . Constructi~ Optimization Hethods, Part 4, Convex Problems. Izdatelstvo ··Universitetskoye", Minsk (Russian) . Go lshte in , E.G. (1987) . A general approach to decomposition of optimizing systems. Tekhnicheskaia Kibernetika , No . 1 , 59-69 (Russian). 165
APPENDIX
TABLE 1
Test
N
1 2 3 4 5 6 7
5 10 50 48 50 30
8
4 4 6 5 4 4 6
9 10
11 12 13 14
10
Results for Exact and Approximate Linearizations
Exact linearizations k f(x k ) 29
41 52 180 16 7 23 14 9 47 10 20 15 23
22 .600162 -0.8414074 6 .0E-13 -638565.00 3.6E-7 3.5E-9 -0 .3681664 0 .7071074 1.0142141 0.0147064 - 32.348679 -43 .999961 23 .886767 68.829581
Optimal value f(x) 22 .60016 -0 .841408 0 -638565 0 0 -0 .36811664 0 .7071068 1.0142136 0 .0147063 - 32 .348679 -44
23 .886767 68.82956
166
Approximate linearizations ic f(x ) k 35
45 89 435 50 13 29 25 14 61 14 20 19 26
22.60016 - 0.841407 5.9E- 7 -638564.51 1.7E-6 -2.3E-9 -0 .3681658 0.7071069 1.0142136 0.0147066 -32. 248679 -43.999971 23 . 886767 68.82956