European Journal of Operational Research 111 (1998) 598±616
Theory and Methodology
Iterative bundle-based decomposition for large-scale nonseparable convex optimization Koohyun Park *, Yong-Sik Shin Department of Industrial Engineering, Hong-Ik University, 72-1 Sangsu-dong, Mapo-ku, Seoul 121-791, South Korea Received 7 September 1996; accepted 24 July 1997
Abstract There has been considerable research in solving large-scale separable convex optimization problems. In this paper we present an algorithm for large-scale nonseparable smooth convex optimization problems with block-angular linear constraints. The solution of the problem is approximated by solving a sequence of structured separable quadratic programs. The Bundle-based decomposition (BBD) method of Robinson (In: Prekopa, A., Szelezs an, J., Strazicky, B. (Eds.), System Modelling and Optimization, Springer, 1986, pp. 751±756; Annals de Institute Henri Poincare: Analyse Non Lineaire 6 (1989) 435±447) is applied to each separable quadratic program. We implement the algorithm and present computational experience. Ó 1998 Elsevier Science B.V. All rights reserved. Keywords: Convex programming; Mathematical programming; Nonlinear programming; Optimization; Quadratic programming
1. Introduction We develop in this paper an algorithm for solving the following large-scale convex optimization problem (LCOP) with linear block-angular structure: minimize
f
x1 ; x2 ; . . . ; xN
1
subject to
B1 x1 b1 ; B2 x 2 b 2 ; .. .
2
B N x N bN ; A1 x1 A2 x2 AN xN 6 a; x1 P 0; x2 P 0; . . . ; xN P 0; *
Corresponding author. Fax: +82-2-336-1130; e-mail:
[email protected]
0377-2217/98/$19.00 Ó 1998 Elsevier Science B.V. All rights reserved. PII S 0 3 7 7 - 2 2 1 7 ( 9 7 ) 0 0 3 5 0 - 0
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
599
where f is the closed proper convex function from Rn1 n2 nN to R but not necessarily separable. For each i 1; 2; . . . ; N ; bi 2 Rmi ; Bi 2 Rmi ni ; Ai 2 Rmni and a 2 Rm . The number of coupling constraints, m, is often much smaller than the number of separable constraints
m1 m2 mN or the number of variables of the problem
n1 n2 nN . The large-scale optimization problem has wide applications such as production and inventory scheduling, production and distribution planning, input±output analysis of economic planning, and multicommodity network ¯ow problem. There has been signi®cant research on the problem of which the objective function is separable and convex (see Ahuja et al., 1993 and Lasdon, 1970, for applications). This paper is stimulated by Park's research on the optimal recon®guration of a virtual path network (Park, 1995) which ®nds a path and assigns the bandwidth of the path for each trac class in an ATM (Asynchronous Transfer Mode) BISDN (Broadband Integrated Services Digital Network). The design of recon®gurable virtual paths can be formulated as a multicommodity network ¯ow type problem with the structure of LCOP. The objective function of the recon®guration problem is not separable and hence we have diculty in applying existing decomposition methods.P If the objective function of LCOP is linear, that is, f Ni1 hci ; xi i, LCOP can be solved by Dantzig± Wolfe's decomposition Pmethod or Rosen's partitioning method. If the objective function is separable, that is, f
x1 ; x2 ; . . . ; xN Ni1 fi
xi , the bundle-based decomposition (BBD) method of Robinson (1986, 1989) is very ecient. The computations of Medhi (1994) showed that BBD was superior to DECOMP (implementation of Dantzig±Wolfe decomposition method) for solving LCOP with linear objective function and equality coupling constraints. Bundle-based decomposition applies the bundle method of Lemarechal et al. (1981) to the conjugate dual problem of LCOP. By conjugate duality theory (Rockafellar, 1970), the dual objective of LCOP is nonsmooth, but its function value and subgradient can be obtained by solutions of decomposed primal subproblems. The constraints of LCOP have a special linear block-angular structure, but its objective function is not separable. Ecient solution methods for nonseparable large-scale optimization are not available. In this study we assume that the objective function is smooth, that is, at least twice dierentiable. Then we suggest an iterative quadratic programming method. In each iteration, the objective function is approximated as a separable quadratic function and the bundle-based decomposition of Robinson (1986) is applied to the approximated problem. Therefore, this study can be compared to the work of Medhi (1994). In that work he implemented bundle-based decomposition for linear and hence separable LCOP. After a brief introduction of notation in this section, Section 2 develops an iterative BBD algorithm for nonseparable LCOP. Section 3 implements the algorithm and Section 4 contains computational experiences. Section 5 veri®es the applicability of the algorithm to a network design problem, which is a motivation of this study. Section 6 gives the conclusion of this paper, and Appendix A investigates the convergence of the algorithm. We brie¯y describe our notations and de®nitions here. They can be found in Rockafellar (1970). The scalar product of two vectors x and y in Rn is denoted by hx; yi. Let f be a function from Rn to R. Then for any x1 ; x0 in Rn and each k 2
0; 1, if
2 1 f
kx1
1 ÿ kx0 6 kf
x1
1 ÿ kf
x0 ÿ qk
1 ÿ k x1 ÿ x0 ; 2
3
f is said to be strongly convex with modulus q. For q 0, f is simply convex. Let M be an n n matrix, then M is said to be positive de®nite with modulus q if for any x 2 Rn ; hMx; xi P qk xk2 . Then the following lemma for the relationship between the strong convexity of a function and the positive de®nity of its Hessian matrix is well known (see Robinson, 1985, for the proof).
600
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
Lemma 1.1. Let f be C2 from a neighborhood of x0 in Rn to R. Let A be an ane set with x0 2 A and let S be the subspace parallel to A. Then f is strongly convex on a neighborhood of x0 in A if and only if r2 f
x0 is positive de®nite on S. The conjugate of a convex function f is de®ned as f
x max fhx; zi ÿ f
zjz 2 Rn g:
4
The subdierential of a concave function g at y 2 Rm , denoted by @g
y, is de®ned as the set @g
y fp 2 Rm j8z; g
z 6 g
y hp; z ÿ yig:
5
An element of the subdierential is called a subgradient. 2. Iterative BBD algorithm In this section, we develop an iterative BBD algorithm applying the BBD method of Robinson (1986) to LCOP. The conjugate dual of the problem (LCOP) is the problem maximize fg
yjy P 0g; where g
y;
8 > < min > :
6
f
x1 ; x2 ; . . . ; xN
N P
Ai xi ÿ a; y
i1
Bi xi bi ; xi P 0; i 1; 2; . . . ; N 8 N > < min f
x1 ; x2 ; . . . ; xN PAi xi ; y ÿha; yi i1 > : s:t: Bi xi bi ; xi P 0; i 1; 2; . . . ; N : s:t:
7
Since f
x1 ; x2 ; . . . ; xN in g(y) is not separable, we cannot evaluate the value of function g(y) by directly solving decomposed subproblems as in the BBD method. In the iterative BBD algorithm for smooth LCOP, we improve the current solution by approximating f by a separable quadratic function and applying the BBD method to the approximated problem. That is, the Hessian matrix of f at the current solution xk
xk1 ; xk2 ; . . . ; xkN , denoted by r2 f , is approximated by a block-diagonal matrix: 1 f
x hr2 f
xk
x ÿ xk ; x ÿ xk i hrf
xk ; x ÿ xk i f
xk 2 N N X 1X hMi
xi ÿ xki ; xi ÿ xki i hqi ; xi ÿ xki i f
xk 2 i1 i1
N N N X X 1X hMi xi ; xi i hqi ÿ Mi xki ; xi i ni f
xk ; 2 i1 i1 i1
8
where Mi
@ 2 f
xk ; @x2i
qi
@f
xk ; @xi
1 ni hMi xki ; xki i ÿ hqi ; xki i: 2
9
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
601
Note that all xi 's in the Eq. (9) are vectors. Fig. 1 shows the gradient vector and the approximated Hessian matrix of f(x) at xk . Given Mi , qi and ni as Eq. (9), de®ne fi
xi at xki as following: ( 1 hMi xi ; xi i hqi ÿ Mi xki ; xi i ni if Bi xi bi and xi P 0; fi
xi 2
10 1; otherwise: Note that if f
x1 ; x2 ; . . . ; xN is (strongly) convex, Mi is positive (semi-)de®nite, and fi
xi is (strongly) convex. To see this, let f
x1 ; x2 ; . . . ; xN be strongly convex with modulus q. For the simple convexity case, q 0. Then, from Lemma 1.1, the Hessian matrix of f is positive de®nite with modulus q. Hence for any x i 2 R ni , hr2 f
x
0; . . . ; 0; xTi ; 0; . . . ; 0T ;
0; . . . ; 0; xTi ; 0; . . . ; 0T i hMi xi ; xi i P qkxi k2 :
11
Therefore Mi is positive de®nite with modulus q, so fi
xi is strongly convex with modulus q. By substituting the approximated quadratic function of Eq. (8) at xk
xk1 ; xk2 ; . . . ; xkN for the dual objective function g(y) of Eq. (7), we have g
y ÿha; yi
N X i1
Under the condition a 2 @g
y ÿa
fminffi
xi ÿ hÿATi y; xi igg ÿha; yi ÿ
PN
i1
N X i1
fi
ÿATi y:
12
Ai
ri dom fi , the subdierential @g
y is approximated by
N X Ai @fi
ÿATi y;
13
i1
where @fi
ÿATi y arg minffi
xi ÿ hÿATi y; xi ig and ``ri'' stands for relative interior (see Sections 23 and 6 (especially, Theorems 23.8 and 23.9), of Rockafellar (1970) for the `chain rule' Eq. (13) for subdierential and relative interiors). Hence @fi
ÿATi y is the solution set of the ith decomposed subproblem
QP
yi :
QP
yi
1 hMi xi ; xi i hqi ÿ Mi xki ATi y; xi i 2 subject to Bi xi bi ; xi P 0: minimize
Fig. 1. The gradient vector and the approximated Hessian matrix of f
x at xk .
602
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
At any feasible point y k we can evaluate the function value and a subgradient of the dual objective g(y). For each i 1; . . . ; N ; let the optimal solution and optimal value of
QP
y k i be denoted by xki and zki , respectively. Then the function value g
y k and a subgradient pk of g at y k are given by g
y k ÿha; y k i
N X i1
pk ÿa
zki ;
N X Ai xki :
14
15
i1
Now we can apply the BBD algorithm to LCOP with the approximated separable quadratic objective function at xk . Fig. 2 is the ¯owchart of the iterative BBD algorithm for LCOP. At this point we comment on the diculty in using a nonsmooth optimization method to solve LCOP by maximizing the dual function g. Even though we know an exact solution y for g, we may not be able to ®nd is, even though we know that 0 2 @g
y , we can a primal optimal solution x1 ; . . . ; xN for LCOP. PThat N compute only one subgradient of @g
y ; a ÿ i1 Ai xi
y which may be far from zero. But the BBD method circumvents this diculty through the manner in which the approximate zero subgradient is constructed (see Robinson (1986, 1989) and Medhi (1994) for detailed description of the construction and error estimate of the BBD method). 3. Implementation of the algorithm We implement the iterative BBD algorithm as two parts. The ®rst part is to modify the work of Medhi (1994). He implemented the BBD method of Robinson (1986) for large-scale linear programs with blockangular structure. Our modi®ed implementation is for quadratic programs with LCOP structure. Compared with Medhi (1994), there is another dierence between our implementation and his. We have inequality coupling constraints in the block-angular structure. Because of inequality constraints, the dual problem has nonnegativity constraints. Therefore, we also need to project a subgradient to some tangent cone in computing a search direction. The second part is to approximate the nonseparable objective function of LCOP for the purpose of applying the BBD method and to develop line search procedures for convergence. If any ecient and ®nite algorithm terminates for each primal convex quadratic subproblem, the iteration of the BBD method is well de®ned. For the convergence of solutions of the approximated problems, we develop line search procedures for local and global convergences. For precise description we distinguish between the inner iteration and the outer iteration of the algorithm. The inner iteration, the iteration of BBD method applied to our algorithm, will be called a BBD iteration. The outer iteration of the algorithm will be called a SQA iteration, which stands for Sequential Quadratic Approximation. We present the steps of the implementation. The notations for parameters, variables, and bundles are as in Medhi (1994). 1. 2. 3. 4. 5.
Step 0: Initialization Select the initial point of the primal problem (LCOP), x^
^ x1 ; x^2 ; . . . ; x^N : Select the initial point of the dual problem (6), y 1 . Choose the allowable error of the primal problem (LCOP), s. Choose the allowable error of the dual problem (6), e and the parameters d > 0; ^e P e: Choose a number b for the maximum size of the bundle.
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
Fig. 2. The ¯owchart of the iterative BBD algorithm.
603
604
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
6. Choose parameters b; c; l
0 < c < b < 1;
b lh1; li0: Step 1: Quadratic approximation x=@x2i ; qi @f
^ x=@xi ; and ni 12 hMi x^i ; x^i i ÿ hqi ; x^i i: 1. For i 1; 2; . . . ; N and at x^, compute Mi @ 2 f
^ Step 2: Solving the decomposed subproblems 1. For i 1; 2; . . . ; N and at y1 , solve the decomposed subproblems
QP
y 1 i :
QP
y 1 i
minimize subject to
1 2 hMi xi ; xi i B i x i bi ;
hqi ÿ Mi x^i ATi y 1 ; xi i xi P 0:
2. Denote the optimal solution of
QP
y 1 i by x1i , the optimal value of it by z1i . Step 3: Evaluation of function and subgradient of Pdual problem 1. Evaluate the function value g
y 1 ÿha; y 1 i Ni1 z1i : P 2. Evaluate the subgradient of g(y) at y 1 : p1 ÿa Ni1 Ai x1i : Step 4: Setting the initial bundles f
x11 ; x12 ; . . . ; x1N g; K 1. Let D fp1 g; P 1 ^e. 2. Let k 1; e
f1g; a11 0:
Step 5: Computation of search direction of dual problem maxfe; minfek ; ^egg. 1. Let ek 2. For j 1; . . . ; K, project pj to the tangent cone of y k and denote it by pjP , that is, ( if yik > 0; pji j
pP i 0 if yik 0: 3. To obtain the search direction, solve the dual quadratic programming problem (DQP), using Lemke's algorithm.
2
X
j
DQP min 12 kj pP
j2K
X s:t: kj 1; j2K
X ajk kj 6 ek ; j2K
kj P 0;
j 2 K:
4. DenoteP the optimal solution of (DQP) by kkj
j 2 K and denote the multiplier associated with the constraint j2K ajk kj 6 ek by sk . The multipliers and solutions are simultaneously obtained from (DQP) if we apply Lemke's P algorithm to the transformed linear complementary problem. 5. Compute d k j2K kkj pjP : 2 6. Let vk d k sk ek : Step
6: Convergence P test ofk inner BBD iteration ajk kj 6 e; let kj kkj and go to Step 9. 1. If d k < d and j2KP
k k 2. If either d < d or e l^e and go to Step 5. j2K ajk kj > e; let ^
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
605
3. If d k P d, go to Step 7. Step 7: Update bundles 1. If the size jKj of the current bundle is less than b, then K fj 2 Kjkkj > 0g; D fpj j j 2 Kg; P f
xj1 ; xj2 ; . . . ; xjN j j 2 Kg: 2. If the size jKj of the current bundle is equal to b, then dk; pk D fpk g; P j k j j k k k P f
x P1 ; x2 ; . . .k; xN j2K kj
x1 ; x2 ; . . . ; xN g; akk j2K ajk kj ; K fkg. Step 8: Line search of BBD iteration 1. Compute a trial point uk1 y k tk d k satisfying the following: (a) tk > 0 (b) hpk1 ; d k i P bvk (c) either g
uk1 P g
y k ctk vk (serious step), or a
y k ; uk1 ; pk1 6 lek
null step; where a
y k ; uk1 ; pk1 g
uk1 ÿ g
y k hpk1 ; y k ÿ uk1 i: (Here g
uk1 and pk1 2 @g
uk1 are obtained by solving
QP
uk1 i : 2. If it is a serious step, update uk1 ; y k1 ajk g
y k ÿ g
y k1 hpj ; y k1 ÿ y k i; j 2 K; aj;k1 0; ak1;k1 K K [ fk 1g; D D [ fpk1 g; k1 k1 P P [ f
xk1 1 ; x2 ; . . . ; xN g. 3. If it is a null step, update yk ; y k1 ajk ; j 2 K; ak1;k1 a
y k1 ; uk1 ; pk1 ; aj;k1 K K [ fk 1g; D D [ fpk1 g; k1 k1 P P [ f
xk1 1 ; x2 ; . . . ; xN g and go to Step 5. Step 9: Line search P of outer SQA iteration 1. Compute xi j2K kj xji
i 1; . . . ; N . 2. Find ~t satisfying either (a) or (b) and compute a next point of the outer iteration ~x x^ ~t
x ÿ x^: (a) ~t 1, i.e., ~x x (for local convergence). (b) ~t
1=2mk , where mk is the ®rst nonnegative integer m for which for given r 2
0; 1 x; x ÿ x^i (for global convergence). f
^ x ÿ f
^ x
1=2m
x ÿ x^ P ÿ r
1=2m hrf
^ Step 10: Convergence test of outer SQA iteration ~x and stop. 1. If j f
~x ÿ f
^ xj 6 s; let x ~x and go to Step 1. 2. If j f
~x ÿ f
^ xj > s; let x^ Since the decomposed subproblems
QP
uk1 i or
QP
y 1 i are quadratic programs, we can use any method for convex quadratic programs. But, since the multipliers and optimal solution should be simul-
606
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
taneously obtained from (DQP), we transform it into a linear complementary problem and apply Lemke's algorithm to it. (See Ch. 11 of Bazaraa et al. (1993) for the linear complementary problem and Lemke's algorithm. Hence we implemented Lemke's algorithm for convex quadratic programs.) 4. Computational experience In this section we test the iterative BBD algorithm. We implemented it using ANSI C on a HP9000/700S. We used several types of test problems to present computational results and compared them with GAMS/ MINOS. The ®rst test problem is a quadratic programming problem with a positive de®nite Hessian matrix. In this problem we have nonseparable quadratic terms. Though we developed an algorithm for LCOP with equality separable constraints and inequality coupling constraints, we tested it for the problems with both equality and inequality constraints as follows: Test problem 1: Nonseparable quadratic programming problem. 1 0 1 1 0 1 10 10 0 0 x11 x21 x11 x21 5 ÿ1 1 0 5 1 ÿ1 1 C B C+ C B C+ CB CB *B *B B x22 C B x22 C B ÿ1 5 2 1 CB x12 C B x12 C B 1 5 2 1 C C C C C C C B B B B B B 1 minimize C; B C 12 B C; B C CB CB 2 B C C C C C C B B B B B 1 B 2 5 0 A@ x13 A @ x13 A @ @ ÿ1 2 5 ÿ1 A@ x23 A @ x23 A 0
1
0 5
x14
x14
1
1
ÿ1
5
x24
x24
x11 2x12 ÿ 2x13 x14 ÿ x21 2x22 3x23 ÿ 5x24 fnonseparable quadratic termsg subject to
x11 ÿ x12 3x13 4x14
P or 5 5x11 2x12 2x14
P or 14 3x21 x23 2x24
P or 6 ÿ2x21 x22 x23 3x24
P or 12 x11 2x12 x13 5x14 2x21 ÿ x22 4x23 x24 6 or 18 3x11 x12 2x13 ÿ x14 3x21 2x22 x23 5x24 6 or 32 xij P 0;
i 1; 2; j 1; 2; 3; 4:
Ten kinds of nonseparable quadratic terms of the problem are tested. For each test, both equality and inequality constraints are considered. Table 1 shows computational results for various nonseparable cases of Test problem 1 with the same optimal values as those obtained by GAMS/MINOS. For other types of nonseparable test problems, exponential and fractional functions are tested in Test problem 2 and Test problem 3, respectively. Test problem 4 has a polynomial with degree 4. The constraints of the problems are same as those in the Test problem 1. Test problem 2: Exponential function. minimize
ex11 eÿ3x12 ex13 4ex14 eÿx21 7e3x22 ex23 eÿx24 ex13 x22 eÿx14 x21 :
Test problem 3: Fractional function. minimize
x11 x21 x12 x22 x13 x23 x14 x24 : 10 ÿ x11 ÿ x21 15 ÿ x12 ÿ x22 15 ÿ x13 ÿ x23 10 ÿ x14 ÿ x24
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
607
Table 1 Computational results for nonseparable quadratic programming Problem number
Nonseparable quadratic terms
Separable constraints
Coupling constraints
Number of SQA iterations
0
No nonseparable terms
1
13x13 x22
2
8x13 x22
3
15x12 x24
4
13x11 x24 + 4x14 x21
5
4x14 x21 + 10x14 x24
6
3x11 x24 + 4x14 x21
7
15x12 x24 + 8x13 x23 + 4x14 x21
8
13x13 x22 + 8x13 x23 + 7x14 x23
9
15x12 x24 + 13x13 x22 + 8x13 x23 + 4x14 x21
P P P P P P P P P P P
6 6 6 6 6 6 6 6 6 6 6
2 3 3 3 4 3 4 3 4 3 4 3 4 3 3 3 6 5 3 3 3 3
15x12 x24 + 13x13 x22 + 8x13 x23 + 4x14 x21 + 5x14 x22
10
Test problem 4: Polynomial with degree 4. minimize
2x411 x412 3x413 4x414 x421 3x422 2x423 4x424 4x211 x223 8x214 x224 :
Table 2 shows that the iterative BBD algorithm also gives almost the same optimal solution as that obtained by GAMS/MINOS except Test problem 4, whose objective function Hessian is not positive semide®nite at some point. Our algorithm is for convex problems. We test the eciency of the algorithm with large-scale random problems, and compare it with GAMS/ MINOS. Test problem 5 is a large-scale nonseparable quadratic convex problem. The coecients of the objective function and the block-angular constraints are generated by random function in ANSI C.
Table 2 Computational results for additional test problems Problem
Separable constraints
Coupling constraints
Number of outer SQA iterations
Optimal value Iterative BBD
GAMS/MINOS
Test problem 2
P P P
6 6 6
4 9 7 7 5 NO
56 763.399 28.044 1.119 1.296 612.069 NO a
56 763.399 28.020 1.119 1.296 612.069 176.800
Test problem 3 Test problem 4 a
Optimal value not obtained.
a
608
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
Test problem 5: Large-scale nonseparable convex quadratic program. 0 1 0 1 x1 x1 * B C B C+ N B x2 C B x2 C X B C B C 1 minimize M ; hc i ; x i i B B C C 2 B ... C B ... C i1 @ A @ A xN xN subject to
B1 x 1 b 1 ; B 2 x 2 b2 ; .. . B N x N bN ; A1 x1 A2 x2 AN xN 6 a; x1 P 0; x2 P 0; . . . ; xN P 0:
We provide a procedure to generate random problems which are feasible and solvable. Library function rand( ) gives a random number which has a uniform distribution in [0,1]. (1) Generate a feasible solution:
xi j rand
; i 1; . . . ; N ; j 1; . . . ; mi . (2) Generate an upper triangular matrix W such that 8 > < Wii 10 rand
; i 1; . . . ; N ; j 1; . . . ; mi
i < j: > : Wij 10
rand
ÿ 1:0; (3) Compute M W T W . Then M is positive semi-de®nite and it is positive de®nite if W is nonsingular. (4) For the separable constraints Bi xi bi , generate Bi using rand( ) and compute the right hand side bi to satisfy the equality such that
Bi jk rand
10; ni X
bi j
Bi jk
xi k ;
i 1; . . . ; N ; j 1; . . . ; mi ; k 1; . . . ; ni :
k1
P (5) For coupling constraints Ni1 Ai xi 6 a, generate Ai 's using rand( ) and compute the right hand side a to satisfy the inequality such that
Ai jk rand
10; N X a Ai xi rand
10; i1
i 1; . . . ; N ; j 1; . . . ; mi ; k 1; . . . ; ni :
P The choice of a ensures that Ni1 Ai xi < a for the generated feasible solution xi 's. This guarantees the Slater type condition of (A.9) for the convergence theorems in Appendix A. In Table 3, we give the 18 speci®cations of randomly generated problems of the Test problem 5 by the procedure above. For instance, for problem TP5-1(a), M is a 105 ´ 200 matrix, and for i 1; . . . ; 20 Bi is a 5 ´ 10 matrix and Ai is a 5 ´ 10 matrix. For each row, (a) and (b) have the same size of whole problem and same number of coupling constraints. But (a) has smaller number of subproblems and hence has higher density than (b). We present the optimal value and solution time obtained by the iterative BBD algorithm and GAMS/ MINOS in Table 4. We report the convergence and the number of iterations of our iterative BBD algo-
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
609
Table 3 Randomly generated large-scale quadratic programming problems Problem name TP5-1 TP5-2 TP5-3 TP5-4 TP5-5 TP5-6 TP5-7 TP5-8 TP5-9
Size of whole problem a b a b a b a b a b a b a b a b a b
Number of coupling constraints
105 ´ 200
5
105 ´ 300
5
205 ´ 400
5
210 ´ 500
10
310 ´ 600
10
315 ´ 700
15
415 ´ 800
15
420 ´ 900
20
520 ´ 1000
20
Number of subproblems
Size of subproblems
Density (%)
20 50 20 50 20 50 20 50 20 50 20 50 20 50 20 50 20 50
5 ´ 10 2´4 5 ´ 15 2´6 10 ´ 20 4´8 10 ´ 25 4 ´ 10 15 ´ 30 6 ´ 12 15 ´ 35 6 ´ 14 20 ´ 40 8 ´ 16 20 ´ 45 8 ´ 18 25 ´ 50 10 ´ 20
9.5 6.7 9.5 6.6 7.3 4.4 9.5 6.6 8.1 5.2 9.6 6.7 8.4 5.5 9.5 6.6 8.7 5.8
rithm. X-norm is equal to xk1 ÿ xk in the ®nal iteration. GAMS/MINOS does not obtain an optimal solution for the problems from TP5-5 to TP5-9; it reports `Ran Out of Memory'. Fig. 3 shows the growth of execution time of the iterative BBD and GAMS/MINOS in terms of problem size. Table 4 Comparison of iterative BBD and GAMS/MINOS Problem name
Optimal value Iterative BBD
TP5-1 TP5-2 TP5-3 TP5-4 TP5-5 TP5-6 TP5-7 TP5-8 TP5-9 a
a b a b a b a b a b a b a b a b a b
167 285 174 326 335 775 356 689 319 567 431 747 2135 1088 879 1720 943 638
856.25 017.31 587.66 971.53 254.56 069.80 739.70 349.03 828.14 163.88 042.42 365.30 073.87 720.56 233.57 367.74 126.48 473.44
Execution time (s) GAMS/ MINOS 167 856.25 285 017.31 174 587.66 326 971.53 335 254.56 775 069.80 356 739.70 689 349.03 ) ) ) ) ) ) ) ) ) )
X-norm (10ÿ7 )
Iterative GAMS/ BBD MINOS 51 26 57 49 139 83 182 137 246 231 377 294 512 447 698 580 810 754
170 144 395 402 758 758 1219 1227 ) ) ) ) ) ) ) ) ) )
0.02 0.19 0.45 0.86 0.05 0.40 0.11 0.12 0.53 0.38 2.10 0.49 2.98 0.16 7.71 1.13 0.73 0.89
Iterative BBD algorithm Total number of BBD iterations
Number of outer SQA iterations
Average BBD iterations a
41 36 36 44 46 42 75 59 76 71 89 77 99 93 129 104 148 124
4 3 4 4 6 5 10 9 8 9 10 9 12 9 12 11 14 13
10.25 12 9 10 7.67 8.4 7.5 6.56 9.5 7.89 8.9 8.56 8.25 10.33 10.75 9.45 10.57 9.53
Average BBD iterations (Total number of BBD iterations)/(Number of SQA iterations).
610
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
Fig. 3. Growth of execution time in terms of number of variables.
We analyse the growth of the execution time in terms of number of coupling constraints. Given the size of the whole problem and the number of subproblems, the density of each problem increases according to the number of coupling constraints. We report the results in Table 5 and Fig. 4. For results on the application of the iterative BBD algorithm to the recon®guration of virtual paths in ATM network, which is the motivation for this study, refer to Park and Shin (1996, 1997). But, in this paper, we verify the applicability of the algorithm to the problem. Table 5 Growth of execution time in terms of number of coupling constraints Problem name
Size of whole problem
Number of coupling constraints
Number of subproblems
Size of subproblems
Density (%)
TP6-1 TP6-2 TP6-3 TP6-4 TP6-5
250 250 250 250 250
10 20 30 40 50
10 10 10 10 10
24 23 22 21 20
13.6 17.2 20.8 24.4 28.0
´ ´ ´ ´ ´
500 500 500 500 500
´ ´ ´ ´ ´
50 50 50 50 50
Execution time (s) Iterative BBD
GAMS/MINOS
178 199 253 313 374
1235 1230 1239 1232 1232
Fig. 4. Growth of execution time in terms of number of coupling constraints.
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
611
5. Veri®cation of applicability of the algorithm to a network design problem In this section we introduce a network design problem. And we verify the applicability of the algorithm to the problem. The problem is to ®nd an optimal design of virtual path network in ATM network. Without detailed explanation, we introduce the problem, which is slightly simpler than the original model in Park and Shin (1996, 1997); see them for model development and detailed explanation. PN i 8 m A x < minimize f P P k1 k k N a ÿ Ai xk
P i i1 k1 k : subject to Bk xk bk ; k 1; . . . ; N ; N X
Ak xk 6 a
1 ÿ q;
k1
xk P 0; k 1; . . . ; N : The subscript k of the problem (P) denotes the kth trac class. For each trac class mk nk i k k ; Ak 2 Rmn ; a 2 Rm k 1; 2; . . . ; N ; bk 2 Rm ; Bk 2 R and 0 < q < 1. Let Ak be the ith row of Ak . The decision variable xk is a nk ´ 1 column vector and its ith element,
xk i , denotes the assigned tracs to the candidate path i for the trac class k. The objective function f denotes the average cell delay and it would be a cost function of optimal routing of Codex network for one trac class (see Bertsekas and Gallager (1987) for optimal routing). Note that f is not separable. Equality constraint is for demand of each trac class. The inequality constraints stand for the limit of transmission capacity of physical link. q is a reservation ratio for special trac control. We developed two convergences of the iterative BBD algorithm: local convergence and global convergence. We leave them for Appendix A. Now we show that the objective function f of the problem (P) is strongly convex. And we prove that the problem (P) satis®es the hypothesis of Theorem A.3. Proposition 5.1. Let D maxfai ji 1; . . . ; mg. Then the objective function f of (P) is strongly convex with modulus 2=D3 . P Proof. For each i 1; . . . ; m; let hi
x hi
x1 ; . . . ; xN Nk1 Aik xk and let gi
y y=
ai ÿ y be de®ned in R. Then, gi is strongly convex with modulus 2=D3 for y P 0 since gi00
y
2
ai ÿ y
3
P
2 2 P 3: 3 ai D
Hence the composite gi hi is strongly convex with modulus 2=D3 . To show this, choose any x1
x11 ; . . . ; x1N and x0
x01 ; . . . ; x0N in Rn1 n2 nN and k in (0,1). Then from linearity of hi , hi
kx1
1 ÿ kx0 khi
x1
1 ÿ khi
x0 ; and from strong convexity of gi ; gi
hi
kx1
1 ÿ kx0 gi
khi
x1
1 ÿ khi
x0
2 6 kgi
hi
x1
1 ÿ kgi
hi
x0 ÿ 12 D23 k
1 ÿ k x1 ÿ x0 : P 3 Since f m i1 gi hi
x, f is strongly convex with modulus 2=D . h We assume that (P) satis®es the hypothesis (i) of Theorem A.3. In fact, the hypothesis (i), the Slater type condition, depends upon the instance of (P). But, it would be satis®ed for a real design problem in the
612
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
application. The hypothesis (ii), the compact level set condition, is satis®ed since Ak is a nonnegative matrix. Theorem 5.2. Suppose that (P) satis®es the Slater type condition. And suppose that the rows of Bk ' s
k 1; . . . ; N are linearly independent and d minfai ji 1; . . . ; mg is positive. Then (P) satis®es the hypothesis of Theorem A.3. Proof. Hypotheses (i) and (ii) of Theorem A.3 are clear. And hypothesis (i) implies hypothesis (iv) by the theorem of Gauvin (1977). For any feasible solution x0 , let 0 1 M1 0 0 B 0 M 0 C 2 B C @ 2 f
x0 B . C Mi
x0 and M . . B 2 .. .. C @xi @ .. A 0
0
MN
as described in Section 2. Now we claim that for any x, ^ 2 2km 2 0 k x k 6 hM
x x; xi 6 k xk2 ; D3 d2 q 3 where k^ is the maximum absolute value of eigenvalues of
Aik T
Aik for all k 1; . . . ; N and i 1; . . . ; m. From Proposition 5.1, we have left inequality since hM
x0 x; xi
N N X X 2 2 hMk
x0 xk ; xk i P kx k k2 3 k x k 2 3 D k1 k1 D
and we also have the right inequality since *" N N m X X X 0 0 hM
x x; xi hMk
x xk ; xk i k1
k1
i1
# + 2ai
Aik T Aik xk ; xk P
ai ÿ Nk1 Aik x0k 3
N X m N X m X X 2ai xTk
Aik T Aik xk 2xTk
Aik T Aik xk 6 PN i 0 3 a2i q3 k1 i1
ai ÿ k1 i1 k1 Ak xk
6
m N ^ X 2 X 2km k^kxk k2 6 2 k xk2 : 2 3 d q3 i1 d q k1
We simply obtain the ®rst inequality from the feasibility of x0 . And we have the second inequality from symmetricity of
Aik T Aik . We have proved hypothesis (iii) of Theorem A.3, and this completes the proof of Theorem 5.2. h 6. Conclusion In this paper we suggested an algorithm for large-scale nonseparable convex optimization problem with linear block-angular structure of type LCOP. Under a smoothness condition on the objective function we developed an iterative BBD algorithm. In each iteration, the objective function is approximated as a separable quadratic function and the BBD method of Robinson (1986) is applied to the approximated separable problem. This study is an extension of the BBD method of Robinson (1986, 1989) for the nonseparable case.
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
613
We developed local and global convergences of the algorithm and veri®ed the applicability of the algorithm to a network design problem, which was a motivation of this study. We also tested the algorithm on several types of problems and compared it with GAMS/MINOS on randomly generated nonseparable quadratic programming problems. Based on the computational experience, the algorithm appears to be promising and ecient for nonseparable convex optimization problems.
Acknowledgements We are thankful to the referees for their remarks and in particular to one of them who provided us with many helpful suggestions which improved the content and the presentation of the paper considerably. This work was supported by grant No. KOSEF 941-1000-032-2 from the Korea Science and Engineering Foundation.
Appendix A. Convergences of the algorithm In this appendix we describe convergences of the iterative BBD algorithm. As described in Section 3, the algorithm contains two kinds of iterations: inner BBD iterations and outer SQA iterations. We ®rst investigate the convergence of the BBD method for the separable convex optimization problem with inequality coupling constraints (A.1). When fi
xi is a quadratic function de®ned by Eq. (10), this ensures the convergence of the inner BBD iterations in our algorithm. minimize subject to
n X i1 n X
fi
xi
A:1 Ai xi 6 a:
i1
The conjugate dual of Eq. (A.1) is given by maximize
g
y ÿhy; ai ÿ
subject to
y P 0:
n P i1
fi
ÿATi y
A:2
We slightly modify Theorem 4.1 of Robinson (1989), which is the convergence result for problem (A.1) with equality coupling constraints. Theorem A.1. For i 1; . . . ; n; let fi : Rni !
ÿ1; 1 be closed proper convex and let Ai : Rni ! Rm be linear transformations, with a 2 Rm . Assume the following: (i) For each negative d near 0 in Rm , the system (A.3) is solvable: n X Ai xi 6 a d; xi 2 dom fi
i 1; . . . ; n:
A:3
i1
(ii) For each real c, the set (A.4) is bounded: ( ) n n X X
x1 ; . . . ; xn Ai xi 6 a; fi
xi 6 c : i1 i1
A:4
614
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
Then for each g > 0 there exist d > 0 and e > 0 such that if x^1 ; . . . ; x^n ; y^ and d satisfy (a)±(d) described with the dual elements y 1 ; . . . ; y k and the associated primal elements fxji ji 1; . . . ; n; j 1; . . . ; kg at the termination of bundle algorithm: (a) xji minimizes fi
ÿ hÿATi y j ; i, that is, ÿ ATi y j 2 @fi
xji for i 1; . . . ; n j 1; . . . ; k; P (b) pj ÿa ni1 Ai xji satis®es pj 2 @g
y j for j 1; . . . ; k;
P P with nj1 kj 1 and such that d nj1 kj pj satis®es kdk 6 d (c) Pnthere exist k1 ; . . . ; kk all nonnegative and j1 kj ej 6 e; wherePej g
y i ÿ g
y k ÿ hy j ÿ y k ; pj i, (d) y^ y k and x^i nj1 kj xji
i 1; . . . ; n; then there exist x1 ; . . . ; xn ; y such that they satisfy (e)±(f), . . ; kx^n ÿ xn k < g; ky^ ÿ yk < g: P and such that kx^1 ÿ x1 k < g; .P
xi on the set f
x1 ; . . . ; xn j ni1 Ai xi 6 ag; (e)
x1 ; . . . ; xn minimizes ni1 fiP (f) y maximizes g
y ÿhy; ai ÿ ni1 fi
ÿATi y on the set fy 2 Rm jy P 0g. Proof. This is directly borrowed of Theorem 4.1 of Robinson (1989). Let P from the proof P x
x1 ; . . . ; xn 2 RN and N ni1 ni . Then f
x P ni1 fi
xi . And let A A1 jA2 j jAn , then A : RN ! Rm is a linear transformation and Ax ni1 Ai xi . Now de®ne a multifunction M with arguments (e,r,s) by r x 0 @e f
x ÿAT M
e; r; s
x; y 2 ;
A:5 y ÿa s A 0 that is, for each s 6 0 M
0; 0; s is the product of the primal and dual solution sets of the problem Eq. (A.1). It is sucient to show that the multifunction M is Hausdor upper-semicontinuous at (0,0,s) relative to R RN Rm ÿ . This means that when e and r are suciently close to 0, each point M
e; r; s will lie within any preassigned positive distance of some point in the set M(0,0,s). Apply Theorem 3.4 of Robinson (1989) with given e P 0. Then for any (r,s) satisfying Eqs. (A.6) and (A.7) M is Hausdor upper-semicontinuous at
e; r; s 2 R RN Rm ÿ: r 2 intdom f im AT ;
A:6
s 2 intA
dom f ÿ a:
A:7
As shown in Section 4 of Robinson (1989), Eq. (A.6) is equivalent to the compact-level-set condition (ii) and Eq. (A.7) is same as the Slater type condition (i). h We show convergence of outer SQA iterations of the iterative BBD algorithm. We apply convergence results of the iterative quadratic programming (IQP) method for both local and global convergences of our algorithm. Please refer to Garcia Palomares and Mangasarian (1976) and Han (1977) for the IQP method. For this goal we describe additional de®nitions and notations. The approximated block-diagonal matrix of Hessian matrix r2 f
x is denoted by M(x). Now we de®ne local block-separability: f is locally block-separable at x if f has second order derivative which is Lipschitz continuous in the open neighborhood N
x of x, and Hessian matrix r2 f
x is a block-diagonal matrix, that @ 2 f
x is, @xi @xj 0 for all i ¹ j. Lagrangian function L(z) is de®ned by * + N N X X hui ; Bi xi ÿ bi i y; A i xi ÿ a :
A:8 L
z L
x1 ; . . . ; xN ; u1 ; . . . ; uN ; y f
x1 ; . . . ; xN i1
i1
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
615
Theorem A.2 (local convergence). Let
x; 0; y be a Kuhn±Tucker point of LCOP such that the second order suciency condition, linear independence of active constraints and strict complementarity are satis®ed. Suppose that f is locally block-separable at x: Assume as following: (i) The system (A.9) is solvable: 8 < Bi xi bi ; i 1; . . . ; N ; N P
A:9 : Ai xi < a: i1
(ii) For each real c, the set (A.10) is bounded: ( ) N X
x1 ; . . . ; xN Ai xi 6 a; f
x1 ; . . . ; xN 6 c : i1
A:10
Then there exist positive numbers d1 and d2 such that if the iterative BBD algorithm is started at any point
x0 ; y 0 with k
x0 ; y 0 ÿ
x; yk < d1 ; kM
x0 ÿ r2 f
xk < d2 ; then the sequence f
xk ; y k g of outer SQA iterations is well de®ned and converges R-superlinearly to
x; y. Proof. Assumptions (i) and (ii) show that each fi of Eq. (10) satis®es the assumptions of Theorem A.1. Hence the outer SQA iteration is well de®ned and it gives a next point. With setting the multipliers of separable equality constraints
uk1 ; . . . ; ukN to 0, apply Theorem 3.1 of Garcia Palomares and Mangasarian (1976) to the outer SQA iterations. From 4.1 of their study, the block-diagonal matrix M
x satisfying the following condition ensures superlinear convergence of the algorithm:
2 k
r f
x ÿ M
xk ! 0 as xk ! x: Now choose any sequence fxk g which converges to x and let L be a Lipschitz constant of the second order derivative f. Then, from local block-separability of f, we have
2
of
r f
x ÿ M
xk 6 r2 f
x ÿ r2 f
xk since r2 f
x is a block-diagonal matrix and r2 f
xk is same as M
xk except o-diagonal block. Hence we get
2 k
r f
x ÿ M
xk 6 r2 f
xk ÿ r2 f
x r2 f
x ÿ M
xk
2 k
6 2 r f
x ÿ r2 f
x 6 2L xk ÿ x : This completes the proof of Theorem A.2.
h
If we have strong convexity of f on the whole feasible region instead of local block-separability at the Kuhn±Tucker point, we guarantee the global convergence of the algorithm. Theorem A.3 (global convergence). Let f be continuously dierentiable. Assume (i) and (ii) of Theorem A.2 and the following (iii) and (iv). (iii) For any iteration k there exist b P c > 0 such that ck xk2 6 hM
xk x;
xi 6 bk xk2 :
A:14
616
K. Park, Y.-S. Shin / European Journal of Operational Research 111 (1998) 598±616
(iv) For all k, there exists a dual solution y k of Eq. (12) such that ky k k1 6 r for some r > 0. That is, the dual problem has uniformly bounded solutions. Then, any sequence f
xk ; 0; y k g generated by the iterative BBD algorithm terminates at a Kuhn±Tucker point of LCOP or any accumulation point
x; 0; y is a Kuhn±Tucker point if the rows of Bi 's
i 1; . . . ; N are linearly independent. Proof. Apply Theorem 3.2 of Han (1977).
h
References Ahuja, R.K., Magnanti, T.L., Orlin, J.B., 1993. Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, Englewood Clis, NJ. Bazaraa, M.S., Sherali, H.D., Shetty, C.M., 1993. Nonlinear Programming: Theory and Algorithms. Wiley, NY. Garcia Palomares, U.M., Mangasarian, O.L., 1976. Superlinearly convergent quasi-Newton algorithms for nonlinearly constrained optimization problems. Mathematical Programming 11, 1±13. Gauvin, J., 1977. A necessary and sucient regularity condition to have bounded multipliers in nonconvex programming. Mathematical Programming 12, 136±138. Han, S.P., 1977. A globally convergent method for nonlinear programming. Journal of Optimization Theory and Applications 22, 297±309. Lasdon, L.S., 1970. Optimization Theory for Large Systems. Macmillan, NY. Lemarechal, C., Strodiot, J.J., Bihain, A., 1981. On a bundle algorithm for nonsmooth optimization. In: Mangasarian, O.L., Meyer, R.R., Robinson S.M. (Eds.), Nonlinear Programming, vol. 4. Academic Press, NY, pp. 245±282. Medhi, D., 1994. Bundle-based decomposition for large-scale convex optimization: Error estimate and application to block-angular linear programs. Mathematical Programming 66, 79±101. Park, K., 1995. Virtual path routing optimization in ATM network. Journal of the Korean Operations Research and Management Science Society 20, 35±54. Park, K., Shin, Y., 1996. Virtual path design in ATM network. The Journal of the Korean Institute of Communication Sciences 21, 939±951. Park, K., Shin, Y., 1997. A practical design method for virtual paths in ATM network. Technical Report, Department of Industrial Engineering, Hong-Ik University (submitted to Telecommunication Systems). Robinson, S.M., 1985. Nonlinear Programming Theory and Applications. Lecture Notes, Department of Industrial Engineering, University of Wisconsin, Madison. Robinson, S.M., 1986. Bundle-based decomposition: Description and preliminary results. In: Prekopa, A., Szelezs an, J., Strazicky, B. (Eds.), System Modelling and Optimization. Springer, Berlin, pp. 751±756. Robinson, S.M., 1989. Bundle-based decomposition: Conditions for convergence. Annals de l'Institute Henri Poincare: Analyse Non Lineaire 6, 435±447. Rockafellar, R.T., 1970. Convex Analysis. Princeton University Press, Princeton, NJ.