A generalized design approach to solution of the non-convex quadratic programming problem J. F. Brotchie CSIRO Division of Building Research, PO Box 56, Highett, Melbourne, Victoria 31900, Australia (Received September 1985; revised November 1986)
This paper outlines a generalized, systematic design approach to solution of the non-convex quadratic programming problem. It is based on a previous formulation of utility of a general system in terms of efficiency and robustness of the system. The approach is to introduce a robustness term of arbitrary magnitude into the design utility function. Mathematically, this makes the problem convex. From a design approach, it yields a more general solution allowing specialization to proceed by decreasing robustness (on an initially convex utility surface in the feasible design space) until the region of the overall optimum is approached. The approach is mathematically related to the Metropolis technique of simulated annealing but a more systematic (less random) solution process is used. It is analogous also to the heuristic technique of Burkard and Bonniger. These two previous techniques are the most effective so far reported for the quadratic programming problem. The robustness approach provides an underpinning for each and opens up further solution options. Applications include layout of buildings and other constructed facilities and information technology layout problems. Keywords : knowledge-based design, tion technique, robustness, system
The quadratic programming problem In a previous study,’ a general planning problem was expressed in quadratic programming form. It defined the selection of allocations X = [xij] of activities i to options j to maximize utility U where:
ij
ijkl
in which uij is the utility of allocating a unit of resource or activity i to option j and cijkl is the utility of interaction between a unit of activity i in option j and a unit of activity k in option I subject to the constraints that the total quantity Ai of activity i is allocated, i.e. :
0307-904X/87/040291-05/$03.00 0 1987 Butterworth Publishers
and the capacity
1
xij
G
quadratic
programming,
solu-
Zj of option j is not exceeded, i.e.:
zj
Equations (lH3) are one form of the TOPAZ* model. The Koopmans-Beckmann quadratic assignment model is a special case (in which Ai = 1 = Zj). The model is generally non-convex with multiple local optima. Mathematical programming techniques normally find a local optimum and random moves are often introduced to allow escape from this. The formulation of equation (i) implies that utility uij is constant over all of a particular activity i in option j, i.e. that activities i and options j are homogeneous. It also assumes that the interactions between activities i in options j are relatively strong. However, if each option j
Appl.
Math. Modelling,
1987 Vol. 11, August
291
Solution of the non -convex quadratic programming problem: J. F. Brotchie is considered to be divided into a very large number of elements, and utilities vary between these elements, a more general formulation of the allocation problem may be obtained.
Introduction of diversity and robustness
u=
In a later study,4*5 the case of diversity of the distribution of utilities uij over elements of each option j was considered for a complex system with no or only weak interactions between activities, and the total utility U was found to be given by: U = R + S/A.,
(4)
in which R E R(X) = cijiiijxij is the total mean or base utility and is a measure of efficiency of the system; ii, is the mean or base value of utility distribution uij; l/L = l/Aij is the diversity of utility Uij over elements of option j (and is proportional to standard deviation gij of the distribution uij. It may also be interpreted as a weighting on the uncertainty assuming it is the same over all ij); S = S(X) is entropy or dispersal of the allocations X, (and is defined as the log of the number of ways the elements of activities i may be allocated to elements of the optionsj. It is a measure of uncertainty of individual decisions or allocations within the distributions xij); and S/n is a measure of robustness of the system. Equation (4) defines a fundamental relationship between utility, efficiency, diversity, dispersal, and robustness for the class of system considered (with analogy to weakly interacting statistical mechanics systems as later discussed). Utility U in equation (4) is convex and the solution subject to equations (2) and (3) may be expressed in closed form. Maximization of utility U for given diversity l/L, or maximization of entropy S subject to given level of base utility R, results in: Xij E YijZj = Zj/eXp(A(U,i
-
l&j)
+
p)
(5)
in which ugi is a constant which may be interpreted4s5 as a threshold level of utility uij for allocation to be made, p = 1 or 0 depending on constraint conditions on xii (and is 1 when equation (3) is included in the entropy expression and 0 when it is not)5 and yij lies in the range 0 d yij < 1 where zone capacities Zj are constrained. The constants ugi may be determined by activity levels Ai when these are binding. The overall quantities U and S were determined by aggregation from the level of (elements of) decisions X (by considering the elements to be ordered and sufticiently large in number to be treated as continuous), e.g. for the case where zone capacities Zj are constrained (Yij
d
l):
u=c ij
s xi,
Uij
dxij = l/‘LZj f(yij)
+ C UijYijZj
(6)
0
and the Fermi-Dirac giving :
formulation
of entropy4’5
applies,
s=C-CYijlogYij+P(l-Yijl ij x log( 1 - Yij)Zj = zj f( Yij)
which from expansion of log yij in the range 0 d Yij < 1 yields the approximation: s = C (l + P)Yij(l - Yijlzj ij
292
Appl. Math.
Modelling,
(7)
1987,
and for p = 1, l/1 = l/A,, and is found to be equal to 0.55 oij. Where there are strong interactions between decisions, introducing the interaction term from equation (1) into equation (4) gives :6
Vol. 11, August
C
ij
iiij
Xij
+
CCijkr
XijXkl
+
Sin
ijkl
which may also be derived by introducing diversity into equation (1). When either diversity l/L or dispersal S is zero, equation (8) reduces to equation (1). (A further disaggregated version of equations (l), (4) and (8) involving allocation at the option element level was also developed-as a quadratic assignment model with four subscripts’).
Solution technique Equation (1) was noted to be non-convex and able to have multiple local optima, e.g. at the vertices of constraint sets (equations (2) and (3)) in the vector design space X. On the other hand, equation (4) was noted to be convex. The convexity or otherwise of equation (8) depends on the magnitude of S/n and hence on diversity l/L Where l/n + 0, the problem is non-convex; where l/L + co, the problem is convex. The solution process proposed for equations (1x3) is that of mathematical programming by systematic linearization and iteration (e.g. iterative linear programming with adaptive move limits’ or alternative procedures later discussed) but with the ploy of initially introducing robustness S/L (equation (8)) with a sufftciently large initial value of l/n to make the problem convex. (The quadratic objective function of equation (8) is strictly convex when the determinant of its coefftcients is positive definite-which occurs if the diagonal terms (ciijj + (1 + p)/i&from equation (7)-are sufficiently large compared with the off-diagonal terms cijkl-and l/L may be selected accordingly to achieve this.) The value of l/n is then slowly and successively reduced to zero in a number of steps, in each of which a new optimum is obtained, thereby enabling the absolute optimum to be more closely approached. Since utility maximization is also a basis of human automatic utility maximizing can decision-making, simulate the human decision process. In planning and design, the individual allocation decisions made over elements of each option j may be aggregated as the distribution X. Entropy maximizing gives the most likely outcome X of that decision-making. Thus, in design, automated utility maximizing simulates the designer’s decision-making process, whereas entropy maximizing predicts the results. The design choices (X) which maximize total utility U for given diversity l/n (of utility of each option) under diverse design conditions also maximize entropy S for given design efficiency R (in equation (4)). S/L is again a measure of robustness of the design to match the diversity of the design conditions. Where the design conditions are constant, a single design option only is often required for maximum utility U. Where they are diverse (l/n finite), a dispersal (S) of design capacity between options is required to give the necessary level of robustness (S/n) for maximum utility U. Both optimal design and optimal planning may thus be expressed as the selection of X to maximize U in equations (1) or (8) (subject to equations (2) and (3).
Solution of the non-convex
Xl
5
Figure 7 Design trajectory from general and robust system (C) to specialized and efficient system (A) obtained by maximizing total utility lJ under decreasing diversity l/A (equation (8)) in twodimensional space X, X2
When diversity is zero (l/1, = 0), total utility U reduces to total base utility R in equation (4), and to equation (1) in the case of equation (8) where strong interactions are involved. The efficiency R and utility U are a maximum at points in this design space X where a minimum number of design options are utilized (at the vertices of the constraint set, e.g. 0, 1 solutions for yij), and U reduces-often sharply-away from these points. The effect of diversity l/3, and robustness S/n is to reduce the variability and the non-convexity of the utility surface U in this design space. The robustness S/n for finite l/1 is zero for zero-one solutions for yij (from equation (7)) and increases away from these points of peak efficiency as additional design options are utilized, and hence as 1/;1 is increased a less variable utility surface U is obtained. When l/n is sufficiently large, the surface U changes from non-convex to convex and as l/A approaches infinity, the optimum solution approaches the point X at which all yij are finite and equal (from equation (5)). Consider for example the allocation of one resource between two design or planning options 1 and 2 (Figure I), for which the utility U is given by equation (1). The unit net benefit u1 of option 1 is greater than that u2 of option 2. The interaction cost cijxixj is greatest when the resource is split equally between the two activities, U is non-convex where ciZ > cii, cZZ. The function (Figure 1) with local optimum at point B (X = 0, 1) and global optimum at point A (X = 1, 0). A conventional optimization technique starting in the region of B would find only the sub-optimum at B. The arbitrary introduction of large diversity and robustness (S/n large-using equation (8)) changes the function U to convex with a maximum between A and B. If l/n is sufficiently large the optimum approaches C (X = 0.5, 0.5). As l/n is reduced, the optimum moves towards the point A (X = 1, 0). Successive reduction of diversity l/n results in the optimization trajectory eventually reaching the absolute optimum (for 1/;1 = 0) at A. The point C (X = 0.5, 0.5) represents maximum entropy and hence maximum robustness but almost minimum efficiency. The point A (X = 1, 0) represents maximum efficiency but zero robustness. The full line AB in Figure I represents the feasible design utility surface on the feasible design space A,B, within the two-dimensional design space defined by the vectors X, and X,. Where there are three design parameters or vectors Xj (j = 1, 2, 3), the feasible design
quadratic programming
problem:
J. F. Brotchie
space is a triangle and where U is non-convex, the vertices may be sub-optima. For an n-dimensional design space, the corresponding feasible region is a simplex with n vertices all of which may be local optima. With a single activity i, a particular allocation is represented by a point in this space. For m activities, a particular allocation is represented by m of these points whose movements are linked through the constraint sets (or by one point in an m x n space). The effect of adding sufficient robustness in each case is to change the utility U from non-convex to convex (i.e. from U in equation (1) to U in equation (8) (l/1 9 0)). Iterative linear optimization with slowly reducing diversity l/n may be used as a solution technique. One such process consists of linearizing the utility function U about a point X0 and solving the resulting linear programming problem (subject to the constraint sets 2 and 3 and possibly move limits 1Axij 1< b,), giving a new point X, = X, + AX. The process is then repeated at X,, giving X,. (The move limit bij may be reduced each time Axij changes sign.) Linearizing U in equation (8) about X, by NewtonRaphson gives : U = (U),Y, + C (aUlaxij),o
Ax,
(9)
ij
in which, from equation
(7):
aU/axij = Uij + 2 1
Cijk[Xk[
+
(1 + P)(l - 2Yij)zj/~
kl
Solution gives [Axij] and hence Xi, X,, . . . X,. The gradient of U with l/n large can be opposite that for l/A = 0 so that robustness acts as a bridge over the non-convex surface allowing negative gradients in U (equation (1)) to be traversed (Figure I), and a global optimum to be sought. Other systematic moves, e.g. a directed process of pair swapping,’ may also be introduced once an optimum vertex is found, to test whether any other vertices offer improved solutions.
Discussion Thus introducing an initially large value of l/;l and slowly reducing it provides a means of seeking the optimal design when the actual design conditions are constant or only slightly diverse (l/1 --f 0). This may be considered as a rational design approach in which the design is successively specialized from an initially general one, e.g. a racing car design is developed from a conventional car design by increasing specialization of its components for the specialized conditions of racing. The design process in this case starts from the most robust but least efficient solution and proceeds by successive refinement to the most efficient solution with zero robustness. The assumption made above is that design conditions are known and constant. Where there is a given level l/1, of diversity or variability of design conditions, the process should terminate when l/J reduces to the value l/A,. The design will then have the optimal balance between efficiency R and robustness S/n to provide maximum overall effectiveness or utility U. Thus the automated design process above simulates a human decision process of successive design refinement until the optimum is achieved. It is also a process of utility maximization by successive refinement (and
Appl.
Math.
Modelling,
1987
Vol. 11, August
293
Solution
of the non-convex
quadratic
programming
problem:
utility may be related to motivation and human need in a more general sense). Hence the process is similar to the human process and the end result can be demonstrably better, because of the capacity of the machine to include more interactions, to carry out a far larger number of iterations and to more systematically explore the design decision space. The process is also related to the Metropolis technique lo of simulated annealing, and the heuristic technique of Burkard and Bonniger’ as discussed in the following sections. In fact the Metropolis and Burkard and Bonniger techniques may be considered as particular examples of the generalized solution process above. Metropolis technique The Metropolis technique takes a random path to the optimum of the non-convex utility surface within the constraint set. ii The Metropolis technique may be considered as a mountain climbing (optimum seeking) technique with the ability to make some random movements down (decreasing utility) as well as up in order to explore alternative peaks. Metropolis was developed as a statistical mechanics technique with application to simulated annealing of metals which with slow temperature reduction allows lower energy states representing better crystalline structures to be sought. It is basically analogous to the technique described previously, of slowly reducing diversity or robustness of a design until a peak performance is obtained for a given set of (constant) design conditions. In fact, equation (4) is mathematically identical to the classical equation relating free energy (- U), internal energy (--IX), entropy (S), and absolute temperature (l/n) in thermodynamics or statistical mechanics12 where temperature represents diversity of particle energies and energy is a negative measure of utility’-see appendix). Thus design optimization by progressive reduction of diversity l/n of design conditions is mathematically equivalent to energy minimization by progressive temperature reduction in thermodynamics or statistical mechanics. Metropolis provides one approach to iterative solution, a random (pair swapping) process which is less efIicient in computing time but more efficient in computer capacity needs than the approach considered here. (The random component again has some analogy with the random or associative aspect of human decision making.) It makes the assumption that integer solutions only are acceptable; and provides solutions which are found to be close to the best previously obtained.11s’3 Further details are given in the appendix.
Metropolis, giving results within 1% of the absolute optimum over the range of problems examined. It is the most effective technique so far reported, but less efficient in computing time than Metropolis.’ 3
Conclusions The robustness concept provides a generalization of the two techniques described, and facilitates further solution processes. The solution process outlined treats the variables yij as continuous-allowing non-integer solutions where appropriate. It utilizes the properties of linear programming to arrive at integer solutions where required. The two existing solution techniques discussed are the most effective so far developed for quadratic programming.13 Both are theoretically underpinned by the robustness approach and provide particular options for its solution process. They serve as particular examples of its effectiveness. The robustness approach also opens up a spectrum of solution options between these two and these are being explored in the quest for global optimality, and will be reported separately. These systematic design techniques provide a further basis for computer-aided design and problem solution. Combined with equations (l), (4) and (8), they provide a potentially useful means of designing high technology systems such as computer boards and computer system configurations, as well as solving urban, building and other layout planning problems.
References 1 2
3
I 8
Heuristic technique Another example of this general approach is the heuristic technique of Burkard and Bonniger’ for quadratic assignment. In their technique, the term cij Yij (1 - Yij) is introduced into the objective function and is shown here to be equivalent to the robustness term, from equation (7). The coeflicient cij corresponds to (1 + p)Zj/~ij so that the (augmented) objective functions are agam analogous. (The authors note that their solution is not sensitive to the values of cij, so that a single value l//2 may suffice for l/Lij.) The robustness term is again equal to zero for zero-one solutions of yij. The Burkard and Bonniger technique is slightly more effective than
294
Appl.
Math.
Modelling,
1987,
Vol. 11, August
J. F. Brotchie
9
10
11 12 13
14
Brotchie, J. F. A general planning model, Manage. Sci., 1969, 16 (3), 265-266 Brotchie, J. F., Sharpe, R. and Dickey, J. W. TOPAZ-General planning technique and its application at the regional, urban and facility planning levels, Springer-Verlag, Berlin, 1980 Koopmans, T. C. and Beckmann, M. Assignment problems and the location of economic activities, Econometrica, 1957, 25, 53-76 Brotchie, J. F. and Lesse, P. F. A unified approach to urban modelling, Manage. Sci., 1979, 25 (I), 112-113 Brotchie,-J. F., Lesse, P. F. and Roy, J. R. Entropy, utility and olannine models. Sistemi Urbani, 1979,3,33-T? Brotchie, J. F., Georgeff, M., Sharpe, R. and Marksjo, B. S. Introducing intelligence and knowledge into CAD, Proc. Conf. Applications of ArtlJicial Intelligence to Engineering Problems, Southampton, April 1986, pp 797-810 Brotchie, J. F. Technological change and urban form, Environ. Plan. A, 1984,16,583-596 Reinschmidt, K. F., Cornell, C. A. and Brotchie. J. F. Iterative design and structural optimization, .I. Struct. Di;., ASCE, 1966, 92 (ST6), Proc. Pap. 5015,281-318 Burkard, R. E. and Bonniger, T. A heuristic for quadratic boolean programs with applications to quadratic assignment problems, Euro. J. Oper. Res., 1984, 13 (4), 374-386 Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. Equation of state calculations by fast computing machines, J. Chem. Phys., 1953,21, 1087-1092 Sharue. R. and Marksjo, B. M. Facility lavout ootimization csing Metropolis, En&n. Plan. B, 1985, 12, Gj>53‘ Tolman, R. C. The principles of statistical mechanics, Oxford University Press, London, I955 Burkard, R. E. and Rendl, F. A thermodynamically motivated simulation procedure for combinatorial optimization problems, Euro. J. Oper. Res., 1984, 11, 169-174 Kirkpatrick, S., Gelatt, C. D., Jr. and Vecchi, M. P. Optimization by annealing, Science., 2983, 220 (4598), 671680
Solution of the non-convex
Appendix Comparison of Metropolis, heuristic, and robustness approaches To further compare the Metropolis approach, the heuristic approach, and the (more generalized) robustness technique proposed herein, a distinction must be made between the mathematical objective function in each case and the mechanisms for seeking to optimize that function in the decision parameter space. It will be shown here that the objectives in each case are mathematically identical but the mechanisms proposed for movement towards the optimum differ. The Metropolis method attempts to find a minimum energy state. It makes a (randomly selected) parameter change dx with corresponding energy change dE only if dE is negative or if the probability P(dE) given by Boltzmann’s law P(dE) = exp(-dE/kT)(O < P < 1) is greater than the probability P’ randomly selected from a uniform distribution over the range 0 to 1, in which k is the Boltzmann constant and T is absolute (Kelvin) temperature. In the robustness approach, a design parameter change is made (systematically) on the basis of a (maximum) improvement in the total utility U, where U is given by equation (4) or (8). Note that energy = -utility: dE = -dR. Thus, in the robustness approach, the change dx must produce a positive change d V, i.e. :
where R is the value of U for l/L = 0. For Boltzmann entropy (corresponding to p = 0 in equation (5), where the constraint equation (3) is not incorporated into the entropy expression). (This is appropriate for the solution process although the difference is small, as indicated by equation (7)): s = c - yij log yij ij
giving: (S/L) = -(log
y)/L > -dR
for given change in y or: log y < dR3, Therefore: y -C exp(LdR) where l/L = kT,”
which corresponds
to :
P’ < exp( - dE/kT) In each case, diversity l/E, or temperature T is an arbitrary variable and is slowly reduced to allow the optimum to be approached. Thus, the Metropolis approach accepts only changes dx which reduce internal energy (-R), or, if not, are likely to reduce free energy (-U), whilst also slowly reducing temperature (l/n) so that these two quantities converge. On the other hand, the robustness approach as described here accepts only those changes dx which
problem:
J. F. Brotchie
increase total utility (U), whilst slowly reducing diversity l/L so that U approaches R. The Burkard and Bonniger technique’ introduces an additional term into the objective function, which is shown to be equivalent to S/L by equation (7). Its coeflicients cij are not reduced directly in the solution process but are made equal to or greater than the largest costs of accepted options only, and diversity again reduces as the higher cost options are rejected and lower cost options are explored. Their effect is zero for zero-one solutions. The technique, however, does not converge as rapidly as the other two in which temperature or diversity reduces to zero. The movement mechanisms are quite different. The Metropolis approach makes its search by randomly swapping pairs of units of allocations X. The heuristic technique uses a systematic pair (or triplet) swapping technique. The robustness approach makes its search initially by iterative linear solution, e.g. moving to the best vertex in a linearly constrained space about the previous solution point. It also has the option of switching to the heuristic or Metropolis pair swapping technique. The simulated annealing approach to optimization14 is able to utilize the phenomenon of specific heat, which accompanies phase change, as an indicator of large rates of change in the solution parameters. Specific heat, -R, entropy, S, and temSC, is related to energy, perature, l/2, by:
SC= -
dU=dR+d(S/A)>O
$
quadratic programming
dR d(l,;l)
=
(l/4
dS
W/4
Hence, in terms of the solution process proposed herein, high specific heat represents large rates of change of efticiency, and of entropy and robustness. At very high values of diversity, little change occurs as robustness dominates the utility function. As diversity and hence robustness reduce, some vertices with high efficiency begin to rival regions of high robustness for maximum overall utility. At these critical values of diversity, large compensating changes in efficiency and robustness occur, and hence specific heat is large. Diversity must be reduced very slowly at these points to ensure that the region of the current optimum is adequately explored. At still lower values of diversity, other vertices may emerge representing finer levels of parameter selection, and high specific heats would again result. As diversity is gradually reduced in this hierarchical case the major parameters are selected first (e.g. clusters of cities in the travelling salesman problem or connections between chips in the computer design problem of Kirkpatricki4) and minor parameters are selected later at a lower diversity. This again corresponds to human decisionmaking but is achieved automatically, through the properties of the complex systems involved. In each case, the process is analogous to the excitation of a ball moving over a non-convex surface with temperature, diversity, or cij equivalent to the amplitude of excitation or vibration of the ball. The more the ball is shaken to begin with and the slower the reduction in amplitude, the more likely it is to end at an optimum (low point) on the surface. In the heuristic technique, the amplitude is not reduced directly. The constraints on movement of the ball differ between the three techniques, and the final solutions may, therefore, also differ.
Appl.
Math.
Modelling,
1987
Vol. 11, August
295