Operations Research Letters 36 (2008) 67 – 70
Operations Research Letters www.elsevier.com/locate/orl
Identifying the optimal partition in convex quadratic programming Stephen E. Wright∗ Department of Mathematics and Statistics, Miami University, Oxford, OH 45056, USA Received 18 February 2007; accepted 28 February 2007 Available online 19 March 2007
Abstract Given an optimal solution for a convex quadratic programming (QP) problem, the optimal partition of the QP can be computed by solving a pair of linear or QP problems for which nearly optimal solutions are known. © 2007 Elsevier B.V. All rights reserved. Keywords: Linear programming; Linear complementarity; Strict complementarity; Maximally complementary solutions
1. Introduction This paper is concerned with determining which constraints in a given quadratic programming (QP) problem must hold as equality at some optimal solution. The specific form of QP problem that we shall consider is minimize 21 x · Qx + c · x over all x ∈ Rn subject to Ax = b, x 0,
(1)
where Q ∈ Rn×n , A ∈ Rm×n , c ∈ Rn , and b ∈ Rm ; we assume that Q is symmetric and positive semidefinite. The well-known Karush–Kuhn–Tucker (KKT) optimality conditions for (1) are Ax = b, x 0, AT y − Qx + s = c, x · s = 0.
s 0,
(2) (3) (4)
These are both necessary and sufficient, owing to the linearity of the constraints and the convexity of the objective function. As will be shown in Section 2, the primal and dual solutions x and (s, y) are independent of each other, in the sense that a choice of multiplier vector (s, y) satisfying (2)–(4) for a given solution x of (1) also works for any other solution of (1). The complementary slackness condition (4) tells us that the common index set of the primal and dual variables x and s ∗ Tel.: +1 513 529 1837; fax: +1 513 529 1493.
E-mail address:
[email protected]. 0167-6377/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.orl.2007.02.010
in (2)–(3) can be partitioned at optimality into three sets, according to whether the corresponding inequality can be inactive at some solution. The existence of this optimal partition stems from the primal–dual independence mentioned above: if a primal constraint is inactive for just one of many optimal solutions, then the corresponding dual constraint must be active for every choice of KKT multipliers. The analogous statement in which the primal and dual roles are reversed, is also valid. Furthermore, the convexity guarantees the existence of a KKT triple (x, s, y) whose support indices precisely identify the optimal partition. We formalize these well-known statements in the following theorem. Theorem 1.1. If the QP (1) admits an optimal solution, then the coordinates {1, . . . , n} can be partitioned as X ∪ S ∪ Z so that, for any KKT point (x, s, y) satisfying (2)–(4), one has (a) j ∈ X whenever xj > 0, and sj = 0 whenever j ∈ X; (b) j ∈ S whenever sj > 0, and xj = 0 whenever j ∈ S; and (c) xj = 0 = sj if and only if j ∈ Z. Moreover, there exists a KKT point (x, ˆ sˆ , y) ˆ that completely characterizes the optimal partition as X = {j : xˆj > 0}, S = {j : sˆj > 0}, Z = {j : xˆj = 0 = sˆj }. The point (x, ˆ sˆ , y) ˆ described in this theorem is called a maximally complementary solution for the problem (1). Of course, the primary interest in identifying the optimal partition is that it provides complete information regarding which constraints
68
Stephen E. Wright / Operations Research Letters 36 (2008) 67 – 70
must be active at an optimal solution to the given QP problem. Furthermore, knowledge of the optimal partition also plays a central role in recently developed approaches to the sensitivity analysis of optimal solutions and optimal values with respect to changes in the problem data [6]. The use of the optimal partition in the context of linear programming (LP) has been particularly emphasized in the textbook of Roos et al. [8]. Some interior point methods for QP problems (when formulated as linear complementarity problems, LCPs) compute maximally complementary solutions as a matter of course [5], and there has been some recent work [2,7] on strongly polynomial methods for calculating a maximally complementary solution, given an optimal or nearly optimal solution. However, practical implementations of interior point methods are based on “largestep” approaches, which do not guarantee the optimum will be maximally complementary. Furthermore, while interior point methods are particularly well-suited to large sparse problems, they are not widely used for small- to medium-sized problems. In particular, the statistics, scientific, and general engineering communities continue to rely primarily on familiar active-set codes, such as those built into popular mathematical software environments or available freely within the public domain. The purpose of the present paper is to show that once an optimal solution to the quadratic program (1) has been calculated by any means, the computation of (x, ˆ sˆ , y) ˆ can be formulated as a pair of linear (or quadratic) programming problems for which nearly optimal solutions are known. This has the practical advantage that the optimal partition can, in many cases, be computed rapidly with the same off-the-shelf software used to solve the QP itself. The same is true for the related problems of linearly constrained least squares and monotone linear complementarity. In Section 2 we present the LP formulation of the QP partition identification problem, whereas in Section 3 we indicate specializations for the cases of least squares and monotone complementarity. In Section 4, we briefly discuss the practical implications of the formulation. 2. An LP formulation of the optimal partition for QP A key concept in the following is that of strict complementarity. This refers to the special case in which the index set Z, as defined in Theorem 1.1, of the optimal partition is actually empty; in the current notation, this property can be expressed as the inequality xˆ + sˆ > 0. Strict complementarity may fail to hold for QP problems such as (1). A simple one-dimensional example is given by min{ 21 x 2 : x 0},
¯ Proof. First observe that Qx = Qx¯ implies x · Qx = x¯ · Qx, because x · Qx = x · Qx¯ = (Qx) · x¯ = (Qx) ¯ · x. ¯ This clearly demonstrates the claimed sufficiency. Conversely, suppose x is optimal. By convexity, (x + x)/2 ¯ is also optimal for (1) and so 1 1 1 x · Qx + c · x + x¯ · Qx¯ + c · x¯ 2 2 2 1 x + x¯ x + x¯ x + x¯ = ·Q +c· . 2 2 2 2 Expanding the right-hand side and subtracting it from both sides gives (x − x) ¯ · Q(x − x) ¯ = 0. By positive semidefiniteness, this forces Q(x − x) ¯ = 0, so that x · Qx = x¯ · Qx¯ once again. Because x and x¯ are both optimal, we have 21 x ·Qx +c ·x = 21 x¯ · Qx¯ + c · x, ¯ which therefore implies c · x = c · x. ¯ This proves the necessity. Lemma 2.1 shows more directly why the same KKT multipliers are valid for all solutions to (1). Using Ax = b and s = c + Qx − AT y to rewrite (4) as b · y = x · Qx + c · x,
x − s = 0,
x · s = 0.
Here we see that the unique solution x = 0 = s is not strictly complementary: we do not have x + s > 0. The situation is much better in the setting of LP, which corresponds to the case of problem (1) in which Q = 0. The classical
(5)
we see that the dual solution (s, y) depends on quantities that are independent of the particular choice of optimum x. Now define A E := Q , f := E x. ¯ cT Note that E and f are both independent of the particular choice of x; ¯ furthermore, x is optimal for (1) if and only if Ex = f and x 0. Consequently, the feasible (hence optimal) solutions to the LP x
for which the KKT conditions are s 0,
Lemma 2.1. Suppose x is feasible for (1). Then x is optimal if and only if Qx=Qx¯ and c·x=c· x, ¯ in which case x·Qx= x·Q ¯ x. ¯
min{0 · x: Ex = f, x 0}
x
x 0,
Goldman–Tucker theorem [4] states that if an LP admits an optimal solution, then it must also admit strictly complementary KKT point. In this section, we will reformulate the search for the optimal partition of a convex QP in terms of LP. Standard elementary results from LP theory [3] will be invoked without explicit mention. For the remainder of this section, we shall assume the QP problem (1) admits an optimal solution, and we let x¯ denote some fixed optimum of (1). Our LP formulation hinges on the following observation.
(6)
are the same as those for the original QP problem (1). It is also clear that (, ) = (0, 0) is an optimal solution for the LP max{f · : E T + = 0, 0}, ,
(7)
which is the dual of (6). By the Goldman–Tucker theorem, (6)–(7) admits a strictly complementary vector (x, ˆ , ˆ ˆ ), which
Stephen E. Wright / Operations Research Letters 36 (2008) 67 – 70
corresponds to a solution of the system Ex = f, x 0, E T + = 0, 0, x · = 0, x + > 0.
3. Least squares and LCPs (8) (9) (10) (11)
Note that (10) is equivalent to f · = 0, in view of (8)–(9). Because the solution sets of (1) and (6) agree, the support of xˆ coincides with the primal support set X of the optimal partition in Theorem 1.1. An analogous derivation can be used to identify the dual set S of the optimal partition. Defining G := [A, −b],
h :=
I (Qx¯ + c), −x¯ T
we see that h is, by Lemma 2.1, independent of the choice of optimal x. ¯ Furthermore, the feasible (hence optimal) solutions to the LP max{0 · y: GT y + = h, 0} ,y
(12)
correspond via s = Qx¯ + c − AT y to the KKT multipliers for the original quadratic program (1); this follows from the reformulation (5) of the complementary slackness condition (4). At the same time, = 0 is an optimal solution for the LP problem min{h · : G = 0, 0},
(13)
which is dual to (12). Hence the pair (12)–(13) admits a strictly ˆ , ˆ y), complementary vector (, ˆ which corresponds to a solution of the system G = 0, 0, GT y + = h, 0, · = 0, + > 0.
69
(14) (15) (16) (17)
We note that, in view of (14)–(15), the complementary slackness condition (16) is equivalent to h · = 0. It is clear that the support of sˆ := Qx¯ + c − AT yˆ gives the dual support set S of the optimal partition in Theorem 1.1. To summarize, the above analysis provides a two-step approach to computing the optimal partition for the QP problem (1), given a solution x¯ of (1): (i) Solve the system (8)–(11) for (x, ˆ , ˆ ˆ ). ˆ , ˆ y) (ii) Solve the system (14)–(17) for (, ˆ and set sˆ := Qx¯ + c − AT y. ˆ
In this section we briefly discuss two special cases of convex QP problems for which simplifications of the formulation in Section 2 are possible. First, we consider the linearly constrained, linear least squares problem. This amounts to the case of problem (1) where the objective function is 21 U x − v2 , which therefore corresponds to the choice Q = U T U and c = −U T v. Lemma 2.1 can now be restated as follows: a feasible point x is optimal if and only if U x = U x. ¯ In particular, the equality c · x = c · x¯ is now redundant. Consequently, the matrix E appearing in (6)–(9) can be taken as A U rather than A U TU . cT This has the advantage that it improves the condition of the matrix E. The other problem we consider is the monotone LCP. This is the problem of finding a vector x 0 satisfying both Mx + q 0 and x · (Mx + q) = 0, where M is a given (possibly nonsymmetric) positive semidefinite matrix and q is a given vector. A maximally complementary solution for an LCP is defined to be a solution for which the support sets of the vectors x and s = Mx + q have maximal cardinality. Note that the monotone LCP amounts to an inequality-constrained, convex QP problem, in which we minimize 21 x · (M + M T )x + q · x subject to x 0 and Mx − q. Consequently, we could add slack variables to this QP and treat the LCP as a special case of (1), extracting a maximally complementary solution of the LCP from a maximally complementary primal–dual solution of the QP. It turns out that we can instead handle the LCP directly, with the advantage that only a single LP problem must be solved rather than two. The first step is to note that there is a suitable analogue of Lemma 2.1 for the monotone LCP: given a solution x¯ of the LCP, a vector x 0 solves the LCP if and only Mx + q 0, (M + M T )x = (M + M T )x, ¯ q · x = q · x. ¯ Defining M + MT P := , qT
r = P x, ¯
we see that the solutions of the LCP correspond precisely to feasible (hence optimal) solutions of the LP problem min{0 · x: x 0, Mx + q 0, P x = r}.
Observe that these steps can be carried out independently of each other. Merging their results yields a maximally complementary solution (x, ˆ sˆ , y) ˆ for (1), the support indices of which precisely identify the optimal partition X ∪ S ∪ Z.
x
The dual LP for this problem is max{−q · w + r · y: w 0, M T w + P T y 0}, w,y
70
Stephen E. Wright / Operations Research Letters 36 (2008) 67 – 70
for which (w, y) = (0, 0) is a feasible point. As x¯ is feasible for the primal LP, the Goldman–Tucker theorem guarantees the existence of a strictly complementary solution satisfying Mx + q 0,
P x = r,
M T w + P T y 0,
x 0,
w 0,
−q · w + r · y = 0, x − (M T w + P T y) > 0,
(Mx + q) + w > 0.
A solution (x, ˆ w, ˆ y) ˆ of this system yields a maximally complementary solution xˆ of the monotone LCP. We conclude this section by noting that the KKT optimality conditions (2)–(4) for the convex QP problem (1) are easily reformulated as a monotone LCP. This raises the question of whether a maximally complementary solution for the QP should instead be computed by means of the single LP problem above, rather than the two LPs given in Section 2. In fact, the LPs in Section 2 have approximately the same combined dimensions as the corresponding LP problem for the LCP. The net effect is that, for a general convex QP problem, the LPs of Section 2 could be viewed as a decoupling of the larger LP of the preceding paragraph into two smaller problems. 4. Practical considerations In Sections 2 and 3 we presented LP problems from whose strictly complementary solutions one can obtain maximally complementary solutions for QP, least squares, and monotone LCPs. As noted in Section 1, some primal–dual interior point methods for QPs and LCPs compute a maximally complementary solution (x, ˆ sˆ , y) ˆ directly. Similarly, some primal-dual interior point methods for solving LP problems actually compute strictly complementary points for LPs [5]. Therefore, it might be expected that practitioners with ready access to such codes might naturally employ them. However, as also noted in Section 1, practical implementations of interior point methods are not guaranteed to find maximally (or strictly) complementary solutions. Another option would be to calculate the primal support set X by means of a Balinski–Tucker tableau, which is an extension of the usual simplex tableau, for the primal–dual LPs (6)–(7). In their article [1], Balinski and Tucker showed how to compute a strictly complementary solution for an LP, starting from an optimal Balinski–Tucker tableau for that LP. Likewise, if a dual solution y¯ satisfying the KKT conditions (2)–(4) were also known, then the same procedure could be applied to the LPs (12)–(13) to obtain the dual support set S. On the other hand, it should be recognized that many practitioners will prefer to use their accustomed software to solve their quadratic programs, and that this will especially be true of many who routinely work with small- to medium-sized constrained least squares problems. For this reason, we now indicate yet another approach to solving the LP problems of Sections 2 and 3.
The main task is to rewrite systems of equations and inequalities such as (8)–(11) in a suitable form. Using the same trick by which we converted (4) into (5), we can rewrite the quadratic expression (10) in linear form as f · = 0. Letting p ∈ Rn denote some fixed positive vector, we see that the strict inequality (11) can be replaced by the inequalities x +s −tp 0 and t > 0. In this way, a solution to (8)–(11) can be found by minimizing either −t or (t − 1)2 over all (x, , , t) subject to the constraints Ex = f,
x 0,
E T + = 0,
0,
f · = 0, x + − tp 0. A similar transformation, in which h·=0 replaces (16), works for (14)–(17); the LP problems of Section 3 can be handled likewise. Here is our key observation: an advanced feasible solution for the optimization problem of the preceding paragraph is given by the choice (x, , , t) = (x, ¯ 0, 0, 0). Moreover, if x¯ 0 is a basic solution for Ex = f and all entries of f are nonzero, then a single primal simplex pivot for the linear objective −t forces t away from zero, thereby attaining the desired strictly complementary solution. For cases where x¯ is not basic, we note that most LP codes are capable of creating an advanced basis starting from x. ¯ Finally, we observe that, regardless of whether f has any zero entries, an active-set method for minimizing the least squares objective (t − 1)2 will likewise require only a small number of iterations. Acknowledgement The author is grateful to an anonymous referee for several helpful suggestions. References [1] M.L. Balinski, A.W. Tucker, Duality theory of linear programs: a constructive approach with applications, SIAM Rev. 11 (1969) 347–377. [2] A.B. Berkelaar, B. Jansen, K. Roos, T. Terlaky, Basis- and partition identification for quadratic programming and linear complementarity problems, Math. Programming 86 (2) (1999) 261–282. [3] V. Chvátal, Linear Programming, Freeman, New York, 1983. [4] A.J. Goldman, A.W. Tucker, Theory of linear programming, in: H.W. Kuhn, A.W. Tucker (Eds.), Linear Inequalities and Related Systems, Annals of Mathematics Studies, vol. 38, Princeton University Press, Princeton, NJ, 1956, pp. 53–97. [5] O. Güler, Y. Ye, Convergence behavior of interior-point algorithms, Math. Programming 60 (2) (1993) 215–228. [6] A.G. Hadigheh, T. Terlaky, Sensitivity analysis in convex quadratic optimization: invariant support set interval, Optimization 54 (1) (2005) 59–79. [7] T. Illés, J. Peng, C. Roos, T. Terlaky, A strongly polynomial rounding procedure yielding a maximally complementary solution for P∗ () linear complementarity problems, SIAM J. Optim. 11 (2) (2000) 320–340. [8] C. Roos, T. Terlaky, J.-Ph. Vial, Interior Point Methods for Linear Optimization, Springer, New York, 2006.