European Journal of Operational Research 112 (1999) 673±681
Theory and Methodology
Implied constraints and LP duals of general nonlinear programming problems Pravin K. Johri
1
AT & T Laboratories, Crawford Corner Road, Room 1L-237, Holmdel, NJ 07733-3030, USA Received 30 August 1995; accepted 10 September 1997
Abstract A very surprising result is derived in this paper, that there exists a family of LP duals for general NLP problems. A general dual problem is ®rst derived from implied constraints via a simple bounding technique. It is shown that the Lagrangian dual is a special case of this general dual and that other special cases turn out to be LP problems. The LP duals provide a very powerful computational device but are derived using fairly strict conditions. Hence, they can often be infeasible even if the primal NLP problem is feasible and bounded. Many directions for relaxing these conditions are outlined for future research. A concept of local duality is also introduced for the ®rst time akin to the concept of local optimality. Ó 1999 Elsevier Science B.V. All rights reserved. Keywords: Nonlinear programming; Linear programming; Duality; Local duality; Implied constraints; Lagrangian dual; Simplex method
1. Introduction There are many formulations of duality in nonlinear programming (NLP). None of these result in an ``LP-type'' dual, that is, a dual problem which is similar in form to the primal problem. The Lagrangian dual (see, e.g., [1] and Section 2.4), for example, is de®ned in terms of supremum functions and not just simple nonlinear constraints. A very surprising result is derived in this paper, that there exists a family of LP duals of general NLP problems. A general dual problem is 1
Fax: +1-908-834-5906.
derived from the primal problem using implied constraints and a simple bounding technique. A concept of local duality is introduced for the ®rst time akin to the concept of local optimality. All previously known duality results are for global duality. The Lagrangian dual is shown to be a special case of this general dual. Finally, it is shown that other special cases of this general dual turn out to be LP problems. The LP duals provide a very powerful computational device but are derived using fairly strict conditions. Hence, they can often be infeasible even if the primal NLP problem is feasible and bounded. Many directions for relaxing these conditions are outlined for future research.
0377-2217/99/$ ± see front matter Ó 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 7 - 2 2 1 7 ( 9 7 ) 0 0 4 0 3 - 7
674
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
The derivation of the general dual is based on implied constraints which are simply constraints implied by the given constraints (and the objective!) of the problem. In this sense, implied constraints can be redundant but there are two important distinctions between implied and redundant constraints. First, a given constraint is also considered an implied constraint and, thus, not all implied constraints are redundant. Second, the focus is not to identify and delete redundant constraints in an eort to reduce the dimensionality of the problem. Instead, it is to construct special types of implied constraints which will bound the objective of the problem. The vast amount of literature on duality is beyond the scope of this paper. The extensive references in [1,2] will provide a comprehensive list of earlier work in this area. Recently, an alternate, uni®ed, development of NLP theory based on implied constraints [3] yielded the following: (i) A weaker condition than the Guignard constraint quali®cation (see, e.g., [2]) for the existence of ®nite multipliers in the Karush±Kuhn±Tucker conditions (see, e.g., [1,2]); (ii) A more general suciency condition for local and global optimality; (iii) A single uni®ed formulation of duality (see also [4]) which shows that duality is nothing more than an eort to generate the tightest implied constraint on the objective. Duality theorems hold in general for this formulation ± convexity is not required ± and the existence of a duality gap in prior formulations is easily explained; (iv) Additional dual problems can be derived directly from their primal problems using simple bounding methods (see also [5,6]). In fact, all major formulations of duality were derived in this manner in [5,6]. This paper is organized as follows: preliminary results are stated in Section 2; the concept of local duality, the general dual problem, and the LP duals are derived in Section 3; ®nally, Section 4 contains the conclusions. 2. Preliminaries All vectors are column vectors in Euclidean space. The primal NLP problem considered in this paper is to ®nd a vector x 2 En that solves
maximize subject to
f
x gi
x 6 bi ; i 1; 2; . . . ; m; x 2 X 0 En ;
P
where f : En ! E; gi : En ! E; and bi 2 E; i 1; 2; . . . ; m: For simplicity, and without loss of generality, we assume that the domain of all functions is the entire space En . Let X En be the set of feasible solutions, that is, X fx 2 X 0 j gi
x 6 bi ; i 1; 2; . . . ; mg. The following is a well-known NLP problem (see, e.g., [2], p. 31). Example 1. maximize subject to
x1 ÿ
1 ÿ x1 3 x2 6 0; ÿx2 6 0:
The optimal point is (1,0) with objective value 1. This example is particularly important because the Karush±Kuhn±Tucker ®rst-order optimality conditions are not satis®ed at this point.
2.1. Implied constraints Let e : En ! E and c be a scalar. The constraint e
x 6 c is an implied constraint of problem (P) if e
x 6 c for all x 2 X . If the region X is empty then all constraints are de®ned to be implied constraints. An implied constraint of the form f
x 6 c is said to be an implied constraint on the objective of (P).
2.2. Generating implied constraints Since implied constraints are de®ned in terms of the set of feasible solutions, they can be generated from the given constraints of the problem [3,4]. There are two types of methods for generating implied constraints. The ®rst type are ad hoc rules which depend on the form of the given constraints of the problem. For example, the two constraints of Example 1 can be combined to yield: ÿ
1 ÿ x1 3 6 ÿ x2 6 0:
1
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
This way of generating implied constraints cannot be generalized to apply to all problems. The second type employ rules which can be generalized. Some of these rules are given in [3]. One such rule is that if
e
x3 6 c3 is a given constraint of the problem, then e
x 6 c is an implied constraint, and vice versa. For example, the (implied) constraint (1) further implies that ÿ
1 ÿ x1 6 0:
2
Another rule is to take linear combinations of the constraints of problem (P) in such a way that the inequalities, if any, are preserved. Let L denote this set of linearly-generated implied constraints. That is, with y 2 Em ; y P 0, ( ) m m X X L e
x 6 c j e
x yi gi
x; c yi bi : i1
i1
3 2.3. Recent results [3,4] Let C denote the set of all implied constraints of (P). 1. In LP, L C, and all implied constraints in LP can be generated linearly. 2. In NLP, L C, and only a subset of implied constraints can be generated linearly. 3. Johri's formulation of duality in NLP is to ®nd the tightest implied constraint on the objective of (P). 4. Duality theorems hold in general for Johri's formulation. 5. Other duality formulations, such as the Lagrangian (see Section 2.4), generate only a subset of implied constraints and, thus, have a duality gap in general. 2.4. The Lagrangian formulation of duality Let y 2 Em ; y P 0. The Lagrangian dual (see, e.g., [1]) of the problem (P) is as follows: (L) minimize subject to
C
y ; yP0
where
(
C
y sup x2X 0
f
x ÿ
m X
675
) yi
gi
x ÿ bi :
i1
2.5. Objective function domination Two simple bounding methods were used in [3] to derive dual programs. We will combine the two and use this new de®nition in the following sections. For any e : En ! E and any set Y such that X Y En , the function e
x is said to dominate the objective function f
x if e
x P f
x for all x 2 Y : 3. Derivation of LP duals of NLP problems The duality results derived in this paper are based on several key steps. Step 1: Augmenting the primal problem. The ®rst step is to augment the primal problem by adding newly constructed implied constraints. This step is counter intuitive. Most of the traditional theory tries to reduce the size of the problem by deleting redundant constraints. We will show that some NLP problems are easier to solve if (implied) constraints are added to the problem. This is in direct contrast to LP where no bene®t is derived by adding more constraints. The additional constraints in LP just increase the complexity of the problem. The main reason for this dierence in LP and NLP is that additional (implied or redundant) constraints in LP are always linear combinations of the given constraints of the problem. This result does not hold in NLP. The Example 1 can be augmented by adding implied constraints (1) and (2) to yield: Augmented Example 1. maximize x1 subject to ÿ
1 ÿ x1 3 x2 6 0; ÿx2 6 0; ÿ
1 ÿ x1 3 6 0; ÿ
1 ÿ x1 6 0:
676
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
Any number of implied constraints can be added to the primal problem to obtain an augmented primal problem. It is obvious that the solutions of the original and augmented problems are identical. For notational simplicity, and without loss of generality, we assume that problem (P) denotes an augmented primal problem. Step 2: Using an LP-type notation. The second key step is to express NLP problems with an LPtype notation. The constraints of the Example 1 contain the following terms: x1 ; x21 ; x31 ; and x2 . A single variable, for example ÿ x1 , can occur in mulT tiple terms. Let v
x v1
x; v2
x; . . . ; vp
x denote the terms in the problem (P). For ÿ T the Example 1, p 4 and v
x x1 ; x21 ; x31 ; x2 . We can then write problem (P) in the following LP-type manner: (P-LP) maximize
cT v
x x 2 X 0 En ;
where A is an m p matrix, c 2 Ep , and b 2 Em . The augmented Example 1 can be written in this form with ÿ T v
x x1 ; x21 ; x31 ; x2 ; c
1; 0; 0; 0T ; b
1; 0; 1; 1T ; 2 3 3 ÿ3 1 1 6 0 0 0 ÿ1 7 7 A6 4 3 ÿ3 1 0 5: 1 0 0 0 If the last two rows of the matrix A and the last two terms of the vector b are deleted above, then we obtain the original Example 1. With y 2 Em ; y P 0, linearly-generated implied constraints can be written as
4
or yT Av
x c 6 yT b c
Theorem 1. Let e
x 6 c be an implied constraint of (P) and let e
x dominate the objective of (P). Then c is an upper bound on the objective of (P). Applying Theorem 1 to linearly-generated implied constraints, for any y 2 Em ; y P 0; c 2 E, and Y such that X Y En , if yT Av
x c ÿ cT v
x P 0
for all x 2 Y ;
5
for c 2 En . The reason for introducing c in Eq. (5) will be explained later.
6
then cT v
x 6 yT b c:
subject to Av
x 6 b;
yT Av
x 6 yT b
Step 3: Bounding the objective. Our third key step is to obtain a bound on the objective of the primal problem (P). The main idea can be stated very simply ± if we can construct an implied constraint and show that its left side dominates the objective then we have bounded the objective. More formerly,
7
Hence, we can obtain the best bound (with this way of bounding the primal objective) by solving the following dual problem:
D
minimize yT b c subject to yT Av
x c ÿ cT v
x P 0 for all x 2 Y ; y P 0;
where y 2 Em ; c 2 E, and X Y En . Note that this dual problem (D) is de®ned in terms of both the primal
x and dual
y variables. The ®rst constraint of (D) must be veri®ed for all x 2 Y . We could just use Y X but then the problem (D) is tantamount to solving the primal problem (P). With a suitable Y ; X Y , this constraint can often be veri®ed more easily and the program (D) can often be simpli®ed. In particular, the cases Y En and Y X 0 will be used in subsequent sections. Adding c in Eq. (5) seemed like an innocuous step earlier. Its importance can be seen in conditions (6) and (7). It will allow condition (6) to hold, with an appropriate value of c, even in cases where the linear combination of the given constraints yT Av
x does not dominate the objective. This in turn can yield a lower objective value in (D) and provide a better bound on the objective of (P) than without the use of c.
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
We will develop formulation (D) in six main directions by 1. deriving the concept of local duality akin to the concept of local optimality; 2. showing that the Lagrangian dual is equivalent to formulation (D) with Y X 0 ; 3. showing in LP that (D) reduces to the standard LP dual; 4. showing that the optimality condition in the Simplex method in LP is equivalent to the ®rst constraint of (D); 5. showing that formulation (D) can be reduced to LP duals for general NLP problems; 6. discussing the duality gap and how these results can be generalized. 3.1. Local duality The concept of global and local implied constraints, analogous to the concept of global and local optimality, was developed in [3]. The dual formulation (D) allows us to introduce a similar concept of global and local duality. The concept of local duality has not been developed before. All previous formulations of duality typically refer to global duality although none are explicitly stated as such. The de®nition of dominance in Section 2.5 is also a de®nition of global dominance. It is modi®ed ®rst for local dominance. Let Nd
x denote a spherical neighborhood of the point x with radius d. For some x0 2 X and d > 0; X \ Nd
x0 denotes the set of feasible points in a neighborhood of x0 . For e : En ! E and any set Y such that X \ Nd
x0 Y En , the function e
x is said to locally dominate the objective function f(x) if e
x P f
x for all x 2 Y . With this de®nition of Y, the program (D) is a local dual of the problem (P) at the point x0 2 X : 3.2. Lagrangian duality In our LP-type notation, the Lagrangian dual (L) can be written as minimize yT b sup cT v
x ÿ yT Av
x subject to y P 0:
x2X 0
677
For a ®xed y and with Y X 0 , if we minimize c subject to the ®rst constraint of (D), then the minimum value of c is given by c sup cT v
x ÿ yT Av
x : x2X 0
Hence, formulation (D) can be reduced to the Lagrangian dual with Y X 0 . This result is signi®cant for many reasons. First, we have shown our dual formulation (D) to be at least as powerful as the Lagrangian dual formulation. Second, we have found a way of deriving the Lagrangian dual directly from the primal problem using a simple bounding method. Third, as we will show later, the Lagrangian is only one way of simplifying formulation (D). There is another way by which formulation (D), very surprisingly, simpli®es to a LP dual. Before we do this, we get some insights from results in LP.
3.3. Linear programming In LP, p n and v
x x. This allows us to simplify the ®rst constraint of the formulation (D) for the case Y En . The following result can be proven easily. Theorem 2. Given y 2 Em ; c 2 En ; c 2 E, and an m n matrix A, yT Ax c ÿ cT x P 0 for all x 2 En if and only if
yT A cT and c P 0:
In words, Theorem 2 says that if the function yT Ax c dominates the objective function cT x for all x 2 En , the coecients of the two linear functions yT Ax and cT x must be equal term for term (together with c P 0). This result obviously does not generalize to NLP. Thus, for Y En , formulation (D) in LP reduces to minimize subject to
yT b c yT A cT ; y P 0; c P 0:
The minimum is obtained at c 0 and c can be eliminated to get (LPD)
678
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
minimize yT b subject to yT A cT ; y P 0; which is just the standard LP dual problem. With Y En the scalar c played no role in the derivation of the standard LP dual. As explained earlier, c allows the dual program (D) to achieve lower objective values yielding better bounds on the objective of (P). In LP, it can be demonstrated based on the Farkas Lemma (see, e.g., [1,2]) that the problem (LPD) has the same objective function value as problem (P) if both problems are feasible. This is, in fact, the strong duality theorem. Thus, the objective value of (LPD) cannot be reduced further and there is no bene®t in using c with a Y En . Again, this result does not generalize to NLP. Not using c in NLP can result in a larger duality gap, which is de®ned as the dierence in the optimal objective values of the primal and dual problems. Even though Theorem 2 does not generalize to NLP, there is a weaker result which does carry over to NLP. Theorem 3. yT A cT and c P 0 are sucient conditions for the ®rst constraint of formulation (D) to be satis®ed for all x 2 En for general NLP problems. Corollary 1. Formulation (LPD) is also a dual of general NLP problems. We have obtained our ®rst LP dual of general NLP problems. We will develop this aspect further later on.
3.4. Optimality condition in the Simplex method It is very interesting to see that the optimality condition in the Simplex method is of the same form as the ®rst constraint of the dual program (D) (see Appendix A for details). This observation points out the potential for developing computational methods from the dual (D). Theorem 3 allowed us to use the equality condition yT A cT to de®ne an LP dual in NLP. The Simplex algo-
rithm provides an additional insight that if the variables are all nonnegative this condition can be relaxed to the inequality yT A P cT . Conversely, if the variables are all nonpositive this condition can be relaxed to the inequality yT A 6 cT . We do not really need to cite the Simplex method to obtain these relaxations. They are fairly obvious by themselves. 3.5. Nonlinear programming Theorem 3 and the insight from the Simplex algorithm will be used now to obtain a more general LP dual of NLP problems. In the formulation (D), let I fi j vi
x P 0 for all x 2 Y g; I ÿ fi j vi
x 6 0 for all x 2 Y g; I fi j i 2 f1; 2; . . . ; pg; i 62 I ; i 62 I ÿ g: Further, let ATi denote the ith row of the matrix AT . The following result is very easy to prove. Theorem 4. c P 0 and ATi y 6 ci ; i 2 I ÿ ; ATi y ci ; i 2 I; ATi y P ci ; i 2 I ; are sucient conditions for the ®rst constraint of formulation (D) to be satis®ed for general NLP problems. The conditions of Theorem 4 are more general than the conditions of Theorem 3 and yield the following LP dual of NLP problems: (LD) minimize subject to
yT b ATi y 6 ci ; i 2 I ÿ ; ATi y ci ; i 2 I; ATi y P ci ; i 2 I ; y P 0:
The new LP dual (LD) is a more general version of the previously obtained LP dual (LPD) and will have a smaller duality gap than (LPD). This is because the set of feasible solutions of (LPD) is a subset of the set of feasible solutions of (LD).
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
We have shown that there exists an LP dual (LD) of general NLP problems. The primal NLP problems can be augmented, repeatedly, with implied constraints without changing the optimum solution of the problem. Each augmented problem results in a dierent, but related, LP dual. Thus, there exists a family of LP duals for general NLP problems. This result is very surprising. It also has great computational potential as LPs can be solved very easily compared to NLP problems. The LP dual of Example 1, for Y En , is minimize subject to
y1 3y1 1; ÿ3y1 P 0; y1 0; y1 ÿ y2 0; y1 ; y2 P 0:
It is clear that this dual is infeasible. The LP dual of the Augmented Example 1, for Y En , is minimize subject to
y1 y3 y4 3y1 3y3 y4 1; ÿ3y1 ÿ 3y3 P 0; y1 y3 0; y1 ÿ y2 0; y1 ; y2 ; y3 ; y4 P 0:
The optimum solution of this LP occurs at the point y (0, 0, 0, 1) with objective value 1. The Lagrangian dual of the Augmented Example 1 as well as the optimal solution of Example 1 also had identical objective values.
3.6. Duality gap and future research In general, it is quite likely that the LP dual (LD) is infeasible. In particular, this will often be the case whenever the objective of (P) contains a term which does not occur in the constraints. It is also possible that when the dual (LD) is feasible that the duality gap between problems (P) and (LD) may be quite large. This is not surprising and is the price for obtaining an LP dual as solving LP problems is a trivial task compared to solving NLP
679
problems. However, there are several ways to mitigate these shortcomings. The ®rst option is to try and augment problem (P) with implied constraints such that, for example, all terms in the objective are represented in the constraints. The more implied constraints are added the smaller the duality gap is likely to be. This aspect was demonstrated with the Example 1 and the Augmented Example 1. While the dual of the Example 1 had no solution, the dual of the Augmented Example 1 had a solution with duality gap zero. Augmenting the primal problem increases the dimensions of the resulting LP dual. Still, the LP dual is much easier to solve than any other NLP dual formulation. It is challenging to come up with innovative ways of constructing implied constraints which are similar in form to the objective and hence yield better bounds on the objective. The second option is to try and relax Theorem 4. Dominating the objective term for term is obviously a very strict requirement as the same variable in NLP problems can occur in several terms. In Example 1, the variable x1 occurs in the terms x1 ; x21 ; and x31 . It may be possible to exploit the relationship between these dierent terms just the way non-negativity and non-positivity of terms was used to derive Theorem 4. It is challenging to come up with conditions under which Theorem 4 can be relaxed. It certainly seems feasible and is left for future research. The third option might be to try and replace non-polynomial functions with their Taylor series expansions. The fourth option is to try and involve the variable c in the simpli®cation process. It was essentially ignored in the derivation of the LP duals. These issues are also left for future research. 3.7. Generalization ± equality constraints The results derived in this paper are for primal problems with only inequality constraints. These results can be easily generalized to include equality constraints. With a modi®ed de®nition of linearlygenerated implied constraints (see, e.g., [3,4]) which includes equality constraints, the remainder
680
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681
Fig. 1.
of the paper generalizes in a straight forward manner. 4. Conclusions This paper derives a very surprising result in that there exists a family of LP duals for general NLP problems. In this derivation, a general dual problem, more powerful than the Lagrangian dual, is ®rst obtained from implied constraints using a simple bounding technique. A concept of local duality is introduced for the ®rst time akin to the concept of local optimality. The computational potential is highlighted by showing that the optimality condition in the Simplex Method is similar to the main constraint of this dual. Many opportunities for future research are also outlined in this paper. Acknowledgements I am grateful to Thorsten Nieuwenhuizen and an anonymous referee for comments which helped to improve the presentation of this paper. Appendix A. Optimality condition in the Simplex method Consider the LP problem in standard form: Maximize cT x subject to Ax b; x P 0, where A is an m n matrix, x; c 2 En , and b 2 Em . Given a basic feasible solution, with the variables reordered if necessary such that the ®rst m variables are in this basis, the Simplex tableau is
split into two parts and changed into the canonical form as shown in Fig. 1 where cTB are the basic objective coecients, cTN the nonbasic objective coecients, B the basis matrix, N the nonbasis matrix, and I the identify matrix. The objective function row can be written as cT ÿ cTB Bÿ1 A: The second term is a linear combination of the rows of matrix A. The optimality condition for the Simplex method is that cT ÿ cTB Bÿ1 A 6 0T or cTB Bÿ1 A ÿ cT P 0T : Since the variables x are constrained to be nonnegative, the previous condition implies that cTB Bÿ1 Ax ÿ cT x P 0
for all x 2 Y ;
n
with Y fx 2 E j x P 0g. Thus, the optimality condition in Simplex method can be reduced to the same form as the ®rst constraint in our dual program (D). Theorem 3 allowed us to use the equality condition yT A cT to de®ne an LP dual in NLP. The Simplex algorithm provides the additional insight that if the variables are all nonnegative, this condition can be relaxed to the inequality yT A P cT . References [1] M.S. Bazaraa, C.M. Shetty, Nonlinear Programming: Theory and Algorithms, Wiley, New York, 1979.
P.K. Johri / European Journal of Operational Research 112 (1999) 673±681 [2] M. Avriel, Nonlinear Programming Analysis and Methods, Prentice-Hall, Englewood Clis, NJ, 1976. [3] P.K. Johri, Implied constraints and an alternate, uni®ed development of nonlinear programming theory, European J. Oper. Res. 88 (1996) 537±549. [4] P.K. Johri, Implied constraints and a uni®ed theory of duality in linear and nonlinear programming, European J. Oper. Res. 71 (1993) 61±69.
681
[5] P.K. Johri, Derivation of duality in mathematical programming and optimization theory, European J. Oper. Res. 73 (1994) 547±554. [6] P.K. Johri, Duality via half spaces, ZOR Methods Models Oper. Res. 39 (1994) 85±92.