A polynomial formulation for the stochastic maximum weight forest problem

A polynomial formulation for the stochastic maximum weight forest problem

Available online at www.sciencedirect.com Electronic Notes in Discrete Mathematics 41 (2013) 29–36 www.elsevier.com/locate/endm A polynomial formula...

212KB Sizes 0 Downloads 45 Views

Available online at www.sciencedirect.com

Electronic Notes in Discrete Mathematics 41 (2013) 29–36 www.elsevier.com/locate/endm

A polynomial formulation for the stochastic maximum weight forest problem Pablo Adasme 1 Departamento de Ingenier´ıa El´ectrica, Universidad de Santiago de Chile Avenida Ecuador 3519, Santiago, Chile.

Rafael Andrade 2 Departamento de Estat´ıstica e Matem´ atica Aplicada, Universidade Federal do Cear´ a, Campus do Pici - Bloco 910, 60455-760, Fortaleza, Cear´ a, Brasil.

Marc Letournel, Abdel Lisser 3 Laboratoire de Recherche en Informatique, Universit´e de Paris-Sud XI Bˆ at. 650, 91405, Orsay Cedex France. Abstract Let G = (V, ED ∪ES ) be a non directed graph with set of nodes V and set of weighted edges ED ∪ ES . The edges in ED and ES have deterministic and uncertain weights, respectively, with ED ∩ ES = ∅. Let S = {1, 2, · · · , P } be a given set of scenarios for the uncertain weights of the edges in ES . The stochastic maximum weight forest (SMWF) problem consists in determining a forest of G, one for each scenario s ∈ S, sharing the same deterministic edges and maximizing the sum of the deterministic weights plus the expected weight over all scenarios associated with the uncertain edges. In this work we present two formulations for this problem. The first model has an exponential number of constraints, while the second one is a new compact extended formulation based on a new theorem characterizing forests in graphs. We give a proof of the correctness of the new formulation. It generalizes existing related models from the literature for the spanning tree polytope. Preliminary results evidence that the SMWF problem can be NP-hard. Keywords: compact formulation, stochastic maximum weight forest problem.

1571-0653/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.endm.2013.05.072

30

1

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

Introduction

Consider a non directed graph G = (V, ED ∪ ES ), with set of nodes V and set of weighted edges ED ∪ ES , with ED ∩ ES = ∅. The edges in ED and ES have deterministic and uncertain weights, respectively, with ED ∩ ES = ∅. The edges in ED and ES have deterministic and uncertain weights, respectively. Let S = {1, 2, · · · , P } be a given set of scenarios for the uncertain weights of the edges in ES . The stochastic maximum weight forest (SMWF) problem [3] consists in determining a forest of G, one for each scenario s ∈ S, sharing the same deterministic edges and maximizing the sum of the deterministic weights plus the expected weight over all scenarios associated with the uncertain edges. For |S| = 1, the problem is to find a forest of maximum weight in G and can be done in polynomial time [5]. There is no similar result concerning the complexity of the SMWF problem, but we present some instances that have not the totally dual integral (TDI) property. This evidences that this problem can be NP-hard. In fact, it has been practically impossible to generate randomly non TDI instances for this problem. One of the main difficulties in investigating the SMWF problem concerns the large dimensions of a linear model for solving the problem instances. We can transform accordingly SMWF instances to obtain a complete graph so that any scenario solution be a spanning tree and use existing spanning tree formulations [4,6,1] to model the SMWF problem. But we can overcome transforming forests into trees. Thus, saving memory space and execution time, by introducing a new characterization theorem for forests in graphs that allows to obtain a compact extended formulation for the problem. This result is novel for the SMWF problem and generalizes the ideas from the spanning tree polytope (see e.g. [4,6,1]), constituting the main contribution of this work. We prove the correctness of the new formulation. With this compact extended formulation we can now handle instances with up to 40 nodes.

2

Problem formulation

Let G = (V, ED ∪ ES ) be an undirected graph and S = {1, 2, · · · , P } be the set of scenarios as described in Section  1. Let πs be the probability associated with a given scenario s ∈ S, with Ps=1 πs = 1 and πs ≥ 0, for all s ∈ S. Let csuv ∈ R, with u, v ∈ V , represent the weight of an edge uv in scenario s ∈ S, 1 2 3

Email: [email protected] Email: [email protected] Email: {marc.letournel,abdel.lisser}@lri.fr

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

31

with c1uv = c2uv = · · · = cPuv given for all uv ∈ ED . The uncertain weights csuv of each scenario s ∈ S, for all uv ∈ ES , are randomly generated. Let represent a forest of G in scenario s ∈ S by a vector xs = (xsD , xsS ) ∈ {0, 1}|ED ∪ES | , where xsuv = 1 if uv belongs to the forest of that scenario, and xsuv = 0, otherwise. We must have x1D = x2D = · · · = xPD , thus we replace these variables by a unique vector y ∈ {0, 1}|ED | . Let E(H) represent the set of edges with both extremities in the subset of nodes H ⊂ V , such that |H| ≥ 3. The SMWF problem is      (1) (P1 ) max c1uv yuv + πs csuv xsuv uv∈ED

(2) (3)

s.t.



s∈S

yuv +

uv∈E(H)∩ED xsuv ∈ {0, 1},

uv∈ES



xsuv ≤ |H| − 1, ∀ H ⊂ V, ∀ s ∈ S

uv∈E(H)∩ES

∀ uv ∈ ES , ∀ s ∈ S,

yuv ∈ {0, 1}, ∀ uv ∈ ED

Constraints (2) impose the non existence of cycles in any solution. Proposition 2.1 The maximum weight forest problem (i.e. the SMWF with |S| = 1) is polynomially solvable by a greedy algorithm [2]. Proposition 2.2 Formulation (P1 ) is not totally dual integral (TDI). In [3] the authors present a small example for a six nodes graph and three scenarios of edges weights where the linear relaxed solution is not integer. This proves Proposition 2.2. The fact that the problem may present fractional linear relaxed solution makes it difficult to deal with. Moreover, the exponential number of constraints in (P1 ) makes the problem intractable, thus we propose a new compact formulation for this problem.

3

Polynomial formulation

Before presenting the new formulation, we need a new characterization for forests in undirected graphs. Let, for a given node k ∈ V and a given scenario s ∈ S, Fks represent an abstract orientation of the edges of a general subgraph F of G such that the node k in the scenario s is taken as a referential in F . In this case, if an edge uv belongs to F , then the referential node k observes u preceding v = k or v = k preceding u, but not both, in Fks . Theorem 3.1 Let F s be a subgraph of G for scenario s ∈ S. For all k ∈ V , if there exist independent abstract orientations Fks of the edges of F s , for k = 1, · · · , |V |, verifying simultaneously the following conditions in each Fks

32

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

(i) There is no arc entering each referential node k; (ii) There is at most one arc entering a node u = k; then F s is an acyclic forest. Proof. Suppose that F s contains a cycle C = (V (C), E(C)) for some scenario s, with V (C) and E(C) being the set of nodes and edges of C, respectively. We show that F s cannot satisfy both conditions above for all k ∈ V in scenario s ∈ S. To see this, first consider the nodes not in V (C) as referential nodes and assume without loss of generality that they do not induce a cycle in F s . For these nodes, if the arcs of the cycle C were symbolically oriented in the clockwise sense, both conditions (i) and (ii) should be satisfied. But when we consider the nodes in V (C) as referential in any abstract orientation of the edges of E(C), there are at least two arcs connected to any node of V (C) (two leaving or two entering or one leaving and the other entering each node). We distinguish two possible situations. First, if all arcs connected to any referential node of C leave that node, then at least some other node in this cycle presents two arcs entering it, which violates the condition (ii). Second, if the arcs of C were symbolically oriented in the clockwise sense, then the condition (i) should be violated for all referential node in C. Thus, if F s contain a cycle, there is no possibility of orienting the edges of Fks , k = 1, · · · , |V |, in order to satisfy the conditions (i) and (ii) simultaneously. Therefore F s must be a forest, thus concluding the proof. 2 See that when F s is a forest, then always exist abstract orientations of the edges of Fks , k = 1, · · · , |V |, satisfying the two conditions of Theorem 3.1. Corollary 3.2 If subgraph F s in the Theorem 3.1 contains |V (G)| − 1 edges, then F s is a spanning tree of G in scenario s. Now we present a new model for this problem. Consider the sets E := ES ∪ ED and E + := E ∪ {uv ∈ V × V | u = v, uv ∈ / E}. Note that E + + contains all possible edges for G. Let, for all scenario s ∈ S, xs ∈ {0, 1}|E | be the characteristic vector of a general solution (forest) F s in scenario s for the SMWF problem with coordinates xsuv for all uv ∈ E + . Define, for every s ∈ S, k ∈ V and uv ∈ E + , binary variables λskuv and λskvu , such that λskuv = 1 if edge uv belongs to the solution of the scenario s and, for the referential node k, node v precedes node u in an abstract orientation Fks of the edges of F s ; and λskuv = 0, otherwise. The SMWF problem can be modeled as   (P2 ) (4) max c1uv x1uv + πs csuv xsuv uv∈ED

uv∈ES , s∈S

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

(5) (6) (7) (8) (9) (10)

33

s.t. λskuv + λskvu = xsuv , ∀ s ∈ S, ∀ k ∈ V, ∀ uv ∈ E  λskuv ≤ 1, ∀ s ∈ S, ∀ u, k ∈ V, u = k v∈V −{u} λskkv = 0, ∀ s ∈ S, xsuv = 0, ∀ s ∈ S, x1ED = x2ED = · · · =

λ ∈ {0, 1}|V ×E

+ ×S|

,

∀ v, k ∈ V, k = v ∀ uv ∈ E + − E xPED x ∈ {0, 1}|S×E

+|

Constraints (5) state that if an edge uv is in a solution of the scenario s (i.e. xsuv = 1), then for all referential node k ∈ V either v precedes u or u precedes v in Fks . Constraints (6) limit the number of predecessors of any node u in Fks to at most one, with u = k, in order to satisfy the condition (ii) of the Theorem 3.1. Constraints (7) state that no node v can precede the referential node k in Fks (the condition (i) of the Theorem 3.1). Constraints (8) fix at zero all the corresponding variables related to the extra edges we add to make G a complete graph. Note that the λ variables of each scenario (those associated with existing edges of E) induce an abstract orientation (one for each referential k) of the edges present in any feasible solution for problem (P2 ) and these orientations satisfy the two conditions of the Theorem 3.1. Equalities in (9) are the non anticipativity constraints. In this work the set of scenarios S has a constant number of realizations of the edges weights. Thus, as the number of constraints and variables of (P2 ) is polynomial, then the resulting model is polynomial. For a non constant |S|, the model (P2 ) is pseudo-polynomial. Proposition 3.3 The projection on the space of the x variables of an optimal ¯ x¯) for (P2 ), if it is non empty, is a set of forests, one for each solution (λ, scenario, maximizing (4). Proof. Suppose, by contradiction, that F s is feasible for (P2 ) and contains a cycle C s = (V (C s ), E(C s )) in some scenario s ∈ S induced by the variables x¯s . For all the edges uv in E(C s ) we have that x¯suv = 1. Thus, by (5), we must have λskuv + λskvu = 1, for all k ∈ V (C s ) and for all uv ∈ E(C s ). In particular, these λs must verify (6) and (7). But (see the proof of the Theorem 3.1) finding one evaluation for these variables means orienting the edges in E(C s ) such that every referential node k ∈ V (C s ) have no incoming arc and the remaining nodes in V (C s ) − {k} have at most one incoming arc. Therefore, this is not possible and thus the related constraints correspond to an infeasible system of linear inequalities, contradicting the assumption that F s is feasible for (P2 ). The reader may notice that when a solution F s is a

34

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

¯ s and x¯s whose values correspond to such forest, then there exist variables λ solution. From the edges in F s we obtain the x¯s values and if we apply an algorithm to remove iteratively the leaves from every sub-tree of the forest F s we can identify which node can be considered as sub referential. If this deletion process reaches an edge, any extremity of this edge can be considered as the sub-referential node kˆ to obtain the required orientation with respect to the ˆ main referential node k. This orientation can be obtained from the k-rooted pending sub-tree structure. The solution optimality follows from (4). 2

4

Computational experiments

Numerical experiments have been carried out on a Pentium IV, 1 GHz with 2G-RAM under windows XP. The source codes are generated with Matlab and the optimization models are solved by CPLEX 12. We report preliminary results for randomly (1–21) and arbitrarily (22–25) generated instances. The dimensions of the instances are in the Table 1. (RP1 ) and (RP2 ) correspond to the linear relaxations of the models (P1 ) and (P2 ), respectively. The first column identifies the instance number. Columns 2-5 provide the parameters |V |, N = |ED |, M = |ES | and P = |S|. Columns 6-7 and 8-9 show the number of constraints and variables for (RP1 ) and (RP2 ), respectively. For the model (RP1 ), the number of variables is of the order of O(N + M P ) and the number of constraints, O(2|V | P ). For the model (P2 ), the number of variables and constraints are of the order of O(P |V |3 ). In the first part of Table 2 we report numerical results for random generated instances. Column Inst. identifies the instance from the Table 1. Columns 2-3, 4-5, 6-7 and 89 give the optimal solution value and cpu time in seconds for (P1 ), (RP1 ), (P2 ) and (RP2 ), respectively. We have that the optimal relaxed solutions are all integer. For the random generated instances, the data for columns 6 and 7 are equal to those for columns 8 and 9, respectively. This means that these random generated instances proved easy. The model (P1 ) is very limited in solving those instance with more than 13 nodes. However, generally solving their corresponding linear relaxation (RP1 ) takes less cpu time than by using the linear relaxation (RP2 ) for these instances. Note that CPLEX finds integer solutions for (RP1 ) in smaller cpu time than for (P1 ) for these instances. In the second part of Table 2 we report numerical results for the arbitrarily generated instances of the Table 1. These instances intend to show the problem difficulty. These results give some insights about the complexity of this problem (this is an open question). Note that despite the reduced dimensions of these instances, and the fact we do not have many instances to

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

Inst. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Instance parameters |V | N M P 5 5 5 5 5 5 5 50 5 5 5 100 8 14 14 5 8 14 14 50 8 14 14 100 10 22 23 5 10 22 23 50 10 22 23 100 12 33 33 5 12 33 33 50 12 33 33 100 15 52 53 5 15 52 53 50 15 52 53 100 20 95 95 5 20 95 95 10 20 95 95 25 30 217 218 5 30 217 218 10 40 390 390 5 6 3 6 3 8 4 10 4 8 4 11 5 14 6 30 4

(RP1 ) # Constr. 130 1300 2600 1235 12350 24700 5065 50650 101300 20415 204150 408300 163760 1637600 3275200 5242775 10485550 26213875 5368708965 10737417930 5497558138675 171 988 1235 65476

# Var. 30 255 505 84 714 1414 137 1172 2322 198 1683 3333 317 2702 5352 570 1045 2470 1307 2397 2340 21 44 59 126

35

(RP2 ) # Constr. # Var. 675 550 6750 5500 13500 11000 2730 2380 27300 23800 54600 47600 5285 4725 52850 47250 105700 94500 9075 8250 90750 82500 181500 165000 17585 16275 175850 162750 351700 325500 41325 38950 82650 77900 206625 194750 138110 132675 276220 265350 325650 315900 684 585 2144 1904 2680 2380 11308 10556

Table 1 Dimensions of the instances. Inst. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

(P1 ) cpu(s) 163.0000 178.6012 176.4886 460.0961 395.5182 460.7503 585.3537 589.5790 612.3388 808.9535 809.6658 855.6202 1182.2429 -

22 41 94 23 24 118 25 202 “-”: No solution

(RP1 ) cpu(s) (P2 ) cpu(s) Random generated instances 0.34 163.0000 0.32 163.0000 0.36 0.36 178.6012 0.36 178.6012 1.50 0.41 176.4886 0.41 176.4886 2.94 0.42 460.0961 0.39 460.0961 0.53 0.81 395.5182 0.75 395.5182 3.98 3.38 460.7503 1.36 460.7503 12.91 0.73 585.3537 0.69 585.3537 0.78 5.22 589.5790 2.66 589.5790 6.81 14.06 612.3388 5.50 612.3388 27.20 2.97 808.9535 2.36 808.9535 0.92 44.58 809.6658 19.00 809.6658 28.12 107.17 855.6202 37.00 855.6202 57.48 44.23 1182.2429 27.05 1182.2429 2.83 - 1142.6433 161.94 - 1108.7019 1025.52 - 1616.3587 15.67 - 1586.4263 37.13 - 1579.3486 759.53 - 2537.2853 254.66 - 2649.1216 3425.19 - 3540.0685 4682.42 Arbitrarily generated instances 1.484 43.500 0.422 41 0.011 0.515 97.333 0.391 94 0.028 0.406 120.500 0.391 118 0.032 14.375 204.500 11.563 202 0.110 found due to CPLEX shortage memory.

(RP2 )

cpu(s)

163.0000 0.36 178.6012 1.50 176.4886 2.94 460.0961 0.53 395.5182 3.98 460.7503 12.91 585.3537 0.78 589.5790 6.81 612.3388 27.20 808.9535 0.92 809.6658 28.12 855.6202 57.48 1182.2429 2.83 1142.6433 161.94 1108.7019 1025.52 1616.3587 15.67 1586.4263 37.13 1579.3486 759.53 2537.2853 254.66 2649.1216 3425.19 3540.0685 4682.42 43.500 97.333 120.500 204.500

0.010 0.006 0.006 0.044

Table 2 Numerical results for random and arbitrarily generated instances.

perform exhaustive numerical experiments, the new compact model obtains all their optimal solutions in considerable fewer cpu time than the model (P1 ).

36

5

P. Adasme et al. / Electronic Notes in Discrete Mathematics 41 (2013) 29–36

Conclusion

In this paper we propose a polynomial size formulation for the stochastic maximum weight forest problem. This formulation is based on the one for the spanning tree polytope of complete undirected graphs [4]. We extend the model in [4] to deal with forests in non complete graphs. For this, we propose a new theorem characterizing forests in undirected graphs that is important to prove the correctness of the new model. The numerical experiments evidence that this problem can be NP-hard. Thus, further research will be dedicated to answer this open question concerning the complexity of the SMWF problem. Acknowledgments Thanks to the Chilean National Commission for Scientific and Technological Research (Conicyt grant 79100020) and to the Brazilian National Council for the Research and Development (CNPq grant 201347/2011-3) for supporting this work. We are grateful to the referees by their valuable comments to improve this paper.

References [1] Conforti, M., and G. Cornu´ejols, and G. Zambelli, Extended Formulations in Combinatorial Optimization, 4OR - A Quarterly Journal of Operations Research 8 (2010), 1–48. [2] Edmonds, J., Matroids and the greedy algorithm, Mathematical Programming 1 (1971), 127–136. [3] Letournel, M., and A. Lisser, and R. Schultz, Is the polytope associated with a two stage stochastic level problem TDI?, Universit´e de Paris Sud - LRI, Report RR-1546, 2010. [4] Martin, R. K., Using separation algorithms to generate mixed integer model reformulations, Operations Research Letters 10(3) (1991), 119–128. [5] Pulleyblank, W. R., “Polyhedral combinatorics,” In Nemhauser et al., editor, Optimization, volume 1 of Handbooks in Operations Research and Management Science, North-Holland, chapter 5, 371–446, 1989. [6] Yannakakis, M., Expressing Combinatorial Optimization Problems by Linear Programs, Journal of Computer and System Sciences, 43(3) (1991), 441–466.