Corrigendum to “A scenario decomposition algorithm for 0–1 stochastic programs” [Oper. Res. Lett. 41(6) (2013) 565–569]

Corrigendum to “A scenario decomposition algorithm for 0–1 stochastic programs” [Oper. Res. Lett. 41(6) (2013) 565–569]

Operations Research Letters 43 (2015) 215–217 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.c...

343KB Sizes 0 Downloads 48 Views

Operations Research Letters 43 (2015) 215–217

Contents lists available at ScienceDirect

Operations Research Letters journal homepage: www.elsevier.com/locate/orl

Erratum

Corrigendum to ‘‘A scenario decomposition algorithm for 0–1 stochastic programs’’ [Oper. Res. Lett. 41(6) (2013) 565–569] Shabbir Ahmed School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive, Atlanta, GA 30332, United States

article

info

Article history: Received 25 November 2014 Accepted 12 January 2015 Available online 28 January 2015

Table 1 Values of f (x, ξ ) for an example.

Section 3 of the paper ‘‘A scenario decomposition algorithm for 0–1 stochastic programs’’, Operations Research Letters, 41 (6): 565–569, 2013, contains some errors. The result in Proposition 2 had to be stated with respect to epsilon optimality as opposed to optimality. In this errata the entire Section 3 is reproduced with the errors corrected. 3. Optimality of scenario solutions The algorithm proposed in the previous section explores solutions to the scenario subproblems as candidates for solutions to the overall problem. This is a departure from other methods based on solving the dual problem (3) where candidate solutions are generated by aggregating solutions to the scenario subproblems. In this section we attempt to provide some rationale why the solutions to the scenario subproblems themselves can be good solutions to the overall problem in the case the problem of interest is a sample average approximation problem. Consider a 0–1 stochastic program

  (P ) : min E[f (x, ξ)] : x ∈ X ⊆ {0, 1}n , and its sample average approximation

 (SAAN ) : min (1/N )

N 

,

i=1

corresponding to an iid sample {ξ i }Ni=1 . The scenario subproblem corresponding to the ith scenario/sample is

  (Pi ) : min f (x, ξ i ) : x ∈ X ⊆ {0, 1}n . Let SNϵ be the set of ϵ -optimal solutions of (SAAN ) for a given ϵ > 0,

N N i.e. SNϵ = {x ∈ X : (1/N ) i=1 f (x, ξ i ) ≤ (1/N ) i=1 f (y, ξ i ) +

DOI of original article: http://dx.doi.org/10.1016/j.orl.2013.07.009. E-mail address: [email protected]. http://dx.doi.org/10.1016/j.orl.2015.01.006 0167-6377/© 2015 Elsevier B.V. All rights reserved.

ξ1

x1 x2 x3

−1

0 2

ξ2 0 2 −1

ϵ ∀y ∈ X }. Let Si be the set of optimal solutions of (Pi ) for i = 1, . . . , N. Note that these are random sets since they depend on the sample drawn. We investigate the following question: What is the probability that the set of solutions to one of the scenario problems (Pi ) contains an ϵ -optimal solution to (SAAN )? More precisely we would like to estimate Pr[SNϵ ∩ ( i=1 Si ) ̸= ∅] and understand its dependence on X and N. Trivially, if |X | ≤ 2 and N ≥ 2 then the above probability is 1. On the other hand, it is not hard to construct examples where this probability is arbitrarily small. For example, suppose X = {x1 , x2 , x3 }, ξ has two realizations ξ 1 and ξ 2 with equal probability, and the values of f (x, ξ ) are as in Table 1. Then for N reasonably large and ϵ sufficiently small, SNϵ = {x1 } with very high

N

probability while i=1 Si = {x2 , x3 } with probability one. In the following we show that if the collection of random objective functions {f (x, ξ)}x∈X of (P ) is jointly normal with a nonnegative and strictly diagonally dominant covariance matrix then

N

 f (x, ξ i ) : x ∈ X ⊆ {0, 1}n

Solution

 lim Pr SNϵ ∩

N →∞



N 

 Si

 ̸= ∅ = 1

i=1

exponentially fast. Before proceeding with a proof of the above result we make some remarks. The condition of nonnegative covariance matrix implies that the objective functions at different solutions are nonnegatively correlated, and therefore, perhaps it is not very surprising that a scenario solution is likely to be near optimal for the sample average problem. Recall that a matrix is strictly

216

S. Ahmed / Operations Research Letters 43 (2015) 215–217

diagonally dominant if the diagonal element in a row is greater than the sum of the absolute values of all other elements in that row. This condition can be ensured by adding independent zeromean normal noise to the objective function value for each solution without changing the optimal solution or optimal value of (P ). We will need the following Lemma.

Let fN (x) = (1/N ) that

Lemma 1. Let W be a K + 1 dimensional Gaussian random vector, W ∼ N (µ, Σ ) with the following properties:

where gN (x, y) = fN (x) − fN (y) and g (x, y) = f (x) − f (y). Note that E[gN (x, y)] = g (x, y). Let g (x, y, ξ) = f (x, ξ) − f (y, ξ). Applying Large deviations theory (cf. [1]) to g (x, y, ξ) we have that

1. For some ϵ ≥ 0, µ0 ≤ µk + ϵ for all k = 1, . . . , K and 2. the covariance matrix Σ is nonnegative and strictly diagonally dominant. 2 Condition 2 implies that there exists δ such that 0 < δ 2 := mink= ̸ l {σk 2 − σkl , σl − σkl }. Then



Pr[W0 ≤ Wk ∀k = 1, . . . , K ] ≥ (Φ (− 2ϵ/δ)) , K

Let Uk := (W0 −µ0 )−(Wk −µk ) for k = 1, . . . , K , and note that U is a K dimensional Gaussian random vector with E[Uk ] = 0, E[Uk2 ] = σ02 +σk2 − 2σ0k ≥ 2δ 2 , and E[Uk Ul ] = σ02 −σ0l −σ0k +σkl ≥ 0 for all k, l = 1, . . . , K . Consider another K dimensional Gaussian random vector V with E[Vk ] = 0, E[Vk2 ] = E[Uk2 ] and E[Vk Vl ] = 0 for all k, l = 1, . . . , K . Since E[Vk Vl ] ≤ E[Uk Ul ] for all k, l = 1, . . . , K , by Slepian’s inequality [2]

−N

Pr [gN (x, y) − E[gN (x, y)] > ϵ/2] ≤ e

Pr[S∗α ̸⊆ SNϵ ] ≤

K 

Pr[Vk ≤ −ϵ],

where the last identity follows from independence. Since E[Vk2 ] ≥ 2δ 2 . Hence for any k,



   Pr[Vk ≤ −ϵ] = Pr Vk / E[Vk2 ] ≤ −ϵ/ E[Vk2 ]    √ √ 2 ≥ Pr Vk / E[Vk ] ≤ − 2ϵ/δ = Φ (− 2ϵ/δ) 

Proposition 2. If the collection of random objective functions {f (x, ξ)}x∈X of (P ) is jointly normal with a nonnegative and strictly diagonally dominant covariance matrix then for any ϵ > 0, lim Pr SNϵ ∩

N →∞

N 



≤ |X |2 e

ϵ2

−N 2 Pr[S∗α ⊆ SNϵ ] ≥ 1 − |X |2 e 8σmax .

Now consider x∗ ∈ S∗α . We have





S∗α

Pr



Pr SN ∩

N 

 Si

 S∗α

 ̸= ∅

≥ Pr



N 

 Si

 ̸= ∅ ∧

 [S∗α

ϵ

⊆ SN ]

i =1

 ≥ Pr

 S∗α



N 

 Si







 ∗

̸= ∅ ≥ Pr x ∈

Si

N 

 Si

i=1

i =1

=1−

N 

Pr[x∗ ̸∈ Si ]

=1−

N 

(1 − Pr[x∗ ∈ Si ])

i=1

=1−

N 

(1 − Pr[f (x∗ , ξ) ≤ f (y, ξ) ∀ y ∈ X \ {x∗ }]).

i=1

Applying Lemma 1 with K = |X \ {x∗ }|, W0 = f (x∗ , ξ) and Wk = f (y, ξ) for y ∈ X \ {x∗ }, we get



Pr f (x∗ , ξ) ≤ f (y, ξ) ∀ y ∈ X \ {x∗ } ≥ (Φ (− 2α/δ))|X | ,





for some δ > 0 depending on the covariance matrix of {f (x, ξ)}x∈X . Thus



 ϵ

Pr SN ∩

N 

 Si



√ ̸= ∅ ≥ (1 − (1 − (Φ (− 2ϵ/2δ))|X | )N )

i=1 ϵ2 2 8σmax

.

(1)

Since 0 < Φ (− 2ϵ/2δ) < 1 for δ > 0, taking limits with respect to N on both sides of (1) we get the desired result. 

i=1



N 



Proof. Let α = ϵ/2 and S∗α be the set of α -optimal solutions to (P ). Note that,



.

−N

exponentially fast.

ϵ

ϵ2 2 8σmax

− |X |2 e

i =1



−N

Thus

 ̸= ∅ = 1

Si

Pr [gN (x, y) − E[gN (x, y)] > ϵ/2]

x∈X y∈X

k=1





i=1

≥ Pr[Vk ≤ −ϵ ∀k] =



ϵ2

2 8σmax

2 where σmax = maxx,y∈X [V[g (x, y, ξ)]]. Thus

Pr[W0 ≤ Wk ∀k] ≥ Pr[Uk ≤ −ϵ ∀k]

and the result follows.

f (x, ξ i ) and f (x) = E[f (x, ξ)]. Note

s.t. f (x) ≤ f (y) + α, fN (x) > fN (y) + ϵ ⇒ ∃ x, y ∈ X s.t. fN (x) − f (x) > fN (y) − f (y) + ϵ/2 ⇒ ∃ x, y ∈ X s.t. gN (x, y) − g (x, y) > ϵ/2,

Proof. Since µ0 ≤ µk + ϵ for all k = 1, . . . , K , we have

≤ −ϵ ∀k = 1, . . . , K ].

i=1

S∗α ̸⊆ SNϵ ⇒ ∃ x, y ∈ X

where Φ is the standard normal cdf. Pr[W0 ≤ Wk ∀k = 1, . . . , K ] ≥ Pr[(W0 − µ0 ) − (Wk − µk )

N

 ̸= ∅ + Pr[S∗α ⊆ SNϵ ] − 1,

i=1

where the last inequality is using Boole’s inequality.

We close this section with some remarks√on Proposition 2. Note that when ϵ is very small relative to δ , Φ (− 2ϵ/2δ) ≈ 1/2 and so the right hand side of (1) is approximately (1 − (1 − (1/2)|X | )N ) − |X |2 e−NC where C is a small constant. This indicates that smaller the solution set, |X |, relative to the sample size N, higher is the probability that one of the scenario solutions is optimal. If |X | is large relative to N, then the bound in (1) is very loose, even negative. Consequently inequality (1) does not lead to any meaningful finite sample size estimates, but implies asymptotic behavior with respect to N for fixed X . Proposition 2 can be extended to more general distributions under different assumptions. For example, if the random vector {f (x∗ , ξ) − f (y, ξ)}y∈X \{x∗ } (where x∗ is an ϵ optimal solution of (P )) has a density and contains a ball centered

S. Ahmed / Operations Research Letters 43 (2015) 215–217

at 0 of radius at least ϵ in its support, then there exists p > 0 such that Pr√[f (x∗ , ξ) ≤ f (y, ξ) ∀ y ∈ X \ {x∗ }] ≥ p. We can then replace (Φ (− 2ϵ/2δ))|X | with p in (1) and the conclusion of Proposition 2 holds. References [1] Anton J. Kleywegt, Alexander Shapiro, Tito Homem-de Mello, The sample average approximation method for stochastic discrete optimization, SIAM J. Optim. 12 (2) (2002) 479–502. [2] David Slepian, The one-sided barrier problem for Gaussian noise, Bell Syst. Tech. J. (1962) 463–501.

Further reading [1] Shabbir Ahmed, Alper Atamtürk, Maximizing a class of submodular utility functions, Math. Program. 128 (1–2) (2011) 149–169. [2] Antonio Alonso-Ayuso, Laureano F. Escudero, M. Teresa Ortunso, Bfc, a branch-and-fix coordination algorithmic framework for solving some types of stochastic pure and mixed 01 programs, European J. Oper. Res. 151 (3) (2003) 503–519. [3] Gustavo Angulo, Shabbir Ahmed, Santanu S. Dey, Volker Kaibel, Forbidding vertices. Working paper, 2013. [4] Dmitri Bertsekas, Nonlinear Programming, Athena Scientific, 1999.

217

[5] Claus C. Caroe, Ruediger Schultz, Dual decomposition in stochastic integer programming, Oper. Res. Lett. 24 (1–2) (1999) 37–45. [6] Teodor Gabriel Crainic, Xiaorui Fu, Michel Gendreau, Walter Rei, Stein W. Wallace, Progressive hedging-based metaheuristics for stochastic network design, Networks 58 (2) (2011) 114–124. [7] Edgar Gabriel, Graham E. Fagg, George Bosilca, Thara Angskun, Jack J. Dongarra, Jeffrey M. Squyres, Vishal Sahay, Prabhanjan Kambadur, Brian Barrett, Andrew Lumsdaine, Ralph H. Castain, David J. Daniel, Richard L. Graham, Timothy S. Woodall, Open MPI: Goals, concept, and design of a next generation MPI implementation, in: Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, 2004, pp. 97–104. [8] Arne Lokketangen, David L. Woodruff, Progressive hedging and tabu search applied to mixed integer (0,1) multistage stochastic programming, Journal of Heuristics 2 (2) (1996) 111–128. [9] Miles Lubin, Kipp Martin, Cosmin Petra, Burhaneddin Sandikci, On parallelizing dual decomposition in stochastic integer programming, Oper. Res. Lett. 41 (3) (2013) 252–258. [10] Lewis Ntaimo, Suvrajeet Sen, The million-variable ‘‘march’’ for stochastic combinatorial optimization, J. Global Optim. 32 (3) (2005) 385–400. [11] Lewis Ntaimo, Suvrajeet Sen, A comparative study of decomposition algorithms for stochastic combinatorial optimization, Comput. Optim. Appl. 40 (3) (2008) 299–319. [12] Lewis Ntaimo, Matthew W. Tanner, Computations with disjunctive cuts for two-stage stochastic mixed 0–1 integer programs, J. Global Optim. 41 (3) (2008) 365–384. [13] Jean-Paul Watson, David Woodruff, Progressive hedging innovations for a class of stochastic mixed-integer resource allocation problems, Comput. Manag. Sci. (2010) 1–16.