Concentration of Measure for Chance-Constrained Optimization

Concentration of Measure for Chance-Constrained Optimization

Proceedings, 6th IFAC Conference on Analysis and Design of Proceedings, 6th IFAC Conference on Analysis and Design of Hybrid Systems Proceedings, 6th ...

503KB Sizes 0 Downloads 45 Views

Proceedings, 6th IFAC Conference on Analysis and Design of Proceedings, 6th IFAC Conference on Analysis and Design of Hybrid Systems Proceedings, 6th IFAC Conference on Analysis and Design of Proceedings, 6th IFAC Conference on Analysis and Design of Hybrid Systems Oxford, UK, July Hybrid Systems 11-13, 2018 Available online at www.sciencedirect.com Hybrid Oxford,Systems UK, July 2018 Proceedings, 6th 11-13, IFAC Conference on Analysis and Design of Oxford, UK, July 11-13, 2018 Oxford, UK, July 11-13, 2018 Hybrid Systems Oxford, UK, July 11-13, 2018

ScienceDirect

IFAC PapersOnLine 51-16 (2018) 277–282 Concentration of Measure for Concentration of Measure for Concentration of Measure for Concentration of Measure for Chance-Constrained Optimization Chance-Constrained Optimization Concentration of Measure for Chance-Constrained Optimization Chance-Constrained Optimization ∗ Sadegh Soudjani ∗ RupakOptimization Majumdar ∗∗ Chance-Constrained Sadegh Soudjani ∗ Rupak Majumdar ∗∗ ∗∗

Sadegh Sadegh Soudjani Soudjani ∗ Rupak Rupak Majumdar Majumdar ∗∗ School ofSadegh Computing, Newcastle University, United∗∗Kingdom ∗ Soudjani Rupak Majumdar School of Computing, Newcastle University, United Kingdom ∗∗ Max Planck Institute for Software Systems, Germany School of Computing, Newcastle University, United Kingdom ∗∗ School of Planck Computing, Newcastle University, United Kingdom Institute for Software Systems, Germany ∗∗ Max Planck Institute for Software Systems, Germany ∗ ∗∗ Max Max Institute for Software Systems, Germany School of Planck Computing, Newcastle University, United Kingdom ∗∗ Max Planckoptimization Institute forproblems Softwareoptimize Systems,aGermany Abstract: Chance-constrained cost function in the presence Abstract: Chance-constrained optimization problems optimize cost function in theare presence of probabilistic constraints. Theyoptimization are convex inproblems very special cases and, in function practice, in they solved Abstract: Chance-constrained optimize aaa cost the presence Abstract: Chance-constrained optimization problems optimize cost function in theare presence of probabilistic constraints. They are convex in very special cases and, in practice, they solved using approximation techniques. In this paper, we special study approximation of chance constraints of probabilistic constraints. They are convex in very cases and, in practice, they are solved of probabilistic constraints. Theyoptimization are convex inproblems very special cases and, in function practice, they are solved Abstract: Chance-constrained optimize a cost in the presence using approximation techniques. In this paper, we study approximation of chance constraints for the class of probability distributions that satisfy a concentration of measure property. We using approximation techniques. In paper, we study approximation of constraints using approximation techniques. In this this paper, we special study approximation of chance chance constraints of probabilistic They are convex very cases and, inofpractice, are solved for the class of constraints. probability distributions that satisfy aa concentration measure property. We show that using concentration of measure, weincan transform chance constraints tothey constraints on for the class of probability distributions that satisfy concentration of measure property. We for the class of probability distributions that satisfy a concentration of measure property. We using approximation techniques. In this paper, we study approximation of chance constraints show that using concentration of measure, we can transform chance constraints to constraints on expectations, which can then be solved based on transform scenario optimization. Our approach depends show that using concentration of measure, we can chance constraints to constraints on show that using concentration of measure, we can transform chance constraints to constraints on for the class probability distributions that satisfy a concentration of Our measure We expectations, which can then then be solved based based on scenario optimization. approach depends solely on the ofwhich concentration ofbe measure property of the uncertainty and does notproperty. require the expectations, can solved on scenario optimization. Our approach depends expectations, which can thenofbe solved based on transform scenario optimization. Our approach depends show that using concentration ofmeasure measure, we can chance constraints tonot constraints on solely on the concentration property of also the uncertainty and does require the objective or constraint functions to be convex. We give bounds on the required number solely on concentration of property of the and does not the solely on the the concentration ofbemeasure measure property of also the uncertainty uncertainty and does not require require the expectations, which canfunctions then solved on scenario optimization. approach depends objective or for constraint toconfidence. be based convex. We give bounds onOur the required number of scenarios achieving a certain We demonstrate our approach on a non-convex objective or constraint functions to be convex. We also give bounds on the required number objective or for constraint functions toconfidence. be convex. We also give bounds on the number solely on the concentration of measure property of the uncertainty and doesrequired nota non-convex require the of scenarios achieving a certain We demonstrate our approach on chanced-constrained optimization, and benchmark our technique against alternative approaches of scenarios for achieving a confidence. We demonstrate our approach on of scenarios for achieving a certain certainto confidence. We demonstrate our approach on aa non-convex non-convex objective or constraint functions be convex. We also give bounds on the required number chanced-constrained optimization, and benchmark our technique against alternative approaches in the literature on chance-constrained LQG problem. chanced-constrained optimization, and benchmark our technique against alternative approaches chanced-constrained optimization, benchmark technique our against alternative approaches of for achieving a certainand confidence. We our demonstrate approach on a non-convex in scenarios the literature literature on chance-constrained chance-constrained LQG problem. in the on LQG problem. in the literature on chance-constrained LQG problem. chanced-constrained optimization, benchmark our technique approaches © 2018, IFAC (International Federationand of Automatic Control) Hosting byagainst Elsevieralternative Ltd. All rights reserved. Keywords: Chance-Constrained Optimization, Non-Convex Scenario Program, Concentration in the literature on chance-constrained LQG problem. Keywords: Chance-Constrained Optimization, Non-Convex Scenario Program, Concentration of Measure,Chance-Constrained Stochastic Optimization, Randomized Optimization, Linear Quadratic Gaussian Keywords: Optimization, Non-Convex Scenario Program, Concentration Keywords: Chance-Constrained Optimization, Non-Convex Scenario Program, Concentration of Measure, Stochastic Optimization, Randomized Optimization, Linear Quadratic Gaussian of Measure, Stochastic Optimization, Randomized Optimization, Linear Quadratic Gaussian of Measure,Chance-Constrained Stochastic Optimization, Randomized Optimization, Linear Quadratic Gaussian Keywords: Optimization, Non-Convex Scenario Program, Concentration 1. INTRODUCTION decision variables. Moreover, increasing number of samples of Measure, Stochastic Optimization, Randomized Optimization, Linear Quadratic Gaussian 1. INTRODUCTION INTRODUCTION decision variables. Moreover, increasing number of samples samples reduces in general Moreover, the chanceincreasing of gettingnumber a feasible SP and 1. decision variables. of 1. INTRODUCTION decision variables. Moreover, increasing number of samples reduces in general the chance of getting a feasible SPwork, and decreasesin the (sub)optimal performance. In recentSP reduces general the chance of getting a feasible and Chance-constrained programming (CCP) (Pr´ e kopa (1995)) reduces in general the chance of gettingnumber a In feasible SPwork, and 1. programming INTRODUCTION decision variables. Moreover, increasing of samples decreases the (sub)optimal performance. recent Chance-constrained (CCP) (Pr´ e kopa (1995)) Grammatico et al. (2016) address random non-convex prodecreases the (sub)optimal performance. In recent work, is an important technique to optimize a cost function in the reduces Chance-constrained programming (CCP) (Pr´ kopa (1995)) (1995)) the performance. In recent work, general the chance getting a non-convex feasible and Chance-constrained programming (CCP) (Pr´ eekopa Grammatico et(sub)optimal al.multiple (2016) address random prois an important important technique to optimize optimize cost function in the the decreases grams byinsolving SPs of with different convexSP objecGrammatico et al. (2016) address random non-convex propresence of random parameters. It arises in many problems is an technique to aa cost cost function in Grammatico et al. (2016) address random non-convex prothe (sub)optimal performance. In recent work, is an important technique to optimize a function in the decreases grams by solving multiple SPs with different convex objecChance-constrained programming (CCP) (Pr´ e kopa (1995)) presence of random parameters. It arises in many problems tive functions, but they restrict the different constraint function to grams by solving solving multiple SPs with with convex objecin engineering and finance, where a full in description of the Grammatico presence of random random parameters. It arises arises many problems problems by SPs different convex objecet al.multiple (2016) address random non-convex propresence of parameters. It in many tive functions, but they restrict restrict the constraint function to is an important technique towhere optimize a cost in the grams in engineering and finance, full description of be separable non-convex. tive functions, but they the constraint function to system and theand effect of all factorsaa may notfunction be available. in engineering finance, where full description of the tive functions, but they restrict the constraint function to grams by solving multiple SPs with different convex objecin engineering and finance, where a may full in description of the be be separable non-convex. presence of random parameters. It arises many problems system and the effect of all factors not be available. separable non-convex. non-convex. For example, in controller synthesis desirable tive system and the the effect of of all all factors problems may not not itbe beisavailable. available. separable functions, butshow theythat restrict constraintdistribution function to system and effect factors may In this paper, we if thethe uncertainty in finance, where aproblems full the description of the be For example, inand controller synthesis it is isofdesirable desirable to engineering pick the “best” control input among set possible For example, in controller synthesis problems it In this paper, we show show that that if the the uncertainty uncertainty distribution be separable non-convex. For example, in controller synthesis problems itbeisofavailable. desirable satisfies a concentration of measure property, then one can system and the effect of all factors may not In this paper, we if distribution to pick the “best” control input among the set possible valid inputs in thecontrol presence of disturbances forofwhich we In this paper, we show that if the uncertainty distribution to pick the “best” “best” input among the possible satisfies a concentration concentration of measure measure property, then oneConcan to pick the control input among the set set ofwhich possible significantly generalize the scenario-based approach. For example, in controller synthesis problems it is desirable satisfies a of property, then one can valid inputs in the presence of disturbances for we may only have a stochastic model. This problem is cast satisfies a concentration of measure property, then oneConcan valid inputs in the presence of disturbances for which we In this paper, we show that if the uncertainty distribution significantly generalize the scenario-based approach. valid inputs in thestochastic presence of disturbances forofwhich we significantly centration ofgeneralize measure phenomenon roughlyapproach. states that, if to thehave “best” control input among theproblem set possible the scenario-based scenario-based Conmay only model. This is cast cast as pick anonly optimization question which minimizes an objective significantly the approach. Conmay have aaa stochastic stochastic model. This problem is satisfies of measure property, thenone one can centration ofgeneralize measure space phenomenon roughly states that, if may only have model. This problem is cast a set in aa concentration probability has measure at least half, valid inputs in the presence of disturbances for which we centration of measure phenomenon roughly states that, if as an optimization question which minimizes an objective function modeling the systemwhich performance, while ensuring significantly centration ofgeneralize measure space phenomenon roughly statesone that, if as an optimization question minimizes an objective the scenario-based approach. Cona set in a probability has measure at least half, as anonly optimization question which minimizes an objective the points space in the has probability areone “close” may have aconstraints stochastic model. This problem is cast a“most” set in in of probability measure space at least least half, function modeling the system performance, while ensuring that the system are met “as much as possible.” a“most” set aa probability space measure at half, function modeling the system performance, while ensuring centration ofand measure phenomenon states that, if of the points in the has probability space areone “close” function modeling the system performance, while ensuring to the set, if a function on theroughly probability space is as an optimization question which minimizes an objective “most” of the points in the probability space are “close” that the system constraints are met “as much as possible.” Thatthe is, system while constraints are maymet be“as violated under rare, a“most” of the points in the has probability space areone “close” that much as possible.” set in a probability space measure at least half, to the set, and if a function on the probability space is that the are met “as much as possible.” regular enough, that deviates function modeling theby system performance, while ensuring to the set, set, and the if aa chance function onthis thefunction probability spacetoo is That is, system while constraints may beor violated under rare, unexpected, events, modeling approximating the “most” to the and if function on the probability space is That is, while constraints may be violated under rare, of the points in the(or probability are “close” regular enough, the chance that this function deviates too That is, while constraints may be violated under rare, much from its expectation median) isspace verydeviates small. that the system are met “as much as possible.” regular enough, the chance that this function too unexpected, events, by modeling or approximating the distribution of the random parameter, it makes sensethe to to regular enough, the that this unexpected, events, by modeling modeling or approximating approximating thefrom set, andexpectation if a chance function on thefunction probability spacetoo is much its (or median) is very verydeviates small. unexpected, events, by the That is, while constraints may beorsense) violated under rare, much from its its expectation (or (or median) is small. distribution of the random parameter, it makes makes sense to regular call decisions feasible (in a stochastic whenever they much from expectation median) is very small. distribution of the random parameter, it sense to enough, the chance that this function deviates too In particular, given a CCP, we construct a SP which takes distribution of the random parameter, it makes sense to In particular, given a CCP, we construct a SP which takes unexpected, events, byprobability. orsense) approximating the call decisions feasible (in amodeling stochastic whenever they are decisions feasible with high(in CCP is whenever in contrast to much call feasible stochastic sense) they from itsgiven expectation median) small. samples from the uncertain but, instead of In particular, CCP, (or weparameters, constructisaavery SP which which takes call decisions feasible (inprobability. aa stochastic sense) whenever they distribution of the random parameter, it makes sense to In are feasible with high CCP is in contrast particular, given aa CCP, we construct SP takes samples from the uncertain uncertain parameters, but, instead of the feasible robust approach, which boundsCCP the worst-case range are with high probability. is in contrast to forcing the constraint to hold for all individual samples, samples from the parameters, but, instead of are feasible with high(in probability. CCP is whenever in contrast to In call decisions feasible a stochastic sense) they the robust approach, which bounds the worst-case range samples from the uncertain parameters, but, instead of particular, given a CCP, we construct a SP which takes forcing the constraint to hold for all individual samples, of disturbances, and which can consequently be extremely the robust approach, which bounds the worst-case range requiresthe thatconstraint the constraint is satisfied in averagesamples, for the forcing to hold hold for all all individual individual the robust approach, which bounds the worst-case range are feasible withand high probability. CCP is inbecontrast to samples of disturbances, which can consequently extremely forcing the constraint to for samples, from the uncertain but, instead of requires that the constraint isparameters, satisfied in boundary. average for the conservative. of disturbances, and which which can consequently be extremely extremely samples with athe predefined margin from its With requires that constraint is satisfied in average for the of disturbances, and can consequently be the robust approach, which bounds the worst-case range conservative. requires that the constraint is satisfied in average for the forcing the constraint to hold for all individual samples, samples with a predefined margin from its boundary. With conservative. an appropriate choice for margin the margin, which depends on samples with aa predefined predefined from its its boundary. With conservative. of disturbances,chance-constrained and which can consequently be extremely Unfortunately, optimization problems requires samples with from boundary. With that thechoice constraint is property, satisfied in average for the an appropriate for margin the margin, which depends on the concentration of measure we can relate the an appropriate choice for the margin, which depends on Unfortunately, chance-constrained optimization problems conservative. have feasible domains which are in general non-convex. an appropriate choice for margin the property, margin, which depends on Unfortunately, chance-constrained optimization problems samples with a predefined from its boundary. With the concentration of measure we can relate the Unfortunately, chance-constrained optimization problems the feasible solutions ofofthe original CCP with the SP. concentration measure property, we that can of relate the have feasible domains which are arecases, in general non-convex. Thus,feasible other than a few restricted numerical methods an the concentration measure we can relate the have domains which in general non-convex. choice for the property, margin, which depends on feasible solutions ofofnot the original CCP with with that of the SP. SP. have feasible domains which arecases, in general non-convex. Ourappropriate approach does assume convexity ofthat the objective Unfortunately, chance-constrained optimization problems feasible solutions of the original CCP of the Thus, other than a few restricted numerical methods must be employed for obtaining acases, solution. A widely used the feasible solutions ofofnot the original CCP with that of the SP. Thus, other than a few restricted numerical methods concentration measure property, we can relate the Our approach does assume convexity of the objective Thus, other than a few restricted cases, numerical methods or constraint but only Lipschitz continuity of have feasible domains which areaoptimization in general Our approach functions, does not not assume convexity of the the objective must be employed forrandomized obtaining solution. A non-convex. widely used numerical method is (Campi and Our approach does convexity must be employed for obtaining solution. A widely used feasible solutions of the assume original CCP withofthat ofobjective the SP. or constraint functions, but only only Lipschitz continuity of must be employed for obtaining aaoptimization solution. A widely used theconstraint constraint function w.r.t. the uncertainty parameter, Thus, other than a few restricted cases, numerical methods or functions, but Lipschitz continuity of numerical method is randomized (Campi and Garatti (2008); Calafiore (2010)). In this approach, one Our or constraint functions, but only Lipschitz continuity of numerical method is randomized randomized optimization (Campi and and approach does not assume convexity of the objective the constraint function w.r.t. the uncertainty parameter, numerical method is optimization (Campi which is a reasonable assumption and less restrictive. must be employed for obtaining a solution. A widely used the constraint function w.r.t. the uncertainty parameter, Garatti (2008); Calafiore (2010)). In this this approach, one or constructs a scenario program (SP) by taking samples theconstraint constraint function w.r.t. the uncertainty parameter, Garatti (2008); Calafiore (2010)). In approach, one functions, but only Lipschitz continuity of which is a reasonable assumption and less restrictive. Garatti (2008); Calafiore (2010)). In this approach, one which Thus, we show that concentration of measure can be numerical method is randomized optimization (Campi and is reasonable assumption and less restrictive. constructs scenario program (SP) bythat taking samples from the uncertain variables and requires the (chance) which is aashow reasonable assumption and less parameter, restrictive. constructs aa scenario scenario program (SP) by taking samples the constraint function w.r.t. the uncertainty Thus, we that concentration of measure can be constructs a program (SP) by taking samples used towesignificantly expand the scope of randomized Garatti (2010)). In this approach, show that concentration of can from the (2008); uncertain variables and requires that the (chance) constraints hold Calafiore forvariables all observed values of samples. If one the Thus, Thus, towe that expand concentration of measure measure can be be from the uncertain and that the which is significantly ashow reasonable assumption and less restrictive. used the scope scope of randomized randomized from the uncertain and requires requires that the (chance) (chance) optimization. constructs ahold scenario byof taking samples to significantly expand the of constraints forvariables all program observed values samples. If the the used objective and constraint functions(SP) are convex, the required used to significantly expand the scope of randomized constraints hold for all observed values of samples. If Thus, we show that concentration of measure can be optimization. constraints hold forvariables all observed values of samples. If the optimization. from theof uncertain and requires that thethat (chance) objective and constraint functions are convex, required number samples can be selected to convex, guarantee with Concentration optimization. objective and constraint functions are the required used to significantly expand the scope of randomized of measure is a powerful property of distriobjective constraint functions are convex, thethat required constraints hold forcan all be observed values samples. Ifwith the number ofand samples selected to guarantee Concentration of measure measure is aainpowerful powerful propertyinof ofcombidistricertain confidence the solution of SPguarantee isof feasible for the Concentration number of samples can be selected to that with optimization. butions, and has been used many problems of is property distrinumber of samples can be selected to guarantee that with objective and constraint functions are convex, the required certain confidence the solution of SP is feasible for the Concentration of measure is ainpowerful propertyinofcombidistributions, and has has been used many the problems original confidence CCP. The main advantage ofSPconvex SP is that no butions, certain the solution of is feasible for the natorics (Alon and Spencer (2016)), analysis of ranand been used in many problems in combicertain confidence the be solution of of SPconvex is feasible forwith the number of samples can selected to guarantee that original CCP. The main advantage SP is that no butions, and has been used in many problems in combiConcentration of measure is a powerful property of distrinatorics (Alon and Spencer (2016)), the analysis of ranknowledge of properties of uncertainty is required but the original CCP. The main advantage of convex SP is that no domized algorithms (Barvinok (1997)), and for discrete natorics (Alon and Spencer (2016)), the analysis of ranoriginal CCP. The main advantage ofSPconvex SP is but that no domized certain confidence the the solution ofcondition feasible for the knowledge of properties properties of special uncertainty isis required required natorics (Alon and Spencer (2016)), the analysis of ranbutions, and has been used in many problems inThey combialgorithms (Barvinok (1997)), and for discrete guarantee holds under that objective knowledge of of uncertainty is but the optimization (Dubhashi and Panconesi (2009)). indomized algorithms (Barvinok (1997)), and for discrete knowledge of properties of special uncertainty is required but the original CCP. The main advantage of convex SP is that no guarantee holds under the condition that objective domized algorithms (Barvinok (1997)), and for discrete natorics (Alon and Spencer (2016)), the analysis of optimization (Dubhashi and Panconesi (2009)). They inand constraint functions are convex with respect to the optimization (Dubhashi and Panconesi (2009)). Theyranguarantee holds under the special condition that objective inguarantee holds under the special condition that objective knowledge of properties ofare uncertainty is required but and constraint constraint functions convex with respect to the the domized optimization (Dubhashi and Panconesi inalgorithms (Barvinok (1997)),(2009)). and for They discrete and functions are convex with respect to and constraint convex with respect to the optimization (Dubhashi and Panconesi (2009)). They inguarantee holds functions under theare special condition that objective Copyright © 2018, 2018 IFAC and constraint functions are convex withofrespect toControl) the277 2405-8963 © IFAC (International Federation Automatic Copyright © 2018 IFAC 277 Hosting by Elsevier Ltd. All rights reserved. ∗ ∗ ∗ ∗

Copyright 2018 responsibility IFAC 277Control. Peer review© of International Federation of Automatic Copyright ©under 2018 IFAC 277 10.1016/j.ifacol.2018.08.047 Copyright © 2018 IFAC 277

2018 IFAC ADHS 278 Oxford, UK, July 11-13, 2018

Sadegh Soudjani et al. / IFAC PapersOnLine 51-16 (2018) 277–282

clude “classical” Chernoff bounds for sums of independent Bernoulli variables and Gaussian concentration inequalities, to more advanced Poincar´e, log-Sobolev, and Talagrand inequalities. In the general form, the inequalities take the form  P(|f − E(f )| > t) ≤ c · e−c t bounding probability of f deviating from its expected value. While classical results show the existence of such constants c and c , for practical applications, we are also interested in optimizing the constants: the smaller c and larger c , the less conservative solution obtained from SP. Thus, we have revisited proofs of concentration of measure (Ledoux (1999)) with an attempt to improve the constants. The rest of this paper is organized as follows. In Section 2 we define chance-constrained programs and the concentration of measure property for the uncertainty. In Section 3 we discuss the construction of a scenario program and the selection of the number of scenarios, and connect these to the original CCP through concentration of measure. Section 4 gives an overview of the concentration phenomenon together with improved bounds. We use a non-convex CCP as a running example and in Section 5 we compare our approach with alternative techniques from literature on an LQG problem. 2. PRELIMINARIES: CCP Consider a random variable δ ∈ Ω ⊆ Rp defined on a probability space (Ω, F, P). With respect to this random variable, we define the Chance-Constrained Program  min J(x) CCP : x∈X (1) s.t. P ({δ ∈ Ω | g(x, δ) ≤ 0}) ≥ 1 − . where x ∈ X is the decision variable belonging to the compact admissible set X ⊂ Rn , J : Rn → R is a lower semi-continuous cost function, g : Rn × Rp → R is the constraint function, and  ∈ (0, 1) is constraintviolation tolerance. We assume that for any fixed x ¯ ∈ X, the mapping δ → g(¯ x, δ) is measurable and for any fixed ¯ is lower semi-continuous. δ¯ ∈ Ω, the mapping x → g(x, δ) Theorem 1. The CCP (1) is well-defined and attains a solution if it is feasible. Proof of Theorem 1 relies on reverse Fatou’s lemma (Royden (1988)) and is omitted here due to space limitation. Note that here we do not put any assumption on the convexity of the function g(·, δ) or on the cost function J(·), which is a common practice in the scenario approach for optimization (cf. Campi and Garatti (2011); Grammatico et al. (2016)). In this paper, we make the following assumption on the probability space (Ω, F, P). Assumption 2. (Concentration of Measure). Probability space (Ω, F, P) satisfies the inequality P (|f (δ) − E (f (δ)) | ≤ t) ≥ 1 − h(t), ∀t ≥ 0, (2) for any function f : Ω → R that belongs to a class of functions C such that g(x, ·) ∈ C for any x ∈ X , and where h : R≥0 → [0, 1] is a monotonically decreasing function. We use Assumption 2 in the next section to construct a scenario program for finding a (possibly sub-optimal) solution of CCP (1). We discuss in Section 4 how this assumption holds for many well-known distributions. Note that monotonicity of h(·) is not restrictive since we can 278

always replace h(·) with another monotonically decreasing ¯ ¯ function h(·). Inequality (2) is still valid with h(·) if ¯ h(·) ≥ h(·).

The following non-convex CCP is used for demonstrating our approach. Example 3. Consider uniformly distributed random variable δ ∼ U [0, 1] and the following CCP  min x2 s.t. x1 ∈ [0, 1], x2 ∈ R (3)  P ({δ ∈ Ω : x2 ≥ 1 − |x1 − δ|}) ≥ 1 − .

Since any (x1 , x2 ) ∈ R2 with x2 ≥ 1 is feasible and with x2 < 0 is infeasible, we can put X = [0, 1] × [0, 1]. In this example g(x, δ) = 1 − x2 − |x1 − δ| with is obviously nonconvex on X for any δ ∈ [0, 1]. Feasible domain of this CCP is non-convex. The exact optimal value of CCP (3) will be 1 −  which is obtained at x = (0, 1 − ) or x = (1, 1 − ). 3. FROM CCP TO SP

We define the following Expected-Constrained Program (ECP) that is tightly connected to CCP (1) under Assumption (2), for each choice of the parameter β ≥ 0:  min J(x) (4) ECP(β) : x∈X s.t. Eg(x, δ) + β + h−1 () ≤ 0.

Proposition 4. The feasible domain of CCP (1) includes the feasible domain of Expected-Constrained Program (4) for all values of β ≥ 0. The expectation operator in (4) still prevents us to efficiently compute a solution. Therefore, we define the following Scenario Program (SP) that replaces the expectation with its empirical mean,  min J(x)   x∈X ,γi ∈R   s.t. g(x, δ i ) ≤ γ , i = 1, . . . , N, i (5) SP(γ) : N   1  −1  γi + γ + h () ≤ 0,   N i=1

where δ i , i = 1, 2, . . . , N , each defined on the probability space (Ω, F, P), are independent and γ > 0 is a positive parameter.

We make the following assumption to formally connect the two optimizations (4) and (5). Assumption 5. Variance of g(x, δ) is bounded: there is a constant Mv such that  E (g(x, δ) − E(g(x, δ)))2 ≤ Mv , ∀x ∈ X . This assumption already holds if g(x, δ) is bounded. Theorem 6. Under Assumption 5, any feasible solution of ECP(β) in (4) is a feasible solution of SP(γ) in (5) with probability Mv , 1− N (β − γ)2 for any β > γ ≥ 0 and any number of samples N . Corollary 7. Under Assumption 5, if ECP(β) in (4) is feasible, then SP(γ) in (5) is also feasible with confidence 1 − α ∈ (0, 1) when β > γ ≥ 0 and number of samples N satisfy the inequality N ≥ Mv /[α(β − γ)2 ].

The above theorem relates feasibility of ECP (4) to that of SP (5). In practice it is desirable to have confidence on the

2018 IFAC ADHS Oxford, UK, July 11-13, 2018

Sadegh Soudjani et al. / IFAC PapersOnLine 51-16 (2018) 277–282

solution of SP (5) being a (possibly sub-optimal) solution for CCP (1). In order to provide such a confidence, we require one of the following technical assumptions. Assumption 8. ECP(β0 ) (4) is feasible for some β0 > 0. Assumption 9. SP (5) is feasible for any N ∈ N and any choice of samples δ i ∈ Ω. In case of Assumption 8 we get a lower bound on probability of having a feasible SP (5) according to Theorem 6. In case of Assumption 9, SP (5) always has an optimal value since X is compact, and its optimal value can in principle be written as a function of samples δ i ∈ Ω. Note that these assumptions are less restrictive than the one posed in (Campi and Garatti (2008)) as we do not require non-empty interior for feasible domain of SP (5). Proposition 10. Under Assumptions 5 and 8, if SP(γ) in (5) is feasible for samples δ i , then ECP(0) in (4) is also feasible with confidence 1 − α, where number of samples N satisfies   1 Mv 1 . (6) + N≥ α γ2 (β0 − γ)2 Moreover, having concentration inequality (2), CCP (1) will be also feasible with confidence 1 − α. Remark 11. This theorem can give a tradeoff between optimality and number of samples for achieving a confidence. We solve SP(γ) with γ = β0 /2 and the associated number of samples in (6). If the optimal solution happens to be at the boundaries of optimization constraint, we can improve the solution at the cost of larger computational effort by reducing γ and increasing number of samples. Proposition 12. Under Assumptions 5 and 9, any solution of SP(γ) in (5) is a feasible solution of ECP(0) in (4) (and hence a feasible solution of CCP (1) having Assumption 2) with confidence 1 − α if γ > 0 and number of samples N satisfy N ≥ Mv /(αγ 2 ). Example 13. Uniformly distributed random variable δ ∼ U [0, 1] satisfies the inequality P (δ : g(x, δ) − E (g(x, δ)) ≤ t) ≥ min{2t, 1}, for all t ≥ 0 and x ∈ R2 . This enables us to have the same results for uniform distribution with h(t) = max{1−2t, 0}. The ECP will be  min x2 s.t. x1 , x2 ∈ [0, 1] (7)  1 − x2 − E|x1 − δ| + β + (1 − )/2 ≤ 0.

The feasible domain of ECP (7) is x2 ≥ 1 + x1 − x21 + β − /2. As theoretically shown, this domain is a subset of feasible domain of CCP (3). This gives the sub-optimal value (1−/2+β) for CCP (3) at exactly the same optimal points of CCP (3). Now let us examine the following SP without the need for explicit computation of E|x1 − δ|,  min x2     s.t. x1 , x2 ∈ [0, 1], γi ∈ R   1 − x2 − |x1 − δ i | ≤ γi , i = 1, . . . , N (8) N    1    γi + γ + (1 − )/2 ≤ 0.  N i=1

Solution of this optimization is

  i  δ 1 x∗2 = 1 − /2 + γ +  i −  , N 2

279

279

Algorithm 1 Computing a (possibly sub-optimal) solution for CCP (1). input: CCP (1) with J, g, , Confidence 1 − α, Number of samples N0 , constant Mv and function h 1: do: 2: take N0 independent samples δ i 3: Compute (ˆ x, γˆ , γˆi ) = arg max{γ | x ∈ X , γ, γi ∈ R} under constraints in (5) 4: Compute β0 = −E(g(ˆ x, δ)) − h−1 () 5: until β0 > 0 6: Select number of samples N ≥ 8Mv /(αβ02 ) 7: Take N independent samples δ i 8: Solve SP (5) with γ = β0 /2 to get x∗ if it is feasible output: Feasible point x∗ for CCP (1) with confidence 1−α if x∗2 ≤ 1, otherwise it is infeasible. As we see, γ > 0 can be selected sufficiently small and N sufficiently large such that the optimization is feasible. The optimal value is taken at one of the two boundary points x1 = 0 or x1 = 1 depending on the term inside absolute value. In order to theoretically relate (8) and (3), we require to check the imposed assumptions. We verify Assumption 8 by trying a single point in (7). Taking x1 = 0 reveals that (7) is feasible for β0 ≤ /2. According to Proposition 10, we have   i  δ 1 x∗2 = 1 − /2 + β0 /2 +  i −  , N 2 as a sub-optimal solution for CCP (3) with confidence (1 − α) if we take N ≥ 8/(3αβ02 ). This solution is very close to 1 − /2 when we take the limit β0 → 0 and N → +∞. On the other hand, if we try x1 = 0.5, we get that β0 ≤ /2 − 1/4, which puts the constraint  > 0.5 for drawing the same conclusion. We can make the choice of x for verifying Assumption 8 more intelligent. Intuitively, we take N0 samples and minimize the empirical mean of g(x, δ). Algorithm 1 presents the combined approach of verifying Assumption 8 and finding a feasible solution for CCP (1). In this algorithm do-until loop tries to find x ˆ for verifying Assumption 8. In step 6 number of samples is selected according to the outcome of this loop and then SP (5) is solved. Note that do-until loop terminates with probability one if Assumption 8 holds. The choice of N0 can also be made adaptive w.r.t. the outcome of optimization is step 3. 4. CONCENTRATION OF MEASURE Assumption 2 has a central role in our approach but establishing inequality (2) is generally difficult for multivariate probability distributions. In this section, we discuss concentration of measure that enables us to identify classes of distributions that satisfy this assumption. The most important feature of this phenomenon is that it is dimension free, and thus extends the results from one dimension to product probability spaces. Concentration of measure phenomenon can be explained in terms of sets. It roughly states that, if a set A in probability space Ω has measure at least one half, “most” of the points of Ω are “close” to A. However, we are more often interested in functions rather than sets in order to satisfy Assumption 2 with inequalities of the form (2). Concentration of measure phenomenon has also an

2018 IFAC ADHS 280 Oxford, UK, July 11-13, 2018

Sadegh Soudjani et al. / IFAC PapersOnLine 51-16 (2018) 277–282

interpretation in terms of functions that if a function f on a metric space (Y, d) equipped with a probability measure µ is sufficiently regular, it is very concentrated around its median (hence around its mean). In the following we discuss concentration of measure inequalities based on Poincar´e and log-Sobolev inequalities. We also discuss the modified log-Sobolev inequality for exponential distributions. In measure theory, the focus of these results (e.g., Ledoux (2005)) is mainly on proving the existence of function h(·) in (2). We have improved the concentration of measure results founded on these inequalities and have classified the well-known distributions that satisfy one of these inequalities. Let (Y, B, µ) be a probability space. We denote by E integration w.r.t. µ, and by (Lp ,  · ∞ ) the Lebesgue spaces over (Y, B, µ). We further denote the variance of any function f ∈ L2 by   2 Var(f ) := E (f − E(f ))2 = E(f 2 ) − (E(f )) . (9) If f is a non-negative function on Y , we define the entropy of f w.r.t. µ as Ent(f ) := E(f log f ) − E(f ) log E(f ). (10) In order to have a bounded quantity for Ent(f ) we restrict this definition to functions that E(f log f ) < ∞ with the convention 0 log 0 = 0. Note that Ent(f ) ≥ 0 and that Ent(af ) = aEnt(f ) for any a ≥ 0. Using Jensen’s inequality one can easily show that Ent(f ) ≥ 0. On some subset A ⊆ L2 of measurable functions f : Y → R, consider now a map, or energy, E : A → R≥0 . Definition 14. We say that µ satisfies a spectral gap or Poincar´e inequality w.r.t. the pair (A, E) if there exists C > 0 such that Var(f ) ≤ CE(f ), ∀f ∈ A. (11) Definition 15. We say that µ satisfies a logarithmic Sobolev inequality w.r.t. the pair (A, E) if there exists C > 0 such that Ent(f 2 ) ≤ CE(f ), (12) for every f ∈ A with E(f 2 log f 2 ) < ∞. Having these abstract definitions we assume Y is a metric space (Y, d) equipped with its Borel sigma algebra B. The energy functional E can be selected as    E(f ) := E |∇f |2 = |∇f |2 (y)dµ(y), (13) Y

where |∇f | is the abstract length of the gradient of f , |f (y  ) − f (y)| . |∇f |(y) := lim sup d(y  , y) d(y  ,y)→0

bound 240e−λ0 t is provided in (Naor (2008)). Theorem 16 improves this bound by reducing it to 6.8e−λ0 t . Theorem 17. (Herbst’s Theorem). Let (Y, d, µ) be a metric probability space, energy functional (13), and class A in (14). If µ satisfies the log-Sobolev inequality in (12) with constant C w.r.t. (A, E), then we have concentration inequality 2

µ (|f − E(f )| > t) ≤ 2e−λ1 t , ∀f ∈ A, ∀t > 0, where λ1 = 1/(ρ2 C) and ρ = f Lip .

(16)

One important feature of both variance and entropy defined in (9)-(10) is their product property (Ledoux (1999)). Assume we are given probability spaces (Yi , Bi , µi ), i = 1, 2, . . . , n. Denote their product probability space by (Y, B, µ), where µ := µ1 ⊗ · · · ⊗ µn , Y := Y1 × · · · Yn and B is the product sigma algebra. Given a function f : Y → R on the product space, we define functions fi : Yi → R, with fi (yi ) = f (y1 , . . . , yi−1 , yi , yi+1 , . . . , yn ), with y1 , . . . , yi−1 , yi+1 , . . . , yn being fixed. Under appropriate integrability conditions, we have the following inequalities for variance and entropy n  Eµ (Varµi (fi )) Varµ (f ) ≤ i=1

Entµ (f ) ≤

n 

Eµ (Entµi (fi )) ,

i=1

where Var, Ent, and E are computed w.r.t. the measure written as subscript. This product property tells us that in order to establish a Poincar´e or logarithmic Sobolev inequality in product spaces, it will be enough to deal with dimension one (Talagrand (1995)). In the following we discuss some well-known probability measures that satisfy concentration of measure inequality. 4.1 Gaussian concentration inequality Suppose µ is the standard Gaussian measure on R.The Logarithmic Sobolev inequality Entµ (φ2 ) ≤ 2Eµ φ2 holds for any smooth function φ : R → R. By the product property of entropy, the multivariate Gaussian measure on Rn satisfies (12) with C = 2. Then the concentration inequality (16) holds for Gaussian measure with λ1 = 1/2ρ2 . 4.2 Exponential concentration inequality

2

The subset A ⊆ L will also be A := {f : Y → R, f Lipschitz}, (14) the set of all Lipschitz functions on Y , which implies |∇f |(y) ≤ f Lip for any f ∈ A. Theorem 16. Let (Y, d, µ) be a metric probability space, energy functional (13), and class A in (14). If µ satisfies the Poincar´e inequality in (11) with constant C w.r.t. (A, E), then we have the exponential concentration µ (|f − E(f )| > t) ≤ 6.8e−λ0 t , ∀f ∈ A, ∀t > 0, (15)  with λ0 = ρ1 C2 , and ρ = f Lip .

This inequality is proved in (Ledoux (1999)) with focus on the existence of exponential bound κe−λ0 t and the 280

Consider the exponential measure dµ(y) = 21 e−|y|dy w.r.t.  Lebesgue measure on R. Then, Varµ (φ) ≤ 4Eµ φ2 for any smooth φ. By the product property of variance, product of exponential measures on Rn , denoted by νn , satisfies (11) with C = 4. Then the concentration inequality (15) holds for multivariate exponential measure with √ λ0 = 1/ρ 2. This bound can be further improved for small values of t as    κ1 t κ2 t2 , (17) , 2 νn (|f − E(f )| > t) ≤ 2 exp − min β ρ 1 for every t ≥ 0, with κ1 = 14 , κ2 = 16 , ρ = f Lip , and β ≥ 0 satisfying n  |f (y) − f (y  )| ≤ β |yi − yi |, ∀y, y  ∈ Rn . i=1

2018 IFAC ADHS Oxford, UK, July 11-13, 2018

Sadegh Soudjani et al. / IFAC PapersOnLine 51-16 (2018) 277–282

281

4.4 Measures with Bounded Support A function f on Rn is called separately convex if it is convex in each coordinate, i.e., convex in the directions of coordinate axes. For instance function f (y1 , y2 ) = y1 y2 is separately convex but it is not convex. Another example is f (y) = y T Qy with a symmetric matrix Q. This function is convex only if Q is positive semi-definite, but for being separately convex we only need to ensure that the diagonal element of Q are non-negative.

100

10-1

10-2

10-3

Let f be a separately convex Lipschitz function on Rn with Lipschitz constant f Lip ≤ 1. Then, for every product probability P on [0, 1]n , and every t ≥ 0,

10-4 0

10

20

30

40

50

60

2

Fig. 1. Comparing  = h(t) form three concentration inequalities of exponential measure. Plot is in logarithmic scale. Black curve is the bound in (15) for exponential measure, blue curve is the bound from modified log-Sobolev inequality (17) and red in the improved bound in (18). The proof of (17) is presented in (Bobkov and Ledoux (1997)) and is based on a modified version of log-Sobolev inequality (12). Since we are interested in the tightest possible bound in (17), we improve the above bound in the following theorem. Theorem 18. Constants κ1 , κ2 in (17) can be selected 2 freely with the condition that κ1 , κ2 ∈ (0, 1), (1 −  κ12) = 8κ2 . Bound (17) can also be improved by 2 exp −u (t) with  √  ρ 2 βt u(t) := 1+ 2 −1 . (18) β 2ρ Figure 1 compares three upper bounds (15), (17), and (18) (for one-side concentration, i.e., without factor 2) for values ρ = 6 and β = 1. In this case (18) gives the smallest value of t = h−1 () for all values of . Remark 19. If a function f is almost everywhere differentiable, constants ρ and β can be computed as n  |∂i f |2 ≤ ρ2 and max |∂i f | ≤ β a.e. . i=1

1≤i≤n

4.3 Strong Log-Concave Measures

A measure µ on Rn is strongly log-concave if dµ(y) = µ(y)dy with µ(y) = h(y)cγ(cy), c > 0, where ln h(·) is concave and γ(·) is density function of standard Gaussian measure. Strong log-concavity is preserved under convolution and marginalization (Saumard and Wellner (2014)). A sufficient condition of being strong log-concave is that (− log µ) (y) ≥ λIn for some λ > 0 for all y ∈ Rn . Under this sufficient condition, the measure satisfies both (15) with C = 1/λ and (12) with C = λ. Thus concentration inequalities (17) and (16) are true for this class of measures. Some examples of strong logconcave densities are γ the Gaussian density and • h the logistic density h(y) = ey /(1 + ey )2 ; • h the Gumbel density h(y) = exp(y − ey ); • or µ(y) = zh(y)h(−y) with h being the Gumbel density and z a normalizing constant. A review of log-concave and strong log-concave measures can be found in (Saumard and Wellner (2014)). 281

P (f ≥ E(f ) + t) ≤ e−t /2 . This result enables us to solve CCPs where uncertainty has bounded support and g(x, δ) is separately convex w.r.t δ. 5. CASE STUDY: LQG PROBLEM We demonstrate the effectiveness of our approach on the optimization required in chance-constrained Linear Quadratic Gaussian (LQG) problem. Consider the linear time-invariant dynamical system (19) xk+1 = Axk + Buk + F δk , k = 0, 1, 2, . . .

where xk ∈ Rd is the state, uk is the control input, and A, B, F are matrices with appropriate dimensions. {δk }k is a sequence of independent standard Gaussian random vectors with δk ∈ Rp . Control input uk takes values in a compact set U ⊂ Rm . The objective of LQG problem is to optimize the cost function   L  (xTk Qk xk + uTk Rk uk ) + xTL QL xL , J (x0 , u) := E k=1

over the sequence of control inputs u = [uT0 , . . . , uTL−1 ]T . At the same time, we would like to keep the sequence of states inside a safe region C with high probability:   P [xT1 , . . . , xTL ]T ∈ C|x0 ≥ 1 − . (20)

After expanding the linear dynamics (19) and substituting them in both the objective function J (x0 , u) and constraint (20) we need to solve an optimization of the form ¯ + 2xT0 Ru ¯ + c) min(uT Qu (21) u   ¯ 0 + Bu ¯ + F¯ δ) ∈ C|x0 ≥ 1 − , u ∈ U mL , P (Ax T with δ = [δ0T , . . . , δL−1 ]T ∈ RpL and appropriately defined ¯ ¯ ¯ ¯ ¯ matrices Q, R, A, B, F , c.

This problem is studied in (Hokayem et al. (2013)) addressing closed-loop policies. To keep the presentation focused we only consider open-loop policies with the understanding that our approach is applicable also to closed-loop policies. We select the following numerical values for the system dynamics         −1 10 1 1 0.3 A = 0 −2 , B = 1 , F = σ 1 , x0 = 0.3 ,

with σ := 0.1. We also consider the objective function with horizon L = 5 and matrices Qk = Rk = I2 for all k. For the sake of comparison, we take C as in (Hokayem et al. (2013)) to be an 2 -ball C = {ξ ∈ RdL , ξ2 ≤ r}. In the following we apply our approach to this problem and compare it against (Hokayem et al. (2013)) and the scenario approach of (Grammatico et al. (2016)).

2018 IFAC ADHS 282 Oxford, UK, July 11-13, 2018

Sadegh Soudjani et al. / IFAC PapersOnLine 51-16 (2018) 277–282

Our approach. We have Lipschitz continuous function ¯ + F¯ δ2 − r, ¯ 0 + Bu g(u, δ) = Ax with Lipschitz constant ρ := F¯ 2 . Due to the Gaussian measure h(t) = exp(−t2 /2ρ2 ) thus h−1 () = ρ 2 ln 1/. √ ¯ ¯ Variance of g is bounded by  + 2 Ax 0 + Bu2 with  := T r(F¯ T F¯ ) thus Assumption 5 holds.  ¯ + 2xT Ru ¯ + c) min(uT Qu  0  u,γi   s.t. Ax ¯ + F¯ δ i 2 ≤ γi , i = 1, 2, . . . , N ¯ 0 + Bu N  1  γi + γ + ρ 2 ln 1/ ≤ r. N i=1

    

The safe set is considered to be the 10-dim 2 -ball with radius r = 64 and threshold 0.95 ( = 0.05). Note that the system without input goes out of this ball in expectation, so it is not a large ball for these dynamics. We also consider the input space U = [−5, 5]. By taking input that ¯ 2 , the constraint is feasible with ¯ 0 + Bu minimizes Ax β0 = 2.32. We take γ = 1.16 and number of samples N according to Theorem 6 for confidence 1 − α = 0.99. Approach of (Hokayem et al. (2013)) relies on inequalities that hold only for Gaussian measure and are dimension dependent. Chance constraint of (1) in conservatively replaced by   Lp ln 1/ ¯ 2 + F¯ 2 ¯ 0 + Bu ≤ r, η := 2 , Ax 1−η Lp where η has to be inside the open interval (0, 1). As it is also mentioned in (Hokayem et al. (2013)), this puts a lower bound on the probability threshold  that can be achieved. In this case the constraint is infeasible for all values of threshold  ∈ (0, 0.7), which makes the approach impractical. Scenario approach of (Grammatico et al. (2016)). Since the constraint of this case study can be transformed into a convex constraint and the objective function is already convex, the approach of (Grammatico et al. (2016)) is applicable, which gives the following optimization  ¯ + 2xT0 Ru ¯ + c) min(uT Qu u

¯ + F¯ δ i 2 ≤ r2 , i = 1, 2, . . . , N. ¯ 0 + Bu s.t. Ax 2 Number of samples N = 345 is selected according to (Grammatico et al., 2016, Eqn. (5)) that depends on the required confidence α, probability threshold , and dimension of decision variables. Comparison. Approach of (Hokayem et al. (2013)) is infeasible for values of threshold  ∈ (0, 0.7). We run our scenario program and that of (Grammatico et al. (2016)) 500 times. Our SP is feasible in all runs and the optimal value has mean 364.4 and standard deviation 9×10−10 . SP of (Grammatico et al. (2016)) is infeasible in 25 runs (5% of the cases) and the optimal values in feasible runs have mean 368.6 and standard deviation 23.7. Note that here we have not used sampling and discarding (Campi and Garatti (2011)), which will improve the optimal values. The main weakness will be getting infeasible SP. This results in violation of recursive feasibility (Morari et al. (2014)), which is a standing assumption naturally made when applying scenario optimization in model predictive control framework. Note that by increasing number of samples, probability of getting an infeasible SP will decrease in our approach, while this probability will increase in SP of (Grammatico et al. (2016)) due to adding more constraints. 282

6. CONCLUSION In this work we proposed a scenario program for solving chance-constrained optimizations. Our approach does not require convexity of the objective or constraint function but relies on knowing that uncertainty has concentration of measure property. Instead of satisfying constraints for all observed samples of uncertainty, we allow violation of constraints but we require that in average the value of constraints be away from their boundaries. Concentration of measure enables us to specify how much it should be away in order to guarantee having a feasible solution for the original optimization with certain confidence. We benchmarked our technique against approaches from literature on LQG control problem. REFERENCES Alon, N. and Spencer, J. (2016). The Probabilistic Method. Wiley, 4th edition. Barvinok, A. (1997). Measure concentration in optimization. Mathematical Programming, 79(1), 33–53. Bobkov, S. and Ledoux, M. (1997). Poincar´e’s inequalities and Talagrand’s concentration phenomenon for the exponential distribution. Probability Theory and Related Fields, 107, 383–400. Calafiore, G.C. (2010). Random convex programs. SIAM Journal on Optimization, 20(6), 3427–3464. Campi, M.C. and Garatti, S. (2011). A sampling-anddiscarding approach to chance-constrained optimization: Feasibility and optimality. Journal of Optimization Theory and Applications, 148(2), 257–280. Campi, M.C. and Garatti, S. (2008). The exact feasibility of randomized solutions of uncertain convex programs. SIAM Journal on Optimization, 19(3), 1211–1230. Dubhashi, D. and Panconesi, A. (2009). Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, New York, NY, USA. Grammatico, S., Zhang, X., Margellos, K., Goulart, P., and Lygeros, J. (2016). A scenario approach for nonconvex control design. IEEE Transactions on Automatic Control, 61(2), 334–345. Hokayem, P., Chatterjee, D., and Lygeros, J. (2013). Chance-constrained LQG with bounded control policies. In 52nd IEEE Conference on Decision and Control, 2471–2476. Ledoux, M. (2005). The Concentration of Measure Phenomenon. Mathematical surveys and monographs. American Mathematical Society. Ledoux, M. (1999). Concentration of measure and logarithmic Sobolev inequalities. Sminaire de probabilits de Strasbourg, 33, 120–216. Morari, M., Fagiano, L., Schildbach, G., and Frei, C. (2014). The scenario approach for stochastic model predictive control with bounds on closed-loop constraint violations. Automatica, 50(12), 3009–3018. Naor, A. (2008). Lecture notes on concentration of measure. Pr´ekopa, A. (1995). Stochastic Programming. Mathematics and Its Applications. Springer Netherlands. Royden, H. (1988). Real Analysis. Mathematics and statistics. Macmillan. Saumard, A. and Wellner, J.A. (2014). Log-concavity and strong log-concavity: A review. Statist. Surv., 8, 45–114. Talagrand, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Publica´ tions Math´ematiques de l’Institut des Hautes Etudes Scientifiques, 81(1), 73–205.