Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant

Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant

ARTICLE IN PRESS JID: CACE [m5G;October 19, 2018;4:12] Computers and Chemical Engineering xxx (xxxx) xxx Contents lists available at ScienceDirect...

5MB Sizes 0 Downloads 2 Views

ARTICLE IN PRESS

JID: CACE

[m5G;October 19, 2018;4:12]

Computers and Chemical Engineering xxx (xxxx) xxx

Contents lists available at ScienceDirect

Computers and Chemical Engineering journal homepage: www.elsevier.com/locate/compchemeng

Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plantR,RR Yi Zhang, Xuanzhi Jin, Yiping Feng, Gang Rong∗ State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China

a r t i c l e

i n f o

Article history: Received 26 June 2017 Revised 24 September 2017 Accepted 22 October 2017 Available online xxx Keywords: Data-driven Robust optimization Correlated uncertainty Production scheduling Copula Fuel gas

a b s t r a c t To hedge against the fluctuations generated from continuous production processes, practical solutions can be obtained through robust optimization induced by the classical uncertainty sets. However, uncertainties are sometimes correlated in industrial scheduling problems because of the connected process and various random factors. To capture and enrich the valid information of uncertainties, copulas are introduced to estimate the joint probability distribution and simulate mutual scenarios for uncertainties. Cutting planes are generated to remove unnecessary uncertain scenarios in the uncertainty sets, and then robust formulations induced by the cut set are proposed to reduce conservatism and improve the robustness of scheduling solutions. A real-world process of ethylene plant is introduced as the numerical case, and high-dimensional data-driven uncertainty sets are illustrated in detail. The proposed models are proved to control the fluctuation of consumed fuel gas below a lower level of conservatism. © 2017 Elsevier Ltd. All rights reserved.

1. Introduction Uncertainties naturally appear from the continuous production process, causing fluctuations in the production performance, and this brings difficulties to the implementation of the traditional deterministic scheduling solutions (Sahinidis, 2004). To obtain robust and effective production plans, various kinds of uncertainties are considered in different forms (Sahinidis, 2004; Li et al., 2011; Lin et al., 2004; Janak et al., 2007). For example, external uncertainties (random price and demand) are included when the objective is to obtain the most economical solution, which is immune to the changing market. Internal uncertainties (unstable yield and capacity of equipment) are researched when decision-makers are trying to find the most operational and applicable solution. However, due to the lack of information, uncertainties are previously assumed as bounded intervals or with given probability distributions. Sometimes a series of discrete scenarios are generated to describe possible values of uncertainty, based on which stochastic programming (SP) (Prékopa, 2013; Zhou et al., 2013) was defined.

R A publisher’s error resulted in this article appearing in the wrong issue. The article is reprinted here for the reader’s convenience and for the continuity of the special issue. For citation purposes, please use the original publication details; Computers and Chemical Engineering, 109(2017), pp. 48–67 RR DOI of original item: 10.1016/j.compchemeng.2017.10.024 ∗ Corresponding author. E-mail address: [email protected] (G. Rong).

When the probability density and distribution of uncertainties are available, constraints are acceptable to be violated at a certain level, where the concept of confidence level for constraints is introduced (Charnes and Cooper, 1959; Miller and Wagner, 1965). Normally, stochastic programming could enumerate nearly all the uncertain scenarios, which are effective for dealing with most of the scheduling problems without the preparation of probability information. But chance-constrained approach is more helpful to reveal the extent of the distribution of uncertainties and their impact on the objective (Mesfin and Shuhaimi, 2010; Jiao et al., 2012). Feasibility and optimality are always regarded as two conflicting objectives in the optimization problems under uncertainty. When constraints are satisfied for most of the random values, the objective tends to become more conservative. Thus, robust optimization provides a reliable perspective to discuss the robustness and conservatism of solutions in algebraic models (Ben-Tal and Nemirovski, 2002; Beyer and Sendhoff, 2007; Bertsimas et al., 2011). As the uncertainty is considered as a combined term of a nominal part and an uncertain part, the uncertain part can be described by an uncertainty set, where uncertain scenarios scatter within the set and the nominal scenario is the center of the abstract space. The Euclidean norm-induced sets have been successfully applied to describe bounded uncertainties (Bertsimas et al., 2004), and corresponding robust formulations were derived and implemented to many industrial problems (Li et al., 2011). To reduce the conservatism and improve the robustness of solutions obtained from the above optimization approaches,

https://doi.org/10.1016/j.compchemeng.2017.10.039 0098-1354/© 2017 Elsevier Ltd. All rights reserved.

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE 2

ARTICLE IN PRESS

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Indices Indices f r c t cpk Sets FUf FIf,r

cracking furnaces (BA105–115, BA1101–1104) raw materials (subset of c) commodities (energy resources) time periods (10 days) cutting planes the collection of furnaces with uncertain energy consumption rates the set of input material r for furnace f

Parameters a˜ f,r uncertain fuel gas consumption rate for furnace f when processing material r 1−γ a f,r nominal value of fuel gas consumption rate for fur-

aˆmax f,r bf,r cf,r ctf,r

β cpk,f,r dcpk

nace f when processing material r with the confidence level 1 − γ maximum width of the uncertain interval of consumption rate a˜ f,r coefficient of the processed dilution steam for furnace f when processing material r adjustable parameter of the uncertainty a˜ f,r in flexible uncertainty set constant for the equation of calculating consumed fuel gas for furnace f when processing material r normal vector for the cpkth cutting plane introduced to the flexible uncertainty set of a˜ f,r the constant in the equation of the cpkth cutting plane

Variables FCf,r,t the amount of material r consumed by furnace f in time period t DSf,r,t the amount of dilution steam consumed by furnace f when processing material r in time period t FPPfuel,t the amount of fuels produced in period t PC fuel,t the amount of fuels purchased in period t FGUf,r,t positive variable related to the perturbation of fuel gas consumption of furnace f when processing material r in time period t FGSt positive variable representing the maximum perturbation of fuel gas consumption for all the furnaces at time period t τ cpk positive variable introduced as weights for the cpkth cutting plane

industrial data is introduced to constraints, which results in the data-driven chance constrained programming (Jiang and Guan, 2016; Calfa et al., 2015; Zhang et al., 2016a), distributionally robust optimization (DRO) (Delage, 2009; Delage and Ye, 2010), etc. Motivated by the estimated probability density and distribution of uncertainties, the uncertainty set in robust optimization can be constructed based on the distributions with confidence level. In the data-driven uncertainty set, redundant and unreasonable scenarios could be eliminated, thus the data-driven robust optimization will indirectly improve the quality of solutions. In order to make sure the solutions are feasible in real-world problems, the priori and posterior probability bounds on the violation of constraints are derived for different kinds of set-induced formulations (Guzman et al., 2016; Li et al., 2012; Zhang et al., 2016b). Focusing on production scheduling problems in process industry, we have proposed a novel description to transform continuous uncertain-

ties into bounded forms with confidence levels (Zhang et al., 2016a,b). Then flexible uncertainty sets are defined to describe high-dimensional uncertain scenarios for independent uncertainties, and a series of novel robust formulations are derived. The new robust models have been proved less conservative and provide tight probability bounds on the violation of constraints, as compared to the classical norm-set-induced models. It is widely assumed that uncertainties are independent of each other. Unfortunately, it seems impossible that one uncertain parameter could not exert on other uncertainties. For instance, the change of the price of crude oil – naphtha (NAP) will cause fluctuation in the price of other kinds of crude oils. As a result, this fluctuation will cause further changes in the demand of crude oils and related commodities. These phenomena have been rarely researched carefully in the industrial field, but there have been various practical researches of optimization problems in financial fields, such as the classical portfolio optimization problems. Lauprete et al. (2003) and Ghaoui et al. (2003) firstly researched the robust formulation for portfolio problems, and the marginal distributions of historical returns were modeled by copulas. Later the copulas were utilized to generate scenarios for the calculation of portfolio’s variance and conditional value at risk (CVaR) (Bai and Sun, 2007; Kakouris and Rustem, 2014). Considering the presence of correlated financial returns, the copula-based approach was further improved by Boubaker’s work (Boubaker and Sghaier, 2013). For industrial problems, Papaefthymiou and Kurowicka (2009) have tried to model the dependence structure of uncertainties in the power system, which motivates us to investigate the influence of the correlation relationships between uncertainties in the scheduling model. Deterministic solutions are regarded as theoretical optimal at most time, and robust solutions provide references for decisionmakers, which may not be optimal but feasible and applicable. It is always neglected that stricter description of uncertainties could also create great profits. The full coverage of uncertain values usually leads to unpractical and conservative results, and improper simplification of uncertainty scenarios will cause infeasibility when the solutions are implemented to the volatile production process. Thus historical data should be introduced to correct the description of uncertainties and drive the reformulation of constraints with uncertainties. In this paper, the uncertainty set induced robust formulations are firstly reviewed in Section 3, where the robust counterparts induced by the flexible uncertainty sets are listed. To deal with the correlation relationship between uncertainties in industrial problems, the copula-based methods are introduced. The cut sets of flexible uncertainty sets are proposed in Section 4. Then in Section 5, the new robust formulations induced by cut sets are derived for linear programming (LP) and mixed-integer linear programming (MILP) problems. Through the real-world ethylene plant example in Section 6, the correlations between uncertain consumption rates of furnaces are clearly analyzed, and finally, the performance of the new robust model is discussed and concluded in detail. 2. Problem statement The diagram of the given production process of an ethylene plant is illustrated in Fig. 1. The detailed interpretation of processes and materials are presented in Section 6. Here the relationships between uncertainties of the production and utility system are firstly demonstrated. The balance of energy resources is the strongest connection between the production system, utility system, and markets. Normally uncertainties are classified according to the sources, which results in external uncertainties (such as price and demand uncertainty) and internal uncertainties (contains yield and capacity uncertainty, etc.) (Li and

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE

ARTICLE IN PRESS

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

3

Fig. 1. Illustration of correlated uncertainties generated from the industrial production process.

Ierapetritou, 2008). Apparently, the uncertainties generated from furnaces probably exert on each other because of their mutual links to the energy network, containing steam, electricity, water and fuels. In this paper, correlated uncertainties in production scheduling problems are mainly researched. These correlations usually arise from the following production conditions: the causal relationship between upstream and downstream equipment, the close connection between the same kind of equipment, the similar fluctuation from the mutual energy networks, etc. No matter how the uncertainties exert on the production, industrial data can always provide valuable information for the improvement of the robust scheduling model. By utilizing several months’ industrial data, the probability density of uncertainties could be estimated through kernel-based approaches. The correlation relationship will be deeply analyzed by the copula-based method. To reduce the conservatism and improve the robustness of the robust scheduling model, the scheduling problem under correlated uncertainties will be discussed by the following steps: •











STEP 1: Mine the correlation relationships between uncertainties generated from the continuous production process; STEP 2: Estimate the joint probability density of correlated uncertainties; STEP 3: Construct the cut set for correlated uncertainties to remove unnecessary uncertain scenarios; STEP 4: Formulate the robust models induced by the cut sets and solve them; STEP 5: Compare the conservatism and robustness between the newly-obtained robust solutions with the previous sets induced results; STEP 6: Analyze the gap between robust solutions and the realworld performance, and finally provide reference engineering strategies.

Before the numerical cases are discussed, the basic theories will be illustrated in details by the following sections.

3. Uncertainty sets induced robust optimization In previous optimization researches, uncertainties were usually assumed to be independent. However, in production scheduling problems, many uncertainties are associated with an equipment, a process or the entire production network, which makes them correlated and difficult to be separated. In this paper, the classical robust optimization approach (Bertsimas et al., 2004) is extended for correlated uncertain scenarios. At first, the flexible uncertainty sets and corresponding robust formulations are briefly reviewed (Zhang et al., 2016b), which is mainly for dealing with problems under continuous uncertainties.

3.1. Flexible uncertainty sets According to the classical definition, the robust linear programming (LP) problems induced by uncertainty sets can be defined as Eq. (1).

max

z = c x

s.t.

 ≤ b˜ Ax

A˜ , b˜ ∈ U

(1)

x∈X where x ∈ Rn×1 , c ∈ Rn×1 and X represents a computable bounded convex set in Rn ; A˜ ∈ Rm×n , b˜ ∈ Rm×1 represents the matrix uncertainty and right-hand side (RHS) uncertainty, respectively. All the uncertainties are subject to a known set U. Similarly, the robust

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 4

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

mixed-integer linear programming (MILP) problems are defined as Eq. (2).

z = cx + dy   a˜ij x j + b˜ ik yk ≤ p˜ i

max s.t.

j

a˜ij , b˜ ik , p˜ i ∈ U  , ∀i

k

(2)

∀ j, k

x j ∈ X, yk ∈ {0, 1}

For continuous uncertainties, empirical distributions can be estimated when the data sources are measurable. Then the continuous uncertainties can be transformed to bounded ones with confidence levels, which can be defined as Eq. (3). 1−γij

a˜ij = aij

+ ξij aˆmax , ij

  ξij ∈ −cij , cij

(3)

here ξ ij is an interval variable to control the range of uncertainties; the bounds of ξ ij is controlled by parameter cij , which equals 1−γij

to the ratio of the confidence interval’s width aˆij

and the maxi-

mum span aˆmax of the uncertain interval. Once the confidence level ij 1 − γ ij is chosen, the adjustable parameter cij will be determined and the bounded form for the continuous uncertainty will be ob1−γ 1−γ 1−γ 1−γ tained as [aij i − aˆij i , aij i + aˆij i ]. In order to control every uncertain parameter independently, flexible uncertainty sets are defined based on the classical norminduced uncertainty sets, which are shown in Table 1. The uncertainties in the flexible sets are controlled by independent adjustable parameters, which are related to the chosen confidence level.

3.2. Flexible sets induced robust formulations When uncertainties are considered as bounded ones, the equivalent formulations of the ith constraint in Eqs. (1) and (2) are shown as Eqs. (4) and (5), respectively.



 aij x j + max

j



ξ ∈U

j





aij x j +



 ξij aˆij x j − ξi0 bˆ i

≤ bi

(4)

j∈Ji

 bik yk + max



ξ ∈U

k

ξij aˆij x j +



 ξik bˆ ik yk − ξi0 pˆ i ≤ pi

k∈Ki

j∈Ji

cannot reflect the exact distribution of uncertain scenarios. Thus the uncertainty sets and formulations should be improved for correlated uncertainties.

4. Copula-based analysis of uncertainties To obtain the joint probability density function (PDF) from univariate probability distributions, the Sklar’s theorem proved that if F( · ) is a multivariate probability distribution function whose marginal distributions have distributions F1 (x1 ), . . . , Fn (xn ), then there exists a copula function C( · ) such that F (x1 , . . . , xn ) = C (F1 (x1 ), . . . , Fn (xn )) (Nelsen, 2015). The copula theory is mostly applied in financial fields, such as monitoring the market risk and measuring the credit risk of loans (Possolo, 2010). However, for random parameters in industrial problems, the correlations between uncertainties are still under studying.

4.1. Copula-based method Assuming that every two uncertainties are correlated at a different level, the correlation coefficient should be calculated and analyzed at first. Kendall, Pearson and Spearman coefficients are applied the most in the rank correlation analysis of random variables. In this paper, the Kendall’s correlation coefficient is chosen for the correlation analysis. In some special cases, we care more about the tail-dependence between uncertainties. It is important to note that the collection of massive industrial data has to be preprocessed before the correlation analysis. Beneficial from the copula theory, there are two most frequently-used copula families, which are elliptical copulas and Archimedean copulas. For problems under less than three uncertainties, the elliptical copula classes, proposed by Fang et al. (1990), are implemented the most with dispersion structures. Archimedean copulas can be constructed through various generators, which can be found in Nelsen’s work (Nelsen, 2015). High-dimensional uncertainties (dimension p > 3) always appear correlated in a single constraint of the industrial optimization problem, and some commonly-used one-parameter families, presented in Table 3, provide convenience for estimating the multivariate cumulative distribution function (CDF). As shown in Fig. 2, after the basic analysis of massive industrial data, four major steps need to be accomplished before we obtain the joint PDF of correlated uncertainties, which are: •

(5) Since the mathematical descriptions for LP and MILP problems are similar, the following parts are mainly illustrated for linear programming problems. The description of MILP problems can be found in the Supplementary material. If only left-hand-side (LHS) uncertainties are considered, the robust counterpart in Eq. (4) can be simplified as

 j





aij x j + max ξ ∈U



 ξij aˆij x j



≤ bi

(6)

j∈Ji

As proposed in our previous work, the flexible uncertainty sets induced formulations for the robust counterpart in Eq. (6) are concluded in Table 2. As mentioned in the definition of flexible uncertainty sets, the uncertainties are considered independently. If the uncertainties in production scheduling problems are correlated, the uncertain values generated from the original flexible uncertainty sets probably



STEP 1: Choose copulas and estimate the multivariate CDF: Through preprocessing the massive data, the dimension of uncertainties and the correlation between them should be clearly discussed. Considering various copula functions’ properties, such as sensitivity to the change on the upper/lower tail, kinds of copulas can be chosen for different uncertain scenarios. Normally, the Archimedean copulas are applied to estimate the joint PDF of high-dimensional uncertainties. Then the copula parameter can be estimated through the dependence measurements given by copulas, which are shown in Table 4. STEP 2: Generate correlated uncertain scenarios: By applying the Monte-Carlo method for the simulation of correlated uncertain scenarios, we can outline the uncertain space, which covers all the generated uncertain values. For correlated uncertainties, random scenarios always gather around the most possible values. This is worth to be focused and described by novel uncertainty sets. STEP 3: Hypothesis testing of the estimated CDF: There is always deviation between the estimated cumulative distribution function (CDF) and the exact distribution, especially when the estimation result is obtained from empirical data.

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

5

Table 1 Classical and flexible uncertainty sets. Type

Mathematical form

Classical box

U∞ = {ξ |ξ ∞ ≤  } = {ξ ||ξ j | ≤  , ∀ j ∈ Ji }



F = {ξ | ξc ∞ ≤ 1} = U∞ {ξ ||ξij | ≤ cij , ∀ j ∈ Ji }

cij =

Classical ellipsoidal

ξ |ξ 2 ≤ } = U

2 = { 2 ξ| j∈Ji ξ j ≤ 



Flexible ellipsoidal

U2F = {ξ | ξc 2 ≤  ξij 2

Flexible box

ξ|

Adjustable parameter



|Ji |} =   ≤ ( ) |Ji | j∈Ji cij

cij =

Classical polyhedral

U  1 = {ξ |ξ 1 ≤ } = ξ | j∈Ji |ξ j | ≤ 



Flexible polyhedral

U1F = {ξ | ξc 1 ≤ |Ji |} =



ξ | j∈Ji | ξcij | ≤ |Ji |

cij =

1−γi

aˆij

aˆmax ij

1−γi

aˆij

aˆmax ij

1−γi

aˆij

aˆmax ij

ij

Note: Ji represents the set of j-indexes of uncertain parameters in the ith constraint; |Ji | is the cardinality of the set Ji .

Table 2 Flexible uncertainty sets induced robust formulations. Type

Robust formulations   1−γi 1 −γ aˆij i u j ≤ bi , ∀i aij x j +

Flexible box

j

j∈Ji

−u j ≤ x j ≤ u j , ∀ j ∈ Ji

   1−γ 2  1 −γ aij i x j + |Ji | · aˆij i x2j ≤ bi , ∀i j j∈J

Flexible ellipsoidal

i

⎧ 1−γi ⎨ j aij x j + |Ji | · pi ≤ bi , ∀i 1 −γ pi ≥ aˆij i u j , ∀ j ∈ Ji ⎩ −u j ≤ x j ≤ u j , ∀ j ∈ Ji

Flexible polyhedral

Table 3 Commonly used one-parameter Archimedean copulas. Family Clayton (1978) Gumbel (1960) Frank GS

Copula C (u1 , . . . , un )

 n −1/θ −θ , θ ∈ ( 0, ∞ ) i=1  ui − n + 1 1/θ  n θ exp − ( − ln u ) , θ ∈ [1, ∞ ) i i=1   n −θ ui ( e −1 ) − θ1 ln 1 + i=1e−θ −1 , θ ∈ ( 0, ∞ )  1/θ !−1 θ n −1 − 1) 1+ , θ ∈ [1, ∞ ) i=1 (ui

Generator ϕ (t) 1 −θ − 1) θ (t

(− ln t )θ 

− ln

e−θ t −1 e−θ −1



θ

(t −1 − 1 )

Table 4 The relationships between Kendall coefficient and the Archimedean copula parameter. Copulas Kendall coefficients

Clayton θ θ +2

Gumbel θ −1 θ

Generally, the precision of the copula-based estimation could be increased when the sampling size of random scenarios is getting larger. In fact, the hypothesis testing of the estimated joint CDF is to compare the two statistical datasets. One of them is simulated in the last step, and another is sampled from the real-world uncertainties. The relevant null and alternative hypotheses are

Frank

"

1−4

0

θ



t dt − θ //θ 2 et − 1

GS 1 − 32θ

stated as follows:

H0 : F (x ) = F0 (x ) H1 : F (x ) = F0 (x ) where F0 (x) represents the exact cumulative distribution of correlated uncertainties; F(x) is the copula-based empirical estimation of the joint CDF obtained from the sampled uncertain

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 6

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20 Table 5 Cut set of flexible uncertainty sets. Type

Cut set of flexible uncertainty set

Flexible box

CS U∞ =

Flexible ellipsoidal

U2CS =

Flexible polyhedral

U1CS =

 #  #ξ /c∞ ≤ 1 = ξ ## f cp (ξ ) + d ≥ 0  #  #ξij /cij ∞ ≤ 1, ∀ j ∈ Ji ξ ## f k (ξij ) + dk ≥ 0, ∀k, j ∈ Ji  #   #ξ /c2 ≤ |Ji | # ξ# = f (ξ ) + d ≥ 0  # cp   #ξij /cij 2 ≤ |Ji |, ∀ j ∈ Ji # ξ# f k (ξij ) + dk ≥ 0, ∀k, j ∈ Ji  #  #ξ /c1 ≤ |Ji | = ξ ## f cp (ξ ) + d ≥ 0  #  #ξij /cij 1 ≤ |Ji |, ∀ j ∈ Ji ξ ## f (ξ ) + d ≥ 0, ∀k, j ∈ J k

ij

k

i

Note: In this table, the cut sets are defined as the intersection of the flexible set and CS = U∞ ∩ UConSet , U2CS = U2 ∩ UConSet , U1CS = U1 ∩ UConSet . cutting planes, which are U∞

4.2.1. Definitions of the cut set

Fig. 2. Flowchart of the copula-based method.



scenarios. The non-parametric Kolmogorov–Smirnov (K–S) testing approach and Chi-square (χ 2 ) fitting test are widely used in this part. STEP 4: Decide if the estimation result is acceptable. If the hypothesis testing result shows that the observed value is not beyond the significance level (α ), we accept the estimated CDF for the further study of robust optimization under correlated uncertainties. Otherwise, if the observed value is in the critical region defined by α , the null hypothesis has to be rejected and we should turn back to step 1 to revise or reselect the copulas until we get a satisfied estimation result.

For complex uncertain scenarios, the copula-based method needs to be adjusted to the dimension of uncertainties, the distribution characteristics and the required precision. The details of the method will be illustrated through numerical examples in the following sections.

4.2. Cut set of flexible uncertainty sets Different from the independent uncertain scenarios in the previous uncertainty set, correlated scenarios presents aggregation features, and much redundant space is created in the flexible uncertainty set. Many extreme and isolated scenarios in the original flexible set become invalid as a result of the strong correlations among uncertainties, and these points will lead to the waste of computing resources. In some cases, the negligible scenarios would formulate the worst case for robust optimization, which will unfortunately increase the conservatism. Thus to eliminate unnecessary uncertain scenarios, the cut set of flexible uncertainty sets is constructed by introducing cutting planes.

Definition 1 (Cut set of flexible uncertainty set). The cut set of flexible uncertainty set is described by the intersection between the original flexible set and a convex set constructed by a series of cutting planes, which is defined as

UFCS

 #  #ξ /c ≤ # = UF ∩ UConSet = ξ #   fcp ξ + d ≥ 0  #  #ξij /cij  ≤ , ∀ j ∈ Ji # = ξ#   fk ξij + dk ≥ 0, ∀k, j ∈ Ji

(7)

where UF and UConSet represent flexible uncertainty set in Table 1 and the convex set defined by cutting planes, respectively. In Eq. (7), the adjustable parameter 

is decided by the type of norm ( = 1 for ∞ − norm, = |Ji | for 2 − norm, = |Ji | for 1 − norm); |Ji | is the cardinality of the set J i ; fcp (ξ ) + d ≥ 0 refers to the combination of spaces constructed by cutting planes fk (ξij ) + dk = 0, for k = 1, 2, …, q, and q represents the total number of cutting planes introduced to the flexible set. Normally, q is chosen according to the dimension of uncertainties (n), such as q = 2n . Here, when all the uncertainties’ confidence levels are equal to one, cij = 1 for any j ∈ Ji and the original flexible uncertainty set covers all the uncertain scenarios. Then for flexible box, ellipsoidal, and polyhedral uncertainty set, the corresponding cut set are described as shown in Table 5. Here, the cut set can also be regarded as a new kind of uncertainty set. When two uncertainties are considered as a˜i = ai + ξi aˆmax ,i = i 1, 2, the transformation of the uncertainty set is illustrated by Fig. 3 from the geometric view. In Fig. 3, the boundaries of classical sets remain regular shapes when the adjustable parameter changes. In flexible uncertainty sets, the adjustable parameter ( , ,  ) in the classical set is replaced by the independent cij , which is decided by the chosen confidence level of uncertainties. When the distribution of uncertainty varies, the bounds of the flexible set on each dimension will be quite different, such as c1 = c2 in Fig. 3, and this forms a more scalable uncertainty set. Since there is much redundant space in the flexible set for correlated uncertainties, some sets tend to be

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20 •







7

Step 1: Initialize the threshold value ε as zero, and determine the percentage of uncertain points to be cut off, such as 1%; Step 2: Solve the model in Eq. (9) to generate the initial cutting plane, and select the furthest point aε among the 1% nearest scenarios from this plane; Step 3: Update the threshold variable ε with the distance between the initial cutting plane and the point aε ; Step 4: The updated threshold value could be used for generating the final cutting plane.

Obviously, Eq. (9) can be transformed to the equivalent model as following:

min

z=

M 

Dm

m=1

Dm =

s.t.

um , ∀m pk

$ % L % 2 pk = vk 2 = & vkj j=1

vTk asample + d k ij,m pk

(10)

≥ ε , ∀m

−um ≤ vTk asample + d  k ≤ um , ∀m ij,m

Fig. 3. Geometric view of the transformation of uncertainty sets (classical set–flexible set–cut set).

um ≥ 0, ∀m shrunk or cut for obtaining less conservative solutions through robust optimization. Thus cutting planes are generated and introduced to the flexible set to form stricter ones. When the dimension of uncertainties equals two, the cutting planes become secants in Fig. 3 as blue-colored solid lines, and the stricter set is shown as the shadow area. 4.2.2. Generation of cutting planes To maintain the convexity of the uncertainty set, linear cutting planes are introduced as

WT ξ + d =0 where

WT

=

(8) [wT1 ; . . . ; wTq ]

= [wkj ]q×L represents the matrix of lin-

ear coefficients of cutting planes; d = [d1 ; . . . ; dq ] = [dk ]q×1 refers to the vector of constants. Thus the cutting planes can be described as j∈Ji wkj ξ j + dk = 0, ∀i, k = 1, . . . , q. Here, q is the amount of cutting planes and L is the cardinality of the uncertainty set Ji for the ith constraint. Here, we propose an optimization model for generating the kth cutting plane (k = 1, …, q) as follows:

min

z=

M 

Dm =

# sample # #vT a + d k # k ij,m

vk 2

vTk asample + d k ij,m

vk 2

sample

where aij,m

Cut t ingP lane : fk (ξ ) = wkj ξij + dk = 0, ∀i, j ∈ Ji , k

⎧ max ⎪ ⎨wkj = aˆij · vkj , ∀i, j ∈ Ji , k ⎪ ⎩dk =



(11)

aij · vkj + d k , ∀i, k

j∈Ji

The model in Eq. (10) will be calculated for every independent space k ∈ UConSet of the uncertainty set, and the amount of cutting planes could be decreased in the following cases:

Dm •

m=1

s.t.

Since the dimension of uncertainties equals to L, there are L elements in the normal vector vk . 2L cases are constructed by setting the status of each element in vk to nonpositive or nonnegative. In other words, for each solve of the model in Eq. (10), we assign a new orientation to the normal vector. This idea is clearly illustrated through Fig. 4. For each combination of element status, the model in Eq. (10) will be resolved and a new cutting plane is generated, which means the model in Eq. (10) will be solved for 2L times. The cutting planes for the uncertainty set of ξ ij can be derived by following equations:

, ∀m

(9) •

≥ ε , ∀m

∈ Rn×1 refers to the sampling points of the uncer-

tainty a˜ij = aij + ξij aˆmax , ∀i, j ∈ Ji and the sample size is M; Dm ij represents the distance between the sample point and the cutting plane vTk aij + dk = 0. It is noteworthy that the threshold value ε should be determined by an inner iteration process, which is shown as the blue part in Fig. 14. The brief illustration for this inner iteration is as follows:



For the adjacent normal vector, where only one element is under different status, the corresponding cutting plane could be the same (for example, the only different element equals to zero). Thus these repetitious planes could be abandoned. Some cutting planes are constructed outside the closed convex set formed by the other planes, which makes the former planes become redundant for the cut set. No cutting planes can be found when there are no random values lying in the corresponding spaces.

Remark. The above procedure is a heuristic method, which is effective for cutting off most of the unnecessary uncertain scenarios, but it cannot guarantee the tight outer-approximation of the most uncertain scenarios. This is why we keep applying the closed flexible set to roughly cover the random values at first.

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 8

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Fig. 4. Assigning status for the element in the normal vector.

Fig. 5. Simulated uncertain scenarios for the numerical cases.

Table 6 Cutting planes for the three-dimension case. Cutting planes: f k (ξ ) + dk = 0, k = 1, 2, . . . , 23 0.67ξ β + 0.099 = 0 0.125ξ 11 − 0.665ξ β + 0.066 = 0 −0.073ξ 12 + 0.67ξ β + 0.065 = 0 0.131ξ 11 − 0.005ξ 12 − 0.664ξ β + 0.067 = 0

k=1 k=2 k=3 k=4

−0.051ξ 11 − 0.038ξ 12 + 0.669ξ β + 0.057 = 0 0.106ξ 12 − 0.669ξ β + 0.093 = 0 −0.069ξ 11 + 0.668ξ β + 0.065 = 0 −0.67ξ β + 0.093 = 0

k=5 k=6 k=7 k=8

The above generation approach will be illustrated by twodimensional and three-dimensional numerical cases in the next part.

Similarly, if β˜ ∼ N (5, 0.05 ) is regarded as another correlated uncertainty with the correlation matrix as







1

Kendall a˜11 , a˜12 , β˜ = ⎣0.5

4.3. Motivating example

0.5 The first numerical example is introduced from Li’s work (Li et al., 2011), which contains two decision variables and two inequality constraints. There are two uncertainties in each constraint, defined as a˜i1 and a˜i2 in the ith constraint.

max

z = 8x1 + 12x2

s.t.

a˜11 x1 + a˜12 x2 ≤ 140 a˜21 x1 + a˜22 x2 ≤ 72

(12)

x1 , x2 ≥ 0 Assuming that a˜11 ∼ N (10, 0.11 ), a˜12 ∼ N (20, 0.44 ), a˜21 ∼ N (6, 0.04 ), a˜22 ∼ N (8, 0.071 ), and the spans of the uncertainties equal to three times the standard deviation, which is consistent with the range of original bounded parameter. For example, when a˜11 and a˜12 are correlated with the Kendall correlation coefficient τ = 0.8, the 20 0 0 simulated scenarios are presented as red points in Fig. 5(a), where the black points are simulated under the assumption that the uncertainties are independent of each other.



0.5

0.5

1

0.5⎦

0.5

1

Then 20 0 0 simulated scenarios are presented as red points in Fig. 5(b), and the black points still represent the independent scenarios. Apparently, the correlated uncertain points gather more intensively around high-frequency values. Through the generation approach of cutting planes proposed in the previous section, the parameters of planes are obtained as follows. For the two-dimensional case, the four secants for the uncerT tainty set of a˜11 and a˜12 can be obtained as V · [a˜11 , a˜12 ] + d = 0, where

⎡ V=

1

0

−1

0

⎢0.861   ⎢ vkj 4×2 = ⎢ ⎣−0.922







−9.372

⎢1.955 −0.508⎥ ⎥    ⎢ ⎥, d = d k 4×1 = ⎢ 0.386 ⎦ ⎣1.843

⎥ ⎥ ⎥ ⎦

10.653

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE

ARTICLE IN PRESS

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Then the secants for the uncertainty set defined by ξ derived by Eq. (11) as

9

ij

can be

⎧ f1 (ξ ) : ξ11 + 0.628 = 0 ⎪ ⎪ ⎪ ⎨ f (ξ ) : 0.861ξ − 1.016ξ + 0.405 = 0 2 11 12 fk (ξ )k=1,...,4 = ⎪ f3 (ξ ) : −0.922ξ11 + 0.772ξ12 + 0.343 = 0 ⎪ ⎪ ⎩ f4 (ξ ) : −ξ11 + 0.653 = 0

Fig. 6. Geometric view of the two-dimensional ellipsoidal uncertainty set when secants are introduced.

The secants are presented in Fig. 6. Similarly, the cutting planes for the three-dimensional case are presented in Table 6, which are shown in Fig. 7. In Fig. 6, the ellipsoidal uncertainty set is constructed as the basic bounded space for the two uncertainties. The green-colored solid line represents the flexible ellipsoidal set with independent bounds on x-axis and y-axis, which is much tighter than the classical uncertainty set in the red-colored curve. Through the generation process of cutting planes, four secants are obtained and presented as blue-colored solid lines. In our research, a small percentage of peripheral points will be separated by the cutting planes, which reduces the influence of extreme values of

Fig. 7. Geometric view of the three-dimensional uncertainty set when cutting planes are introduced.

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 10

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

uncertainties on the constraints. Thus a novel uncertainty set (refers to the cut set of flexible uncertainty sets) is built with tight bounds and covers most of the effective uncertain values. Remark. If we only use cutting planes as the outer-approximation of uncertain scenarios, more invalid points could be included, such as (0.5,−1) and (0.6, 0.8) in Fig. 6. Fortunately, the flexible set is helpful for generating a tight initial boundary for the cut set.

CS = U ∩ Proof. For the cut set of flexible box uncertainty set U∞     ∞ ξij /cij ∞ ≤ 1, ∀ j ∈ Ji ξ /c∞ ≤ 1 UConSet = ξ | . = ξ| . , we fk (ξij ) + dk ≥ 0, ∀k, j ∈ Ji fcp (ξ ) + d ≥ 0 define P∞ = [CL−1 ; 01×L ], p∞ = [0L×1 ; 1], and K∞ = {[θL×1 ; t] ∈ ×L RL+1 |θ ∞ ≤ t.}, where L is the cardinality of the uncertainty set (i.e., L = |Ji |). The half-space fk (ξij ) + dk ≥ 0, k = 1, 2, . . . , q can be represented as Qξ + d ≥ 0, where Q = [βkj ]q×L and

For the three-dimensional case, multiple cutting planes are generated for surrounding most of the uncertain scenarios. To clearly illustrate the effect of the cutting planes, three views are presented in Fig. 7. As shown in Table 6, some planes are similar, such as f2 (ξ ) + d2 = 0 and f4 (ξ ) + d4 = 0, thus one of the planes can be omitted for the simplification of the cut set. Obviously, when the uncertainties are correlated, the redundant spaces in the classical and flexible uncertainty sets could be eliminated effectively by the cutting planes.

d = [d1 ; d2 ; . . . ; dq ]q×1 . Thus the cutting planes can be described as  j∈J βkj ξij + dk ≥ 0, ∀i, k = 1, 2, . . . , q.

Remark. In fact, the above approach is also effective for delimiting regions formed by uncertainties with joint multimodal distributions. The illustrative numerical case is added in the Supplementary material.

s.t.

5. Robust formulation induced by cut sets

Defining dual variable y = [zi ; wi ] ∈ RL+1 and using the dual ∗ = {[θ L+1 |θ  ≤ t.}, the conic dual of the cone of K ∞ : K∞ L×1 ; t] ∈ R 1 inner maximization problem can be formulated as

Since the size of uncertainty set directly influences the quality of robust solutions, the more uncertain values are considered, the more conservative the robust result. In engineering problems, decision-makers prefer to obtain robust solutions which are immune to most of the high-frequency uncertain scenarios. To reduce the conservatism of the previous robust formulation in the literature, the cut sets are constructed in Section 3 and the novel robust formulations for linear problems (LP) and mixed-integer linear problems (MILP) are derived in this section. To simply illustrate the formulations, only left-hand-side (LHS) uncertainties are considered, and the constraints can be easily extended and transformed for other kinds of uncertainties. Uncertainties in the formulations induced by cut sets are all de1−γ fined as Eq. (3), where the nominal value is represented as aij i 1−γi

and the perturbation amplitude is represented by aˆij

= ξij aˆmax . ij

Here the random variable ξ ij mostly fluctuates within the interval [−cij , cij ], and cij is a data-driven parameter which can be calculated from the random data. The definition of cij was firstly proposed in our previous work (Zhang et al., 2016b), and cij can reflect the percentage of considered uncertain scenarios for both bounded and unbounded uncertainties. Based on the above definitions of uncertainty, the novel robust formulations are illustrated in three parts, which are robust formulations induced by the cut set of flexible box, ellipsoidal and polyhedral uncertainty set.

i

Then the inner maximization problem in Eq. (6) can be rewritten as

max



ξij aˆij x j

j∈Ji

P∞ ξ + p∞ ∈ K∞ Qξ + d ≥ 0

ξ ∈ RL

min

pT∞ y + dT τ

s.t.

T P∞ y − Q T τ = aˆmax x i ∗ y ∈ K∞ = K1

τ ∈ Rq+ which equals to

min

wi +

q 

d k τk

k=1

zij = cij aˆmax x j + cij ij

s.t.

q 

βkj τk , ∀i, j ∈ Ji

k=1

zi 1 ≤ wi , ∀i τk ≥ 0 , ∀ k = 1 , 2 , . . . , q Since the above problem is a minimization problem, it can be further rewritten as the following equivalent formulation by re placing wi with zi 1 = j∈J |zij |, i

5.1. Robust formulation induced by the cut set of box uncertainty set Property 1. For the cut set of flexible box uncertainty set, representCS , the corresponding robust counterpart is equivalent to the ing by U∞ following constraints:



z,τ

q q #   #zij # + dk τk : zij = cij aˆmax x + c βkj τk , ∀ j ∈ Ji j ij ij k=1

j∈Ji



k=1

#  #  q # min x j + cij βkj τk # + d k τk #cij aˆmax τ # ij # k=1 j∈Ji k=1 

q

max

min

#

Then we can reformulate the dual of the inner maximization problem as follows:

⎧ q  1 −γ   ⎪ i ⎪ a x + u + d k τk ≤ b i , ∀ i j ij ⎪ ij ⎪ ⎪ j j∈Ji k=1 ⎪ ⎪ ⎨ −uij ≤ cij aˆij x j + cij βkj τk ≤ uij , ∀ j ∈ Ji ⎪ ⎪ ⎪ k=1 ⎪ ⎪ τ ≥ 0, ∀k ⎪ ⎪ ⎩ k uij ≥ 0, ∀i, j ∈ Ji



(13)

#

 ##

q 

Replacing the original inner maximization problem with the above conic dual, then the following constraint is obtained:

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

# # ⎧ #  q q  1 −γ  ##  ⎪ # ⎪ max i ⎪ ˆ a x + c a x + c β τ + d k τk ≤ b i , ∀ i # # ⎪ j j ij ij kj k ⎪ # ij # k=1 ⎨ j ij j∈Ji k=1



p2 = [0L×1 ; |Ji |], and K2 = {[θL×1 ; t ] ∈ RL+1 |θ 2 ≤ t .}, where L is the cardinality of the uncertainty set (i.e., L = |Ji |). According to the proof for property 1, the half-spaces can be represented as Qξ + d ≥ 0, where Q = [βkj ]q×L and d = [d1 ; d2 ; . . . ; dq ]q×1 .

1 −γ

cij = aˆij i /aˆmax , ∀i, j ∈ Ji ⎪ ij ⎪ ⎪ ⎪ ⎪ ⎩ τk ≥ 0 , ∀ k

Then the inner maximization problem in Eq. (6) can be rewritten as

Here the dual minimization problem can be directly introduced into the original constraint as the algebraic form, because the objective is to maximize the function of decision variable xj . When the positive variable uij is introduced to the constraints, the absolute value can be transformed into algebraic forms shown as:

⎧ q  1 −γ   ⎪ ⎪ uij + d k τk ≤ b i , ∀ i ⎪ aij i x j + ⎪ ⎪ ⎪ j j∈ J k =1 i ⎪ ⎪ ⎨ q  −uij ≤ cij aˆmax x + c βkj τk ≤ uij , ∀ j ∈ Ji j ij ij ⎪ ⎪ k=1 ⎪ ⎪ ⎪ ⎪ τk ≥ 0 , ∀ k ⎪ ⎪ ⎩ uij ≥ 0, ∀i, j ∈ Ji

max s.t.



5.2. Robust formulation induced by the cut set of ellipsoidal uncertainty set Property 3. For the cut set of flexible ellipsoidal uncertainty set, representing by U2CS , the corresponding robust counterpart is equivalent to the following constraints:

$ ⎧ . /2 % ⎪ q % ⎪    ⎪ 1 − γ & i max ⎪ aij x j + |Ji | · cij aˆij x j + cij βkj τk ⎪ ⎪ ⎪ ⎪ j j∈Ji k=1 ⎪ ⎪ ⎨ q  + d k τk ≤ b i , ∀ i ⎪ ⎪ ⎪ k=1 ⎪ ⎪ ⎪ ⎪ ⎪cij = aˆ1ij−γi /aˆmax , ∀i, j ∈ Ji ⎪ ij ⎪ ⎩ τk ≥ 0 , ∀ k

tainty

 ξ|

set

ξij /cij 2 ≤



set

of

flexible ellipsoidal

U2CS = U2 ∩ UConSet =

|Ji |, ∀ j ∈ Ji

fk (ξij ) + dk ≥ 0, ∀k, j ∈ Ji

ξ|

 . ,

we

P2 ξ + p2 ∈ K2

ξ ∈ RL

Proof. The proof can be seen in the Supplementary material.

cut

ξij aˆij x j

Qξ + d ≥ 0

⎧ q  1 −γ  1 −γ    ⎪ i i ⎪ a x + b y + ua + ub + d k τk ≤ p i , ∀ i L ⎪ j ij il ij il ⎪ ⎪ L j j∈Ji ⎪ l∈ L k =1 i ⎪ ⎪ q ⎪  ⎪ ⎪−uaij ≤ cij aˆmax x j + cij βkj τk ≤ uaij , ∀ j ∈ Ji ⎪ ij ⎪ ⎪ ⎪ k=1 ⎪ ⎨ q  (14) −ubil ≤ μil bˆ max y + μ βk,l+|Ji | τk ≤ ubil , ∀l ∈ Li L il il ⎪ ⎪ k=1 ⎪ ⎪ ⎪ τk ≥ 0 , ∀ k ⎪ ⎪ ⎪ ⎪ uaij ≥ 0, ∀i, j ∈ Ji ⎪ ⎪ ⎪ ⎪ ⎪ ubil ≥ 0, ∀i, l ∈ Li ⎪ ⎪ ⎩ x j ∈ X, yL ∈ {0, 1}, ∀ j, l

the

 j∈Ji

min

pT2 y + dT τ

s.t.

P2T y − Q T τ = aˆmax x i y ∈ K2

Property 2. For the cut set of flexible box uncertainty set, representCS , the robust counterpart of MILP problems is equivalent to ing by U∞ the following constraints:

Proof. For

11

define

ξ /c2 ≤

(15)

  uncer|Ji |

τ ∈ Rq+ which equals to

min



|Ji |wi +

q 

d k τk

k=1

s.t.

zij = cij aˆmax x j + cij ij

q 

βkj τk , ∀i, j ∈ Ji

k=1

zi 2 ≤ wi , ∀i τk ≥ 0 , ∀ k = 1 , 2 , . . . , q   min z,τ

|Ji | ·

 j∈Ji

zij2

+

q  k=1

dk τk : zij =

cij aˆmax xj ij

+ cij

q 

 βkj τk ,∀j ∈ Ji

k=1

⎧$ ⎫ . /2 ⎪ ⎪ ⎨% ⎬ q q %    & max min Ji | · cij aˆij x j + cij β + d k τk | kj τk τ ⎪ ⎪ ⎩ ⎭ j∈Ji k=1 k=1

$ ⎧ . /2 % q ⎪ %    ⎪ ⎪ 1 − γ & ⎪ aij i x j + |Ji | · cij aˆmax x j + cij βkj τk ⎪ ij ⎪ ⎪ ⎪ j j∈ J k =1 i ⎪ ⎨ q  + d k τk ≤ b i , ∀ i ⎪ ⎪ ⎪ k=1 ⎪ ⎪ ⎪ 1 −γ ⎪ c = aˆij i /aˆmax , ∀i, j ∈ Ji ⎪ ij ⎪ ⎩ ij τk ≥ 0 , ∀ k

=



P2 = [CL−1 ; 01×L ], ×L

Property 4. For the cut set of flexible ellipsoidal uncertainty set, representing by U2CS , the robust counterpart of MILP problems is equivalent to the following constraints:

fcp (ξ ) + d ≥ 0

.

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 12

 1 −γ ⎧ 1−γi aij x j + bil i yL + ⎪ ⎪ ⎪ L j ⎪ ⎪ $ ⎛ ⎪ ⎞ ⎪ . /2 % ⎪ ⎪ q ⎪   ⎪% % max ⎪ ⎟ ⎪ cij aˆij x j + cij βkj τk % ⎜ ⎪ ⎟ ⎪ % ⎜ ⎪ j∈ J k =1 ⎜ ⎟ i ⎪ %H ⎜ ⎪ ⎟ ⎪ . / % 2 ⎪ ⎜ ⎟ q ⎪ % ⎝  ⎪  ⎠ ⎪ max & ˆ ⎪ + μ b y + μ β τ ⎪ L il il k,l+ J k | i| il ⎨ l∈Li

⎪ q ⎪  ⎪ ⎪ ⎪ + d k τk ≤ p i , ∀ i ⎪ ⎪ ⎪ k=1 ⎪ ⎪ 1 −γ ⎪ c = aˆij i /aˆmax , ∀i, j ∈ Ji ⎪ ij ⎪ ⎪ ij ⎪ ⎪ ⎪ ⎪ ⎪ μ = bˆ 1ij−γi /bˆ max ⎪ il , ∀i, l ∈ Li ⎪ ⎪ il ⎪ τ ≥ 0 , ∀ k ⎪ k ⎪ ⎩ x j ∈ X, yL ∈ {0, 1}, ∀ j, l

k=1

Proof. The proof can be seen in the Supplementary material.

which equals to

min

q 

βkj τk , ∀i, j ∈ Ji

zi ∞ ≤ wi , ∀i τk ≥ 0 , ∀ k = 1 , 2 , . . . , q

(16)

 min z,τ

q # #  dk τk : zij = cij aˆmax xj |Ji | · max#zij # + ij j∈Ji



k=1

q

+ cij



βkj τk , ∀ j ∈ Ji

k=1

# #  # #  q q  # ˆmax # min |Ji | · max#cij aij x j + cij βkj τk # + d k τk τ j∈Ji # # k=1 k=1 



⎧ q  1 −γ  ⎪ i ⎪ a x + J · s + d k τk ≤ b i , ∀ i | | ⎪ j i i ij ⎪ ⎪ ⎪ j k=1 ⎪ ⎪ ⎪ ⎪uij ≤ si , ∀i, j ∈ Ji ⎪ ⎪ ⎨ q  −uij ≤ cij aˆmax x j + cij βkj τk ≤ uij , ∀i, j ∈ Ji ij ⎪ ⎪ k=1 ⎪ ⎪c = aˆ1−γi /aˆmax , ∀i, j ∈ J ⎪ ⎪ i ij ij ij ⎪ ⎪ ⎪ ⎪ ⎪ ⎪τk ≥ 0, ∀k ⎩ uij ≥ 0, ∀i, j ∈ Ji 

(17)

Then the inner maximization problem in Eq. (6) can be rewritten as

ξij aˆij x j

d k τk

k=1

Proof. For the cut set of flexible polyhedral uncertainty set U1CS = ξij /cij 1 ≤ |Ji |, ∀ j ∈ Ji ξ /c1 ≤ |Ji | U1 ∩ UConSet = {ξ | .} = {ξ | .}, fk (ξij ) + dk ≥ 0, ∀k, j ∈ Ji fcp (ξ ) + d ≥ 0 we define P1 = [CL−1 ; 01×L ], p1 = [0L×1 ; |Ji |], and K1 = {[θL×1 ; t] ∈ ×L RL+1 |θ 1 ≤ t.}, where L is the cardinality of the uncertainty set (i.e., L = |Ji |). The half-spaces can be represented as Qξ + d ≥ 0, where Q = [βkj ]q×L and d = [d1 ; d2 ; . . . ; dq ]q×1 .



q 

zij = cij aˆmax x j + cij ij

s.t.

Property 5. For the cut set of flexible polyhedral uncertainty set, representing by U1CS , the corresponding robust counterpart is equivalent to the following constraints:

⎧ q  1 −γ  ⎪ ⎪ aij i x j + |Ji | · si + d k τk ≤ b i , ∀ i ⎪ ⎪ ⎪ ⎪ j k=1 ⎪ ⎪ ⎪ ⎪ uij ≤ si , ∀i, j ∈ Ji ⎪ ⎪ ⎪ ⎨ q  −uij ≤ cij aˆmax x + c βkj τk ≤ uij , ∀i, j ∈ Ji j ij ij ⎪ ⎪ k=1 ⎪ ⎪ ⎪ 1 −γ ⎪ cij = aˆij i /aˆmax , ∀i, j ∈ Ji ⎪ ij ⎪ ⎪ ⎪ ⎪ τk ≥ 0 , ∀ k ⎪ ⎪ ⎩ uij ≥ 0, ∀i, j ∈ Ji

|Ji | · wi +

k=1

5.3. Robust formulation induced by the cut set of polyhedral uncertainty set

max

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Property 6. For the cut set of flexible polyhedral uncertainty set, representing by U1CS , the robust counterpart of MILP problems is equivalent to the following constraints:

⎧ q   1 −γ  ⎪ ⎪ a1ij−γi x j + bil i yL + Hsai + Hsbi + d k τk ≤ p i , ∀ i ⎪ ⎪ ⎪ L ⎪ j k=1 ⎪ ⎪ q ⎪  ⎪ ⎪−uaij ≤ cij aˆmax x + c βkj τk ≤ uaij , ∀ j ∈ Ji ⎪ j ij ij ⎪ ⎪ ⎪ k=1 ⎪ ⎨ q  −ubil ≤ μil bˆ max y + μ βk,l+|Ji | τk ≤ ubil , ∀l ∈ Li L il il ⎪ ⎪ k=1 ⎪ ⎪ ⎪ 0 ≤ uaij ≤ sai , ∀i, j ∈ Ji ⎪ ⎪ ⎪ ⎪ 0 ≤ ubij ≤ sbi , ∀i, l ∈ Li ⎪ ⎪ ⎪ ⎪ ⎪ τk ≥ 0 , ∀ k ⎪ ⎪ ⎩ x j ∈ X, yL ∈ {0, 1}, ∀ j, l Proof. The proof can be seen in the Supplementary material.

(18)



j∈Ji

s.t.

P1 ξ + p1 ∈ K1 Qξ + d ≥ 0

ξ ∈ RL min

pT1 y + dT τ

s.t.

P1T y − Q T τ = aˆmax x i y ∈ K1∗ = K∞

τ ∈ Rq+

Motivating example (continued) As illustrated in Section 4.3, the flexible ellipsoidal uncertainty set of the two-dimensional case has been cut down to a tighter set by introducing four secants. Since the generation of secants is independent of the selection of flexible uncertainty sets, the four secants can also be introduced to the box and polyhedral uncertainty set. Thus, when we apply formulations induced by classical uncertainty sets, flexible uncertainty sets, cut sets sequentially, the optimization results of the problem in Eq. (12) are presented in Fig. 8. Here the cut set for the uncertainties (a˜21 , a˜22 ) in the second constraint in Eq. (12) has been constructed by the same approach (Section 4.2).

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE

ARTICLE IN PRESS Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

[m5G;October 19, 2018;4:12] 13

Fig. 8. Optimization results of the motivating example when different kinds of uncertainty sets are applied.

In Fig. 8, the confidence level ranges from zero to 1, which forms 22 uncertain scenarios for each kind of uncertainty set induced formulation. No matter what kind of uncertainty set (classical, flexible, cut set) is chosen, the polyhedral set induced formulations are proved the most conservative, which is the most sensitive to the change of confidence levels. When we compare the solutions obtained from classical, flexible and cut set induced formulations, the following conclusions can be easily drawn: (1) The conservatism increases in the following order: cut set induced formulation, flexible set induced formulation, classical set induced formulation. (2) When the uncertainty set is chosen as classical, flexible or cut set, the conservatism increases in the following order: box, ellipsoidal, polyhedral. (3) All the cut set induced formulations appear less conservative than any other set-induce ones, except the flexible box uncertainty set. (4) When the confidence level is less than 0.9, the flexible box uncertainty set induced solutions are the same as solutions induced by its cut set; however, when the confidence level is greater than 0.9, the cut set induced solutions are the least conservative results. (5) For the cut set induced formulations, the objectives for the box, ellipsoidal and polyhedral set induced models are similar when the confidence level is determined. When the confidence level equals one, nearly all the uncertain scenarios are considered in the model, the solutions induced by classical and flexible sets are the same, while the objectives for the cut set induced model are less conservative. This is because some extreme values have been eliminated by the introduced secants, and the uncertainty sets have been shrunk to the outerapproximation of the most uncertain scenarios. Thus, through the analysis of the optimization results for the motivating example, the cut set induced formulations are proved with nearly the least conservatism. The new formulations are effective for dealing with problems under correlated uncertainties. In the next section, the above approaches will be illustrated through industrial problems under high-dimensional uncertainties.

6. Industrial case study In the process industry, equipment and devices are connected by a network of complex pipelines and processes. Because uncertainties are more likely to be correlated in a dynamic production process, then some interactions between uncertainties are invisible. Sometimes decision-makers prefer to accept the assumption that the random scenarios are independent of each other. However, from the historical data, some interesting cross impacts can be observed, which usually cause the associated fluctuation in the production process. In this section, the uncertainties in energy consumption rate of furnaces are considered, and the proposed approach will be illustrated through the integrated optimization of utility system and production system in a real-world ethylene plant. The deterministic model was firstly proposed by Zhao’s work (Zhao et al., 2016), which has been further extended to a chanceconstrained formulation (Zhang et al., 2016a) and a robust model (Zhang et al., 2016b) in our previous works. Particularly, the historical data has been collected and preprocessed for the construction of the classical and flexible uncertainty sets. To simplify the illustration, only necessary constraints will be presented and discussed in the following content. The production process in our model (Fig. 9) is to generate nine kinds of products, including hydrogen, ethylene, C4, butadiene, C5, benzene, propylene, cracked gasoline and cracked fuel oil, through the cracking of crude oils, cooling, compression and separation section. In total, four kinds of crude oils (NAP, HVGO, AGO and ETHA) are purchased and energy resources can be self-generated and purchased from the market, such as fuel, water, steam and electricity. To improve the use efficiency of materials, the self-generated fuel gas and natural gas should be recycled in time for the continuous running of cracking furnaces. Thus we focus on the uncertainties in the consumption rate of fuel gas and propose an improved robust model based on the cut set induced formulation in the previous section. 6.1. Mathematical model The objective of the mathematical model is to maximize the profit of delivering products, where the costs of purchasing energy resources and crude oils are considered. The penalty of the shifts in

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 14

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Fig. 9. Simplified flowchart of the integrated production and utility system.

operation modes is also included in the objective. The deterministic model contains mass balance constraints for raw materials and energy resources, inventory constraints, supply and demand balance constraints. Moreover, the balance constraint of consuming and generating energies for furnaces, boilers, turbines, compressors and cooling units are defined in detail, which can be reviewed in the previous work (Zhao et al., 2016) or the supplementary material. Nomenclature for the robust constraints is attached at the end of this work. Concentrating on the constraint of fuel consumption and generation for furnaces as Eq. (19),

 f



a˜ f,r F C f,r,t + b f,r DS f,r,t + ct f,r ≤ F PPf uel ,t + PC f uel ,t ,

∀t

r

(19) the cut set-induced robust constraints can be built as follows: •

Cut set of flexible box uncertainty set induced formulation

⎧   1−γ   a f,r F C f,r,t + b f,r DS f,r,t + ct f,r + F GU f,r,t ⎪ ⎪ ⎪ ⎪ r f f ∈F U r ⎪ ⎪ ⎪ |FU | , ⎪ 2 ⎪  ⎪ ⎪ + d τ ≤ F P P + P C ⎪ cpk cpk f uel ,t f uel ,t ⎨ cpk

∀t (20)

The high-dimensional cutting planes calculated from the simulated uncertain scenarios are introduced to the robust constraint. In fact, the parameters in the normal vector of cutting planes are regarded as an additional perturbation for the consumption of fuel gas. The intercept terms are introduced as another nominal part for the consumption constraint. It is noteworthy that the new formulation remains linear, and only the number of decision variables and constraints will increase.

Cut set of flexible ellipsoidal uncertainty set induced formulation

⎧   1−γ  a f,r F C f,r,t + b f,r DS f,r,t + ct f,r ⎪ ⎪ ⎪ ⎪ r f ⎪ ⎪ $ ⎪ . /2 ⎪ % ⎪ |FU | ⎪ 2    ⎪ % ⎨ +&|F U | c f,r aˆmax F C f,r,t +c f,r βcpk, f,r τcpk , ∀t f,r (21) r f ∈ F U cpk ⎪ ⎪ ⎪ |FU | 2 ⎪  ⎪ ⎪ ⎪ + dcpk τcpk ≤ F PPf uel ,t + PC f uel ,t ⎪ ⎪ ⎪ ⎪ cpk ⎩ τcpk ≥ 0, ∀cpk In the ellipsoidal-type constraint, the introduced parameters and variables about cutting planes will increase the number of nonlinear terms, which will cause greater difficulty in the solving process. The solution time will be analyzed in the following parts. •

| | 2  ⎪ ⎪ ⎪ −F GU f,r,t ≤ c f,r aˆmax F C f,r,t + c f,r βcpk, f,r τcpk ≤ F GU f,r,t , ∀t ⎪ f,r ⎪ ⎪ ⎪ cpk ⎪ ⎪ ⎪τcpk ≥ 0, ∀cpk ⎪ ⎪ ⎩ F GU f,r,t ≥ 0, ∀ f ∈ F U , r ∈ F I f,r , t FU



Cut set of formulation

flexible

polyhedral

uncertainty

set

induced

⎧   1−γ  a f,r F C f,r,t + b f,r DS f,r,t + ct f,r ⎪ ⎪ ⎪ r ⎪ f ⎪ ⎪ ⎪ |FU | , ∀t 2 ⎪  ⎪ ⎪ ⎪ + F U F G S + d τ ≤ F P P + P C | | t cpk cpk f uel ,t f uel ,t ⎪ ⎪ ⎪ cpk ⎪ ⎪ ⎨F GU f,r,t ≤ F GSt , ∀ f ∈ F U , r ∈ F I f,r , t (22) ⎪ ⎪ FU | | ⎪ 2 ⎪  ⎪ ⎪ βcpk, f,r τcpk ≤ F GU f,r,t , ∀t ⎪−F GU f,r,t ≤ c f,r aˆmax f,r F C f,r,t + c f,r ⎪ ⎪ ⎪ cpk ⎪ ⎪ ⎪ ⎪ τcpk ≥ 0, ∀cpk ⎪ ⎪ ⎩ F GU f,r,t ≥ 0, ∀ f ∈ F U , r ∈ F I f,r , t The above formulations for the fuel gas consumption constraints are more complex than the flexible uncertainty sets induced ones, especially for the large-scale industrial problems. The more uncertain parameters are considered in the model, the more

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

15

Fig. 10. Correlation analysis results (Kendall’s correlation coefficients) for the consumption rates of furnaces when processing crude oils (NAP, HVGO).

Fig. 11. Correlation matrix (Kendall’s correlation coefficients) for fuel consumption rate of furnaces when processing NAP.

Fig. 12. Correlation matrix (Kendall’s correlation coefficients) for fuel consumption rate of furnaces when processing HVGO.

decision variables and parameters should be introduced. The conservatism, robustness and solution time of the cut set induced formulations need to be discussed and analyzed clearly in the numerical case study. 6.2. Analysis of correlated uncertainties In this case, 15 cracking furnaces are considered, which are all tubular furnaces. However, furnaces are usually running in groups divided by the kind of processing materials and the furnace types,

such as Lummus short residence time (SRT-I VI), KTI GK-I VI, CBL-I IV. The similar type of furnaces are more likely to reflect correlated fluctuations, and the coupled scenarios are always generated from the closely connected equipment in the continuous process. Normally, the correlation between furnaces can be analyzed from the data perspective and the process perspective. From the process perspective, the inherent uncertainties in the energy network, such as inevitable energy loss and unstable energy flowrate, will cause undetectable fluctuations in the supply and quality of energy resources. Similarly, the uncertain factors in

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE 16

ARTICLE IN PRESS

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Fig. 13. Comparison between simulated and sampled scenarios for furnaces BA105, BA106.

the pipelines and upstream equipment will transfer and probably magnify the fluctuations, and finally, the uncertainties in furnaces are somehow correlated. However, this seems to be inconspicuous for the clear analysis of the coupled scenarios. The historical data can reveal the extent of the correlations. Eighty days of historical data are preprocessed and extracted for analysis, and the sampling frequency is one record per 10 min. As a result, 10,637 valuable records are obtained for the estimation of the marginal probability density function of uncertainties. The uncertain fuel gas consumption rate has been defined in Eq. (19). Then the real-world uncertain scenarios are utilized for the correlation analysis of every two uncertainties, and the results of Kendall’s correlation coefficients are presented in Fig. 10. It is noteworthy that the symbol “−−” in Fig. 10 represents unavailable correlation coefficients because of the following reasons: •





All the correlated scenarios are extracted simultaneously, and every furnace cannot process more than one kind of crude oil at the same time. Thus the cross correlation coefficients for every furnace is none. During the sampling period (80 days), some furnaces processed only one kind of crude oil or processed the same kind of oil at the most time, such as BA109 and BA110 only processed NAP. In all the combinations of furnaces’ operation modes, some mutual scenarios cannot be found in the sampled data, such as we didn’t find a scenario that BA106 and BA107 are processing HVGO at the same time.

Ignoring the unavailable scenarios, we can observe that the consumption rates for the two operation modes of different furnaces are correlated at various degrees. Because of the huge amount of industrial data, if the coefficient is bigger than 0.1, the

correlation between uncertainties should be noticed. Correspondingly, the coefficient (≥0.2) is regarded as nonnegligible correlation, and the coefficient (≥0.3) represents obvious correlation. It can be seen from Fig. 10 that BA105–BA107 are strongly correlated in many scenarios. When furnaces are processing the same kind of material, correlation coefficients are extracted from the results in Fig. 10, and two independent matrices are obtained as shown in Figs. 11 and 12. The estimated marginal probability density curves are attached to the uncertainties. According to the distribution of coefficients, the matrix can be divided into several overlapped submatrices, which are shown in Figs. 11 and 12. It can be concluded that the consumption rate normally exerts influence on the consecutive furnaces. If more mutual realizations of uncertainties can be obtained, the correlation analysis results will become more precise. As a result, the estimation of the joint probability density function and the simulation process in Fig. 2 will provide a richer collection of correlated scenarios for uncertainties. Thus a more accurate uncertainty set can be constructed to cover the most high-frequency random values. Furthermore, the generation of uncertain scenarios (referring to the simulation process) could fill the missing scenarios that are not found in the historical data but probably to occur in the future operation modes. 6.3. Construction of the cut set Taking the uncertain consumption rate of fuel gas in furnace BA105 and BA106 as an example, the simulation process of correlated scenarios are illustrated from the geometric view in Fig. 13. Through the copula-based methods proposed in Fig. 2, the

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE

ARTICLE IN PRESS Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

[m5G;October 19, 2018;4:12] 17

Fig. 14. Generation approach of cutting planes for high-dimensional uncertainty set.

copula function for the two uncertainties is chosen as the elliptical copula with normal distribution and the copula parameter is estimated as 0.38. Then the same size of scenarios are generated, and the hypothesis result proves that the sampled points can be regarded from the same distribution of the real-world uncertainties. The joint probability density function obtained from the realworld scenarios are drawn in Fig. 13(a), and the density function of simulated points is presented in Fig. 13(b). To better illustrate the difference between the two density functions, an auxiliary density is presented in Fig. 13(c). It can be observed that there are still deviations between the simulated scenarios and the exact ones, but if more valuable information can be obtained from the real-world data, the simulation result could become more accurate. Since the maximum deviation presented is obviously less than a quarter of the peak frequency, the simulated results are acceptable for the following discussion. When the copula-based method is extended to highdimensional cases, a great number of correlated scenarios can be generated for the construction of uncertainty sets. To obtain a tight outer-approximation of the most uncertain scenarios, no more than 29 cutting planes need to be calculated by means of the model proposed in Section 4.2.2. The detailed generation

approach for a large-scale problem is shown in Fig. 14. In order to reduce the negative effect caused by extreme values, 1% of simulated scenarios are acceptable to be separated by each cutting plane. Thus the optimal solutions are immune to the most random scenarios, and those too conservative solutions caused by rare uncertain scenarios will not appear. Through the above approach, 512 linear cutting planes can be obtained. In fact, the optimization model for the generation of cutting planes is nonlinear, and the complexity and computational effort will increase significantly when the size of simulated points and the dimension of uncertainties increase. However, in some cases, the redundant cutting planes could be removed before they are introduced to the scheduling model (shown as the dashed box in Fig. 14), but the efficient solving method for this problem will not be discussed at present. We focus more on the effectiveness of the introduction of cutting planes, which will be analyzed in detail in the next part. 6.4. Comparison analysis The classical, flexible and cut set induced robust models and the nonlinear model for generating cutting planes are all solved

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

ARTICLE IN PRESS

JID: CACE 18

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

Fig. 15. Percentage of fuel gas consumption by the different group of furnaces (Group A – New furnaces assumed to be deterministic; Group B – Furnaces with uncertainties; Group C – Old furnaces assumed to be deterministic). Table 7 Computational statistics for different uncertainty sets induced formulations. Model induced by classical sets

Model induced by flexible sets

Model induced by cut sets

Deterministic model (MILP)

Type

Eqs

Vars

Bins

Eqs

Vars

Bins

Eqs

Vars

Bins

Eqs

Vars

Bins

Box (MILP) Ellipsoidal (MINLP) Polyhedral (MILP)

12,257 12,257 12,397

10,217 10,217 10,277

620 620 620

12,257 12,257 12,307

10,217 10,217 10,227

620 620 620

13,099 13,009 13,279

10,969 11,019 10,989

620 620 620

12,231 12,231 12,231

10,381 10,381 10,381

620 620 620

on a desktop computer with the following specifications: Dell Precision T5610 with Intel Xeon CPU E5-2609 v2 AT 2.350 GHz (total four threads), 16GB of RAM, and running Windows 10 professional. CPLEX 12.5 and BARON are chosen as the MIP and MINLP solver, respectively. The R ks 1.10.3 package and copula 0.999-15 package are installed for the kernel-based and copula-based estimation of uncertainties, respectively. The number of equations (eqs), variables (vars), binary variables (bins) are concluded in Table 7. Ellipsoidal set induced constraints always bring a number of nonlinear terms to the scheduling model, which cause the solution time of classical and flexible cases increases to around 34.5 min by the BARON solver. For all the kinds of formulations, the continuous uncertainties and cutting planes lead to no increase in the number of binary variables. However, a series of constraints and continuous variables are created because of introducing cutting planes, and the cut ellipsoidal set induced formulation has to be calculated over 1.5 h before the model reaches the global optimal solution. Fortunately, all the MILP robust models could be solved within several minutes, which provide great convenience for the analysis of uncertainties. As shown in Fig. 15, when the confidence levels of the uncertain consumption rate for BA105-BA109 are determined as 50%, the cut box set induced results are presented. According to the running condition of furnaces in the real-world plant, the furnaces are divided and monitored as three groups, where Group B contains the furnaces with uncertainties. The results show that Group B furnaces occupy nearly 38% of fuel gas consumption in the

ten-days scheduling period, and Group A furnaces and BA114 in Group C normally consume more than 10% of fuel gas. To further analyze the impact of uncertainties, the fuel gas consumptions of Group B furnaces when the confidence level varies are presented in Fig. 16. Here only cut box and polyhedral set induced results are presented, because the perturbation terms for each furnace are combined as one nonlinear term in the ellipsoidal set induced robust counterpart. Thus the unit fuel gas consumption of furnaces cannot be extracted from the ellipsoidal induced model. It is obvious that the percentage of the total consumption of fuel gas by Group B furnaces fluctuates about 0.88% and 2.87% from the cut box and polyhedral set induced results, respectively. This means the uncertainties could cause changes in the supplement of fuel gas for all the furnaces. When we focus on the bars in Fig. 16, the green bars show that the amount of consumed fuel gas in ten days could fluctuate nearly 150 tons, while the blue bars (polyhedral set induced results) appears to fluctuate over 500 tons. The polyhedral set induced solutions are apparently more conservative than the box set induced ones, and it is presumably that the conservatism of ellipsoidal set induced results situates between the polyhedral and box set induced ones. When the cut set induced solutions are compared with the other kinds of sets induced results, all the cut sets induced results are concluded to be less conservative. In Fig. 17, the largest fluctuation of consumed fuel gas in ten days is nearly 100 tons, which can be concluded from the cut polyhedral set induced results (presented as blue dashed lines). It can be found that the fluctuation

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE

ARTICLE IN PRESS

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

19

Fig. 16. Fuel gas consumption of Group B furnaces (BA105–BA110) induced by the cut box and polyhedral set.

Fig. 17. Comparison analysis of the total consumption of fuel gas between the classical, flexible and cut set induced formulations.

for the total consumption of fuel gas by all the furnaces is less than the consumption by only Group B furnaces, which means the scheduling model assigns the increase in consumed fuel gas to the other groups of furnaces. Thus the robustness of model could be increased and the assignment of fuel gas becomes more feasible and economical. Unfortunately, when the uncertainties are correlated, the classical and flexible set induced results are much more conservative, which will cause the over-purchasing of energy resources. Since the proposed robust optimization approach is datadriven, the formulation and scheduling solutions could be modified slightly when more historical data is considered or some missing values of uncertainties become measurable/available. If the uncertainties are regarded as independent of each other, the cut set induced formulation could provide similar solutions as flexible set induced models. In a word, utilizing the data-driven robust optimization model, the decision-makers have the ability to decide

how many uncertain scenarios are considered in the model and to provide effective, economical and robust scheduling plans. Finally, the fluctuations of the production performance will be captured and controlled below a lower level of conservatism. 7. Conclusion This paper systematically discusses the impact of correlated uncertainties on the production performance. The robust counterpart induced by the classical uncertainty set has been further improved by introducing confidence level to uncertainties, which results in the flexible set-induced robust optimization. Furthermore, when the correlation relationships between uncertainties are considered, cutting planes are generated to construct cut sets for the outerapproximation of most uncertain scenarios. Copula-based methods are utilized to overcome the lack of information for uncertainties, and the joint probability density function of correlated

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039

JID: CACE 20

ARTICLE IN PRESS

[m5G;October 19, 2018;4:12]

Y. Zhang et al. / Computers and Chemical Engineering 000 (2018) 1–20

uncertainties are estimated. Then the simulation scenarios are obtained for the better construction of cut uncertainty sets. Through the numerical analysis by scheduling the fuel gas in the ethylene plant, the new robust solutions are proved less conservative and less sensitive to the fluctuations. Efficient engineering strategies for implementing data-driven models will be studied in the future, and the scheduling solutions will be improved when more valuable historical data is included. Acknowledgements The authors gratefully acknowledge financial support from the National Key Research and Development Program of China (2016YFB0301800) and Science Fund for Creative Research Groups of NSFC (Grant No. 61621002). Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at https://doi.org/10.1016/j.compchemeng.2017. 10.024. References Bai, M., Sun, L., 2007. Application of copula and copula-cvar in the multivariate portfolio optimization. In: ESCAPE. Springer, pp. 231–242. Ben-Tal, A., Nemirovski, A., 2002. Robust optimization-methodology and applications. Math. Progr. 92 (3), 453–480. Bertsimas, D., Pachamanova, D., Sim, M., 2004. Robust linear optimization under general norms. Oper. Res. Lett. 32 (6), 510–516. Bertsimas, D., Brown, D.B., Caramanis, C., 2011. Theory and applications of robust optimization. SIAM Rev. 53 (3), 464–501. Beyer, H.-G., Sendhoff, B., 2007. Robust optimization – a comprehensive survey. Comput. Methods Appl. Mech. Eng. 196 (33), 3190–3218. Boubaker, H., Sghaier, N., 2013. Portfolio optimization in the presence of dependent financial returns with long memory: a copula based approach. J. Bank. Finance 37 (2), 361–377. Calfa, B.A., Grossmann, I.E., Agarwal, A., Bury, S.J., Wassick, J.M., 2015. Data-driven individual and joint chance-constrained optimization via kernel smoothing. Comput. Chem. Eng. 78, 51–69. Charnes, A., Cooper, W.W., 1959. Chance-constrained programming. Manage. Sci. 6 (1), 73–79. Delage, E., Ye, Y., 2010. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 58 (3), 595–612. Delage, E.H., 2009. Distributionally Robust Optimization in Context of Data-driven Problems. Stanford University. Fang, K.-T., Kotz, S., Ng, K.W., 1990. Symmetric Multivariate and Related Distributions. Chapman and Hall.

Ghaoui, L.E., Oks, M., Oustry, F., 2003. Worst-case value-at-risk and robust portfolio optimization: a conic programming approach. Oper. Res. 51 (4), 543–556. Guzman, Y.A., Matthews, L.R., Floudas, C.A., 2016. New a priori and a posteriori probabilistic bounds for robust counterpart optimization: I. Unknown probability distributions. Comput. Chem. Eng. 84, 568–598. Janak, S.L., Lin, X., Floudas, C.A., 2007. A new robust optimization approach for scheduling under uncertainty: II. Uncertainty with known probability distribution. Comput. Chem. Eng. 31 (3), 171–195. Jiang, R., Guan, Y., 2016. Data-driven chance constrained stochastic program. Math. Progr. 158 (1–2), 291–327. Jiao, Y., Su, H., Hou, W., Liao, Z., 2012. Optimization of refinery hydrogen network based on chance constrained programming. Chem. Eng. Res. Des. 90 (10), 1553–1567. Kakouris, I., Rustem, B., 2014. Robust portfolio optimization with copulas. Eur. J. Oper. Res. 235 (1), 28–37. Lauprete, G., Samarov, A., Welsch, R., 2003. Robust portfolio optimization. In: Developments in Robust Statistics. Springer, pp. 235–245. Li, Z., Ierapetritou, M., 2008. Process scheduling under uncertainty: review and challenges. Comput. Chem. Eng. 32 (4), 715–727. Li, Z., Ding, R., Floudas, C.A., 2011. A comparative theoretical and computational study on robust counterpart optimization: I. Robust linear optimization and robust mixed integer linear optimization. Ind. Eng. Chem. Res. 50 (18), 10567–10603. Li, Z., Tang, Q., Floudas, C.A., 2012. A comparative theoretical and computational study on robust counterpart optimization: II. Probabilistic guarantees on constraint satisfaction. Ind. Eng. Chem. Res. 51 (19), 6769–6788. Lin, X., Janak, S.L., Floudas, C.A., 2004. A new robust optimization approach for scheduling under uncertainty: I. Bounded uncertainty. Comput. Chem. Eng. 28 (6), 1069–1085. Mesfin, G., Shuhaimi, M., 2010. A chance constrained approach for a gas processing plant with uncertain feed conditions. Comput. Chem. Eng. 34 (8), 1256–1267. Miller, B.L., Wagner, H.M., 1965. Chance constrained programming with joint constraints. Oper. Res. 13 (6), 930–945. Nelsen, R.B., 2015. An Introduction to Copulas, Ser, Lecture Notes in Statistics. Springer, New York. Papaefthymiou, G., Kurowicka, D., 2009. Using copulas for modeling stochastic dependence in power system uncertainty analysis. IEEE Trans. Power Syst. 24 (1), 40–49. Possolo, A., 2010. Copulas for uncertainty analysis. Metrologia 47 (3), 262. Prékopa, A., 2013. In: Stochastic Programming, vol. 324. Springer Science & Business Media. Sahinidis, N.V., 2004. Optimization under uncertainty: state-of-the-art and opportunities. Comput. Chem. Eng. 28 (6), 971–983. Zhang, Y., Feng, Y., Rong, G., 2016. Data-driven chance constrained and robust optimization under matrix uncertainty. Ind. Eng. Chem. Res. 55 (21), 6145–6160. Zhang, Y., Feng, Y., Rong, G., 2016. New robust optimization approach induced by flexible uncertainty set: Optimization under continuous uncertainty. Ind. Eng. Chem. Res. 56 (1), 270–287. Zhao, H., Ierapetritou, M.G., Rong, G., 2016. Production planning optimization of an ethylene plant considering process operation and energy utilization. Comput. Chem. Eng. 87, 1–12. Zhou, Z., Zhang, J., Liu, P., Li, Z., Georgiadis, M.C., Pistikopoulos, E.N., 2013. A two-stage stochastic programming model for the optimal design of distributed energy systems. Appl. Energy 103, 135–144.

Please cite this article as: Y. Zhang et al., Reprint of: Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant, Computers and Chemical Engineering (2018), https://doi.org/10.1016/j.compchemeng.2017.10.039