Computers and Chemical Engineering 28 (2004) 403–415
Global optimization of dynamic systems I. Papamichail, C.S. Adjiman∗ Department of Chemical Engineering and Chemical Technology, Centre for Process Systems Engineering, Imperial College London, London SW7 2AZ, UK Received 18 November 2002; received in revised form 30 July 2003; accepted 30 July 2003
Abstract Many chemical engineering systems are described by differential equations. Their optimization is often complicated by the presence of nonconvexities. A deterministic spatial branch and bound global optimization algorithm is presented for problems with a set of first-order differential equations in the constraints. The global minimum is approached from above and below by generating converging sequences of upper and lower bounds. Local solutions, obtained using the sequential approach for the solution of the dynamic optimization problem, provide upper bounds. Lower bounds are produced from the solution of a convex relaxation of the original problem. Algebraic functions are relaxed using well-known convex underestimation techniques. The convex relaxation of the dynamic information is achieved using a new convex relaxation procedure. Parameter independent as well as parameter dependent bounds on the dynamic system are utilized. The global optimization algorithm is illustrated by applying it to case studies relevant to chemical engineering, where affine functions of the optimization variables are used as a relaxation of the dynamic system. © 2003 Elsevier Ltd. All rights reserved. Keywords: Global optimization; Differential equations; Nonconvex dynamic optimization; Convex relaxation
1. Introduction Dynamic modeling of most chemical engineering processes results in a set of differential equations. The dynamic optimization of these processes examines their performance under transient conditions and tries to optimize their behavior by maximizing or minimizing a performance index subject to operating constraints. Applications of dynamic optimization include the determination of kinetic constants from time series data (parameter estimation), which is essential for the design and control of chemical engineering systems, the optimal control of batch and semi-batch chemical reactors, the optimal start-up and shut-down of continuous processes, the optimal switching of continuous systems from one steady-state operating point to another and the safety and environmental impact analysis of industrial processes. In most cases, numerous local solutions exist for problems of this type. Thus, Luus and Cormack (1972) showed that this is true even for rather simple problems.
∗ Corresponding author. Tel.: +44-207-594-6638; fax: +44-207-594-6606. E-mail address:
[email protected] (C.S. Adjiman).
0098-1354/$ – see front matter © 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0098-1354(03)00195-9
One class of approaches that can be applied for the numerical solution of the dynamic optimization problem uses variable discretization in order to transform the problem to a finite-dimensional nonlinear programming (NLP) problem. In complete discretization (known as the simultaneous approach) both the state variables and the controls are discretized (Oh & Luus, 1977; Biegler, 1984). The solution is carried out in the full space of variables. However, this method results in an NLP problem with a large number of variables and nonlinear equality constraints. In control parameterization (known as the sequential approach) only the controls are discretized (Goh & Teo, 1988; Vassiliadis, Sargent, & Pantelides, 1994). The problem is solved by applying an NLP strategy. The dynamic system is decoupled from the optimization stage and is integrated using well-established techniques in order to evaluate the objective function and the constraints. Due to the presence of nonconvexities of the functions participating in most chemical engineering models, current gradient-based numerical methods may fail to identify a solution for a feasible problem. Furthermore, if they succeed in finding a solution they can only guarantee that it is a local one. Local solutions can result in a poor economic performance, or can have a direct safety and environmental impact. There is therefore a need
404
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
to develop algorithms which can address these issues and guarantee optimal performance. The last decades have seen the advent of global optimization algorithms applicable to broad classes of NLP problems. These algorithms fall into two categories, the stochastic (Boender & Romeijn, 1995) and the deterministic approaches (Horst & Tuy, 1996). Stochastic methods rely on statistical arguments to prove their convergence. On the other hand, deterministic methods guarantee the location of the global optimum solution within a prespecified tolerance. For a thorough description of deterministic methods and their applications the reader is referred to Floudas and Pardalos (2000) and Floudas (2000). Several researchers have used stochastic optimization to address dynamic optimization problems. Dynamic programming utilizing grids was used by Luus (1990a, 1990b) and by Dadebo and McAuley (1995). Banga and Seider (1996) applied a stochastic algorithm called integrated controlled random search for dynamic systems (ICRS/DS) for the optimal control of chemical engineering systems. Carrasco and Banga (1997) applied ICRS/DS and a modification called adaptive randomly directed search for dynamic systems (ARDS/DS) for the dynamic optimization of chemical batch reactors. Banga, Irizarry-Rivera, and Seider (1998) used ICRS/DS for state constrained optimal and model predictive control. Luus and Hennessy (1999) used the direct search procedure of Luus and Jaakola (1973) for the dynamic optimization of fed-batch reactors and Hanke and Li (2000) utilized simulated annealing for the dynamic optimization of batch distillation processes. Recently, deterministic algorithms have been considered for dynamic problems. Smith and Pantelides (1996) used their symbolic manipulation and spatial branch and bound (BB) algorithm and Esposito and Floudas (2000b) used the ␣BB method (Maranas & Floudas, 1994; Androulakis, Maranas, & Floudas, 1995; Adjiman & Floudas, 1996; Adjiman, Androulakis, & Floudas, 1998; Adjiman, Dallwig, Floudas, & Neumaier, 1998) for the solution of the completely discretized dynamic optimization problem. The ␣BB method was also used by Esposito and Floudas (2000a,b) for the solution of the NLP problem that arises from the use of the sequential approach on the dynamic optimization problem. Esposito and Floudas (2002) applied this algorithm to isothermal reactor network synthesis. A theoretical guarantee of attaining the global solution of the problem is offered as long as rigorous values for the parameters needed or rigorous bounds on these parameters are obtained. A deterministic spatial BB global optimization algorithm was presented by Papamichail and Adjiman (2002b) for problems with differential equations in the constraints. A new approach was proposed to handle the dynamic information using constant and nonlinear underestimators. The theoretical convergence of the algorithm was proved by Papamichail and Adjiman (2002a). Singer and Barton (2002) presented theoretical results that can be utilized in a BB framework for the global optimization of a nonconvex integral
objective function subject to an embedded linear dynamic system. This paper is organized as follows. Section 2 gives the mathematical formulation of the problem studied. It is a nonconvex minimization problem with an initial value problem (IVP) for a set of first-order parameter dependent differential equations in the constraints. Parameter independent as well as parameter dependent dynamic bounds on the solution space of the IVP are constructed in Section 3. The formulation of the convex relaxation of the problem is presented in Section 4. Well-known techniques are used for the convex underestimation of algebraic functions. A new convex relaxation of the dynamic information is proposed based on the concepts presented in Section 3. Section 5 describes the global optimization algorithm developed. Computational case studies are discussed in Section 6. The case studies include a simple dynamic optimization problem and two-parameter estimation problems in chemical kinetics modeling.
2. Problem statement The mathematical formulation of the nonconvex dynamic optimization problem studied is given in this section. The assumptions made are clearly stated. 2.1. Dynamic model Dynamic mass and energy balances in chemical engineering modeling yield differential equations. The system considered in this work is described by the following set of first-order parameter dependent, typically nonlinear, differential equations: x˙ = f(t, x, p)
∀t ∈ [t0 , tNP ],
(1)
where t ∈ R is the independent variable, p ∈ Rr are the parameters, x ∈ Rn are the state variables and x˙ ∈ Rn are their derivatives with respect to t.1 The function f is such that f : [t0 , tNP ] × Rn × Rr → Rn . The solution x(t, p) of this set satisfies the initial condition x(t0 , p) = x0 (p),
(2)
where the function x0 is such that x0 : Rr → Rn . 2.2. Constraints Inequality constraints can be imposed at discrete points, ti . These are point constraints of the form: gi (x(ti , p), p) ≤ 0,
i = 0, 1, . . . , NP,
(3)
where ti ∈ [t0 , tNP ] ⊂ R and NP is the number of points considered additionally to the initial point. The functions gi , 1 For the rest of this paper, a dot over a variable represents its derivative with respect to t.
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
i = 0, 1, . . . , NP, are such that gi : Rn × Rr → Rsi . Of course, equality point constraints can be replaced by two inequality point constraints. Lower and upper bounds are imposed on the parameters p: (4)
2.3. Objective function The objective function for a dynamic optimization problem can be expressed in terms of the values of the state variables at discrete points and of the parameters: J(x(ti , p), p; i = 0, 1, . . . , NP).
2.4. Dynamic optimization problem The formulation of the dynamic optimization problem studied is given by min J(x(ti , p), p; i = 0, 1, . . . , NP) s.t. x˙ = f(t, x, p) ∀t ∈ [t0 , tNP ]
≤p≤
The dependence of convex relaxations on variable bounds is a common feature of deterministic global optimization algorithms. Since the state variables appear in the nonconvex objective function and constraints, a method for the derivation of rigorous bounds on these variables at point ti , i = 0, 1, . . . , NP, is needed. This issue can be resolved by generating bounds on the solution space of the dynamic system. The following IVP for a first-order parameter dependent differential equation is considered as an example: x˙ = −x2 + v
∀t ∈ [0, 1]
(7)
x(t = 0, v) = 9
p
pL
3. Bounds on the solution of an IVP for a set of parameter dependent differential equations
(5)
The function J is such that J : Rn(NP+1) ×Rr → R. Integral terms that may appear in the objective function can always be eliminated by introducing additional state variables and equations in the set of differential equations.
x(t0 , p) = x0 (p) gi (x(ti , p), p) ≤ 0,
These are given from the solution of the sensitivity equations (Vassiliadis et al., 1994). Due to the generally nonconvex nature of the functions used in the formulation of the dynamic optimization problem, the solution obtained using the sequential approach and a standard gradient-based NLP technique, is a local optimum and therefore provides an upper bound for the global optimum solution.
(6) i = 0, 1, . . . , NP
pU
The minimization of the objective function (5) is considered subject to the dynamics of the system, described by IVP (1) and (2), the point constraints (3) and the bounds on the parameters (4). Systems with controls that depend on t can be transformed to this form using control parameterization (Vassiliadis et al., 1994). The following assumptions are made on the properties of the functions in (6): • J(x(ti , p), p; i = 0, 1, . . . , NP) is once continuously differentiable with respect to x(ti , p), i = 0, 1, . . . , NP and p on Rn(NP+1) × Rr . • Each element of gi (x(ti , p), p), i = 0, 1, . . . , NP, is once continuously differentiable with respect to x(ti , p) and p on Rn × Rr . • Each element of f(t, x, p) is continuous with respect to t and once continuously differentiable with respect to x and p on [t0 , tNP ] × Rn × Rr . • Each element of x0 (p) is once continuously differentiable with respect to p on Rr . • f(t, x, p) satisfies a uniqueness condition on [t0 , tNP ] × Rn × Rr . The sequential approach is used for the solution of this dynamic optimization problem. The gradients with respect to p can be evaluated using the parameter sensitivities.
where v ∈ [−5, 5]. The solution x(t, v) of this IVP for different values of v is shown in Fig. 1. It can be observed that the solution for the upper bound of v and the solution for the lower bound of v give bounds on the trajectories. The aim of this section is to propose a systematic approach for the derivation of such bounds, applicable to IVPs for a system of generally nonlinear first-order parameter dependent differential equations. First, some key definitions are given. Definition 3.1. Let t0 < tNP ∈ R, I = [t0 , tNP ], I0 = (t0 , tNP ] and the class X be defined as the set of all functions x : I → Rn , continuous on I and differentiable on I0 . Let also x = (x1 , x2 , . . . , xn )T and
10 8 6 4 x
pL ≤ p ≤ pU .
405
υ=5 υ=2 υ=0 υ=−2
2 0 -2
υ=−5 -4 0
0.2
0.4
0.6
0.8
1
t
Fig. 1. Trajectories of the state variable for different values of v.
406
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
xk− = (x1 , x2 , . . . , xk−1 , xk+1 , . . . , xn )T . The notation f(t, x, p) = f(t, xk , xk− , p) is used.
If f is continuous and satisfies a uniqueness condition on I0 × Rn × [pL , pU ], then the solution x(t) and x¯ (t) of the following IVP satisfies Theorem 3.1: ¯
Definition 3.2. Let g(x) be a mapping g : D → R with D ⊆ Rn . Again the notation g(x) = g(xk , xk− ) is used. The function g is called unconditionally partially isotone (antitone) on D with respect to the variable xk if
x˙ k = inffk (t, xk , [xk− , x¯ k− ], [pL , pU ]) ¯ ∀t ∈ I and ¯ k =¯ 1, 2, . . . , n
g(xk , xk− ) ≤ g(˜xk , xk− )
for xk ≤ x˜ k (xk ≥ x˜ k )
Definition 3.3. Let f(t, x, p) = (f1 (t, x, p), . . . , f2 (t, x, p))T and each fk (t, xk , xk− , p) be unconditionally partially isotone on I × R × Rn−1 × Rr with respect to any component of xk− , but not necessarily with respect to xk . Then f is quasi-monotone increasing on I × Rn × Rr with respect to x. (This name is due to Walter (1970).) It is interesting to notice that if n = 1, then any function f(t, x, p) is quasi-monotone increasing with respect to x. The following IVP for a system of first-order parameter dependent differential equations is studied: (8)
where x and x˙ ∈ Rn , p ∈ [pL , pU ] ⊂ Rr , f : I × Rn × [pL , pU ] → Rn and x0 : [pL , pU ] → Rn . Parameter independent bounds on the solution are first discussed. These have been derived by Papamichail and Adjiman (2002b) and are presented for completeness. The analysis continues with parameter dependent bounds. 3.1. Parameter independent bounds Lower and upper parameter independent bounds can be determined for the solution x(t, p) of IVP (8) such that x(t) ≤ x(t, p) ≤ x¯ (t) ∀p ∈ [pL , pU ] ∀t ∈ I, where the in¯ equalities are understood component-wise. Theorem 3.1 (Papamichail & Adjiman, 2002b). Let f be continuous and satisfy a uniqueness condition on I0 × Rn × [pL , pU ]. If x(t), x¯ (t) ∈ X satisfy the following inequalities: ¯ x(t0 ) ≤ 䊐x0 ([pL , pU ]) ≤ x¯ (t0 ) ¯ x˙ k (t) ≤ 䊐fk (t, xk (t), [xk− (t), x¯ k− (t)], [pL , pU ]) ¯ ∀t ∈ I and ¯k = 1,¯ 2, . . . , n 0 ˙x¯ k (t) ≥ 䊐fk (t, x¯ k (t), [xk− (t), x¯ k− (t)], [pL , pU ]) ¯ then x(t) is a subfunction and x¯ (t) is a superfunction for the ¯ x(t, p) of IVP (8), i.e. solution x(t) ≤ x(t, p) ≤ x¯ (t) ¯
∀p ∈ [pL , pU ]
∀t ∈ I.
䊐x0 ([pL , pU ]) denotes the range of x0 over the domain L [p , pU ]. The reader is referred to Appendix A where the
basic concepts of interval analysis are outlined.
(9)
x(t0 ) = infx0 ([pL , pU ]) ¯ x¯ (t0 ) = supx0 ([pL , pU ])
and for all (xk , xk− ), (˜xk , xk− ) ∈ D.
x˙ = f(t, x, p) ∀t ∈ I x(t = t0 , p) = x0 (p)
x˙¯ k = supfk (t, x¯ k , [xk− , x¯ k− ], [pL , pU ]) ∀t ∈ I and k =¯1, 2, . . . , n
This IVP provides a practical procedure to construct bounding trajectories for IVP (8) if the appropriate continuity and uniqueness conditions are satisfied. Natural interval extensions are used as inclusion functions (see Appendix A). For the case of functions f(t, x, p), which are quasimonotone increasing on I × Rn × [pL , pU ] with respect to x, IVP (9) is decoupled. If that is not the case, then the so-called wrapping effect (the phenomenon that appears when a set, which is not representable exactly by an interval vector, has to be enclosed in an interval vector) produces poor enclosures. Moore (1966) noticed this effect and proposed a coordinate transformation. Nickel (1985) examined the wrapping effect and identified systems where it can be eliminated. Parallelotope enclosures (Lohner, 1987), ellipsoids (Neumaier, 1993) and high order zonotope enclosures (Kühn, 1998) have also been used for the reduction of the wrapping effect. The example discussed at the beginning of this section is reconsidered here. Based on (9), IVPs whose solutions x(t) and x¯ (t) can give bounds on the solution of IVP (7) ∀v¯ ∈ [−5, 5] are constructed. The function f(t, x, v) = −x2 + v, on the right-hand side of the differential equation in IVP (7) is quasi-monotone increasing on [0, 1] × R × [−5, 5] with respect to x and therefore, the bounding system is decoupled. The subfunction is given by x˙ = −x2 − 5 ∀t ∈ [0, 1] ¯ ¯ x(0) = 9 ¯ and the superfunction is given by x˙¯ = −¯x2 + 5
∀t ∈ [0, 1]
x¯ (0) = 9
(10)
(11)
The solutions of these bounding IVPs are shown in Fig. 2. They enclose the solution space of the original IVP for the parameter dependent differential equation. 3.2. Parameter dependent bounds Using the parameter independent bounds already presented, lower and upper parameter dependent bounds can be determined for the solution x(t, p) of IVP (8) such that x(t, p) ≤ x(t, p) ≤ x¯¯ (t, p) ∀p ∈ [pL , pU ] ∀t ∈ I. ¯¯ The sufficient conditions are given by the following key theorem.
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
is such that
10
x(t, p) ≤ x¯¯ (t, p)
8 6
x
x
2 0
x
-2
0
∀p ∈ [pL , pU ]
∀t ∈ I.
Proof. The proof is presented in Appendix B.
4
-4
407
0.2
0.4
0.6
0.8
1
t Fig. 2. Parameter independent bounds on the solution space.
Theorem 3.2. Let f be continuous and satisfy a uniqueness condition on I0 × Rn × [pL , pU ] and x(t), x¯ (t) ∈ X satisfy ¯ the following inequalities: x(t0 ) ≤ 䊐x0 ([pL , pU ]) ≤ x¯ (t0 ) ¯ x˙ k (t) ≤ 䊐fk (t, xk (t), [xk− (t), x¯ k− (t)], [pL , pU ]) ¯ ∀t ∈ I and ¯k = 1,¯ 2, . . . , n 0 ˙x¯ k (t) ≥ 䊐fk (t, x¯ k (t), [xk− (t), x¯ k− (t)], [pL , pU ]) ∀t ∈ I0 and k = 1,¯ 2, . . . , n
x(t, v) ≤ x(t, v) ∀v ∈ [−5, 5] ∀t ∈ [0, 1], ¯¯ where x(t) and x¯ (t) are given from the solutions of IVPs ¯ (11) and x(t, v) is the solution of IVP (7). In the (10) and same manner it can be shown that the solution x¯¯ 1 (t, v) of the IVP x˙¯¯ 1 = −2xx¯¯ 1 + x2 + v ∀t ∈ [0, 1] (19) ¯ ¯ x¯¯ 1 (0, v) = 9 is such that
f (t, x, p) ≤ f(t, x, p) ∀x ∈ [x(t), x¯ (t)] ¯ ¯ ∀p ∈ [pL , pU ]
and the solution x¯¯ 2 (t, v) of the IVP (12)
x0 (p) ≤ x0 (p) ∀p ∈ [p , p ] (13) ¯¯ and f is quasi-monotone increasing on I × Rn × [pL , pU ] with ¯respect to x, then the solution x(t, p) of the IVP ¯¯ x˙ = f (t, x, p) ∀t ∈ I ¯¯ (14) ¯ ¯¯ x(t0 , p) = x0 (p) ¯¯ ¯¯ is such that L
U
x(t, p) ≤ x(t, p) ∀p ∈ [pL , pU ] ∀t ∈ I, ¯¯ where x(t, p) is the solution of IVP (8). If
(15)
f(t, x, p) ≤ f¯ (t, x, p) ∀x ∈ [x(t), x¯ (t)] ¯ ∀p ∈ [pL , pU ] ∀t ∈ I, x0 (p) ≤ x¯¯ 0 (p) ∀p ∈ [pL , pU ]
x(t, v) ≤ x¯¯ 1 (t, v)
∀v ∈ [−5, 5]
x˙¯¯ 2 = −2¯xx¯¯ 2 + x¯ 2 + v x¯¯ 2 (0, v) = 9
∀t ∈ [0, 1]
∀t ∈ [0, 1]
(20)
is such that x(t, v) ≤ x¯¯ 2 (t, v)
∀v ∈ [−5, 5]
∀t ∈ [0, 1].
All these parameter dependent bounds, for v = 0, together with the parameter independent ones are shown in Fig. 3. It can be observed that the second overestimator is tighter than the first one ∀t ∈ [0, 1]. Although the first overestimator crosses the parameter independent upper bound at t = 0.8, based on Theorem 3.2, it is still a valid overestimator.
4. Formulation of the convex relaxation
and f¯ is quasi-monotone increasing on I × Rn × [pL , pU ] with respect to x, then the solution x¯¯ (t, p) of the IVP x˙¯¯ = f¯ (t, x¯¯ , p) ∀t ∈ I x¯¯ (t0 , p) = x¯¯ 0 (p)
In what follows, parameter dependent bounds are constructed for the example already discussed above. The function f(t, x, v) = −x2 + v can be underestimated by the function f (t, x, v) = −(x + x¯ )x + xx¯ + v ∀x ∈ [x, x¯ ] ∀v ∈ ¯ ¯ ¯ increas[−5, 5] ∀t¯ ∈ [0, 1]. Function f is quasi-monotone ing on [0, 1] × R × [−5, 5] ¯with respect to x. Based on Theorem 3.2, the solution x(t, v) of the IVP ¯¯ x˙ = −(x + x¯ )x + xx¯ + v ∀t ∈ [0, 1] ¯¯ ¯ ¯¯ ¯ (18) x(0, v) = 9 ¯¯ is such that
Let also f : I×Rn ×[pL , pU ] → Rn , x0 : [pL , pU ] → Rn , ¯¯ ¯ f¯ : I × Rn × [pL , pU ] → Rn and x¯¯ 0 : [pL , pU ] → Rn . If
∀t ∈ I,
(17)
(16)
The dynamic optimization problem has been formulated as a nonconvex NLP problem in Section 2. The solution obtained provides an upper bound for the global optimum solution. A convex relaxation is now needed and is formulated based on well-known techniques and the theory developed in the previous section. Its solution can be used as a lower
408
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
any transformation. Bilinear terms, and univariate concave terms are underestimated using well-known techniques. The convex envelope for the bilinear term z1 z2 over the L U domain [zL1 , zU 1 ] × [z2 , z2 ] was proposed by McCormick (1976) and is given by
10 8 6 4
U U U max(zL1 z2 + zL2 z1 − zL1 zL2 , zU 1 z2 + z2 z1 − z1 z2 ).
x
x
This result was proved by Al-Khayyal and Falk (1983). Each bilinear term is replaced by a new variable w defined by w = z1 z2 . In the relaxed problem, this equation has to be replaced by a convex underestimator and a concave overestimator. Based on the McCormick underestimators, the convex overestimation of the space for w is given by
2 0
x
-2 -4 0
0.2
0.4
0.6
0.8
1
t
Fig. 3. Parameter independent and dependent bounds for v = 0. The solution x(t, 0) (—), the underestimator (· · · ), the first overestimator (- · -) and the second overestimator (- - -) are shown.
bound for the global optimum of the nonconvex problem. A reformulation of the NLP problem (6) is given by min J(ˆx, p) xˆ ,p
s.t. gi (ˆxi , p) ≤ 0,
i = 0, 1, . . . , NP
xˆ i = x(ti , p),
i = 0, 1, . . . , NP
(21)
where the values of x(ti , p), i = 0, 1, . . . , NP are obtained from the solution of the IVP
U L L w ≤ zU 1 z2 + z 2 z1 − z 1 z2 .
For a univariate concave function fUT (z), the convex envelope over the domain [zL , zU ] is simply given by the affine function of z: fUT (zU ) − fUT (zL ) (z − zL ). zU − z L
An overall convex underestimator is given by the summation of the convex underestimators for each term in the function and the introduction of additional constraints required for the bilinear terms. 4.3. Convex relaxation of the dynamic information The set of equalities can be written as two sets of inequalities:
As previously noted, it is essential to have bounds on all the variables participating in a nonconvex manner. In problem (21), bounds on p are user-specified. The bounds on xˆ i depend on the parameter bounds and must be derived automatically. Based on the properties of the IVP and Section 3.1, parameter independent bounds x(t) and x¯ (t) can be constructed ¯ (22). These are given from for the solution x(t, p) of IVP the solution of IVP (9) and for t = ti they are also valid for the variable vectors xˆ i that have been introduced in the reformulated NLP problem: i = 0, 1, . . . , NP.
L U w ≤ zL1 z2 + zU 2 z1 − z 1 z2
(22)
4.1. Bounds on variables
x(ti ) ≤ xˆ i ≤ x¯ (ti ), ¯
U U U w ≥ zU 1 z2 + z 2 z1 − z 1 z2
fUT (zL ) +
p ∈ [pL , pU ]
x˙ = f(t, x, p) ∀t ∈ I x(t0 , p) = x0 (p)
w ≥ zL1 z2 + zL2 z1 − zL1 zL2
(23)
4.2. Convex relaxation of algebraic functions It is assumed that the functions J and gij , i = 0, 1, . . . , NP, j = 1, 2, . . . , si can be decomposed into a sum of terms, where each term may be classified as convex, bilinear or univariate concave. Convex terms do not require
xˆ i − x(ti , p) ≤ 0, i = 0, 1, . . . , NP x(ti , p) − xˆ i ≤ 0, i = 0, 1, . . . , NP Their relaxation is given by xˆ i + x˘ − (ti , p) ≤ 0, x˘ (ti , p) − xˆ i ≤ 0,
i = 0, 1, . . . , NP i = 0, 1, . . . , NP
(24) (25)
where ‘˘x’ denotes the convex underestimator of the specified function and x− (ti , p) = −x(ti , p). Thus, the function x˘ (ti , p) is a convex underestimator of x(ti , p) and the function −˘x− (ti , p) is a concave overestimator of x(ti , p). The generation of these under and overestimators is the most challenging step in the construction of the convex relaxation of the problem because no analytical form is available for x(ti , p). Two strategies have been developed. 4.3.1. Constant bounds The constant bounds given by inequalities (23) are valid convex underestimators and concave overestimators for
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
If the functions f , x0 , f¯ and x¯¯ 0 defined above satisfy ¯ ¯¯ 3.2, then Eqs. (27) and (28) prothe conditions of Theorem vide parameter dependent bounds for the solution of IVP (22). Functions f , x0 , f¯ and x¯¯ 0 that satisfy at least the in¯ ¯¯can easily be constructed for dynamic equality conditions systems with functions f and x0 which can be decomposed into a sum of linear, bilinear, trilinear, univariate convex ¯ and univariate concave terms. The matrices A to E and A ¯ ¯ ¯ to E needed for the construction of these functions usually depend on the bounds pL , pU , x(t) and x¯ (t). How¯ is not known and ever, the functional form of x(t) and x¯ (t) ¯ the matrices needed in (27) and (28) cannot be calculated analytically. The affine bounds for t = ti can be used for the convex underestimation of x(ti , p) and x− (ti , p) over the domain [pL , pU ] ⊂ Rr :
x(ti , p). This means that inequalities (24) and (25) can be replaced by inequalities (23). These bounds do not depend on the parameters p themselves, but do depend on the bounds on p. 4.3.2. Affine bounds A tighter convex relaxation may be derived if parameter dependent under and overestimators can be constructed. This issue is tackled in this section, where affine functions of the parameters are obtained. If f (t, x, p) = A(t)x + B(t)p + ¯ (t) are ¯ ¯¯ A(t), B¯(t) ¯¯and C C(t) and x0 (p) = Dp + E, where ¯ ¯ ¯ ¯ ¯ ¯ ¯ continuous¯ on I, then from linear systems theory (Zadeh & Desoer, 1963) the solution of IVP (14) is given by t x(t, p) = Φ(t, t0 )D + Φ(t, τ)B(τ) dτ p ¯ ¯ ¯¯ t0 t Φ(t, τ)C(τ) dτ, (26) + Φ(t, t0 )E + ¯ ¯ t0
x˘ (ti , p) = M (ti )p + N (ti ), i = 0, 1, . . . , NP ¯ ¯ ¯ i )p − N(t ¯ i ), i = 0, 1, . . . , NP x˘ − (ti , p) = −M(t
where Φ(t, t0 ) is the transition matrix, which is the solution of the IVP ˙ t0 ) = A(t)Φ(t, t0 ) Φ(t, ¯ Φ(t0 , t0 ) = I
and I is the identity matrix. From Eq. (26), it is clear that x(t, p) is an affine function of p of the form: ¯¯ x(t, p) = M (t)p + N (t), (27) ¯ ¯¯ ¯ where M (t) is an n × r matrix and N (t) is an n × 1 matrix. In the¯ same manner, if there exist¯ functions f¯ (t, x¯¯ , p) = ¯ ¯ x¯¯ + B(t)p ¯ ¯ + E, ¯ where A(t), ¯ A(t) + C(t) and x¯¯ 0 (p) = Dp ¯ are continuous on I, then the solution of IVP ¯ and C(t) B(t) (16) is of the form:
4.3.3. Comparison of the two strategies The solution x(t, v) of IVP (7), for t = 1, is a concave function of the parameter v, as shown in Fig. 4(a) using
(28)
4
4
3
3
2
2
1
1
0
0
-1
-1
-2
-2
-3
-3
-2.5
0
υ
2.5
(30)
This system has the form x(ti , pj ) = M (ti )pj + N (ti ), j = ¯ a unique¯ solution ¯¯ ensures that 1, . . . , r + 1. Condition (31) exists. The values of x(ti , pj ), j = 1, . . . , r + 1 are given ¯¯ from the numerical solution of IVP (14) coupled, when nec¯ i ) and N(t ¯ i ), essary, with IVP (9). In the same manner, M(t i = 0, 1, . . . , NP can be calculated.
¯ ¯ where M(t) is an n × r matrix and N(t) is an n × 1 matrix.
-4 (a) -5
(29)
M (ti ) and N (ti ), i = 0, 1, . . . , NP can be calculated from ¯ solution¯of the linear system produced when Eq. (27) is the applied for t = ti and for r + 1 values of p chosen such that p1 p2 · · · pr+1 det = 0. (31) 1 1 ··· 1
∀t ∈ I
¯ ¯ x¯¯ (t, p) = M(t)p + N(t),
409
5
-4 (b) -5
-2.5
0
2.5
5
υ
Fig. 4. Over and underestimators for the solution of IVP (7) for t = 1. The solution x(1, v) (—), the constant bounds (- - -) and the affine bounds (· · · ) are shown. (a) One region: v ∈ [−5, 5]. (b) Two regions: v ∈ [−5, 0] and v ∈ [0, 5].
410
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
a solid line. The two methods proposed can be applied to construct valid convex relaxations on the domain [−5, 5]. The constant bounds for the whole range of parameters are given from the solution of IVPs (10) and (11) and are shown using the dashed lines. The affine bounds are shown using the dotted lines. They are given from Eqs. (27) and (28), where the matrices needed are calculated from the solution of the linear system produced when these equations are applied for t = 1 and for v = vL and v = vU . The values of the state variables are given from the numerical solution at t = 1 for v = vL and v = vU of IVPs (18)–(20) coupled, when necessary, with IVPs (10) and (11). In Fig. 4(b), the domain of v is divided into two subdomains and the two strategies are applied again. The affine underestimator is tighter than the constant lower bound and one of the two affine overestimators is tighter than the constant upper bound. Although the second affine overestimator is not that tight for the whole range of v, it reduces the convex space even more. Since the constant bounds are generated at no extra cost when the affine bounds are used, the relaxation strategies used in practice always involve the constant bounds, with or without the affine bounds.
Given a relative optimality margin, r , and a maximum number of iterations, MaxIter: Step 1. Initialization • • • •
Set the upper bound on the objective function: J u := +∞. Initialize the iteration counter: Iter := 0. Initialize a list of subregions L to an empty list: L := ∅. Initialize a region R to the region covering the full domain of variables p: R := [pL , pU ].
Step 2. Upper bound • Solve the original NLP problem with bounds on p given by R. • If a feasible solution pR is obtained with objective funcu , then set the best feasible solution p∗ := p and tion JR R u. u J := JR Step 3. Lower bound
After underestimating the objective function and overestimating the feasible region, the convex relaxation of the NLP problem (21) is given by ˘ x, p, w) min J(ˆ
• Obtain bounds on the differential variables. • If affine bounds can be constructed and are additionally used for the convex relaxation of the set of equality constraints, then obtain the necessary matrices. • Form the convex relaxation of the problem for R and solve it. • If a feasible solution p∗R is obtained for R with objective ! , then add R to the list L together with J ! function JR R ∗ and pR .
s.t. g˘ i (ˆxi , p, w) ≤ 0,
Step 4. Subregion selection
4.4. Convex relaxation of the NLP
xˆ ,p,w
x(ti ) ≤ xˆ i ≤ x¯ (ti ), ¯ C(ˆx, p, w) ≤ 0
i = 0, 1, . . . , NP i = 0, 1, . . . , NP
(32)
p ∈ [pL , pU ] ˘ denotes the convex underestimator of the specified where ‘J’ function, C denotes the set of additional constraints arising from the convex relaxation of bilinear terms and w denotes the vector of new variables introduced by this relaxation. If the affine bounds are additionally used for the convex relaxation of the set of equality constraints then the following constraints can be added to the above formulation: xˆ i + x˘ − (ti , p) ≤ 0, i = 0, 1, . . . , NP (33) x˘ (ti , p) − xˆ i ≤ 0, i = 0, 1, . . . , NP where x˘ (ti , p) and x˘ − (ti , p) are given from Eqs. (29) and (30).
5. Global optimization algorithm After constructing the convex relaxation of the original NLP problem, a deterministic spatial BB global optimization algorithm, which follows the one by Horst and Tuy (1996), can be used in order to obtain the global minimum within an optimality margin.
• If the list L is empty, then the problem is infeasible. Terminate. • Otherwise set the region R to the region from the list L with the lowest lower bound: R := arg minLi ∈L JL! i . • Remove R from the list L. Step 5. Checking for convergence ! )/|J ! | ≤ , then the solution is p∗ with an • If (J u − JR r R objective function J u . Terminate. • If Iter = MaxIter, then terminate and report (J u − ! )/(|J ! |). JR R • Otherwise increase the iteration counter by one: Iter := Iter + 1.
Step 6. Branching within R • Apply the least reduced axis rule on region R to choose a variable on which to branch and generate two new subregions R1 and R2 which are a partition of R. Step 7. Upper bound for each region • For i = 1, 2, solve the original NLP problem with bounds on p given by Ri .
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
• For i = 1, 2, if a feasible solution pRi is obtained with u < J u , then update the best feasible objective function JR i u and solution found so far p∗ := pRi , set J u := JR i ! > remove from the list L all subregions R such that JR u J . Step 8. Lower bound for each region • Obtain bounds on the differential variables. • If affine bounds can be constructed and are additionally used for the convex relaxation of the set of equality constraints, then obtain the necessary matrices. • Form the convex relaxation of the problem for each subregion R1 and R2 and solve it. • For i = 1, 2, if a feasible solution p∗Ri is obtained for Ri ! , then: with objective function JR i ! < J ! , then set J ! := ◦ If affine bounds are used and JR R Ri i !. JR ! ≤ J u , then add R to the list L together with ◦ If JR i i ! and p∗ . JR Ri i • Go to Step 4. In order to reduce the computational expense arising from the repeated solution of local dynamic optimization problems, the upper bound generation (Step 7) does not have to be applied at every iteration of the algorithm. This does not affect the ability of the algorithm to identify the global solution. In the BB algorithm of Horst and Tuy (1996) if the relaxed problem is feasible for a region, then it has to be at least as tight as the relaxation at its parent node to ensure that the bounding operation is improving. This is true when only constant bounds are used for the relaxation of the set of equality constraints because of their theoretical properties (Papamichail & Adjiman, 2002a). A step to enforce this requirement was included in the algorithm for the case in which affine bounds are used additionally to the constant bounds (Step 8, fifth bullet point).
6. Case studies The global optimization algorithm presented in Section 5 was implemented using MATLAB 5.3 (The MathWorks, 1999). The function fmincon was used for the solution of NLP problems. It is an implementation of a general NLP solver provided by the Optimization Toolbox 2.0 (Coleman, Branch, & Grace, 1999) for MATLAB. This solver uses either a subspace trust region method, based on the interior-reflective Newton method, or a sequential quadratic programming method. The MATLAB function ode45 (Shampine & Reichelt, 1997) was used for the integration of IVPs. It is an implementation of a Runge–Kutta method based on the Dormand–Prince pair. The interval calculations needed were performed explicitly using in-
411
terval arithmetic. It is worth noting that this prototype implementation is not optimized. CPU times are therefore reported only for the purpose of comparing different bounding schemes. The results for the case of using only constant bounds for the relaxation of the equality constraints have been derived by Papamichail and Adjiman (2002b) and are presented here for comparison. The first case study is a simple dynamic optimization problem. The next two are parameter estimation problems in chemical kinetics modeling. All the case studies were solved on an Ultra-60 workstation (2 × 360 MHz UltraSPARC-II CPU, 512MB RAM). 6.1. Case study 1: a simple dynamic optimization problem This example is a problem with one optimization parameter. IVP (7), considered as an example in Section 3, is included in the constraints. This problem has two local minima. Its formulation is given by min − x(t = 1, v)2 v
s.t. x˙ = −x2 + v
∀t ∈ [0, 1] x(t = 0, v) = 9 −5 ≤ v ≤ 5
One affine underestimator and two affine overestimators were produced for the parameter dependent solution of the IVP. The global optimum parameter is v = −5 and the value of the objective function for the global optimum parameter is equal to −8.23262. The results are presented in Table 1. The number of iterations needed for convergence is the same for both cases. When both constant and affine bounds are used for the convex relaxation of the equality constraint, the CPU time is larger because of the time spent on the solution of the IVPs needed for the calculation of the matrices in the affine bounds. Although the first lower bound calculated is equal to the global optimum, the first upper bound is not. That is why an iteration is needed to update the upper bound. 6.2. Case study 2: a first-order irreversible liquid-phase reaction The second example is a parameter estimation problem with two parameters and two differential equations in the constraints. It appears in Tjoa and Biegler (1991) as well as in Floudas et al. (1999) and Esposito and Floudas (2000b). It involves a first-order irreversible isothermal liquid-phase Table 1 Global optimization results for case study 1 Underestimation scheme
r
Iterations
CPU time (s)
Constant Constant and affine
1.00E−07 1.00E−07
1 1
9 11
412
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
When affine bounds are used additionally to the constant bounds, the number of iterations needed is decreased by several orders of magnitude. For the same type of underestimation only two more iterations are needed if the optimality margin r is decreased from 1.00E−02 to 1.00E−03.
1
0.8
x1, x2
0.6
6.3. Case study 3: catalytic cracking of gas oil 0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
t
This example is a parameter estimation problem with three parameters and two differential equations in the constraints. It appears in Tjoa and Biegler (1991) as well as in Floudas et al. (1999) and Esposito and Floudas (2000b). It involves an overall reaction of catalytic cracking of gas oil (A) to gasoline (Q) and other products (S): The problem can be formulated as follows:
Fig. 5. Experimental points and state variable trajectories for the global optimum parameters of case study 2.
chain reaction: k1
k2
→B− →C A− min
The problem can be formulated as follows: min
k1 ,k2
2 10
(xi (t =
j=1 i=1
k1 ,k2 ,k3
2 20 j=1 i=1
exp
(xi (t = tj , k1 , k2 , k3 ) − xi (tj ))2
s.t. x˙ 1 = −(k1 + k3 )x12
exp tj , k1 , k2 ) − xi (tj ))2
∀t ∈ [0, 0.95]
x˙ 2 = − k2 x2 ∀t ∈ [0, 0.95] x1 (t = 0, k1 , k2 , k3 ) = 1 k1 x12
s.t. x˙ 1 = −k1 x1 ∀t ∈ [0, 1] x˙ 2 = k1 x1 − k2 x2 ∀t ∈ [0, 1]
x2 (t = 0, k1 , k2 , k3 ) = 0 0 ≤ k1 ≤ 20
x1 (t = 0, k1 , k2 ) = 1 x2 (t = 0, k1 , k2 ) = 0
0 ≤ k2 ≤ 20 0 ≤ k3 ≤ 20
0 ≤ k1 ≤ 10 0 ≤ k2 ≤ 10 where x1 and x2 are the mole fractions of components A and B, respectively. k1 and k2 are the rate constants of the exp first and second reaction, respectively. xi (tj ) is the experimental point for the state variable i at time tj . The points used are taken from Floudas et al. (1999). Eight affine underestimators and eight affine overestimators were produced for the parameter dependent solution of the IVP. The global optimum parameters are k1 = 5.0035 and k2 = 1.0000 and the value of the objective function for the global optimum parameters is equal to 1.18562E−06. The experimental points and the state variable trajectories for the global optimum parameters are shown in Fig. 5. The results are presented in Table 2. The upper bound calculation was performed once every 100 iterations.
where x1 and x2 are the mole fractions of components A and Q, respectively. k1 , k2 and k3 are the rate constants of exp the respective reactions. xi (tj ) is the experimental point for the state variable i at time tj . The points used are again taken from Floudas et al. (1999). Thirty-two affine underestimators and 64 affine overestimators were produced for the parameter dependent solution of the IVP. The global optimum parameters are k1 = 12.2141, k2 = 7.9799 and k3 = 2.2215 and the value of the objective function for the global optimum parameters is equal to 2.65567E−03. The experimental points and the state variable trajectories for the global optimum parameters are shown in Fig. 6. The results are presented in Table 3. The upper bound calculation was performed once every 100 iterations.
Table 2 Global optimization results for case study 2
Table 3 Global optimization results for case study 3
Underestimation scheme
r
Iterations
CPU time (s)
Underestimation scheme
r
Iterations
CPU time (s)
Constant Constant Constant and affine Constant and affine
1.00E−02 1.00E−03 1.00E−02 1.00E−03
3501 34508 37 39
2828 22959 767 801
Constant Constant Constant and affine Constant and affine
6.41E−02 1.33E−02 1.00E−02 1.00E−03
10000 100000 67 94
16729 152816 26597 35478
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
independent and parameter dependent bounds on the solution of the IVP included in the constraints of the problem studied. An implementation of the proposed algorithm was used to solve case studies relevant to chemical engineering. Different underestimation schemes were considered. When affine bounds are used additionally to the constant bounds there is an increase of the computational expense for the production of the lower bound at each iteration. However, the results suggest that the algorithm converges in far fewer iterations than in the case of using the constant bounds alone, and usually requires less CPU time.
1
0.8
x1, x2
0.6
0.4
0.2
0 0
413
0.2
0.4
0.6
0.8
1
t
Fig. 6. Experimental points and state variable trajectories for the global optimum parameters of case study 3.
While using only the constant bounds for the convex relaxation of the set of equality constraints, a maximum number of iterations was set. This number was reached and the algorithm was terminated. The relative optimality obtained is reported. When affine bounds are used additionally to the constant bounds, the number of iterations needed for even smaller optimality margins is decreased by several orders of magnitude. The CPU time per iteration is large because of the time spent on the solution of the IVPs needed for the calculation of the matrices in the 96 affine bounds.
Acknowledgements Financial support (GR/R52312/01) from the Engineering and Physical Sciences Research Council (EPSRC) is gratefully acknowledged.
Appendix A. Interval analysis Interval analysis was developed by Moore (1966). An introduction to the subject can also be found in the textbook by Alefeld and Herzberger (1983). Ratschek and Rokne (1984, 1988) presented the development of interval tools appropriate for global optimization algorithms. The basic concepts of interval arithmetic are outlined in this Appendix. A real compact interval Z is a subset of the real domain R:
7. Conclusions
Z = [zL , zU ] = {z ∈ R : zL ≤ z ≤ zU }.
Modeling the dynamic behavior of chemical engineering systems results in a set of differential equations. The dynamic optimization of these systems, using gradient-based methods, gives local optimum solutions. The need for global optimum solutions was the main motivation for this research. The dynamic system considered is described by a set of first-order parameter dependent, typically nonlinear, differential equations. A dynamic optimization problem was formulated as a nonconvex NLP problem. A deterministic spatial BB global optimization algorithm was developed for the dynamic optimization problem. Local solutions, produced using the sequential approach, were used as an upper bound on the global minimum of the objective function value. Lower bounds were provided from the solution of a convex relaxation of the problem on subregions considered in the BB algorithm. This convex relaxation was achieved after defining a convex underestimation of the objective function and a convex overestimation of the feasible region. Algebraic functions were underestimated using well-known techniques. A new procedure was proposed for the convex relaxation of the dynamic information, using the theory already developed for the construction of parameter
Let I be the set of all real compact intervals. The interval arithmetic operations are defined by A B = {a b : a ∈ A, b ∈ B} for A, B ∈ I, where the binary operator ∈ {+, −, ·, /}. This definition is equivalent to the following rules: [a, b] + [c, d] = [a + c, b + d] [a, b] − [c, d] = [a − d, b − c] [a, b] · [c, d] = [min(ac, ad, bc, bd), max(ac, ad, bc, bd)] [a, b]/[c, d] = [a, b] · [1/d, 1/c] if 0 ∈ / [c, d]. The following property is called inclusion isotonicity of interval operations and follows from their definition: If A, B, C, D ∈ I then A ⊆ B, C ⊆ D implies that A C ⊆ B D. Let X ⊆ Rn and the function f : X → R. The set of compact n-dimensional intervals contained in X is denoted by I(X). 䊐f(Y) = {f(x) : x ∈ Y } for Y ∈ I(X) is the range
414
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415
of f over Y . A function F : I(X) → I is called an inclusion function for f if
Using Eqs. (8) and (14) and inequality (B.5), the following must hold:
䊐f(Y) ⊆ F(Y)
f k (ξ, x(ξ, p), p) > fk (ξ, x(ξ, p), p). ¯¯ ¯ Using Eq. (B.4), inequality (B.6) and the fact that f (t, x, p) ¯ U ] with is quasi-monotone increasing on I × Rn × [pL , p respect to x, the following is true:
∀Y ∈ I(X).
In addition to the basic operations, predeclared functions can be used. Predeclared functions in programming languages are functions like sin, cos, exp, ln. For these functions predeclared interval functions are known. For example, the predeclared function exp(x) has as a predeclared interval function the function EXP(X) = [exp(xL ), exp(xU )] on the interval X = [xL , xU ]. If f(x) is a function expression in the variable x ∈ Rn and has an explicit formula using the four real arithmetic operations and some predeclared functions then the expression which arises if each occurrence of x in f(x) is replaced by Y , each predeclared function is replaced by its predeclared interval function and each real arithmetic operation is replaced by the corresponding interval arithmetic operation, is called the natural interval extension of f(x) to Y and is denoted by f(Y). It follows from the inclusion isotonicity of the interval operations and from the properties of the predeclared functions that natural interval extensions are functions for which the inclusion monotonicity property holds: If Y1 , Y2 ∈ I(X) then Y1 ⊆ Y2 implies that f(Y1 ) ⊆ f(Y2 ). Natural interval extensions are inclusion functions. Many other methods (mean value forms, Taylor forms) have been proposed for the calculation of inclusion functions. These methods can produce tighter inclusions but involve natural interval extension calculations of the derivatives of f(x). Appendix B. Proof of Theorem 3.2 Proof. Based on the assumptions made and Theorem 3.1, the following is true: x(t) ≤ x(t, p) ≤ x¯ (t) ∀p ∈ [pL , pU ] ∀t ∈ I. (B.1) ¯ Using the initial condition from IVPs (8) and (14) and inequality (13) the following is true: x(t0 , p) ≤ x(t0 , p) ∀p ∈ [pL , pU ]. ¯¯ If
(B.2)
x(t, p) ≤ x(t, p) ∀p ∈ [pL , pU ] ∀t ∈ I0 (B.3) ¯¯ is not true, then there exists ξ ∈ I such that for some k ∈ {1, 2, . . . , n} and some p ∈ [pL , pU ] xk (ξ, p) = xk (ξ, p), ¯¯ x˙ k (ξ, p) > x˙ k (ξ, p) ¯¯ and x(t, p) ≤ x(t, p) ¯¯
∀p ∈ [pL , pU ]
(B.4) (B.5) ∀t ∈ [0, ξ].
(B.6)
f k (ξ, x(ξ, p), p) ≤ f k (ξ, x(ξ, p), p). ¯¯ ¯ ¯ Combining the last two inequalities the following must hold: (B.7) f k (ξ, x(ξ, p), p) > fk (ξ, x(ξ, p), p). ¯ However, using inequalities (12) and (B.1), inequality (B.7) is not true. This contradiction establishes the claim that inequality (B.3) is true and combined with inequality (B.2) it is proved that inequality (15) is true. In the same manner inequality (17) can be proved.
References Adjiman, C. S., Androulakis, I. P., & Floudas, C. A. (1998). A global optimization method, ␣BB, for general twice-differentiable constrained NLPs. II. Implementation and computational results. Computers and Chemical Engineering, 22(9), 1159–1179. Adjiman, C. S., Dallwig, S., Floudas, C. A., & Neumaier, A. (1998). A global optimization method, ␣BB, for general twice-differentiable constrained NLPs. I. Theoretical advances. Computers and Chemical Engineering, 22(9), 1137–1158. Adjiman, C. S., & Floudas, C. A. (1996). Rigorous convex underestimators for general twice-differentiable problems. Journal of Global Optimization, 9, 23–40. Al-Khayyal, F. A., & Falk, J. E. (1983). Jointly constrained biconvex programming. Mathematics and Operation Research, 8(2), 273–286. Alefeld, G., & Herzberger, J. (1983). Introduction to interval computations. New York: Academic Press. Androulakis, I. P., Maranas, C. D., & Floudas, C. A. (1995). ␣BB: A global optimization method for general constrained nonconvex problems. Journal of Global Optimization, 7, 337–363. Banga, J. R., Irizarry-Rivera, R., & Seider, W. D. (1998). Stochastic optimization for optimal and model-predictive control. Computers and Chemical Engineering, 22(4–5), 603–612. Banga, J. R., & Seider, W. D. (1996). Global optimization of chemical processes using stochastic algorithms. In C. A. Floudas & P. M. Pardalos (Eds.), State of the art in global optimization. Series in nonconvex optimization and its applications (pp. 563–583). Dordrecht: Kluwer Academic Publishers. Biegler, L. T. (1984). Solution of dynamic optimization problems by successive quadratic programming and orthogonal collocation. Computers and Chemical Engineering, 8(3–4), 243–248. Boender, C. G. E., & Romeijn, H. E. (1995). Stochastic methods. In R. Horst & P. M. Pardalos (Eds.), Handbook of global optimization. Series in nonconvex optimization and its applications (pp. 829–869). Dordrecht: Kluwer Academic Publishers. Carrasco, E. F., & Banga, J. R. (1997). Dynamic optimization of batch reactors using adaptive stochastic algorithms. Industrial and Engineering Chemistry Research, 36(6), 2252–2261. Coleman, T., Branch, M. A., & Grace, A. (1999). Optimization toolbox. For use with MATLAB. User’s guide (Version 2). The MathWorks, Inc. Dadebo, S. A., & McAuley, K. B. (1995). Dynamic optimization of constrained chemical engineering problems using dynamic programming. Computers and Chemical Engineering, 19(5), 513–525.
I. Papamichail, C.S. Adjiman / Computers and Chemical Engineering 28 (2004) 403–415 Esposito, W. R., & Floudas, C. A. (2000a). Deterministic global optimization in nonlinear optimal control problems. Journal of Global Optimization, 17(1–4), 97–126. Esposito, W. R., & Floudas, C. A. (2000b). Global optimization for the parameter estimation of differential–algebraic systems. Industrial and Engineering Chemstry Research, 39, 1291– 1310. Esposito, W. R., & Floudas, C. A. (2002). Deterministic global optimization in isothermal reactor network synthesis. Journal of Global Optimization, 22, 59–95. Floudas, C. A. (2000). Deterministic global optimization: Theory, methods and applications. Series in nonconvex optimization and its applications. Dordrecht: Kluwer Academic Publishers. Floudas, C. A., & Pardalos, P. M. (2000). Recent developments in global optimization and their relevance to process design. In FOCAPD’99. AIChE Symposium Series (Vol. 96, pp. 84–98). Floudas, C. A., Pardalos, P. M., Adjiman, C. S., Esposito, W. R., Gümüs, Z. H., Harding, S. T., Klepeis, J. L., Meyer, C. A., & Schweiger, C. A. (1999). Handbook of test problems in local and global optimization. Series in nonconvex optimization and its applications. Dordrecht: Kluwer Academic Publishers. Goh, C. J., & Teo, K. L. (1988). Control parameterization: A unified approach to optimal control problems with general constraints. Automatica, 24(1), 3–18. Hanke, M., & Li, P. (2000). Simulated annealing for the optimization of batch distillation processes. Computers and Chemical Engineering, 24, 1–8. Horst, R., & Tuy, H. (1996). Global optimization. Deterministic approaches (3rd ed.). Berlin: Springer-Verlag. Kühn, W. (1998). Rigorously computed orbits of dynamical systems without the wrapping effect. Computing, 61, 47–67. Lohner, R. J. (1987). Enclosing the solutions of ordinary initial and boundary problems. In E. W. Kaucher, U. W. Kulisch, & C. Ullrich (Eds.), Computer arithmetic: Scientific computation and programming languages. Series in computer science (pp. 255–286). Stuttgart: Wiley-Teubner. Luus, R. (1990a). Application of dynamic programming to highdimensional nonlinear optimal control problems. International Journal of Control, 52(1), 239–250. Luus, R. (1990b). Optimal control by dynamic programming using systematic reduction in grid size. International Journal of Control, 51(5), 995–1013. Luus, R., & Cormack, D. E. (1972). Multiplicity of solutions resulting from the use of variational methods in optimal control problems. Canadian Journal of Chemical Engineering, 50, 309– 311. Luus, R., & Hennessy, D. (1999). Optimization of fed-batch reactors by the Luus–Jaakola optimization procedure. Industrial and Engineering Chemistry Research, 38(5), 1948–1955. Luus, R., & Jaakola, T. H. I. (1973). Optimization by direct search and systematic reduction of the size of search region. AIChE Journal, 19(4), 760–766.
415
Maranas, C. D., & Floudas, C. A. (1994). Global minimum potential energy conformations of small molecules. Journal of Global Optimization, 4, 135–170. McCormick, G. P. (1976). Computability of global solutions to factorable nonconvex programs. Part I. Convex underestimating problems. Mathematical Programming, 10, 147–175. Moore, R. E. (1966). Interval analysis. Englewood Cliffs, NJ: PrenticeHall. Neumaier, A. (1993). The wrapping effect, ellipsoid arithmetic, stability and confidence regions. In Validation numerics. Computing supplements (Vol. 9, pp. 175–190). Wien: Springer. Nickel, K. L. (1985). How to fight the wrapping effect. In K. L. Nickel (Ed.), Interval mathematics. Lecture notes in computer science (No. 212, pp. 121–132). Berlin: Springer-Verlag. Oh, S. H., & Luus, R. (1977). Use of orthogonal collocation methods in optimal control problems. International Journal of Control, 26(5), 657–673. Papamichail, I., & Adjiman, C. S. (2002a). Proof of convergence for a global optimization algorithm for problems with ordinary differential equations, submitted for publication. Papamichail, I., & Adjiman, C. S. (2002b). A rigorous global optimization algorithm for problems with ordinary differential equations. Journal of Global Optimization, 24, 1–33. Ratschek, H., & Rokne, J. (1984). Computer methods for the range of functions. Series in mathematics and its applications. England: Allis Horwood Ltd. Ratschek, H., & Rokne, J. (1988). New computer methods for global optimization. Series in mathematics and its applications. England: Allis Horwood Ltd. Shampine, L. F., & Reichelt, M. W. (1997). The MATLAB ODE suite. SIAM Journal of Scientific Computing, 18(1), 1–22. Singer, A. B., & Barton, P. I. (2002). Global solution of linear dynamic embedded optimization problems. Part I. Theory. Journal of Optimization Theory and Applications, in press. Smith, E. M. B., & Pantelides, C. C. (1996). Global optimisation of general process models. In I. E. Grossmann (Ed.), Global optimization in engineering design. Series in nonconvex optimization and its applications (Chapter 12, pp. 355–386). Dordrecht: Kluwer Academic Publishers. The MathWorks (1999). Using MATLAB. Tjoa, I. B., & Biegler, L. T. (1991). Simultaneous solution and optimization strategies for parameter estimation of differential–algebraic equation systems. Industrial and Engineering Chemistry Research, 30, 376– 385. Vassiliadis, V. S., Sargent, R. W. H., & Pantelides, C. C. (1994). Solution of a class of multistage dynamic optimization problems. 1. Problems without path constraints. Industrial and Engineering Chemistry Research, 33(9), 2111–2122. Walter, W. (1970). Differential and integral inequalities. Berlin: Springer-Verlag. Zadeh, L. A., & Desoer, C. A. (1963). Linear system theory: The state space approach. Series in system science. New York: McGraw-Hill.