Adaptive Economic Optimizing Model Predictive Control of Uncertain Nonlinear Systems

Adaptive Economic Optimizing Model Predictive Control of Uncertain Nonlinear Systems

4th IFAC Nonlinear Model Predictive Control Conference International Federation of Automatic Control Noordwijkerhout, NL. August 23-27, 2012 Adaptive...

391KB Sizes 2 Downloads 325 Views

4th IFAC Nonlinear Model Predictive Control Conference International Federation of Automatic Control Noordwijkerhout, NL. August 23-27, 2012

Adaptive Economic Optimizing Model Predictive Control of Uncertain Nonlinear Systems ? M. Guay ∗ ∗

Department of Chemical Engineering, Queen’s University, Kingston, Ontario, Canada (e-mail: [email protected])

Abstract: In this paper, we propose the design of economic MPC systems based on a single-step approach of the adaptive MPC technique proposed for a class of uncertain nonlinear systems subject to parametric uncertainties and exogenous variables. The framework considered assumes that the economic function is a known function of constrained system’s states, parameterized by unknown parameters. The objective and constraint functions may explicitly depend on time, which means that our proposed method is applicable to both dynamic and steady state economic optimization. A simulation example is used to demonstrate the effectiveness of the design technique. Keywords: Adaptive control, Real-time optimization, Model predictive control 1. INTRODUCTION One of the key challenges in the process industry is how to best operate plants in light of varying of processing and economic conditions. If one focusses on economic considerations, it may be extremely difficult to know, a priori, what the optimal operating conditions may be. Several technologies have been developed to address this problem. The leading approach remains real-time optimization (RTO) ( Basak et al. (2002),Lauks et al. (1992)), which refers to the online economic optimization of a process plant. RTO is a widely employed technology that can used to address this challenge. RTO attempts to optimize process performance (usually measured in terms of profit or operating cost) thereby enabling companies to push the profitability of their processes to their true potential as operating conditions change. RTO systems are usually designed to solve steady-state optimization problems. It is therefore generally assumed that the process dynamics can be neglected if the optimization execution time interval is long enough to allow the process to reach and maintain steady-state. The integration of RTO and control for an optimal plant operation is an active area of research. RTO can be seen has being in the family of adaptive extremum-seeking control system. Many techniques have been developed in the literature to address the regulation of processes to optimal (unknown) set-points. The main challenge has been to address the combined task of steady-state optimization and transient performance. One approach proposed in Guay and Zhang (2003) is to use the profit (or cost) function to be optimized to construct a Lyapunov function. Since the profit function is not generally measured and may depend on unknown model parameters, an adaptive control approach is usually required to ensure that the control system ? The authors would like to acknowledge the financial support of the Natural Sciences and Engineering Research Council of Canada.

978-3-902823-07-6/12/$20.00 © 2012 IFAC

349

can be reach the true unknown optimum without any bias associated to model and parametric uncertainties. In the context of MPC, the problem has been treated in Adetola and Guay (2010) where the integration of real-time optimization and model predictive control is considered. The technique in Adetola and Guay (2010) is particularly well suited as it provides a robust adaptive integrated control approach that can effectively deal with uncertainties and economic objectives. It is based on the robust adaptive model predictive control techniques proposed in Adetola and Guay (2011) and provides robustness to parametric uncertainties. Another approach that has been proposed in the literature is a framework for integration of RTO and MPC called economic MPC. The leading approach proposed in Angeli and Rawlings (2011). In contrast to the two step approach proposed in Adetola and Guay (2010), the economic MPC approach proposes to use the economic objective cost in the stage cost of the MPC. Assuming that the cost function is exactly known, the resulting MPC can be shown to asymptotically convergence to a neighbourhood of the best feasible operating conditions for the control system. Questions associated with transient performance and stability of the control system remain open. Stability results are limited to linear systems subject to convex cost and convex constraints. Recent studies (Angeli and Rawlings (2010),Diehl et al. (2011)) provides a summary of results for economic MPC for nonlinear systems. In this paper, we propose the design of economic MPC systems based on a single-step approach of the adaptive MPC technique proposed for a class of uncertain nonlinear systems subject to parametric uncertainties and exogenous variables. The framework considered assumes that the economic function is a known function of constrained system’s states, parameterized by unknown parameters. The objective and constraint functions may explicitly

10.3182/20120823-5-NL-3013.00072

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

depend on time, which means that our proposed method is applicable to both dynamic and steady state economic optimization. The control objective is to simultaneously identify and regulate the system to the operating point that optimizes the economic function. The control input and state trajectories of the closed-loop system may also be required to satisfy some constraints. The main difference with the approach proposed in Adetola and Guay (2010) is to incorporate the cost in the MPC stage cost, but in a very specific way that allows one to address stability and robustness issue for the class of economic MPC control systems.

Remark 2. In this study, the exogenous variable ϑ represents an unstructured bounded time-varying uncertainty.

2. PROBLEM DESCRIPTION

3. EXTREMUM SEEKING SETPOINT DESIGN

Consider a constrained optimization problem of the form (1a) minx∈Rnx p(x, θ) s.t. cj (x) ≤ 0 j = 1 . . . mc (1b) with θ representing unknown parameters, assumed to be uniquely identifiable and lie within an initially known convex set Θ0 , B(θ0 , zθ0 ). The functions p and cj are assumed to be C 2 in all of their arguments (with locally Lipschitz second derivatives). The constraint cj ≤ 0 must be satisfied along the system’s state trajectory x(t). In constrast to existing economic MPC techniques, it is assumed that the optimum solution depends on unknown parameters, θ. This is in line with standard RTO where plant data can be used to update unknown optimal operating conditions. Assumption 1. The following assumptions are made about (1). ∂ p (1) There exists ε0 > 0 such that ∂x ≥ ε0 I and 2 ∂2c  nx × Θ ), where Θ is ≥ 0 for all (x, θ) ∈ (R ∂x2 an  neighborhood of Θ. (2) The feasible set  X = x ∈ Rnx | max cj (x) ≤ 0 , 2

j

has a nonempty interior. Assumption 1 states that the cost surface is strictly convex in x and X is a non-empty convex set. Standard nonlinear optimization results guarantee the existence of a unique minimizer x∗ (x, θ) ∈ X to problem 1. In the case of nonconvex cost surface, only local attraction to an extremum can be guaranteed. Consider the uncertain nonlinear system x˙ = f (x) + g(x)u + q(x)θ + ϑ , F(x, u, θ, ϑ) (2) nd where the disturbance ϑ ∈ D ⊂ R is assumed to satisfy a known upper bound kϑ(t)k ≤ Mϑ < ∞. The objective of the study is to (robustly) stabilize the plant to some target set Ξ ⊂ Rnx while satisfying the pointwise constraints x ∈ X ∈ Rnx and u ∈ U ∈ Rnu . The target set is a compact set, contains the origin and is robustly invariant under no control. It is assumed that θ is uniquely identifiable and lie within an initially known compact set Θ0 = B(θ0 , zθ ) where θ0 is a nominal parameter value, zθ is the radius of the parameter uncertainty set. It is assumed that all vector fields f (x), g(x) = [ g1 (x), . . . , gm (x) ] and q(x) = [ q1 (x), . . . , qp (x) ] are smooth vector valued functions. 350

The control objective is to stabilize the nonlinear system (2) to the optimum operating point or trajectory given by the solution of (1) while obeying the input constraint u ∈ U ∈ Rnu in addition to the state constraint x ∈ X ∈ Rnx . The dynamics of the state ξ is assumed to satisfy the following input to state stability condition with respect to x. Assumption 3. If x is bounded by a compact set Bx ⊆ X, then there exists a compact set Bξ ⊆ Rnξ such that ξ ∈ Bξ is positively invariant under 2.

3.1 Parameter Adaptation Let the estimator model for (2) be selected as ˆ˙ kw > 0 (3) x ˆ˙ = f (x) + g(x)u + q(x)θˆ + kw e + wθ, w˙ = q(x) − kw w, w(t0 ) = 0. (4) resulting in state prediction error e = x − x ˆ and auxiliary variable η = e − wθ˜ dynamics: ˙ e˙ = q(x)θ˜ − kw e − wθˆ + ϑ, e(t0 ) = x(t0 ) − x ˆ(t0 ) (5) η˙ = −kw η + ϑ, η(t0 ) = e(t0 ). (6) Since ϑ is not known, an estimate of η is generated from ηˆ˙ = −kw ηˆ, ηˆ(t0 ) = e(t0 ). (7) with resulting estimation error η˜ = η − ηˆ dynamics η˜˙ = −kw η˜ + ϑ, η˜(t0 ) = 0. (8) Let Σ ∈ Rnθ ×nθ be generated from Σ˙ = wT w, Σ(t0 ) = α I  0, (9) based on equations (3), (4) and (7), thye preferred parameter update law is given by 1 Σ˙ −1 = −Σ−1 wT wΣ−1 , Σ−1 (t0 ) = I (10a) α n o ˙ ˆ 0 ) = θ0 (10b) θˆ = Proj Σ−1 wT (e − ηˆ), θˆ , θ(t ˆ denotes a Lipschitz projecwhere θ0 ∈ Θ0 and Proj{φ, θ} tion operator such that ˆ T θ˜ ≤ −φT θ, ˜ − Proj{φ, θ} (11) 0 0 ˆ ˆ θ(t0 ) ∈ Θ ⇒ θ(t) ∈ Θ , ∀ t ≥ t0 . (12) 

Θ0

0

, zθ0 +),

where , B(θ  > 0. More details on parameter projection can be found in Krstic et al. (1995). To proof the following lemma, we need the following result Lemma 4. Desoer and Vidyasagar (1975) Consider the system x(t) ˙ = Ax(t) + u(t) (13) Suppose the equilibrium state xe = 0 of the homogeneous equation is exponentially stable, (1) if u ∈ Lp for 1 < p < ∞, then x ∈ Lp and (2) if u ∈ Lp for p = 1 or 2, then x → 0 as t → ∞. Lemma 5. Adetola and Guay (2011) The identifier (10) is such that the estimation error θ˜ = θ − θˆ is bounded. Moreover, if Z ∞h i ϑ ∈ L2 or k˜ η k2 − ke − ηˆk2 dτ < +∞ (14) t0

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

with the strong condition  lim λmin Σ = ∞ t→∞

(15)

is satisfied, then θ˜ converges to zero asymptotically. 3.2 Set Adaptation An update law that measures the worst-case progress of the parameter identifier in the presence of disturbance is given by: s Vzθ (t) zθ = (16a) λmin (Σ)   Vzθ (t0 ) = λmax Σ(t0 ) (zθ0 )2 (16b)  M 2 ϑ V˙ zθ = −(e − ηˆ)T (e − ηˆ) + (16c) kw where Vzθ (t) represents the solution of the ordinary differential equation (16c) with initial condition (16b). Using the parameter estimator (10) and its error bound zθ (16), ˆ zθ ) is adapted online according the uncertain ball Θ , B(θ, to the following algorithm: Algorithm 1. Beginning from time ti−1 = t0 , the parameter and set adaptation is implemented iteratively as follows: ˆ i−1 ) = θˆ0 and Θ(ti−1 ) = 1 Initialize zθ (ti−1 ) = zθ0 , θ(t ˆ i−1 ), zθ (ti−1 )). B(θ(t 2 At time ti , using equations (10) and (16) perform the update   ˆ i ), Θ(ti ) , if zθ (ti ) ≤ zθ (ti−1 )  θ(t     ˆ Θ = ˆ ˆ θ,   −kθ(ti ) − θ(ti−1 )k    θ(t ˆ i−1 ), Θ(ti−1 ) , otherwise (17) 3 Iterate back to step 2, incrementing i = i + 1. The algorithm ensure that Θ is only updated when zθ value has decreased by an amount which guarantees a contraction of the set. Moreover zθ evolution as given in (16) ensures non-exclusion of θ as shown below. Lemma 6. Adetola and Guay (2011) The evolution of ˆ zθ ) under (10), (16) and algorithm 1 is such Θ = B(θ, that i) Θ(t2 ) ⊆ Θ(t1 ), t0 ≤ t1 ≤ t2 ii) θ ∈ Θ(t0 ) ⇒ θ ∈ Θ(t), ∀t ≥ t0 4. IMPLEMENTATION BY ROBUST TRACKING MPC A possible mechanism to integrate real-time optimization and MPC is to follow the standard two-step approach in which the MPC is made to track and optimal setpoint generated in real-time via a two-degree of freedom approach. The first step requires the computation of the optimal setpoint signal. 4.1 Constraint Removal An interior point barrier function method is used to enforce the inequality constraint. The state constraint is incorporated by augmenting the cost function p as follows: 351

mc 1 X pa (x, θ) , p(x, θ) − ln(−cj (x)) ηc j=1

(18)

with ηc > 0, a fixed constant. The augmented cost function (18) is strictly convex in x and the unconstrained minimization of pa therefore has a unique minimizer in int{X} which converges to that of (1) in the limit as ηc → ∞ Bertsekas (1995). 4.2 Setpoint Update Law Let xr ∈ Rnx denote a reference setpoint to be tracked by x and θˆ denote an estimate of the unknown parameter θ. A setpoint update law x˙ r can be designed based on newton’s method, such that xr (t) converges exponentially to the (unknown) θˆ dependent optimum value of (18). To this end, consider an optimization Lyapunov function candidate

2

2 1 ∂pa 1 ˆ Vr = 2 (19) (xr , θ)

, 2 kzr k ∂x For the remainder of this section, omitted arguments of pa ˆ Differentiatand its derivatives are evaluated at (t, xr , θ). ing (19) yields   ∂ 2 pa ˆ˙ ∂pa ∂ 2 pa x ˙ + V˙ r = θ . (20) r ∂x ∂x2 ∂x∂θ Using the update law   2 −1  2 ∂ pa ˆ˙ ∂ pa ∂pTa ˆ (21) , fr (t, xr , θ) θ + kr x˙ r = − ∂x2 ∂x∂θ ∂x with kr > 0 and r(0) = r0 ∈ int {X} results in V˙ r ≤ −kr kzr k2 , (22) which implies that the gradient function zr converges exponentially to the origin. ˆ is bounded, the optimal setpoint Lemma 7. Suppose (θ, θ) ˆ xr (t) generated by (21) is feasible and converges to x∗pa (θ), the minimizer of (18) exponentially. ˆ Proof: Feasibility follows from the boundedness of (θ, θ) and Assumption 1.1 while convergence follows from (22) and the fact that zr is a diffeomorphism. Since the true optimal setpoint depends on θ, the actual desired trajectory x∗r (t, θ) is not available in advance. ˆ can be generated from the setpoint However, xr (t, θ) update law (21) and the corresponding reference input ur (xr ) must be computed on-line. ˆ is such that there exists ur (xr ) Assumption 8. xr (t, θ) satisfying ˆ 0 = f (xr , ur , θ) (23) The design objective is to design a model predictive control law such that the true plant state x tracks the reference ˆ Given the desired time varying tratrajectory xr (t, θ). jectory (xr , ur ), an attractive approach is to transform the tracking problem for a time-invariant system into a regulation problem for an associated time varying control system in terms of the state error xe = x−xr and stabilize the xe = 0 state. The formulation requires the MPC controller to drive the tracking error xe into the terminal

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

˜ at the end of the horizon. Since the system’s set Xef (θ) dynamics is uncertain, we use the set-based parameter identification approach and incorporate robust features in to the adaptive controller formulation to account for the impact of the parameter estimation error θ˜ in the design. 4.3 Min-max Adaptive MPC Feedback min-max robust MPC is employed to provide robustness for the MPC controller during the adaptation phase. The controller maximizes a cost function with respect to θ and minimizes it over feedback control policies κ. The integrated controller is given as ˆ , κ∗ (0, xe , θ) ˆ u = κmpc (t, xe , θ) ˆ κ) κ∗ , arg min J(t, xe , θ, κ(·,·,·,·)

(24a) (24b)

ˆ κ) is the (worst-case) cost associated with where J(t, xe , θ, the optimal control problem: Z T ˆ κ) , J(T, xe , θ,

L(xpe , up , ur )dτ

max θ∈ Θ, ϑ∈D

(25a)

0

+W (xpe (T ), θ˜p (T ) ) s.t. ∀τ ∈ [0, T ] x˙ p = f (xp ) + g(xp ) up + q(xp ) θ + ϑ, xp (0) = x x ˆ˙ p = f (xp ) + g(xp )up + q(xp )θˆp ˆ˙ x +kw (xp − x ˆp ) + wθ, ˆp (0) = x ˆ ηˆ˙ p = −kw ηˆp , ηˆp (0) = ηˆ w˙ p = q T (xp ) − kw wp , wp (0) = w ˙ −1 )p = −(Σ−1 )p wT w(Σ−1 )p , (Σ (Σ−1 )p (0) = Σ−1  ˙ θˆp = Proj γ (Σ−1 )p (wp )T (ep − ηˆp ), θˆ θ˜p = θ − θˆp , θˆp (0) = θˆ p x˙ r = fr (t, xr , θ), xpr (0) = xr xpe = xp − xpr , xpe (τ ) ∈ Xe , xpe (T ) ∈ Xef (θ˜p (T )), up (τ ) , κ(τ, xpe (τ ), θˆp (τ )) ∈ U

(25b) (25c) (25d) (25e) (25f) (25g)

(25h) (25i) (25j) (25k)

 where Xe = xpe : xp ∈ X , Xef is the terminal constraint and β ∈ {0, 1}. The effect of the future parameter adaptation is incorporated in the controller design via (25a) and (25k), which results in less conservative worstcase predictions and terminal conditions. 4.4 Implementation Algorithm Algorithm 2. The MPC algorithm performs as follows: At sampling instant ti (1) Measure the current state of the plant x(t) and obtain the current value of the desired setpoint xr = xr (ti ), via the update law (21), and the current value of matrices w and Σ−1 from equations (4) and (10a) respectively (2) Obtain the current value of parameter estimates θˆ and uncertainty bound zθ from (10b) and (16). Update the uncertainty sets following (17). (3) Solve the optimization problem (25) and apply the resulting feedback control law to the plant until the next sampling instant (4) Increment i = i+1. Repeat the procedure from step 1 for the next sampling instant. 352

Since the algorithm is such that the uncertainty set Θ contracts over time, the conservatism introduced by the robustness feature in terms of constraint satisfaction and controller performance reduces over time and when Θ contracts upon θ, the min-max adaptive framework becomes that of a nominal MPC. The drawback of the finite-time identifier is attenuated in this application since the matrix invertibility condition is checked only at sampling instants. The benefit of the identifier, however, is that it allows an earlier and immediate elimination of the robustness feature. 4.5 Robust Stability Robust stability is guaranteed under the standard assumptions that Xef ⊆ Xe is an invariant set, W is a local robust CLF for the resulting time varying system and the decay rate of W is greater than the stage cost L within the terminal set Xef in conjunction with the requirement for W to decrease and Xf to enlarge with decreased parametric uncertainty. The details are given in the appendix. The result of the implementation via the robust tracking MPC can be summarized as follows: Theorem 9. Consider problem (1) subject to system dynamics (2), and satisfying Assumption 1. Let the controller be (24) with setpoint update law (21) and the set-based parameter identification procedure. If the conditions of Lemma 5 are satisfied, then for any % > 0, there exists constant ηc such that limt→∞ kx(t) − x∗ (t, θ)k ≤ %, with x∗ (t, θ) the unique minimizer of (1). In addition x ∈ X, u ∈ U for all t ≥ 0. 5. DIRECT EXTREMUM SEEKING MPC IMPLEMENTATION The two-step approach proposed above yields an effective method to design control systems that integrate MPC and real-time optimization. The main advantage of this technique is that it can be used to provide robust regulation of control systems to optimal set-points in the presence of parametric and disturbance uncertainties. In this section, we explore the possibility of implementing the integrated RTO/MPC task using a one-step approach. As reported in Angeli and Rawlings (2011), real-time optimization objectives can be integrated by incorporating the cost function directly in the stage cost for the MPC. The main disadvantage of this technique is that the problem of real-time optimization is transformed artificially to a dynamic optimization problem. One alternative is to consider of stage cost that is associated with the best possible transient performance achievable for a gradient system. Consider the cost to be minimized y = p(x) and assume that the closed-loop system is such that: ∂p x˙ = − ∂x Then the rate of change of the cost is given by:

2

∂p

y˙ = −

∂x .

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

Note that if the hessian of p(x) is such that ∂ 2 p(x) ≥ αIn , ∀x ∈ X ∂x∂xT then the closed-loop system would converge exponentially to the local minimum of p(x). This simple observation would suggest that a suitable stage cost to the combined problem would be:

2

∂p

L(x, u) =

∂x We propose the following Min-max adaptive MPC routine to solve the optimization problem. 5.1 A Min-max Approach The formulation of the min-max MPC consists of maximizing a cost function with respect to θ ∈ Θ, ϑ ∈ D and minimizing over feedback control policies κ. The robust receding horizon control law is ˆ zθ ) , κ∗ (0, x, θ, ˆ zθ ) u = κmpc (x, θ, (26a) ˆ zθ , κ) κ∗ , arg min J(x, θ, κ(·,·,·,·)

where ˆ zθ , κ) , J(x, θ,

max

Z

θ∈Θ, ϑ∈D

0

T

(26b)

L (kγ p k , up ) dτ

(27a)

˜p

(27b)

p

+W (x (T ), θ (T ))

s.t. ∀τ ∈ [0, T ] x˙ p = f (xp ) + g(xp )up + q(xp )θ + ϑ, xp (0) = x ∂p(xp , θˆp ) γp = ∂x x ˆ˙ p = f (xp ) + g(xp )up + q(xp )θˆp ˆ˙ x +kw (xp − x ˆp ) + wθ, ˆp (0) = x ˆ p p p ηˆ˙ = −kw ηˆ , ηˆ (0) = ηˆ w˙p = g T (xp , up ) − kw wp , wp (0) = w (Σ˙ −1 )p = −(Σ−1 )p wT w(Σ−1 )p , (Σ−1 )p (0) = Σ−1 n o ˙ θˆp = Proj γ (Σ−1 )p wT (ep − ηˆp )θˆ θ˜p = θ − θˆp , θˆp (0) = θˆ up (τ ) , κ(τ, xp (τ ), θˆp (τ )) ∈ U xp (τ ) ∈ X, xp (T ) ∈ Xf (θ˜p (T ))

(27c) (27d) (27e) (27f) (27g) (27h) (27i) (27j) (27k)

The effect of future parameter adaptation is also accounted for in this formulation. The conservativeness of the algorithm is reduced by parameterizing both W and Xf as ˜ ). While it is possible for the set Θ to functions of θ(T contract upon θ over time, the robustness feature due to ϑ ∈ D will still remain. Algorithm 3. The MPC algorithm performs as follows: At sampling instant ti (1) Measure the current state of the plant x(t) and obtain the current value of matrices w and Σ−1 from equations (4) and (10a) respectively (2) Obtain the current value of parameter estimates θˆ and uncertainty bound zθ from (10b) and (16) respectively. Update the uncertainty sets following (17). 353

(3) Solve the optimization problem (26) and apply the resulting feedback control law to the plant until the next sampling instant (4) Increment i = i+1. Repeat the procedure from step 1 for the next sampling instant. Criterion 10. The terminal penalty function W : Xf × ˜ 0 → [0, +∞] and the terminal constraint function Xf : Θ ˆ θ) ˜ ∈ (Θ0 × Θ0 × Θ ˜ 0 → X are such that for each (θ, θ, ˜ 0 ), Θ  ˆ : Xf → U satisfying there exists a feedback kf (., θ) ˜ ⊆ X, Xf (θ) ˜ closed (1) x∗ (X, θ) ∈ Xf (θ) ˆ ˜ (2) kf (x, θ) ∈ U, ∀x ∈ Xf (θ) ˜ is continuous with respect to x ∈ Rnx (3) W (x, θ) ˜ ˜ is strongly positively invari(4) ∀ x ∈ Xf (θ)\Ξ, Xf (θ) ˆ with respect to x˙ ∈ f (x) + ant under kf (x, θ) ˆ g(x)kf (x, θ)) + q(x)Θ + D (5)   ˆ + ∂W F(x, kf (x, θ), ˆ θ, ϑ) ≤ 0, L γ, kf (x, θ)

∂x ˆ

˜ ∀ x ∈ Xf (θ)\Ξ, γ = ∂p(x,θ) . ∂x

˜ 0 s.t. kθ˜2 k ≤ kθ˜1 k, Criterion 11. For any θ˜1 , θ˜2 ∈ Θ (1) W (x, θ˜2 ) ≤ W (x, θ˜1 ), ∀x ∈ Xf (θ˜1 ) (2) Xf (θ˜2 ) ⊇ Xf (θ˜1 ) The target set Ξ has the same significance as in other standard robust MPC approaches. In this case, Ξ ∈ X is a set containing the unknown optimal setpoint x∗ . The use of the gradient of p(x, θ) in the stage cost allows to force convergence of the closed-loop system in a neighbourhood ˆ Convergence to the of an estimated optimal setpoint x( θ. set Ξ comes as a result of the set-based parameter identification routine proposed which provides under which the parameter estimates θˆ approach the true value. The choice of stage cost proposed attempts to combine the goals of MPC and real-time optimization. By minimizing a measure of the estimated gradient of the objective function, one can guarantee that the MPC controller achieves the real-time optimization while ensuring some degree of transient performance. The robust stabilization of the target set required a revised condition C5 stating that the function W is a local robust CLF for the uncertain system 2 with respect to θ ∈ Θ and ϑ ∈ D. The main challenge with this formulation is that the resulting optimal equilibrium to be stabilized cannot be known in advance. This particular property of the proposed MPC formulation requires some care in the definition of a suitable terminal cost and terminal set. We state the robust stability of the closed-loop MPC system to the target set Ξ, that is a set containing the unknown optimal setpoint. 5.2 Main Results Theorem 12. Let Xd0 , Xd0 (Θ0 ) ⊆ X denote the set of initial states with uncertainty Θ0 for which (25) has a solution. Assuming criteria 10 and 11 are satisfied, then

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

the closed-loop system state x, given by (2,10,16,26) , originating from any x0 ∈ Xd0 feasibly approaches the target set Ξ as t → +∞. Proof: Feasibility: The closed-loop stability is based upon the feasibility of the control action at each sample time. Assuming, at time t, that an optimal solution up[0,T ] to the optimization problem (26) exists and is found. Let Θp denote the estimated uncertainty set at time t and Θv denote the set at time t + δ that would result with the feedback implementation of u[t,t+δ] = up[0,δ] . Also, let xp represents the worst case state trajectory originating from xp (0) = x(t) and xv represents the trajectory originating from xv (0) = x + δv under the same feasible control a , {xa | x˙ a ∈ input uv[δ,T ] = up[δ,T ] . Moreover, let XΘ b F(xa , up , Θb , D) , f (xa ) + g(xa )up + q(xa )Θb + D}.

Since the up[0,T ] is optimal with respect to the worst case uncertainty scenario, it follows hat up[0,T ] steers any p p trajectory xp ∈ XΘ p to the terminal region Xf . Since Θ is guaranteed not to increase in size over time, it follows p that Θv ⊆ Θp . This, in turn, implies that xv ∈ XΘ v ⊆ p p XΘp . Since the terminal region Xf is strongly positively invariant for the nonlinear system (2) under the feedback kf (., .) and since the input constraints are satisfied in Xpf and Xvf ⊇ Xpf by criteria 2.2, 2.4 and 3.2 respectively, one can conclude that the input u = [up[δ,T ] , kf [T,T +δ] ] is a feasible solution of (26) at time t + δ. By induction, it follows that the dynamic optimization problem is feasible for all t ≥ 0. Stability: The stability of the closed-loop system is established by proving strict decrease of the optimal ˆ zθ ) , J(x, θ, ˆ zθ , κ∗ ). Let the trajectories cost J ∗ (x, θ, p ˆp ˜p p (x , θ , θ , zθ ) and control up correspond to any worst ˆ zθ ). Let case minimizing solution of J ∗ (x, θ,

∂p(xp , θˆp )

p γ =

.

∂x If xp[ 0,T ] were extended to τ ∈ [0, T + δ] by implementing the feedback u(τ ) = kf (xp (τ ), θˆp (τ ) ) on τ ∈ [T, T + δ], then criterion 10(5) guarantees the inequality Z T +δ L(γ p , kf (xp , θˆp ) )dτ + W (xpT +δ , θ˜Tp ) − W (xpT , θ˜Tp ) ≤ 0 T

(28)

where in (28) and in the remainder of the proof, xpσ , xp (σ), θ˜σp , θ˜p (σ), for σ = T, T + δ. The optimal cost Z T ˆ zθ ) = L(γ p , up )dτ + W (xp , θ˜p ) J ∗ (x, θ, T T 0

Z ≥ 0

T

Z T δ L(γ p , up )dτ + L(xp , up )dτ δ 0 Z T +δ p + L(γ , kf (xp , θˆp ) )dτ

Z ≥

T

+ W (xpT +δ , θ˜Tp +δ ) Z ≥

(30)

δ

ˆ L(γ p , up )dτ + J ∗ (x(δ), θ(δ), zθ (δ))

0

(31)

Then, it follows from (31) that ˆ ˆ zθ ) ≤ − J ∗ (x(δ), θ(δ), zθ (δ)) − J ∗ (x, θ,

Z 0

δ

L (γ p , up ) dτ (32) p

Since the stage cost is assumed to be such that L(0, u ) = 0, and locally convex with respect to the gradient of p(xp , θˆp ), it follows that x(t) converges to a neighbourhood ˆ asymptotically where x∗ (θ) ˆ is the critical value of of x∗ (θ) ˆ p(x, θ). The closed-loop stability is established by the feasibility of the control action at each sample time and the strict decrease of the optimal cost J ∗ . The proof follows from the fact that the control law is optimal with respect to the worst case uncertainty (θ, ϑ) ∈ (Θ, D) scenario and the terminal region Xpf is strongly positively invariant for (2) under the (local) feedback kf (., .). If the conditions of Lemma 5 are met, it follows that zθ → 0 and therefore the closed-loop system reaches a neighbourhood of the true unknown setpoint x? subject to the worst case disturbance ϑ ∈ D. 6. SIMULATION EXAMPLE Consider the parallel isothermal stirred-tank reactor in which reagent A forms product B and waste-product C DeHaan and Guay (2005) Let x = [A1 , A2 ]T , θ = [k11 , k12 , k21 , k22 ]T and u = [F1in , F2in ]T where Ai denote the concentration of chemical A in reactor i, kij are the reaction kinetic constants, which are only nominally known. The inlet flows Fiin are the control inputs. The dynamics of the system can be expressed in the form:   x1 kV 1 (ξ1 − V10 + ξ3 )   ξ1  x˙ = −   x2 kV 2 (ξ2 − V20 + ξ4 )  ξ {z2 } | fp1

 Ain   0 x1 2x21 0 0  ξ1  + Ain  u − 0 0 x2 2x22 θ, 0 | {z } ξ g | {z 2 } 

fp2

p

p

W (xpT , θ˜Tp )

where ξ1 , ξ2 are the two tank volumes and ξ3 , ξ4 are the PI integrators. The system parameters are V10 = 0.9, V20 = 1.5, kv1 = kv2 = 1, PA = 5, PB = 26, p11 = p21 = 3 and p12 = p22 = 1.

L(γ , u )dτ + Z T +δ + L(γ p , kf (xp , θˆp ) )dτ T

+ W (xpT +δ , θ˜Tp ) − W (xpT , θ˜Tp )

(29) 354

The economic cost function is the net expense of operating the process at steady state.

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

p(Ai , s, θ) =

2 X

[(pi1 si + PA − PB )ki1 Ai Vi0

i=1

+ (pi2 si + 2PA )ki2 A2i Vi0 ] (33) where PA , PB denote component prices, pij is the net operating cost of reaction j in reactor i. Disturbances s1 , s2 reflect changes in the operating cost (utilities, etc) of each reactor. The control objective is to robustly regulate the process to the optimal operating point that optimizes the economic cost (33) while satisfying the following state constraints 0 ≤ Ai ≤ 3, cv = A21 V10 + A22 V20 − 15 ≤ 0 and input constraint 0.01 ≤ Fiin ≤ 0.2. The reaction kinetics are assumed to satisfy 0.01 ≤ ki ≤ 0.2. The sampling time is take to be 0.1 seconds. The robustness of the adaptive controller is guaranteed via the Lipschitz bound method. The stage cost is selected as a  P2  ˆ 2 i s, θ) . quadratic cost L(γ, u) = 12 i=1 ∂p(A ∂Ai 6.1 Terminal Penalty and Terminal Set Design A Lyapunov function for the terminal penalty is defined as the input to state stabilizing control Lyapunov function (iss-clf):

2 ˆ 1

∂p(x, s, θ)

(34) W (xe ) =

2 ∂x Choosing a terminal controller   ˆ + k1 xe + k2 g g T xe , u = kf (xe ) = −fp−1 − f + g(x) θ p 1 2 (35) with design constants k1 , k2 > 0, the time derivative of (34) becomes ˙ (xe ) = −k1 xTe xe − xTe g θ − k2 xTe g g T xe W (36)

1 kθk2 (37) 4k2 ˙ (xe (T )) + L(T ) ≤ Since the stability condition requires W 0. We choose the weighting matrices of L as Q = 0.5I and R = 0. The terminal state region is selected as Xef = {xe : W (xe ) ≤ αe } (38) such that ˙ (T ) + L(T ) ≤ 0, ∀(θ, xe ) ∈ (Θ, Xe ) kf (xe ) ∈ U, W f (39) Since the given constraints requires the reaction kinetic θ and concentration x to be positive, it follows that ˙ + L = −(k1 − 0.5)kxe k2 − xTe g θ − k2 xTe g g T xe ≤ 0 W (40) for all k1 > 0.5 and xe > 0. Moreover, for xe < 0, the constants k1 and k2 can always be selected such that (40) is satisfied ∀ θ ∈ Θ. The task of computing the terminal set is then reduced to finding the largest possible αe such that for kf (.) ∈ U for all x ∈ Xef . The terminal cost (34) is used for this simulation and the terminal set is re-computed at every sampling instant using the current setpoint value. ≤ −k1 kxe k2 +

The system was simulated subject to a ramping measured economic disturbance in s2 from t = 6 to 10. The simulation results are presented in Figures 1 to 5. The phase trajectories displayed in Figure 1 show that the reactor states 355

x1 and x2 obey the imposed constraints. The concentration of A in reactor x1 is shown to approach its upper bounded. The system trajectories are also shown to approach the constraints towards the end of the simulation. Figure 2 shows that the actual, unknown setpoint cost p(t, xr , θ) converges to the optimal, unknown p∗ (t, x∗ , θ). The initial effect of the parameter estimation is observed initially but vanishes quickly. As soon as the parameter estimates reach their unknown true values, the economic MPC approach converges quickly to the unknown optimum, as expected. Figure 3 confirms the effectiveness of the adaptive MPC in tracking the desired setpoint. Figure 4 demonstrates that the convergence of the parameter estimates to their true values. Note that, in priniciple, the adaptive MPC has a self exciting feature that penalizes large estimation errors. The simulation results show that this approach is extremely successful for the design of the adaptive control system. The control variables are shown in Figure 5. The required control action is implementable and satisfies the given constraints. 7. CONCLUSIONS This paper provides a formal design technique for solving economic optimization problems for a class of constrained nonlinear uncertain systems subject to parametric and disturbance uncertainties. Two approaches are discussed. The first approach considers a robust adaptive tracking MPC formulation. The second approach proposes a new direct adaptive extremum-seeking MPC approach for a class of uncertain nonlinear systems. The main advantage of the direct approach is that it naturally leads to a robust adaptive economic MPC approach that guarantees robust stabilization of the unknown optimal operating conditions. The technique is also readily implemented in discrete-time nonlinear systems. REFERENCES V. Adetola and M. Guay. Robust adaptive mpc for constrained uncertain nonlinear systems. Int. J. Adaptive Control and Signal Processing, 25(2):155–167, 2011. V. Adetola and M. Guay. Integration of real-time optimization and model predictive control. J. Process Control, 20(2):125–133, 2010. D. Angeli and J. Rawlings. Economic optimization using model predictive control with a terminal cost. Annual Reviews in Control, 35(2):178–186, 2011. D. Angeli and J. B. Rawlings. Receding horizon cost optimization and control for nonlinear plants. In Proceedings of the 8th IFAC Symposium on Nonlinear Control Systems, pages 1217–1223, Bologna, Italy, 2010. K. Basak, K.S. Abhilash, S. Ganguly, and D.N. Saraf. Online optimization of a crude distillation unit with constraints on product properties. Industrial Engineering Chemistry Research, 41:1557–1568, 2002. D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 1995. D. DeHaan and M. Guay. Extremum seeking control of state constrained nonlinear systems. Automatica, 41(9): 1567–1574, 2005. C.A. Desoer and M. Vidyasagar. Feedback Systems: InputOutput Properties. Academic Press, New York, 1975.

IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012

3

x 1 (t) 2.5

x 2 (t)

2

x1, x2

M. Diehl, R. Amrit, and J.B. Rawlings. A lyapunov function for economic optimizing model predictive control. Automatic Control, IEEE Transactions on, 56 (3):703 –707, march 2011. ISSN 0018-9286. doi: 10.1109/TAC.2010.2101291. M Guay and T. Zhang. Adaptive extremum seeking control of nonlinear dynamic systems with parametric uncertainties. Automatica, 39:1283–1293, 2003. M. Krstic, I. Kanellakopoulos, and P. Kokotovic. Nonlinear and Adaptive Control Design. John Wiley and Sons Inc, Toronto, 1995. U.E. Lauks, R.J. Vasbinder, P.J. Valkenburg, and C. van Leeuwen. On-line optimization of an ethylene plant. Computers & Chemical Engineering, 16:S213–S220, 1992.

1.5

1

0.5

0

2

4

6

8 time

10

12

14

16

3

Fig. 3. Reference trajectories and closed-loop states

cv = 0 2.5

x2

2

1.5 0.16

x(t)

k 11

0.14

1

k 21

0.12 2

x1

2.5

3

parame te rs

0.5 1.5

Fig. 1. Phase diagram and feasible state region

0.1

0.08

0.06 −4.5



p(t, x , θ) ˆ p(t, x, θ)

−5

k 22

0.04

0.02

k 12 0

5

10

15

time −5.5

Fig. 4. Unknown parameters and estimates

−6

−6.5

−7

−7.5

0.2 0

5

10

15

time

u1

0.15

Fig. 2. Optimal and actual profit functions

0.1 0.05 0

0

5

0

5

10

15

10

15

0.2

u2

0.15 0.1 0.05 0

time

Fig. 5. Closed-loop system’s inputs

356