4th IFAC Nonlinear Model Predictive Control Conference International Federation of Automatic Control Noordwijkerhout, NL. August 23-27, 2012
MPC under the hood/sous le capot/unter der Haube Daniel J. Riggs ∗ Robert R. Bitmead ∗ ∗ Department of Mechanical & Aerospace Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla CA 92093-0411, USA
[email protected],
[email protected]
Abstract: The paper develops achieved performance bounds for MPC control applied over the infinite horizon to constrained systems with persistent disturbances and persistently active constraints. This is approached from a minimalist perspective of introducing as few assumptions as possible and separating the feasibility analysis, which is shown to be purely topological, from the performance analysis, which is based on value functions and builds on the work of Jadbabaie, Hauser, Gr¨ une and Rantzer. A specific focus is on the requirements for stability, where MPC stability in the sense of bounded-input/bounded-state is addressed. 1. INTRODUCTION It is C for Control in MPC. This emphasizes achieving stability and disturbance rejection control via the MPC paradigm. Our aim in this paper is to explore what can be said about stabilization and disturbance rejection control performance achieved by MPC and how this is linked to the receding-horizon problem specification. The driving benefit of MPC control lies in its capacity to handle constraints via the link to constrained optimization. The viability of this optimization lies, in turn, in its finite dimensionality, which is linked to the MPC horizon. This raises a central question for MPC design: How does or does not the behavior over this finite horizon translate into the infinite-horizon stability and performance on the target system? Proceeding from the explication of the computer-bound finite-horizon MPC designed system and of the real-world receding-horizon closed-loop controlled achieved system, we seek to explore what each of the MPC recipe ingredients – objective function, horizon, constraints, terminal penalty, stage cost, terminal constraint, etc. — brings into play, individually or in concert, for the achieved system’s properties. The objective is the reexamination and contextualizing of MPC with a focus on the achieved control system behavior. Throughout the paper, we refer to and build upon the work of Jadbabaie et al. (2001) Jadbabaie and Hauser (2005) and Gr¨ une and Rantzer (2008) and their use of value function analysis. The core attraction of MPC is its capacity to handle constraints on system inputs, states, and/or outputs. Its principle domain of applications success has been as a disturbance rejection controller in the process industries, where the constraints are persistently in play because of the disturbances, and sample times permit significant on-line computation. We shall concentrate on MPC for disturbance rejection with constraints, since this is where performance can be assessed. We commence with the target system to be controlled. 978-3-902823-07-6/12/$20.00 © 2012 IFAC
363
xt+1 = f (xt , ut ) + wt , x0 , (1) The state and control are constrained to xt ∈ X and ut ∈ U. Assumption 1. (Available Information). The state xt is available at every time t. Further, we presume the dynamic model f is known exactly. Assumption 2. The stochastic process {wt , t ≥ 0} defined on probability space (Ω, B, Pw ) is independent and identically distributed (i.i.d.) and takes values in the compact set W ⊂ Rn . Remark 1. The assumed compactness of the disturbance space W allows for analysis of systems which operate with persistently active, hard constraints; that is, systems where constraint violation is prohibited but constraint activity has non-zero probability in the future σ-algebra from any state. Further, almost sure constraint satisfaction would not be possible if X or U were taken to be compact with wt with infinite support. We also note that the assumption of bounded i.i.d. fundamental stochastic variables {wt } does not preclude the system being subject to unbounded disturbances, only that their independent increments are required to be bounded. 1.1 The Three Stooges We identify three distinct performance measures at play in MPC. Since our interest is in state regulation performance with persistent disturbances and over the infinite horizon, a discounted cost is applied in order that the performance value function is finite. Definition 1. The Discount Factor α is a number in [0, 1) when disturbances are present, i.e. wt are not necessarily zero. In the undisturbed case, where wt = 0 for all t, the choice α = 1 is also possible and is described as undiscounted. The inclusion of the discounting factor in the definition of the optimal value function to accommodate the disturbance rejection framework is a central departure from
10.3182/20120823-5-NL-3013.00088
IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012
the work of Gr¨ une and Rantzer (2008) and Jadbabaie and Hauser (2005). Definition 2. The Optimal Value Function is ∞ X V∞ (x0 ) = min αi Ew [l(xi , ui (xi ))], (2)
Assumption 4. The set of initial conditions such that (2) has a feasible solution is non-empty. Assumption 5. (Realizability of Performance). The optimal value function, V∞ (x), is finite for all feasible initial conditions x0 = x.
i=0
Associated with these control laws, we may define three systems at time t with state xt . Definition 5. The Optimal System is the infinitehorizon closed-loop system under the application of the infinite-horizon optimal control, µ∞ (xt ), xot+1 = f (xot , µ∞ (xot )) + wt . (7) o Its state sequence is {xt : t = 0, 1, . . . }. Definition 6. The Designed System at time t is the constrained, finite-horizon, optimally-controlled system of the MPC problem with state sequence {xdt+i,t : i = 0, . . . , N }, starting with xdt,t = xt . xdt+j+1,t = f xdt+j,t , udj (xdt+j,t ) + wt+j,t , (8) for j = 0, . . . , N − 1. Here {wt,t , . . . , wt,t+N −1 } is a sequence of i.i.d. random variables with probability distribution Pw . Definition 7. The Achieved System is the infinitehorizon closed-loop system under the application of MPC control input, µN (xt ), xat+1 = f (xat , µN (xat )) + wt . (9) a Its state sequence is {xt : t = 0, 1, . . . }.
for system (1) over admissible controls ui (xi ) and state constraints xt ∈ X , t ≥ 0, almost surely. Here the expectation operator, Ew , is taken with respect to measure Pw over the σ-algebra generated by {wt }. We denote the associated feedback controller as u = µ∞ (x). Assumption 3. The stage cost l(·, ·) is positive definite in x, that is, l(x, u) ≥ W (x), ∀u ∈ U where W (x) is a positive definite function. The MPC problem poses and solves a related horizon-N optimization. Definition 3. The Designed Value Function is N −1 X VN (x0 ) = min αi Ew [l(xi , ui ) + F (xN )], (3) i=0
for system (1) over admissible control policies ui (xi ) and state constraints {xj ∈ Xj , j = 1, . . . , N } almost surely. Here l(x, u) coincides with that in the optimal value function (2) and F (·) is the terminal cost in the MPC design. The solution to this design problem is a sequence of control policies {ud0 (·), . . . , udN −1 (·)}, from which the MPC control law is taken as µN (x) = ud0 (x). The stage cost l(·, ·) is bequeathed from the infinitehorizon problem to the MPC design problem and represents a measure of disturbance rejection performance that is of importance to controlled system operation. Since our specification of the MPC problem is time-invariant, its solution µN (x) is a static state feedback law. The interest in MPC performance lies in the properties of this feedback law as a disturbance rejection controller. Accordingly, we define the achieved value function. Definition 4. The Achieved Value Function is the infinite-horizon value function of system (1) under MPC control u = µN (x). ∞ X µN αi Ew [l(xi , µN (xi ))]. (4) V∞ (x0 ) =
Under MPC, the designed system’s initial state is set equal to the achieved system’s state, xt , and the designed system’s optimal control, ud0 (xt ) = µN (xt ), is applied to the achieved system at time t. This leads to the central observation about MPC. Theorem 1. (Central Observation). At time t, the achieved system state and the designed system state coincide, xdt,t = xat , (10) and at the next time, t + 1, the achieved state satisfies xat+1 = f xdt,t , µN (xt ) + wt , (11) for random variable wt ∈ W. In the disturbance-free case, (10-11) correspond to xdt,t = xat and xat+1 = xdt+1,t
i=0
This concordance between the states of the achieved and designed systems at times t and t + 1 has been commented upon by Maciejowski (2002). It leads to some very simple but powerful implications between the designed and achieved systems. We remark that the designed state values beyond the first need not be mimicked in the behavior of the achieved state. Property 1. If the designed system possesses a solution at time t then the achieved system satisfies xat+1 ∈ X1 .
As identified in Gr¨ une and Rantzer (2008) in general and in Bitmead et al. (1990) for linear unconstrained problems, the focus of analysis is to bound the achieved value function in terms of the known designed value function and the desired optimal value function. In particular, the adoption of design techniques for MPC flowing from, and thereby ensuring, these bounds is the ultimate goal. 1.2 Admissible controls The MPC problem at time t and from state xt involves constrained optimization of the designed value function VN (xt ) from (3) subject to the system dynamics (1) and the following state and input constraints. xt+j+1 ∈ Xj+1 ⊆ X , (5) ut+j ∈ U, (6) almost surely, for j = 0, . . . , N − 1. This set of constraints defines the class of admissible or feasible controls.
2. FEASIBILITY We proceed with identifying conditions under which the MPC design problem has a feasible solution; first for a single time t and later for the MPC algorithm which considers successive solution of (3). Our results are settheoretic as we leverage the work of Blanchini and Miani (2008). 2
364
IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012
The concordance between the designed system state xdt+1,t and the achieved system state xat+1 as offered by the Central Observation (Theorem 1) allows us to ascertain constraint satisfaction on the infinite-horizon via analysis of the designed state constraints {Xk , k = 1, . . . , N } from (3). In particular, we provide a condition on the first constraint set, X1 , which yields constraint satisfaction for the achieved system state and control signal for all time. We conclude the section by establishing conditions under which the achieved system state is bounded under MPC control. This is accomplished via consideration of the designed constraint sets and dynamics f ; that is, we prove existence of a bound on the achieved system state without specification of stage or terminal cost functions. This is new; existence of a state bound is generally accomplished via input-to-state stability analysis, which requires specification of stage and terminal cost functions (c.f. Raimondo et al. (2009)).
the MPC problem may be simply related to that of the infinite-horizon problem. We have the following direct observations. Lemma 1. If Xi = X , for i = 1, 2, . . . , N , then X1φ ⊇ X2φ ⊇ · · · ⊇ XN φ ⊇ X∞φ . Lemma 2. For state constraint sets {X1 , X2 , . . . , XN }, if X1 ⊆ XN φ and X1 6= ∅, so (3) is N -feasible, and X1 ⊆ X , then X1 ⊆ X∞φ . The proof of Lemma 2 follows from recursive feasibility and the evident MPC-control invariance of the set X1 . The upshot of Lemmas 1 and 2 is that both the feasible sets and the recursive feasibility of the horizon-N MPC design problem and of the infinite-horizon problem are linked, even though their constraint sets might differ. This becomes apparent in the feature that XN φ need not be a superset of X∞φ . Accordingly, we identify the set X∞φ ∩ XN φ as important as the set of initial conditions feasible for (2) and recursively feasible for (3). Assumption 6. The designed constraint set X1 satisfies X1 ⊆ X∞φ ∩ XN φ and X1 6= ∅.
2.1 Recursive feasibility Definition 8. (N -feasible set). Given the sequence of N subsets of Rn , {Xk : k = 1, . . . , N }, from (3), the N feasible set of states, XN φ , is the subset of Rn such that there exists a finite sequence of admissible control N,φ policies {πk−1 : k = 1, . . . , N } ∈ Π such that xt ∈ XN φ N,φ d (xdt+k−1,t )) + wt+k−1,t imply and xt+k,t = f (xdt+k−1,1 , πk−1 N,φ xdt+k,t ∈ Xk almost surely and udt+k−1 = πk−1 (xdt+k−1,t ) ∈ U for k = 1, . . . , N . Definition 9. (Recursive N -feasibility). The MPC design problem is recursively N -feasible if, given that xt ∈ XN φ and that π0N,φ from Definition 8 is applied, we have xdt+1,t = f (xt , π0N,φ (xt )) + wt,t ∈ XN φ almost surely and N,φ udt+k,t = πt+k−1 (xdt+k−1,t ) ∈ U. Theorem 2. The MPC design problem is recursively N feasible if X1 ⊆ XN φ and X1 6= ∅.
The following corollary is immediate and extends the results of Theorem 2 to constraint satisfaction of the achieved system (9) on the infinite horizon. Corollary 1. Suppose Assumption 6 holds and x0 ∈ X∞φ ∩ XN φ . Then the achieved system state xat lies in X almost surely for all t ≥ 0. 2.2 Recursive feasibility and BIBS stability One may establish bounded-input bounded-state (BIBS) stability under MPC control without resort to full specification of the objective function in (3). This requires an assumption about the system (1), that f be a proper map (Bourbaki, 1998). Definition 10. (Proper Map). A function g is a proper map if pre-images in g of compact sets are compact.
The importance of Theorem 2 is that it demonstrates that recursive N -feasibility is a topological property of the constraint sets and is a property divorced from the presence of an optimization problem associated with the MPC control. This is particularly evident in the MPC control design from the paper Wang et al. (2009), for example, where the optimization objective function is loosely tied to the control and state signals and the MPC design converts to a sequence of feasibility problems. Such an interpretation is not immediately evident from this paper.
Effectively, this rules out systems with zero gain over unbounded intervals. In the linear system case, it is a requirement met through having the A-matrix full rank. The following theorem expresses the main result of this section. Via specification of a bounded state constraint Xj and supposing f is a proper map, boundedness of the achieved system state xat under MPC control can be guaranteed. Theorem 3. Suppose that: the MPC design problem is recursively N -feasible for the sequence of constraint sets {Xk : k = 1, . . . , N }, U is compact, and the set Xk is compact for at least one value of k ∈ {1, . . . , N }. Further, suppose that the function f (·, u) : X → X is a proper map for each u ∈ U. Then, provided the initial state is N feasible, x0 ∈ XN φ , the MPC control law remains feasible and the achieved system state xat is almost surely bounded for all time t ≥ 0.
The condition in Theorem 2 is only sufficient; recursive N feasibility as given in Definition 9 implies xdt+1,t ∈ X1 ∩XN φ almost surely for arbitrary X1 ⊆ Rn . We note that Theorem 2 deals with the MPC design problem and the possibly varying state constraint sets {X1 , X2 , . . . , XN } in Rn and control constraint set U in Rm . It is a simple matter to establish that X(N −1)φ ⊇ XN φ no matter the choice of constraint sets.
The novelty of this result is that it invokes no property of the optimization objective function, only of the system function f , of the MPC control constraint set U and of the recursive feasibility. The stability established is BIBS stability from the disturbance to the state, which is stability in the sense of Lagrange rather than in the sense of
Denote by X∞φ the feasible initial state set corresponding to fixed state constraint sets X and U from (2). By Assumption 4, X∞φ is non-empty. When the MPC constraints are taken as subsets of X , the feasibility of 3 365
IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012
setting and offer comparison to the new discounted-cost formulation result that is presented at the end of the section.
Lyapunov (Rekasius, 1963), even though we have so far not introduced any assumption about an equilibrium. In incorporating the notion of a compact set and hence of boundedness, we have also moved the analysis to a metric space context from simply topological considerations. Although compactness of sets and thus properness of maps can be defined solely in a topological space. Remark 2. Competing stability statements using Lyapunov arguments deal with asymptotic properties of the unforced or disturbance-free system (Gr¨ une and Pannek, 2011), or deal directly with disturbances using a minmax objective function (Raimondo et al., 2009). These approaches aim to show input-to-state stability (ISS) of (1), only guaranteeing existence of a bound.
We begin with an elementary observation concerning the designed value function (3), which pertains regardless of discounting in the cost function or the presence of disturbances, assuming the infinite-horizon value function is finite. This is a generalization of observations made in Jadbabaie and Hauser (2005); Gr¨ une and Rantzer (2008), the generalization being in the consideration of the designed constraint sets involved in the MPC design problem. Theorem 4. Consider any x ∈ X∞φ . Assume V∞ (x) < ∞. Let the terminal cost from (3) F (y) = 0, ∀y ∈ X and let Xj = X , j = 1, 2, . . . . Then VN (x) from (3) has a feasible solution ∀N ≥ 0 and, for i = 0, 1, . . . , Vi (x) ≤ Vi+1 (x) ≤ V∞ (x) < ∞.
The BIBS result of Theorem 3 establishes a signal bound. ISS approaches also supply existence of a bound, though with more effort, e.g., the prescription of a cost function. The issue therefore must lie in the quantification of the magnitude of the bound. We now move towards this goal through performance quantification. But we note that the simple achievement of BIBS is an important adjunct to the description of infinite-horizon performance via discounted value functions to be developed next.
The above result holds for the solution of the MPC design problem at a single time t with infinite-horizonfeasible initial condition x ∈ X∞φ , as recursive feasibility is not considered in the theorem statement. As noted by Jadbabaie and Hauser (2005); Gr¨ une and Rantzer (2008), once the value functions cease to change with horizon, we have infinite-horizon optimality.
3. VALUE FUNCTION, ACHIEVED PERFORMANCE AND STABILITY
Disturbed, undiscounted, zero terminal cost case: wt 6= 0, α = 1, F (x) = 0.
Monotonicity of the designed value function VN (x) (3) is a property that allows for proving asymptotic properties of the MPC control in the disturbance-free case (Mayne et al., 2000; Jadbabaie et al., 2001; Jadbabaie and Hauser, 2005) and in min-max formulations with disturbances (Raimondo et al., 2009). It has also recently been used to derive quantifiable state and performance bounds on disturbance-free systems (Gr¨ une and Rantzer, 2008; Gr¨ une and Pannek, 2011).
Here we consider V∞ (x) in (2) and VN (x) in (3) with: stochastic process {wt }, F (x) = 0, α = 1. Gr¨ une and Rantzer (2008) establish a relationship between the infinite-horizon achieved value and the finite-horizon optimal designed value functions for the non-zero disturbance case. Their analysis is couched as “Practical Optimality” and involves an altered stage cost function ¯l(·, ·), which takes the value zero in a certain set, L. Here we present a generalization of their result, originally intended to capture the convergence of systems to a neighborhood of the origin, by quoting it with reference to the stochastic system (1) outside a limit set and given constraint set X . Theorem 5. (Gr¨ une and Rantzer (2008)). Consider the disturbed, undiscounted (α = 1), zero terminal cost MPC problem of horizon N with prescribed designed state constraint sets and with resultant control law µN (x). Define:
In this section, we extend the use of monotonicity of VN (x) to prescribing infinite-horizon performance bounds for the stochastic nonlinear system (1) with discounted cost. The performance-bounding inequalities presented here are generalizations of existing inequalities to the stochastic system (1) and discounted cost formulation. In all of the following results, we assume the MPC design problem is recursively feasible with Xj ⊆ X , j = 1, . . . , N .
– set L ⊂ X to be the minimal (almost surely) invariant set under µN , – alternate stage cost / L, ¯l(x, u) = max{l(x, u) − ε, 0}, x ∈ 0, x ∈ L,
3.1 Monotonically Increasing Value Function with Horizon In this section we consider MPC formulations yielding value functions which are monotonically non-decreasing with horizon. Jadbabaie and Hauser (2005) consider the undisturbed case wt ≡ 0 and prove that, with selection of zero terminal cost and for sufficiently long horizon, the MPC control yields asymptotic stability of the undisturbed system; this is an important result as it does not employ the use of terminal constraints sets (Mayne et al., 2000) to guarantee stability. Gr¨ une and Rantzer (2008) extend the result of Jadbabaie and Hauser (2005) by considering practical stability and deriving performance bounds for a modified stage cost which takes zero value inside a set that is attractive for the MPC control law. We generalize this result to the current stochastic system
and corresponding optimal and achieved infiniteµN horizon alternate value functions V ∞ (x) and V ∞ (x) evaluated using ¯l. – constant σ = inf{Ew [VN (f (x, µN (x)) + w)] : x ∈ X \L} Assume that, for some ε > 0 and for all x ∈ (X∞φ ∩ XN φ )\L, there exists a scalar function γN ∈ [0, 1], such that VN (x) − Ew [VN (f (x, µN (x)) + w)] ≥ max [(1 − γN )l(x, µN (x)) − ε), 0] , (12) 4
366
IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012
Then, for all x ∈ X∞φ ∩ XN φ , (1 − γN )V ∞ (x) ≤ (1 − γN )V
µN ∞ (x)
≤ VN (x) − σ.
state bound could be relegated to that established by the constraint-based BIBS result in Theorem 3.
(13)
Furthermore, Theorem 6 does not require modification of the stage cost. This allows for comparison of the achieved value function under MPC control directly with the original infinite-horizon problem of interest.
The implication of condition (12) is that L is attractive for the MPC control law, though no terminal constraint is explicitly imposed. When the state is inside L, the µN modified stage cost is 0. The cost V ∞ then only captures the cost of convergence to L. We will see that such a modification to the cost is not required when a discount factor α ∈ [0, 1) is introduced, and hence the analysis will not be limited to evaluation of transient performance.
The approach using value functions which are monotonically non-decreasing with horizon, N , yields a sequence of results “for sufficiently large N .” The monotonicity of the value functions follows in Theorem 4 directly from the zero terminal cost, F (·) = 0, regardless of discounting or disturbances and assuming that V∞ (x) is finite. To study the properties of value functions monotonically non-increasing with horizon, then, is limited to the consideration of the terminal cost, which cannot be taken as zero.
Given our aim to evaluate performance against the original infinite-horizon problem of interest (and associated optimal value function), we now move toward presenting results which do not require such stage cost modification. Disturbed, discounted, zero terminal cost case: wt 6= 0, α ∈ [0, 1), F (x) = 0.
3.2 Monotonically Decreasing Value Function with Horizon In this section we consider design of the positive definite terminal cost F (x) to yield designed value functions which are monotonically non-increasing with horizon. The main idea employed here is based on assuming F (x) is a special type of Control Lyapunov Function (CLF) (Artstein, 1983).The attractiveness of selecting the terminal cost as a CLF is that asymptotic stability and bounded achieved performance can be established for any horizon.
The following result is an extension of the undisturbed, undiscounted performance result from Gr¨ une and Rantzer (2008) to the case with stochastic disturbances and discounted stage cost. The result can be contrasted with Theorem 5 as it does not require introduction of a modified stage cost l, nor does it require convergence to a set, provided the conditions of the theorem can be satisfied. Theorem 6. For the disturbed, discounted, zero terminal cost, MPC design problem with horizon N and prescribed designed state constraints, suppose there exists a γN ∈ [0, 1] such that, Ew [VN (f (x, µN (x)) + w)] − Ew [VN −1 (f (x, µN (x)) + w)] ≤ γN l(x, µN (x)) + w, (14) for all x ∈ X∞φ ∩ XN φ . Then, with βN := 1 − αγN , α µN βN V∞ (x) ≤ βN V∞ (x) ≤ VN (x) + w. (15) 1−α
Jadbabaie et al. (2001), Lim´on et al. (2006), Gr¨ une and Rantzer (2008), and Gr¨ une and Pannek (2011) discuss selection of the terminal cost as a CLF for the undisturbed, undiscounted case. We generalize existing results to the stochastic system and discounted cost of the present formulation. In the following lemma, an inequality condition on the terminal cost is established for the stochastic, discounted formulation that mirrors undisturbed, undiscounted results from Gr¨ une and Rantzer (2008). Lemma 3. Suppose that for all x ∈ X and all admissible controls, the terminal cost F (·) satisfies, for some w ∈ [0, ∞), αEw [F (f (x, u) + w)] − F (x) ≤ −l(x, u) + w. (16) Then for all k ≥ 1 and for x ∈ X∞φ ∩ XN φ .
The inequalities (14) and (15) are generalizations of inequalities given in Gr¨ une and Rantzer (2008) to the stochastic, discounted cost case. However, we note the discount factor α does not appear in the first of these inequalities. The scalar w is introduced in the inequality (14) to ameliorate difficulties which might arise in achieving an upper bound on the monotonic increase rate due to the nature of the stochastic disturbance w. The penalty paid for prescription of large w is evident in the performance bound inequality (15). When larger horizons N are specified, the size required of w decreases, as the designed value function approaches the infinite-horizon optimal value function. Then, we have a tradeoff between horizon length and the performance bound we would like to guarantee.
Vk (x) ≤ Vk−1 (x) + αk−1 w,
(17)
The inequality condition on the terminal cost (16) yields the inequality on the designed value function (17), which in turn yields bounds on the achieved value function under MPC control, which we will show shortly. Paralleling the inequality (14) in Theorem 6, the positive scalar w in (16) relaxes the inequality. In fact, it can be shown that, even for scalar linear systems with stochastic disturbances and quadratic stage cost, the inequality cannot be satisfied with positive definite F (x) and w = 0.
The discount factor can be selected as α = 1 with the incurred penalty of a meaningless performance bound; however, selection of α = 1 and w = 0 for the undisturbed case recovers the results for the undisturbed case in Gr¨ une and Rantzer (2008). Specification of a small discount factor might yield tight bounds. Indeed, selection of α = 0 can yield infinite-horizon optimal performance as both the infinite-horizon and MPC design problems reduce to the one-step-ahead problem. But useful state bounds on the infinite horizon might be compromised and the
Here, specification of a small discount factor α might also ease difficulties in establishing inequality (16). Though, as discussed in the previous section, small α might also relegate bounds on the achieved system state, xat , to those established by the BIBS stability result in Theorem 3. The terminal cost function requirement (16) must hold on the entire space X . This might appear unfortunate, 5
367
IFAC NMPC'12 Noordwijkerhout, NL. August 23-27, 2012
controller. We have extended the focus on monotonicity of the value functions with horizon length to gain a handle on modifications to the MPC design problem, chiefly through the terminal cost function.
but the relaxation of the inequality provided by w allows for flexibility in selection of F (x). The penalty paid for poor terminal cost selection given large w is a conservative performance bound, which we now show. Theorem 7. For the disturbed, discounted, non-zero terminal cost MPC design problem with horizon N and prescribed state constraints, suppose VN (x) − VN −1 (x) ≤ w, (18) for all x ∈ X∞φ ∩ XN φ and some w ≥ 0. Then, α µN V∞ (x) ≤ V∞ (x) ≤ VN (x) + w. (19) 1−α
The benefit of a discounted cost function is that the infinite-horizon optimal control performance value is finite. A consequence is that normal questions of closed-loop stability become moot. Indeed, because of the presence of the disturbances, asymptotic stability is not achievable and a different measure of behavior is needed. We note that BIBS stability arises as a side-effect of the MPC problem formulation via the (almost) topological analysis of recursive feasibility. It is our view that BIBS is the most appropriate form of stability for MPC in application.
Here we have made it apparent that performance bound (19) follows if the designed value function satisfies the inequality (18), which in turn can be made possible via judicious selection of the terminal cost.
REFERENCES Artstein, Z. (1983). Stabilization with relaxed controls. Nonlinear Analysis: Theory, Methods and Applications, 7, 1163–1173. Bitmead, R.R., Gevers, M., and Wertz, V. (1990). Adaptive optimal control : the thinking man’s GPC. Prentice Hall, Englewood Cliffs, New Jersey. Blanchini, F. and Miani, S. (2008). Set-Theoretic Methods in Control. Birkh¨auser, Boston, MA, USA. Bourbaki, N. (1998). General Topology. Springer Verlag, Berlin. Gr¨ une, L. and Rantzer, A. (2008). On the infinite horizon performance of receding horizon controllers. Automatic Control, IEEE Transactions on, 53(9), 2100 –2111. doi: 10.1109/TAC.2008.927799. Gr¨ une, L. and Pannek, J. (2011). Nonlinear Model Predictive Control: Theory and Algorithms. Springer-Verlag, London. Jadbabaie, A. and Hauser, J. (2005). On the stability of receding horizon control with a general terminal cost. Automatic Control, IEEE Transactions on, 50(5), 674 – 678. doi:10.1109/TAC.2005.846597. Jadbabaie, A., Yu, J., and Hauser, J. (2001). Unconstrained receding-horizon control of nonlinear systems. Automatic Control, IEEE Transactions on, 46(5), 776 –783. doi:10.1109/9.920800. ´ Lim´on, D., Alamo, T., Salas, F., and Camacho, E. (2006). On the stability of constrained mpc without terminal constraint. Automatic Control, IEEE Transactions on, 51(5), 832 – 836. doi:10.1109/TAC.2006.875014. Maciejowski, J.M. (2002). Predictive Control with Constraints. Prentice Hall, Englewood Cliffs, New Jersey. Mayne, D.Q., Rawlings, J.B., Rao, C., and Scokaert, P. (2000). Constrained model predictive control: Stability and optimality. Automatica, 36(6), 789 – 814. doi: 10.1016/S0005-1098(99)00214-9. Raimondo, D.M., Lim´on, D., Lazar, M., Magni, L., and Camacho, E. (2009). Min-max Model Predictive Control of Nonlinear Systems: A Unifying Overview on Stability. European Journal of Control, 15(1), 5–21. Rekasius, Z. (1963). Lagrange stability of nonlinear feedback systems. Automatic Control, IEEE Transactions on, 8(2), 160 – 163. doi:10.1109/TAC.1963.1105547. Wang, C., Ong, C.J., and Sim, M. (2009). Convergence properties of constrained linear system under mpc control law using affine disturbance feedback. Automatica, 45(7), 1715 – 1720. doi: 10.1016/j.automatica.2009.03.002.
The achieved performance bound (19), like its counterpart with zero terminal cost in Theorem 6, depends on discount factor α and positive scalar w. The comments made following Theorem 6 also apply in this case. However, contrary to the performance result from Theorem 6, in which long horizons are required to achieve a sensible bound, the bound here can be satisfied for all N ≥ 1, as long as the selected terminal cost satisfies (16). The following corollary extends Theorem 7 to yield a result on the rate of convergence of the designed value function. With this result in place it becomes apparent why selection of small α in Theorems 6 and 7 might not be appropriate. Corollary 2. Suppose that the terminal cost F (x) satisfies (16) from Theorem 3, ∀x ∈ X . Then, Ew [VN (f (x, µN (x)) + w)] ≤ ! 1 l(x, µN (x)) VN (x) + w. 1− 1 α F (x) + 1−α w For boundedness of the value function and hence bound l(x,µN (x)) 1 edness of the state, the term α 1 − F (x)+ 1 w must be 1−α
strictly bounded by unity. This encourages selection of larger discount factor α. 4. DISCUSSION AND CONCLUSIONS Our approach has been to extend the analysis of achieved performance to include disturbance rejection. In order to do this, we have introduced a value function based on discounted costs. Techniques from the undisturbed analysis due to Jadbabaie et al. (2001); Jadbabaie and Hauser (2005) and Gr¨ une and Rantzer (2008) have been adapted to apply with disturbances. The center piece of the analysis has been to explore the application of (relaxed) monotonicity of the value function with horizon and both its dependence on the terminal cost function and its consequences for achieved performance bounds. The paper has considered the recursive feasibility and achieved performance of a disturbed MPC problem. In doing so, we have built on the value function approaches due to Jadbabaie, Hauser, Gr¨ une and Rantzer. But we have also drawn in modifications to handle the persistent presence of disturbances to yield performance bounds of more direct application to the usage of MPC as a feedback 6 368