An active set solver for input-constrained robust receding horizon control

An active set solver for input-constrained robust receding horizon control

Automatica 50 (2014) 155–161 Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Brief paper ...

815KB Sizes 4 Downloads 99 Views

Automatica 50 (2014) 155–161

Contents lists available at ScienceDirect

Automatica journal homepage: www.elsevier.com/locate/automatica

Brief paper

An active set solver for input-constrained robust receding horizon control✩ Johannes Buerger, Mark Cannon 1 , Basil Kouvaritakis Department of Engineering Science, University of Oxford, OX1 3PJ, UK

article

info

Article history: Received 7 August 2012 Received in revised form 28 May 2013 Accepted 9 September 2013 Available online 30 October 2013 Keywords: Robust control Control of constrained systems Dynamic programming Min–max optimal control

abstract An efficient optimization procedure is proposed for computing a receding horizon control law for linear systems with linearly constrained control inputs and additive disturbances. The procedure uses an active set approach to solve the dynamic programming problem associated with the min–max optimization of an H∞ performance index. The active constraint set is determined at each sampling instant using first-order necessary conditions for optimality. The computational complexity of each iteration of the algorithm depends linearly on the prediction horizon length. We discuss convergence, closed loop stability and bounds on the disturbance l2 -gain in closed loop operation. © 2013 Elsevier Ltd. All rights reserved.

1. Introduction The aim of robust control is to provide guarantees of stability and of performance with respect to a suitable measure, despite uncertainty in the model of the controlled system. Model Predictive Control (MPC) uses a receding horizon strategy to derive robust control laws by repeatedly solving a constrained optimization problem online, and consequently the approach is effective for systems with constraints and bounded disturbances. Robust receding horizon control based on a worst-case optimization was proposed in Witsenhausen (1968). The approach employed a min–max optimization, which was subsequently adopted in Campo and Morari (1987) to derive an MPC law for linear systems with uncertain impulse response coefficients. In this strategy, and in the related work (Allwright & Papavasiliou, 1992), an open loop predicted future input sequence was used to minimize the worst-case predicted performance. It was argued in Lee and Yu (1997) that by optimizing instead over closed loop predicted input sequences, control laws with improved performance and larger regions of attraction could be obtained. However, unless a

✩ The material in this paper was partially presented at the 50th IEEE Conference on Decision and Control (CDC), December 12–15, 2011, Orlando, Florida, USA. This paper was recommended for publication in revised form by Associate Editor Martin Guay under the direction of Editor Frank Allgöwer. E-mail addresses: [email protected] (J. Buerger), [email protected] (M. Cannon), [email protected] (B. Kouvaritakis). 1 Tel.: +44 1865 273189; fax: +44 1865 273906.

0005-1098/$ – see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.automatica.2013.09.032

degree of optimality is sacrificed through the use of suboptimal controller parameterizations (such as, for example, those proposed in Goulart, Kerrigan, and Maciejowski (2006), Kothare, Balakrishnan, and Morari (1996) and Löfberg (2003)), strategies that involve a receding horizon optimization over predicted feedback policies generally require impractically large computational loads. For example Kerrigan and Maciejowski (2003) and Scokaert and Mayne (1998) apply a scenario-based approach to constrained linear systems with bounded additive uncertainty, which leads to an optimization problem in a number of variables which grows exponentially with the prediction horizon length. Parametric solution methods aim to avoid the explosion in computational complexity of robust dynamic programming with horizon length by characterizing the solution of the receding horizon optimization problem offline, typically as a feedback law that is a piecewise affine function of the model state. In (Bemporad, Borrelli, & Morari, 2003; Diehl & Björnberg, 2004) this method was applied to linear systems with polytopic parametric uncertainty. However, whereas MPC typically solves an optimization problem for a single initial condition at each instant, this approach requires the solution at all points in state space, and moreover necessitates determining online which of a large number of polytopic regions contains the current state. This paper extends the methodology developed in Cannon, Liao, and Kouvaritakis (2008), Ferreau, Bock, and Diehl (2008) and Best (1996) to the case of linear systems with bounded additive uncertainty and input constraints in order to derive a robust dynamic programming solver. An online active set method is described which avoids the need to compute the solution over the entire

156

J. Buerger et al. / Automatica 50 (2014) 155–161

state space, and which forms the basis of an efficient line-searchbased point location technique. The control law is optimal for a convex–concave min–max H∞ performance index, which ensures closed loop stability and a specified l2 -disturbance gain bound. The algorithm’s computational complexity per iteration grows only linearly with horizon length. We consider the case, which was presented in Buerger, Cannon, and Kouvaritakis (2011), of systems subject to constraints on control inputs alone; this paper provides further theoretical and numerical results and gives comparisons with max–min and open loop strategies, as well as numerical comparison with the suboptimal min–max strategy of Goulart, Kerrigan, and Alamo (2009). 2. Problem statement and notation We consider linear discrete time systems with model xt +1 = Axt + But + Dwt ,

t = 0, 1, . . .

(1)

with state xt ∈ Rnx , control input ut ∈ Rnu and disturbance input wt ∈ Rnw at time t. Here ut and wt are subject to constraints: ut ∈ U, wt ∈ W , and U and W are assumed to be convex polytopic sets defined by

U , u ∈ Rnu : Fu ≤ 1





W , w ∈ R nw : G w ≤ 1





for F ∈ RnF ×nu , G ∈ RnG ×nw , where 1 = [1 · · · 1]T denotes a vector of conformal dimensions. We define the feedback law u∗N (x) as the solution to the following closed loop robust optimal control problem (Bemporad et al., 2003; Mayne, Raković, Vinter, & Kerrigan, 2006) over a finite horizon of N time-steps: ∗ u∗m (x), wm (x, u) , arg min max Jm (x, u, w)





u∈U w∈W

(2a)

with Jm defined for m = 1, 2, . . . , N by Jm (x, u, w) ,

1

(∥x∥2Q + ∥u∥2R − γ 2 ∥w∥2 ) + Jm∗ −1 (x+ )   ∗ Jm (x) , Jm x, u∗m (x), wm∗ (x, u∗m (x)) where x J0∗ (x) ,

+

2

(2b) (2c)

= Ax + Bu + Dw , with the terminal cost:

1

∥x∥2P . (2d) 2 Here R is a positive-definite matrix (denoted as R ≻ 0), Q is a positive-semidefinite matrix (Q ≽ 0), ∥x∥2Q denotes xT Qx, and the scalar γ is chosen (as discussed in Section 3.1) to be sufficiently large that (2a) defines a strictly convex–concave min–max problem. We make the assumption that P is chosen so that  f 2 2 2 2 ∥x0 ∥2P = ∞ t =0 (∥xt ∥Q + ∥ut ∥R − γ ∥wt ∥ ) with ut = u∞ (xt ) and f f (·) are the optimal solutions of wt = w∞ (xt , ut ), where uf∞ (·), w∞ (2a)–(2c) in the limit as N → ∞ and in the absence of constraints u ∈ U, w ∈ W . In order to guarantee the existence of this solution we assume that (A, B) is controllable and (Q 1/2 , A) observable. f f Note that u∞ (·), w∞ (·) can be computed by solving a semidefinite

programming problem (see e.g. Boyd, El Ghaoui, Feron, & Balakrishnan, 1994, for details). The problem defined in (2a)–(2d) is formulated under the assumption that the disturbance wt is unknown when the control ∗ input ut is chosen at time t. Since the solutions u∗m (·) and wm (·) depend on x and on (x, u) respectively, (2a)–(2d) defines a closed loop optimal control problem (see e.g. Lee & Yu, 1997). The sequential nature of this min–max problem and the fact that the optimization ∗ is performed over arbitrary feedback laws {u∗m (x), wm (x, u), m = 1, . . . , N } imply that, unlike open-loop formulations of robust MPC (e.g. Campo & Morari, 1987), (2a)–(2d) cannot be solved exactly by a single quadratic program. For given x0 , we denote the optimal state, input and disturbance sequences as x(x0 ) = {x0 , . . . , xN }, u(x0 ) = {u0 , . . . , uN −1 },

w(x0 ) = {w0 , . . . , wN −1 }, where, for t = 0, . . . , N − 1 we define ut = u∗N −t (xt ), wt = wN∗ −t (xt , ut ) and xt +1 = Axt + But + Dwt . As discussed in Section 4, a receding horizon control law is obtained by setting ut = u∗N (xt ) at each time t. 3. Active set solution via Riccati recursion This section describes a method of solving (2a)–(2d) in order to determine u∗N (x) for a given plant state, x = xp . Therefore we aim at determining a local solution to the closed-loop formulation of problem (2). For a given active set, we use a Riccati recursion to solve the Karush–Kuhn–Tucker (KKT) conditions (Nocedal & Wright, 2006) providing first-order necessary optimality conditions for problem (2). The optimal control and disturbance inputs for the corresponding equality constrained problem are obtained as a sequence of affine state feedback functions. We give necessary and sufficient conditions for optimality of these policies with respect to problem (2). For the given active constraint set, our approach then determines state, control, disturbance and multiplier sequences as functions of the initial state x0 using the system model (1). As in Cannon et al. (2008), we use a line-search through x0 -space to update the active set, and the process is repeated until x0 = xp . This line-search is based on homotopy of solutions to problem (2) and the solution can either be initialized using the unconstrained optimal control law with x0 = 0 or warm started using the optimal solution for the plant state at the preceding time instant. Finally we discuss how the computation required by this approach depends on the problem size. 3.1. First order optimality conditions

ˆ t denote the Lagrange multipliers associated with Let λt and λ the constraints xt +1 = xˆ t +1 + Dwt and xˆ t +1 = Axt + But , and let µt and ηt denote the Lagrange multipliers for the constraints ut ∈ U and wt ∈ W respectively. Define Lagrangian functions for the stage-wise maximization and minimization subproblems recursively as follows: ˆ t (ˆxt +1 , wt , ηt , λt , xt +1 , . . . , xN ) , − 1 γ 2 ∥wt ∥2 + ηtT (1 − Gwt ) H 2 − λTt (xt +1 − (ˆxt +1 + Dwt )) + Ht +1 (xt +1 , ut +1 , µt +1 , λˆ t +1 , xˆ t +2 , . . . , xN ) for the maximization subproblem and 1 1 ˆ t , xˆ t +1 , . . . , xN ) , ∥xt ∥2Q + ∥ut ∥2R − µTt (1 − Fut ) Ht (xt , ut , µt , λ 2 2

− λˆ Tt (ˆxt +1 − (Axt + But )) + Hˆ t (ˆxt +1 , wt , ηt , λt , xt +1 , . . . , xN ) for the minimization subproblem, with terminal condition HN (xN ) = J0∗ (xN ) = 21 ∥xN ∥2P . Lemma 1. The solution of problem (2) at time-step t satisfies JN −t (xt ) = min ∗



1 2

1

∥xt ∥2Q + ∥ut ∥2R 2

 + Hˆ t (ˆxt +1 , wt , ηt , λt , xt +1 , . . . , xN )

(3)

where the minimization is over variables ut , xˆ t +1 , and wj , ηj , λj , xj+1

ˆ j , xˆ j+1 for j = t + 1, . . . , N − 1 for j = t , . . . , N − 1 and uj , µj , λ and subject to the constraints: xˆ t +1 = Axt + But , Fut ≤ 1, and ∇wj Hˆj = 0, ∇xj+1 Hˆj = 0 for j = t , . . . , N − 1, and ∇uj Hj = 0, ∇xˆ j+1 Hj = 0 for j = t + 1, . . . , N − 1. Furthermore the objective of the maximization in (2) at time-step t, defined by JˆN∗ −t (ˆxt +1 )

J. Buerger et al. / Automatica 50 (2014) 155–161

157

, − 12 γ 2 ∥wN∗ −t ∥2 + JN∗ −t −1 (xt +1 ), satisfies

λt may be expressed

 1 JˆN∗ −t (ˆxt +1 ) = max − γ 2 ∥wt ∥2

λt = Pt xt +1 + qt .

2

(15)

Then, using (5), (14) gives

+ Ht +1 (xt +1 , ut +1 , µt +1 , λˆ k+1 , xˆ t +2 , . . . , xN )

 (4)

 2 γ I − D T Pt D

GTt 0

Gt

   T wt D Pt (Axt + But ) + DT qt . = ηa,t 1



ˆ j , xˆ j+1 , where the maximization is over variables wt , xt +1 , and uj , µj , λ wj , ηj , λj , xj+1 for j = t +1, . . . , N −1, and subject to the constraints:

Therefore, if (16) has the unique solution:

xt +1 = xˆ t +1 + Dwt , Gwt ≤ 1, and µj ≥ 0, ηj ≥ 0, ∇wj Hˆ j = 0,



∇xj+1 Hˆj = 0, ∇uj Hj = 0, ∇xˆ j+1 Hj = 0 for j = t + 1, . . . , N − 1. Theorem 2. The KKT conditions defining first order necessary conditions for the optimal solution of problem (2) can be expressed as follows: xt +1 = Axt + But + Dwt ,

λt −1 = AT λt + Qxt ,

t = 0, . . . , N − 1

t = 1, . . . , N − 1

Rut = −B λt − F µt

µt ≥ 0,

µ (1 − Fut ) = 0,

(6)

and hence (13) gives

 (8a)

ηtT (1 − Gwt ) = 0,

1 − Gwt ≥ 0

(8b)

λN −1 = PxN

(9)

x0 = xp .

(10)

3.2. Riccati recursion An active set approach solves the optimization problem (2) by solving a sequence of problems involving only equality constraints. Let s = (su , sw ) define a set of active constraints in (2), namely a set of constraints that are satisfied with equality at a solution of (2) for some initial state x0 . Specifically, let su , {su0,i , . . . , suN −1,i , i = u w w 1, . . . , nF } and sw , {sw 0,i , . . . , sN −1,i , i = 1, . . . , nG }, where st ,i , st ,i can take values of 0 or 1, and rewrite (7b) and (8b) as

=1

 if sut,i = 1,

µ ≥0

eTi Gwt = 1 eTi ηt ≥ 0

 if

sw t ,i

R + BT Pˆ t B Ft

= 1,

eTi Fut eTi t



≤1 µ =0

eTi Gwt ≤ 1 eTi ηt = 0

if sut,i = 0

 if

sw t ,i

=0

(11)

(12)

where ei denotes the ith column of an identity matrix of conformal dimensions. Also let Ft , Gt denote the matrices that consist of the rows of F , G corresponding to the active sets indicated by sut,i = 1, i = 1, . . . , nF , and sw t ,i = 1, i = 1, . . . , nG , respectively, and denote the multipliers of these active constraints as µa,t and ηa,t . Then the equality constraints in (7a), (11) and (8a), (12) are equivalent to Rut = −BT λt − FtT µa,t

FtT 0





ut

µa,t

  Tˆ T ˆ = −B Pt Axt − B qt . 1

(19)

(7b)

with the terminal and initial conditions:

eTi Fut eTi t

(18b)

Furthermore, assuming that (19) has a unique solution: 1 − Fut ≥ 0

γ 2 wt = DT λt − GT ηt ηt ≥ 0,

(18a)

w

qˆ t = qt + Pt Dmt

(7a)

T t

Pˆ t = Pt + Pt DMtw



T

(17)

then (15) gives λt = Pˆ t (Axt + But ) + qˆ t with Pˆ t , qˆ t defined:

(5)

and, for t = 0, . . . , N − 1: T

  w  w mt Mt wt = η (Axt + But ) + η , ηa,t Mt mt

(16)

and Ft ut = 1,

γ 2 wt = DT λt − GTt ηa,t and Gt wt = 1.

(13) (14)

Let Σ denote the set of all s such that (5), (6), (13), (14) are feasible for some x0 . Then, for given s ∈ Σ , these constraints and (9), (10) define a two-point boundary value problem. To solve this equality constrained problem using a Riccati recursion, we first assume that

ut

µa ,t

 u

 =

Lt

 u

Lt

lt

µ xt +

lt

µ

,

(20)

Eq. (6) yields λt −1 = Pt −1 xt + qt −1 , where Pt −1 = Q + AT Pˆ t (A + BLut ) T

Pt Blut

qt −1 = qˆ t + A ˆ

(21a)

.

(21b)

Finally, from (9) we have PN −1 = P and qN −1 = 0, and it follows by induction that (15) holds for t = N − 1, . . . , 0. The necessary and sufficient conditions for optimality of the Riccati recursion in (17), (18) and (20), (21) are as follows. Proposition 3. The optimal solution of (2a)–(2d) is given by

wN∗ −t (x, u) = Mtw (Ax + Bu) + mw t

(22a)

uN − t ( x ) =

(22b)



Lut x

+

lut

if and only if: GTt,⊥ (γ 2 I − DT Pt D)Gt ,⊥ ≻ 0 or rank(Gt ) = nw

(23a)

FtT,⊥ (R + BT Pˆ t B)Ft ,⊥ ≻ 0 or rank(Ft ) = nu

(23b)

(where the columns of Ft ,⊥ and Gt ,⊥ form bases for the kernels of Ft and Gt respectively), and G Mtw (Axt + But ) + mw ≤ 1, t



η

F (Lut xt + lut ) ≤ 1



η

Mt (Axt + But ) + mt ≥ 0,

µ

µ

Lt xt + lt ≥ 0.

(24a) (24b)

Remark 4. Condition (23b) is necessarily satisfied since R ≻ 0 by assumption and Q ≽ 0 implies Pt , Pˆ t ≻ 0 for all t. However (23a) is very difficult to verify in practice, since this would require checking all active sets s ∈ Σ . In this paper we therefore assume that γ is sufficiently large to satisfy (23a) for all active sets likely to be encountered. We note that the terms Pt and qt define the optimal DP cost at stage N − t in the solution to problem (2) (omitting the constant term for simplicity, since it has no effect on the solution).

158

J. Buerger et al. / Automatica 50 (2014) 155–161

3.3. Active set method Using the feedback law (22) in conjunction with (5) to simulate forward over the N-step horizon, we obtain xt = Φt x0 + φt

for t = 1, . . . , N

(25)

where Φt ∈ Rn×n and φt ∈ Rn are given by

Φt +1 = (I + DMtw )(A + BLut )Φt   φt +1 = (I + DMtw ) (A + BLut )φt + Blut + Dmw t

(26a) (26b)

with initial conditions Φ0 = I and φ0 = 0. Therefore the input, disturbance and costate sequences u(x0 ), w(x0 ) and λ(x0 ) = {λ0 , . . . , λN −1 }, as well as the corresponding multiplier sequences µ(x0 ) = {µ0 , . . . , µN −1 } and η(x0 ) = {η0 , . . . , ηN −1 } can be determined as affine functions of x0 by substituting (25) into (17), (15) and (20). Hence, for a given active set s, we can define a region of state space X(s) ⊂ Rnx in which the KKT conditions hold:

 X(s) , x0 : x(x0 ) satisfies (24a,b) . 

(27)

Lemma 5. The sets X(s) defined by (27) are convex polytopes, and the collection {X(s) : s ∈ Σ } is a complex with the properties (see e.g. Grunbaum, 2000):

∂ X(s) ⊂ X(s) X(s1 ) ∩ X(s2 ) = ∂ X(s1 ) ∩ ∂ X(s2 )

(28a) (28b)

for any s1 , s2 ∈ Σ (where ∂ X(s) denotes the boundary of X). Furthermore the union ∪s∈Σ X(s) of all admissible active sets covers the set of feasible initial conditions for (2). The algorithm we propose solves (2) based on a line-search technique based on the homotopy of optimal solutions. The method is initialized by solving the equality constrained problem (0) for an optimal active set at some plant state x0 , and then updates this active set at successive iterations to find the optimal active set at the current measured plant state xp . At each iteration j the algorithm determines s(j+1) from s(j) by performing a line search over x0 ∈ X(s(j) ) in the direction of the current plant state xp . Constraints are added to the active set by setting sut,i or sw t ,i to 1 whenever the corresponding inequality in (24a) (and hence (11), (12)) becomes satisfied with equality. Constraints are dropped from the active set by setting sut,i or sw t ,i to 0 if the corresponding multiplier in (24b) (and (11), (12)) becomes equal to zero. This results in a sequence of dual-feasible iterates x(i) that generate trajectories satisfying (24a,b) but not necessarily (10). (0)

Algorithm 1. Initialize with x0 (0)

(0)

x0 ∈ X(s

and an active set s(0) such that

), and set j = 0. At iteration j = 0, 1, . . .:

(i) Compute {Pt , qt } for t = N − 1, . . . , 0, and {Φt , φt } for t = 0, . . . , N − 1, and hence X(s(j) ). (ii) Perform the line search: α (j) = maxα∈(0,1] {α : x(0j) + α(xp − x0(j) ) ∈ X(s(j) )}. (j+1)

(j)

(j)

(iii) If α (j) < 1, then set x0 := x0 + α (j) (xp − x0 ), j := j + 1, (j) and update s on the basis of the new set of active constraints. Return to step (i). (iv) Otherwise set s∗ := s(j) , compute u∗N (xp ) and stop. Theorem 6. Algorithm 1 converges after a finite number of iterations to s∗ such that the trajectories for x, u and w generated by (1) and (22) with s = s∗ are optimal for (2).

Remark 7. If (2) is strictly convex–concave in the absence of constraints (i.e. if (23a,b) hold with Ft ,⊥ = I and Gt ,⊥ = I for t = 1, . . . , N), then u∗t (0) = 0 and wt∗ (0, 0) = 0 for t = 1, . . . , N, (0)

and hence a possible initialization of Algorithm 1 is x0 s(0) = {0, . . . , 0}.

= 0 and

Remark 8. In the context of MPC, the computation required by Algorithm 1 may be reduced through warm-starting. For example, Algorithm 1 can be initialized at time t + 1 using the time-shifted (0) optimal sequence computed at time t by setting x0 := Axt + ∗ ∗ ∗ ∗ ∗ (0) BuN (xt )+ DwN (xt , uN (xt )) and s := {s1 (t ), . . . , sN (t ), 0} at t + 1, where s∗ (t ) is the optimal sequence of active sets at time t. The choice s∗N +1 (t ) = 0 corresponds to the assumption (discussed in Section 4) that the state of (5) enters a terminal set after N timesteps within which the unconstrained optimal control law and worst-case disturbances satisfy the constraints u ∈ U and w ∈ W . 3.4. Computation In order to estimate how the computational complexity of Algorithm 1 depends on the problem size, we make the assumption that (16) and (19) are solved using the null space method commonly employed by QP active set solvers (see e.g. Fletcher, 2000). This approach, applied to (16), involves computing the QR decomposition of Ft , which requires O(n2w ) floating point operations (assuming that incremental rank-1 updates are employed), as well as calculating the inverse of the matrix on the LHS of (23a), which requires O (nw − nF )3 operations (assuming that Cholesky decomposition is used), where nF ≤ nw is the number of rows of Ft . Applying the same approach to the solution of (19) requires O(n2u )   operations for the QR decomposition of Gt plus O (nu − nG )3 operations for the Cholesky decomposition of the LHS of (23b), where nG ≤ nu is the number of rows of Gt . The other significant computation in (17)–(21) is from the matrix multiplications in (18) and (21), which require O (2n3x + (3nu + 2nw )n2x + n2u nx ) operations. Combining these estimates, and noting that the computation required for the forward simulation is O(n2x N ) (since only the pro(i)

jection, Φt (xp − x0 ), of Φt in (26a,b) is needed), and also that the computation involved in the line search in step (ii) is comparatively insignificant, we estimate the computation per iteration of Algorithm 1 to grow as



O

 

2n3x + n2x (3nu + 2nw ) + c1 (n3w + n3u ) + c2 (n2w + n2u ) N .

Here c1 , c2 are constants that depend on the implementation of Cholesky and QR decompositions, and we have used conservative approximations: nu − nG ≈ nu , nw − nF ≈ nw . Thus the dependence of computation per iteration on the horizon length N is linear. The required number of iterations is problem-dependent and, in common with active solvers for quadratic programming, its upper bound is in general exponential, but empirical evidence (see the examples of Section 5) suggests that this also grows approximately linearly with N. Furthermore the number of iterations can be minimized using warm-starting, as described in Remark 8. This is in stark contrast to existing schemes for min–max receding horizon control, which, for the case of optimal approaches that are based on dynamic programming, have computational loads that necessarily depend exponentially on N (see e.g. Kerrigan & Maciejowski, 2003; Scokaert & Mayne, 1998). Likewise, approaches such as Löfberg (2003), Goulart et al. (2006), Goulart et al. (2009) based on suboptimal controller parameterizations require the solution of a convex optimization in a number of variables that grows quadratically with the horizon length, which (as demonstrated in Section 5) leads to much higher computational loads than Algorithm 1.

J. Buerger et al. / Automatica 50 (2014) 155–161

159

4. Closed loop stability and l 2 -gain bound This section discusses the stability and disturbance attenuation properties of the receding horizon controller defined by the solution of problem (2). The problem description (2a)–(2d) does not include inequality state constraints, and hence it does not allow terminal state constraints to be included in the definition of the receding horizon policy. Nevertheless, the associated receding horizon control law u∗N (·) ensures robust stability and induces a specified disturbance l2 -gain bound for a particular set of initial conditions. To show this we define a robust, controlled, positively invariant set Xf under the infinite horizon unconstrained optimal f f solutions, u∞ (·), w∞ (·), of problem (2). The control law u∗N (·) ensures that the specified l2 -gain bound holds for x0 ∈ XN , where XN is a set of initial conditions from which Xf is reached in N steps under u = u∗N (x). This section concludes with a discussion of the connections between min–max, max–min and open-loop problem formulations. Definition 9. A set Xf ⊆ Rnx is robust positively invariant for f

f

The max–min formulation (30) cannot however achieve better performance than the min–max formulation (2) whenever γ is large enough to ensure that the min–max problem is strictly convex–concave, as is shown by the following result.

f

(1) under ut = u∞ (xt ) and feasible for wt = w∞ (xt , u∞ (xt )) if: f f (i) u∞ (x) ∈ U for all x ∈ Xf ; (ii) Ax + Bu∞ (x) + Dw ∈ Xf for all f

Fig. 1. Execution times of Alg. 1 and the DA policy of Goulart et al. (2009), averaged over 90 plant states with ∥x0 ∥ = 2.5, for Example A.

Theorem 14. For all x ∈ X(s), JN∗ (x) = J˜N∗ (x) if γ ≥ γ (s).

f

x ∈ Xf , w ∈ W ; and (iii) w∞ (x, u∞ (x)) ∈ W for all x ∈ Xf .

Similarly the open loop min–max problem defined by

Definition 10. For t = 1, . . . , N, the set Xt is the preimage of Xt −1 for (1) under ut = u∗N (xt ) if X0 = Xf and Xt = {x ∈ Rnx : Ax + Bu∗N (x) + Dw ∈ Xt −1 , ∀w ∈ W }. f

1

N

N

Lemma 11. (i) X ⊆ X ⊆ · · · ⊆ X ; (ii) X is invariant and a region of attraction of Xf under u = u∗N (x); (iii) for any k ≤ N, u∗N (x) = u∗k (x) ∀x ∈ Xk . The main result of this section, concerning the stability of the dynamics xt +1 = Axt + But + Dwt , ut ∈ U, wt ∈ W , under the receding horizon control law u = u∗N (x), is stated next. Proposition 12. If x0 ∈ XN , then, for any admissible disturbance sequence {wt ∈ W , t = 0, 1 . . .}: (i) the state of (1) with ut = u∗N (xt ) satisfies xt ∈ Xf for all t ≥ N; (ii) the following l2 -gain bound holds ∞ ∞   (∥xt ∥2Q + ∥ut ∥2R ) ≤ γ 2 ∥wt ∥2 + 2JN∗ (x0 ). t =0

uol (x0 ), w∗ol (x0 , u∗ol (x0 )) , arg







N −1  1

2 t =0

1

(∥xt ∥2Q + ∥ut ∥2R − γ 2 ∥wt ∥2 ) + ∥xN ∥2P 2

is convex–concave and its optimal objective is equal to JN∗ (x0 ) if γ ≥ γ ol (s) ≥ γ (s) ≥ γ˜ (s). Finally we note that if γ ≥ γ (s), then the KKT conditions defining the optimal solutions for problems (2) and (30) are identical. However a crucial difference between min–max and max–min formulations is the value of γ required to ensure convexity of each problem. Thus (5)–(10) do not define an optimal solution of problem (2) if γ (s) > γ ≥ γ˜ (s), so in this case the l2 -gain bound (29) does not necessarily hold and the max–min controller may have better disturbance rejection capabilities. 5. Numerical examples Example A: Consider the model (1) and cost (2b) with:



1 A= 0

4.1. Min–max, max–min and open loop problems The formulation of problem (2) assumes that wt is unknown when ut is determined. If, on the other hand, ut is allowed to depend explicitly on wt , then it is straightforward to adapt Algorithm 1 to solve the max–min problem defined by w∈W u∈U

J˜m (x, u, w),

(30)

(∥x∥2Q + ∥u∥2R − γ 2 ∥w∥2 ) + J˜m∗ −1 (x+ ), and J˜m∗ (x) ,  ˜ m∗ (x) , with terminal cost [b]J˜0∗ (x) , 21 ∥x∥2P . J˜m x, u˜ ∗m (x, w ˜ m∗ (x)), w

J˜m (x, u, w) ,

max

(29)

t =0

 ∗  w ˜ m (x), u˜ ∗m (x, w) , arg max min

min

{u0 ,...,uN −1 } {w0 ,...,wN −1 } ut ∈U wt ∈ W

1 2



It is easy to show that the receding horizon max–min control law ut = u˜ ∗N (xt , wt ) ensures the performance bound (29) with JN∗ (x0 ) replaced by J˜N∗ (x0 ). The additional information (i.e. knowledge of wt ) needed to compute u˜ ∗N (xt , wt ) is expected to allow a smaller l2 -gain bound; indeed this is demonstrated by the following result. Theorem 13. For each active set s ∈ Σ , there exist scalars γ (s) and γ˜ (s) such that for all x ∈ X(s) problem (2) is strictly convex– concave for γ ≥ γ (s) and problem (30) is strictly concave–convex for γ ≥ γ˜ (s), where γ˜ (s) ≤ γ (s).



1 Q = 0



 

1 , 1

0 B= , 1



R = 1, γ 2 = 20,

0 , 0



1 D= 0



0 , 1

and constraint sets: U = {u ∈ R : −1 ≤ u ≤ 1}, W = {w ∈ R2 : −[0.3 0.3] ≤ wT ≤ [0.3 0.3]}. The unconstrained f f optimal feedback laws u∞ (x), w∞ (x) and terminal cost (2d) were computed using standard procedures (e.g. Buerger et al., 2011), and f the maximal robust positively invariant set Xf under u = u∞ (x) was computed using the procedure of Kolmanovsky and Gilbert (1998). Fig. 1 compares the computational load of the exact DP active set algorithm of Alg.1 with that of a Disturbance Affine (DA) feedback policy (Goulart et al., 2009) applied to problem (2) for 90 initial conditions equispaced around a circle of radius 2.5 (which is contained within X9 ). The DA policy was formulated, as in Goulart et al. (2009), as a semidefinite program, and computation times are SeDuMi (Sturm, 1999) execution times. Alg.1 was implemented in (0) Matlab and initialized with x0 = 0. This comparison shows that Alg.1 is several orders of magnitude faster than the DA approach. However the DA policy can be significantly suboptimal, as shown by the predicted costs in Table 1; the degree of suboptimality falls

160

J. Buerger et al. / Automatica 50 (2014) 155–161

Table 1 Table 1. Comparison of initial predicted cost, closed loop cost for worst-case disturbances, and CPU time (s), for Example A with N = 12, for 25 initial conditions on the boundary of X12 .

Predicted cost suboptimality (mean/max) Closed loop cost suboptimality (mean/max) Execution time (mean/max)

f

Alg.1

DA (Goulart et al., 2009)

Saturated u∞

0/ 0 0/ 0 0.24/0.33

6%/27% 2%/5% 43.5/54.0

– 186%/1027% –

set solver is proposed with complexity per iteration that scales linearly with horizon length. Numerical examples demonstrate that the execution time is similar to the time typically required to solve the corresponding nominal MPC problem. Though not considered here, extensions to systems with state constraints are currently under development. Appendix. Proofs

Fig. 2. Execution time (blue) and number of iterations (green) of Algorithm 1 with (0) x0 = 0, averaged over 100 plant states for Example B. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) f

in closed loop operation but the naive strategy of saturating u∞ (x) at the input bounds is highly suboptimal. The execution times (0) of Alg.1 in Table 1 are for cold starts (x0 = 0), for which the average number of iterations was 37. If Alg.1 is warm-started using Remark 8, a single iteration is required for all t ≥ 1 for the worst case disturbances, while 4.8 iterations were required on average for t ≥ 1 for the case of random disturbances uniformly distributed within W . Example B: A model of aircraft pitch dynamics, with 0.938 −0.536 A= 0.119 1.174





0 1 D= 0 0

0.134 0.402 0.007 0.092

0 0 1 0



0 0  , 0.2 1

0.046  0.402  , B= −0.019 −0.180



Proof of Lemma 1. The proof is by induction and uses Wolfe duality (Fletcher, 2000). By the dynamic programming principle, at time t we can consider subproblems at t + 1, . . . , N − 1 to involve only equality constraints (since an optimal active set exists for these subproblems) and append the conditions corresponding to the inequality-constrained tth stage problem using the Wolfe dual. Continuity of the objective function and constraints follows from the linear-quadratic nature of (2); hence the Wolfe duals of (3) and (4) exist given the assumption that (3) and (4) are convex problems.  Proof of Theorem 2. Lemma 1 allows the first-order necessary conditions to be found using conditions (3) and (4), i.e. for active constraints: ∇uj Hj = 0, ∇xˆ j+1 Hj = 0, ∇λˆ Hj = 0, ∇µj Hj = 0, j

ˆ j = 0, ∇λj Hˆ j = 0, and for inactive constraints: ∇wj Hˆ j = 0, ∇xj+1 H ∇ηj Hˆj = 0. These conditions imply λt = λˆ t and, with (2d) and the initial condition x0 = xp , they prove the theorem. 





0 0 , 0 1

has constraints: U = {u ∈ R : −25 ≤ u ≤ 25}, W = {w ∈ R2 : −0.25 ≤ wi ≤ 0.25, i = 1, 2}, and cost weights Q = diag(0, 0, 0.04, 0), R = 2.5 × 10−4 and γ 2 = 5. Fig. 2 shows the average execution time and number of iterations of Alg.1 (0) (initialized with x0 = 0) at t = 0 for 100 initial states equispaced on the circle: Ω = {x ∈ R4 : xT = [0 0 x3 x4 ], ∥(x3 , x4 )∥ = 50}. The required number of iterations is independent of N for N ≥ 18 (since Ω ⊂ X18 , so increasing N causes no more constraints to become active); hence the linear increase in execution time for N ≥ 18 confirms that computation per iteration depends linearly on N. For N ≤ 10, XN ⊂ Ω , and the required number of iterations grows approximately linearly with N, causing the execution time to rise approximately quadratically with N. When Alg.1 is warmstarted as in Remark 8, the average number of iterations and average execution time required at t = 1 with random (uniformly distributed) disturbances fall to 1.54 and 0.017 s respectively for N ≥ 18. 6. Conclusions

Proof of Proposition 3. Conditions (24a,b) ensure that a solution of the equality constraint problem (5), (6), (13), (14) coincides with that of the KKT conditions (5)–(10) for the given active set s. Furthermore problem (2) is strictly concave in wt and strictly convex in ut if and only if conditions (23a) and (23b) hold. Therefore, under (23a,b), the KKT conditions (5)–(10) admit a unique solution and are sufficient as well as necessary for optimality (Nocedal & Wright, 2006).  Proof of Lemma 5. The linear inequality constraints (24a,b) defining the boundary of X(s) imply convexity and (28a). (28b) results from the piecewise continuity of the trajectories x(x0 ), λ(x0 ), u(x0 ), w(x0 ), µ(x0 ) and η(x0 ) which follows from (16), (18), (19), (21). Finally, a solution of (2) exists for all feasible xp , so ∪s∈Σ X(s) necessarily covers the set of feasible initial conditions xp .  Proof of Theorem 6. The line search in step (ii) of Algorithm 1 (j) implies that each iterate x0 lies on the line segment defined by (0)

(0)

x0 + β (j) (xp − x0 ) with β (j) ∈ [0, 1]. Also {β (j) , j = 0, 1 . . .} is a non-decreasing sequence and each iterate x(j) lies at an intersection (j) of this line with the boundary ∂ X(s(j) ) or at x0 = xp . Thus β (j) = 1 for finite j because the number of admissible active sets s ∈ Σ is (j) finite. It follows that Algorithm 1 terminates with x0 = xp . 

Proof of Lemma 11. Definition 9 and the terminal cost (2d) imply f u∗N (x) = u∞ (x) ∀x ∈ Xf . Properties (i) and (ii) therefore follow from Definition 10. We next demonstrate Property (iii) by induction. Thus, if, for some k ≤ N, u∗i (x) = u∗N (x) ∀x ∈ Xi for all f

For a robust min–max optimal control problem for inputconstrained linear systems with bounded disturbances, an active

i < k, then x ∈ Xk implies u∗k (x) = u∗N (x), and since u∗N (x) = u∞ (x) ∀x ∈ Xf , we therefore obtain u∗k (x) = u∗N (x) ∀x ∈ Xk ∀k ≤ N. 

J. Buerger et al. / Automatica 50 (2014) 155–161

Proof of Proposition 12. For x0 ∈ XN , Definitions 9, 10 and Lemma 11 imply xt ∈ Xf for all t ≥ N and JN∗ (xt ) ≥ JN∗ (xt +1 ) + 1 (∥xt ∥2Q + ∥ut ∥2R − γ 2 ∥wt ∥2 ) for all t ≥ 0; the bound (29) follows 2 from this inequality (see e.g. Mayne et al., 2006).  Proof of Theorem 13. Problem (30) is strictly concave–convex for a given active set s if and only if: FtT,⊥ (R + BT P˜ t B)Ft ,⊥ ≻ 0

ˆ

GTt,⊥ (γ 2 I − DT P˜ t D)Gt ,⊥ ≻ 0

(31a) (31b)

ˆ

˜ tw ) and = P˜t − P˜t BL˜ ut , P˜t −1 = Q + AT Pˆ˜ t (A + DM ∗ w ˜ ˜u∗N −t (x, w) = L˜ ut (Ax + Dw) + ˜lw ˜w ˜ N −t (x) = Mt x + m t , w t for ˜ t = 0, . . . , N − 1, with PN −1 = P. Since PN −1 = P and since u∗m (x) and w ˜ m∗ (x) are suboptimal solutions for u˜ ∗m (x, w) and wm∗ (x, u) respectively, it follows by induction that P˜ t ≼ Pt for t = 0, . . . , N − ˆ 1. Therefore λ(DT P˜ t D) ≤ λ(DT Pt D) (λ(·) denotes minimum eigenvalue) and hence, for all t = 0, . . . , N − 1, the lower bound on γ imposed by (23a) exceeds that imposed by (31b).  where P˜ t

Proof of Theorem 14. If γ ≥ γ (s), then problem (2) is strictly convex–concave and the Wolfe dual permits reformulation as a ∗ single minimization. The order in which u∗m and wm are determined can then be interchanged, and subsequent re-application of the Wolfe dual, which is applicable since γ ≥ γ˜ (s) by Theorem 13, yields problem (30).  References Allwright, J. C., & Papavasiliou, G. C. (1992). On linear programming and robust model-predictive control using impulse-responses. Systems & Control Letters, 18(2), 159–163. Bemporad, A., Borrelli, F., & Morari, M. (2003). Min-max control of constrained uncertain discrete-time linear systems. IEEE Transactions on Automatic Control, 48(9), 1600–1606. Best, M. J. (1996). An algorithm for the solution of the parametric quadratic programming problem. In Applied mathematics and parallel computing (pp. 57–76). Heidelberg: Physica-Verlag. Boyd, S., El Ghaoui, L., Feron, E., & Balakrishnan, V. (1994). Linear matrix inequalities in system and control theory: Vol. 15. Studies in applied mathematics. Philadelphia, PA: SIAM. Buerger, J., Cannon, M., & Kouvaritakis, B. (2011). An active set solver for inputconstrained robust receding horizon control. In 50th IEEE Conference on Decision and Control, pp. 7931–7936. Campo, P. J., & Morari, M. (1987). Robust model predictive control. Proc. American Control Conference, pp. 1021–1026. Cannon, M., Liao, W., & Kouvaritakis, B. (2008). Efficient MPC optimization using Pontryagin’s minimum principle. International Journal of Robust and Nonlinear Control, 18(8), 831–844. Diehl, M., & Björnberg, J. (2004). Robust dynamic programming for min-max model predictive control of constrained uncertain systems. IEEE Transactions on Automatic Control, 49(12), 2253–2257. Ferreau, H. J., Bock, H. G., & Diehl, M. (2008). An online active set strategy to overcome the limitations of explicit MPC. International Journal of Robust and Nonlinear Control, 18, 816–830. Fletcher, R. (2000). Practical methods of optimization. Wiley.

161

Goulart, P. J., Kerrigan, E. C., & Alamo, T. (2009). Control of constrained discrete-time systems with bounded l2 gain. IEEE Transactions on Automatic Control, 54(5), 1105–1111. Goulart, P. J., Kerrigan, E. C., & Maciejowski, J. M. (2006). Optimization over state feedback policies for robust control with constraints. Automatica, 42(4), 523–533. Grunbaum, B. (2000). Convex polytopes (2nd ed.). Springer. Kerrigan, E. C., & Maciejowski, J. M. (2003). Robustly stable feedback min-max model predictive control. In Proc. ACC, Denver, CO. Kolmanovsky, I., & Gilbert, E. (1998). Theory and computation of disturbance invariant sets for discrete-time linear systems. Mathematical Problems in Engineering, 4, 317–367. Kothare, M. V., Balakrishnan, V., & Morari, M. (1996). Robust constrained model predictive control using linear matrix inequalities. Automatica, 32(10), 1361–1379. Lee, J. H., & Yu, Z. (1997). Worst-case formulations of model predictive control for systems with bounded parameters. Automatica, 33. Löfberg, J. (2003). Minimax approaches to robust model predictive control. Ph.D. thesis, Linköping University, Sweden. Mayne, D. Q., Raković, S. V., Vinter, R. B., & Kerrigan, E. C. (2006). Characterization of the solution to a constrained H∞ optimal control problem. Automatica, 42, 371–382. Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer. Scokaert, P. O. M., & Mayne, D. Q. (1998). Min-max feedback model predictive control for constrained linear systems. IEEE Trans. Automatic Control, 43(8), 1136–1142. Sturm, J. F. (1999). Using SeDuMi 1.02, a Matlab toolbox for optimization over symmetric cones. Optimization Methods and Software, 11–12, 625–653. Witsenhausen, H. (1968). A minimax control problem for sampled linear systems. IEEE Transactions on Automatic Control, 13, 5–21.

Johannes Buerger was awarded an M.Eng. by the University of Oxford in Engineering, Economics and Management in 2008. In 2013 he obtained a D.Phil. in Engineering Science, also from the University of Oxford, for his work on efficient optimization techniques for Model Predictive Control. He currently works as a control engineer for hybrid powertrains at BMW, Munich.

Mark Cannon received the degrees of M.Eng. in Engineering Science in 1993 and D.Phil. in Control Engineering in 1998, both from the University of Oxford, and received the MS in Mechanical Engineering in 1995 from Massachusetts Institute of Technology. He is an Official Fellow of St. John’s College and a University Lecturer in Engineering Science, Oxford University.

Basil Kouvaritakis was awarded a B.Sc. in Electrical Engineering from the University of Manchester Institute of Science and Technology, where he also received his Masters and Doctorate. He is currently a Professor of Engineering at the Department of Engineering Science and a Tutorial Fellow at St. Edmund Hall, Oxford University.