Optimistic optimization for model predictive control of max-plus linear systems

Optimistic optimization for model predictive control of max-plus linear systems

Automatica 74 (2016) 16–22 Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Brief paper O...

636KB Sizes 0 Downloads 35 Views

Automatica 74 (2016) 16–22

Contents lists available at ScienceDirect

Automatica journal homepage: www.elsevier.com/locate/automatica

Brief paper

Optimistic optimization for model predictive control of max-plus linear systems✩ Jia Xu, Ton van den Boom, Bart De Schutter Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands

article

info

Article history: Received 29 August 2014 Received in revised form 24 February 2016 Accepted 28 June 2016

Keywords: Max-plus linear systems Model predictive control Optimistic optimization

abstract Model predictive control for max-plus linear discrete-event systems usually leads to a nonsmooth nonconvex optimization problem with real valued variables, which may be hard to solve efficiently. An alternative approach is to transform the given problem into a mixed integer linear programming problem. However, the computational complexity of current mixed integer linear programming algorithms increases in the worst case exponentially as a function of the prediction horizon. The focus of this paper is on making optimistic optimization suited to solve the given problem. Optimistic optimization is a class of algorithms that can find an approximation of the global optimum for general nonlinear optimization. A key advantage of optimistic optimization is that one can specify the computational budget in advance and guarantee bounds on the suboptimality with respect to the global optimum. We prove that optimistic optimization can be applied for the given problem by developing a dedicated semi-metric and by proving it satisfies the necessary requirements for optimistic optimization. Moreover, we show that the complexity of optimistic optimization is exponential in the control horizon instead of the prediction horizon. Hence, using optimistic optimization is more efficient when the control horizon is small and the prediction horizon is large. © 2016 Elsevier Ltd. All rights reserved.

1. Introduction Max-plus algebra is a useful tool to model and analyze discreteevent systems (DES). Maximization and addition are two basic operations in the max-plus algebra. In conventional algebra, DES usually result in nonlinear systems, but there is a class of DES which can lead to linear systems in the max-plus algebra, called max-plus linear (MPL) systems (Baccelli, Cohen, Olsder, & Quadrat, 1992; Cohen, Dubois, Quadrat, & Viot, 1985). Many results for control of MPL systems have been achieved, e.g. Amari, Demongodin, Loiseau, and Martinez (2012), Boimond and Ferrier (1996), Cottenceau, Hardouin, Boimond, and Ferrier (2001), Gazarik and Kamen (1999), Hardouin, Maia, Cottenceau, and Lhommeau (2010), Houssin, Lahaye, and Boimond (2013), Katz (2007), Maia, Andrade, and Hardouin (2011).

✩ Research supported by the Chinese Scholarship Council. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Jan Komenda under the direction of Editor Ian R. Petersen. E-mail addresses: [email protected] (J. Xu), [email protected] (T. van den Boom), [email protected] (B. De Schutter).

http://dx.doi.org/10.1016/j.automatica.2016.07.002 0005-1098/© 2016 Elsevier Ltd. All rights reserved.

The model predictive control (MPC) framework has been extended to MPL systems (De Schutter & van den Boom, 2001). For some special cases, the MPL-MPC problem can be formulated as a linear programming; however, in general, it results in a nonsmooth nonconvex optimization problem. To solve this problem, one approach is to recast it as a mixed integer linear programming (MILP) problem. Nonetheless, the computational complexity of most MILP algorithms grows in the worst case exponentially if the number of variables increases (Schrijver, 1986). For the MILP problem resulting from the MPL-MPC problem, the number of auxiliary binary variables is proportional to the number of max operators (i.e. the prediction horizon and the number of inputs and outputs). Thus, the computation time of the corresponding MILP problem will become unacceptable if the prediction horizon is large. As the period corresponding to the prediction horizon should contain the crucial dynamics of the process, the prediction horizon can be very large for some MPL-MPC problems. In this paper, we will show that the MPL-MPC problem can be solved efficiently by using optimistic optimization. Optimistic optimization (Munos, 2014) is a class of algorithms that can find an approximation of the global optimal solution for nonlinear optimization problem. This method is called optimistic because the most promising solutions are examined first at each iteration. The

J. Xu et al. / Automatica 74 (2016) 16–22

main advantage of optimistic optimization is that one can specify the computational budget (e.g. the number of node expansions) in advance and guarantee bounds on the suboptimality with respect to the global optimum. Note that the gap between the result returned by optimistic optimization and the optimal value can be made arbitrarily small as the computational budget increases. In our previous conference paper (Xu, De Schutter, & van den Boom, 2014), we have applied optimistic optimization to the MPC problem for single-input single-output MPL systems. In the current paper, we study multi-input multi-output MPL systems; we also consider arbitrary piecewise linear output cost functions and two kinds of input cost functions. In addition, we develop an expression for the dedicated semi-metric required by optimistic optimization for each type of objective function. The proposed approach is evaluated with a group of random instances with different parameters settings. Moreover, we show that the computational complexity of optimistic optimization for the MPL-MPC problem depends on the control horizon instead of the prediction horizon. Situations with a short control horizon and a long prediction horizon are common for DES control and it is useful to have a method to solve the corresponding MPC optimization problem without a significant influence of the prediction horizon. For a given MPL-MPC problem, the method using optimistic optimization in this paper will be more efficient than the MILP method in case of small control horizons and large prediction horizons. 2. Preliminaries Define ε = −∞ and Rε = R ∪ {ε}. The max-plus-algebraic addition (⊕) and multiplication (⊗) are defined x⊕y =  as follows:  max(x, y), x⊗y = x+y, for numbers x, y ∈ Rε ; A⊕B ij = aij ⊕bij = max(aij , bij ), A ⊗ C





=

m

ij n×m

k =1

aik ⊗ ckj = maxk=1,...,m (aik + ckj ),

×p for matrices A, B ∈ Rε and C ∈ Rm . ε Consider a multi-input multi-output MPL system

17

introduced as weighting coefficients for the tardiness and earliness penalties with respect to a reference signal r. Denote Φj,p,k =  max αp yˆ p (k + j|k)− rp (k + j) , βp rp (k + j)− yˆ p (k + j|k)









. The four

1,∞ (k) = Φj,p,k , Jout  ny ∞,1 p=1 Φj,p,k , j=0 maxp=1,...,ny Φj,p,k , Jout (k) = maxj=0,...,Np −1 ∞,∞ Jout (k) = maxj=0,...,Np −1 maxp=1,...,ny Φj,p,k . It is easy to show that

output cost functions are:

1,1 Jout

(k) =

Np −1 ny j =0

p=1

Np −1

Jout,1 and Jout,2 in De Schutter and van den Boom (2001) are special 1,1 cases of Jout . In De Schutter and van den Boom (2001) the following input cost functions are introduced: Np −1 nu 1 Jin (k) = −



uq (k + j),

(4)

j=0 q=1

2

2 Jin (k) = −u˜ (k)2 .



(5)

For an explanation and discussion of the input cost functions, we refer the reader to De Schutter and van den Boom (2001). Let ∆u be the input rate: ∆u(k) = u(k) − u(k − 1). In MPL-MPC, a control horizon Nc with Nc < Np is often introduced and the control input rate is taken to be constant from event step k + Nc on. Thus, the use of Nc reduces the computational burden. For an in-depth discussion about tuning of Nc , we refer the reader to van den Boom and De Schutter (2002). Consequently, we assume ∆u(k+j) = ∆u(k+Nc − 1), j = Nc , . . . , Np − 1. Denote u¯ (k − 1) = [uT (k − 1) · · · uT (k − 1)]T and define L ∈ RNp nu ×Nc nu as

 I nu  In u  .  ..   L =  In u   Inu  .  . .

0 In u

.. .

··· ··· .. .

0 0

In u

···

Inu

In u

Inu

In u

2Inu

Inu

Inu

··· .. . ···

In u

(Np − Nc + 1)Inu

.. .

0 0

.. .

.. .

.. .

.. .

x(k) = A ⊗ x(k − 1) ⊕ B ⊗ u(k),

(1)

with Inu the nu × nu identity matrix. Then

y(k) = C ⊗ x(k)

(2)

u˜ (k) = L∆u˜ (k) + u¯ (k − 1),

          

     

Nc nu (6)

(Np − Nc )nu

(7)

where k is the event counter, x(k) ∈ Rnε x is the state, u(k) ∈ Rnε u ny is the input, y(k) ∈ Rε is the output, and where A ∈ Rnε x ×nx ,

B ∈ Rεnx ×nu , and C ∈ Rε are the system matrices. Let yˆ (k + j|k), j = 0, 1, . . . be the estimate of the output at event step k + j based on the information available at event step k. Given a prediction horizon Np , the estimation of the evolution of the MPL system from event step k up to k + Np − 1 can be presented as follows

where ∆u˜ (k) = [∆u (k) . . . ∆u (k + Nc − 1)] . Using (3) to eliminate y˜ from Jout , the eliminated J only depends on u˜ now. Then using (7) to replace u˜ by ∆u˜ , the resulting J only depends on ∆u˜ , denoted as J∆ . Simple bound constraints on the input rate are common in practice, meaning that there is a minimum and a maximum separation between input events:

y˜ (k) = H ⊗ u˜ (k) ⊕ g (k)

a ≤ ∆u˜ (k) ≤ b,

ny ×nx

(3)

T

T

T

(8)

with y˜ (k) = [ˆyT (k|k) . . . yˆ T (k + Np − 1|k)]T , u˜ (k) = [uT (k) . . . uT (k + Np − 1)]T for appropriate H, g (k) (see De Schutter and van den Boom (2001) for details of H, g (k)).

with a, b real vectors of size Nc nu × 1. Therefore, we finally obtain the following MPL-MPC problem, for given σ , τ ∈ {1, ∞}, ω ∈ {1, 2}:

3. The MPC problem for MPL systems

∆u˜ (k)

σ ,τ ,ω

The MPC framework has been extended to MPL systems (De Schutter & van den Boom, 2001). The considered objective function J consists of the weighted sum of an output cost and an input cost: J = Jout + λJin with λ > 0. Moreover, different objective functions can be designed for different goals. In this paper, we consider four different output cost functions and two input cost functions. We include both a tardiness and an earliness penalty in the output cost functions with parameters to express the tradeoff between the two kinds of penalty. In this paper, parameters αp , βp , p = 1, . . . , ny with αp , βp ≥ 0, αp + βp > 0 are

min J∆

σ ,τ (k) = Jout (k) + λJinω (k)

(9)

subject to (8). A finite optimal solution of the MPL-MPC problem exists if the feasible set is bounded and closed and the objective function is finite for finite arguments. These conditions hold in general. 4. Optimistic optimization Optimistic optimization can be used for nonlinear optimization problem with the objective function that is deterministic (Munos, 2011) or stochastic (Valko, Carpentier, & Munos, 2013). It was used

18

J. Xu et al. / Automatica 74 (2016) 16–22

for the consensus problem of agents with nonlinear dynamics (Busoniu & Morarescu, 2014). Optimistic planning (Busoniu, Postoyan, & Daafouz, 2013; Hren & Munos, 2008; Máthé, Buşoniu, Munos, & De Schutter, 2014) is a class of algorithms related to optimistic optimization. To introduce optimistic optimization generically, we consider a minimization of a general deterministic function f over a feasible set X. The implementation of optimistic optimization is based on a hierarchical partitioning of X. For any integer h ∈ {0, 1, . . .}, X is partitioned into K h sets, called cells X h,d with d = 0, . . . , K h − 1. This partitioning may be represented by a tree where each cell X h,d corresponds to a node (h, d) of the tree such that each node (h, d) possesses K child nodes {(h + 1, dk )}Kk=1 . In addition, the set of {X h+1,dk }Kk=1 forms a partition of the parent cell X h,d . The root node of the tree (i.e. the cell X 0,0 ) corresponds to the whole domain X. To each cell X h,d , we assign a representative point xh,d ∈ X h,d , where f may be evaluated. To use optimistic optimization, some requirements should be satisfied (Munos, 2011). Definition 1 (Semi-Metric). A semi-metric on a set X is a function ℓ : X × X → R+ 0 satisfying: (1) ∀x, y ∈ X, ℓ(x, y) = ℓ(y, x) ≥ 0; and (2) ∀x, y ∈ X, ℓ(x, y) = 0 if and only if x = y.

where αp , βp are as defined in Section 3 and Li,· is the ith row of L in (6). Proof. Assume that y˜ (k) is the estimate of the output sequence corresponding to ∆u˜ (k) and y˜ ∗ (k) is defined similarly corresponding to ∆u˜ ∗ (k). Let p˜ = jny + p, thus y˜ p˜ (k) = yˆ p (k + j|k), y˜ ∗p˜ (k) = yˆ ∗p (k+j|k) and r˜p˜ (k) = rp (k+j), for p = 1, . . . , ny , j = 0, . . . , Np −1.

It is easy to verify that, for any x, y, z ∈ R, max α(x − z ), β(z −

    x) − max α(y − z ), β(z − y) ≤ max(α, β)|x − y|, where α, β are non-negative real numbers. Hence, 1,1

1,1

Jout (∆u˜ ) − Jout (∆u˜ ∗ ) Np −1 ny





max(αp , βp )y˜ p˜ (k) − y˜ ∗p˜ (k).

Requirement 2. There exists at least one global optimizer x∗ ∈ X of f (i.e. f (x∗ ) = minx∈X f (x)) and for all x ∈ X, f (x) − f (x∗ ) ≤ ℓ(x, x∗ ).

From (3), we have y˜ p˜ (k) = max Hp˜ ,· ⊗ u˜ (k), gp˜ (k) and y˜ ∗p˜ (k) =   max Hp˜ ,· ⊗ u˜ ∗ (k), gp˜ (k) , where Hp˜ ,· is the p˜ th row of H and u˜ (k) and u˜ ∗ (k) are the respective input sequences corresponding to ∆u˜ (k) and ∆u˜ ∗ (k). Thus,



Requirement 4. There exists a scalar ν > 0 such that any cell X h,d at any depth h contains an ℓ-ball of radius νδ(h) centered in xh,d . The requirements guarantee bounds on the suboptimality with respect to the global optimum and on the computational budget. In particular, Requirements 1 and 2 regard the semi-metric ℓ and the local property of the objective function near the optimum with respect to ℓ. Requirements 3 and 4 guarantee that the partitioning of the feasible set generates well-shaped cells that shrink with further partitioning.



    y˜ p˜ (k) − y˜ ∗ (k) ≤ Hp˜ ,· ⊗ u˜ (k) − Hp˜ ,· ⊗ u˜ ∗ (k).

(12)



Denote Hp˜ w + u˜ w (k) ,



max

w=1,...,(j+1)nu



= Hp˜ w0 + u˜ w0 (k),   Hp˜ ,· ⊗ u˜ (k) = max Hp˜ z + u˜ ∗z (k) ∗

z =1,...,(j+1)nu

= Hp˜ z0 − u˜ ∗z0 (k)

{δ(h)}∞ h=0

Requirement 3. There exists a decreasing sequence with δ(h) > 0, such that for any depth h ∈ {0, 1, . . .}, for any cell X h,d at depth h, we have supx∈X h,d ℓ(x, xh,d ) ≤ δ(h), where δ(h) is called the maximum diameter of the cells at depth h.

(11)

j=0 p=1

Hp˜ ,· ⊗ u˜ (k) =

Requirement 1. There exists a semi-metric ℓ on X.





≥ Hp˜ w0 − u˜ ∗w0 (k).

(13)

Then

  Hp˜ ,· ⊗ u˜ (k) − Hp˜ ,· ⊗ u˜ ∗ (k)   = Hp˜ w0 + u˜ w0 (k) − Hp˜ z0 − u˜ ∗z0 (k)  (13)  ≤ Hp˜ w0 + u˜ w0 (k) − Hp˜ w0 − u˜ ∗w0 (k)   ≤ u˜ w0 (k) − u˜ ∗w0 (k)   u˜ i (k) − u˜ ∗ (k) ≤ max i i=1,...,(j+1)nu

(7)



max

i=1,...,(j+1)nu

    Li,· (∆u˜ (k) − ∆u˜ ∗ (k)).

(14)

From (11), (12) and (14), we have 1,1

1,1

Jout (∆u˜ ) − Jout (∆u˜ ∗ )

5. Optimistic optimization for the MPL-MPC problem In this section, we show that Requirements 1–4 hold for the 1,1,1 MPL-MPC problem (9) with σ = τ = ω = 1, i.e. J∆ = 1,1 1 Jout + λJin . Other cases that can be proved similarly are included in the appendix. Let X = {∆u˜ (k)|a ≤ ∆u˜ (k) ≤ b}. Before proceeding further, we first present the following result.



ny 

Np −1

max(αp , βp )

p=1

 j =0

max

i=1,...,(j+1)nu

    Li,· (∆u˜ − ∆u˜ ∗ ).

(15)

On the other hand, N p nu 1 Jin (∆u˜ ) − Jin1 (∆u˜ ∗ ) =



u˜ ∗i (k) − u˜ i (k)



i=1

Theorem 1. Let ∆u˜ (k) be an arbitrary input rate sequence and ∆u˜ ∗ (k) be the optimal input rate sequence for problem (9). Then it holds that 1,1,1

J∆



From (15) to (16), we deduce that (10) holds.

(∆u˜ ) − J∆1,1,1 (∆u˜ ∗ ) ny  p=1

Np −1

max(αp , βp )

 j =0

+ λ∥L(∆u˜ − ∆u˜ ∗ )∥1

≤ ∥˜u(k) − u˜ ∗ (k)∥1 ≤ ∥L(∆u˜ (k) − ∆u˜ ∗ (k))∥1 .

max

i=1,...,(j+1)nu

Based on Theorem 1, we can define ℓ that for any ∆u˜ (k), ∆v˜ (k) ∈ X,

    Li,· (∆u˜ − ∆u˜ ∗ )

1,1,1

(16) 

: X × X → R+ , such

ℓ1,1,1 (∆u˜ (k), ∆v˜ (k)) (10)

,1 := ℓ1out (∆u˜ (k), ∆v˜ (k)) + λℓ1in (∆u˜ (k), ∆v˜ (k))

(17)

J. Xu et al. / Automatica 74 (2016) 16–22

Thus if we define δ 1,1,1 (h) as in (20), then

with ,1 (∆u˜ (k), ∆v˜ (k)) ℓ1out ny 

=

sup ∆u˜ (k)∈X h,d

N p −1



max(αp , βp )

p=1

max

i=1,...,(j+1)nu

j=0

    Li,· (∆u˜ (k) − ∆v˜ (k)),

ℓ (∆u˜ (k), ∆v˜ (k)) = ∥L(∆u˜ (k) − ∆v˜ (k))∥1 where λ > 0 and αp , βp are as defined in Section 3. Because L is not singular, it is easy to verify that the function ℓ defined by (17) is a semi-metric on X. Therefore, Requirements 1–2 are satisfied for σ = τ = ω = 1. For the partitioning of X = {∆u˜ (k)|a ≤ ∆u˜ (k) ≤ b}, we take the center of X as the starting point (i.e. the root node of the tree). At each iteration, we bisect each dimension of X; so the number of branches K equals 2Nc nu . From (8), we have, for any ∆u˜ (k) ∈ X h,d with its center denoted as ∆u˜ h,d (k), 1

∥∆u˜ (k) − ∆u˜ h,d (k)∥∞ ≤ ∥∆u˜ (k) − ∆u˜ h,d (k)∥1 ≤

2h+1 1

∥ b − a∥ ∞ ,

∥ b − a∥ 1 .

2h+1

ℓ1,1,1 (∆u˜ (k), ∆u˜ h,d (k)) ≤ δ 1,1,1 (h). 

Theorem 3. Choose ν 1,1,1 such that

1 in

(18) (19)

0 < ν 1,1,1 ≤

def

bh,d = f (xh,d ) − δ(h) by adding its K children to the current tree (i.e. splitting the corresponding cell into K sub-cells). We now derive the expression δ 1,1,1 (h) for δ(h) corresponding to ℓ1,1,1 .

1 δout + λδin



(20)

for h ∈ {0, 1, . . .} with 1,1 δout =

ny Np (Np + 1)∥b − a∥∞ 

2

max(αp , βp ),

(21)

p=1

1 δin = ∥L(b − a)∥1 .

(22)

Then for any h ∈ {0, 1, . . .}, d ∈ {0, . . . , K h − 1}, it holds that sup∆u˜ (k)∈X h,d ℓ1,1,1 (∆u˜ (k), ∆u˜ h,d (k)) ≤ δ 1,1,1 (h), where ∆u˜ h,d (k) is the center of cell X h,d . Proof. For any ∆u˜ (k) ∈ X h,d , we have1

Np −1

max(αp , βp )

p=1





ny 

max(αp , βp )



1 2h+1

.

Proof. According to Theorem 2, we can define a decreasing sequence {δ 1,1,1 (h)}∞ h=0 as in (20). Select a real number ρ with 0 < ρ < 1. From (8), the ℓ-ball B h,d of radius ν 1,1,1 δ 1,1,1 (h) centered in ∆u˜ h,d is inside the cell X h,d , if we choose ν 1,1,1 such min Nc nu (bi −ai ) . Then ν can be chosen as that ν 1,1,1 δ 1,1,1 (h) ≤ ρ i=1,..., 2h+1

ν 1,1,1 ≤

ρ

min

i=1,...,Nc nu

(bi − ai )

1,1 1 δout + λδin

. 

Up to now, we have proved that Requirements 1–4 are satisfied for σ = τ = ω = 1. In a similar way, we can obtain corresponding results for other cases. The dedicated semi-metrics ℓ and the δ(h) expressions for different σ , τ , ω are presented in the appendix, 

2

2h+1

1

∥L(b − a)∥1

and (19)

(22)



σ ,τ ω δout + λδin .

Remark 1. The computational complexity of optimistic optimization in our implementation is exponential in the control horizon Nc . On the other hand, the MPL-MPC problem can also be formulated as an MILP problem (Bemporad & Morari, 1999; De Schutter & van den Boom, 2001). The number of auxiliary binary variables that are used to convert the max operator into linear equations is proportional to the prediction horizon Np . As a result, the complexity of state-of-the-art MILP algorithms is in the worst case exponential in Np (Schrijver, 1986). Therefore, optimistic optimization will be more efficient if Nc ≪ Np . 6. Examples

6.1. Random systems



1,1 δout ,

ℓ1in (∆u˜ (k), ∆u˜ h,d (k)) ≤

1 2h+1

∥Li,· ∥1

Np (Np + 1) ∥b − a∥∞

p=1 (21)

max

i=1,...,(j+1)nu

j =0

∥∆u˜ (k) − ∆u˜ h,d (k)∥∞ (18)

1 ,1 1 δout + λδin

In this section, we illustrate the approach and the statement in Remark 1 with random experiments and an application to an industrial manufacturing system.

,1 (∆u˜ (k), ∆u˜ h,d (k)) ℓ1out ny 

(bi − ai )

σ ,τ

1  1 ,1 2h+1

min

i=1,...,Nc nu

σ ,τ ,ω where ℓσ ,τ ,ω = ℓout + λℓω ( h) = in and δ

Theorem 2. Define

δ 1,1,1 (h) =

ρ

Then any cell X h,d at any depth h contains an ℓ-ball B h,d of radius ν 1,1,1 δ 1,1,1 (h) centered in ∆u˜ h,d where 0 < ρ < 1 and 1 ,1 1 , δin are as defined in (20)–(22). δ 1,1,1 (h), δout

Optimistic optimization expands a leave with the best b-value



19

2h+1 1 2h+1

1 δin .

1 For every x, y ∈ Rn , we have |xT y| ≤ ∥x∥ ∥y∥ . 1 ∞

Consider the MPL system (1)–(2) with nu = ny = 1. We will consider nx = 5, 10, 20. To verify the statement in Remark 1, we will perform experiments for Nc = 3, 4, 5 and Np = Nc +1, . . . , 60. Assume that λ = 0.01, u(0) = 0, σ = τ = ω = 1, and −15 ≤ ∆u(k) ≤ 15 for all k. The elements of A, B, C , x(0) are selected as random integers uniformly distributed in the interval [0, 10], but some elements of A, B, C , x(0) may be equal to ε with a probability 0.2. The increments of the reference sequence r are random integers uniformly distributed in the interval [0, 10]. For each nx ∈ {5, 10, 20}, we generate 20 random (A, B, C , x(0)) combinations. For each choice of (A, B, C , x(0)), we generate 10 Np

random reference sequences {r (k)}k=1 . We compare the efficiency of our method with the MILP solvers cplex and glpk for solving the problem (9). This comparison is fair because our method and the MILP solvers are all implemented in object code. The computational budget of optimistic optimization is set to 200 node

20

J. Xu et al. / Automatica 74 (2016) 16–22

(a) Nc = 3.

(c) Nc = 5.

(b) Nc = 4.

(d) Relative error of oo.

Fig. 1. (a–c) The CPU time for optimistic optimization (oo), cplex, and glpk for Nc = 3, 4, 5; (d) Relative error between oo and the MILP solvers.

expansions. The cplex solver is from the Tomlab toolbox and the glpk solver is from the Multi-Parametric Toolbox. All experiments are performed in Matlab. The CPU time for each method is plotted using logarithmic scale in Fig. 1. We can see that, in Fig. 1(a), for Nc = 3, the mean CPU time curves of optimistic optimization (oo) and the MILP solvers intersect at Np = 6. For Nc = 4 and Nc = 5, the intersections of the mean CPU time curves for oo and for the MILP solvers occur at respectively Np = 9 and Np = 14 as shown in Fig. 1(b–c). Thus optimistic optimization is faster than MILP when Np is about two or three times as large as Nc . We can also see that the computation time of the MILP solvers is exponential in Np , while Np has no significant influence on the computation time of optimistic optimization. We also compute the relative error between the objective function value obtained by optimistic optimization and the best value among the MILP solvers (see Fig. 1(d)). The difference between the objective function values provided by the MILP solvers is negligible, so it is not plotted. For each nx and each combination of A, B, C , x(0) and r (k), the relative error of optimistic optimization is computed. The plotted relative errors are the average values over all instances. We can see that for each value of Nc considered, the average relative errors are less than 3.5 × 10−3 . 6.2. Industrial manufacturing system Now we consider the manufacturing unit for producing rubber tubes for automobile equipment presented in Atto, Martinez, and

Amari (2011). The dynamic behavior of this system is described by an MPL system with 19 states, 2 inputs, and 1 output (see Atto et al. (2011) for details). Let Nc = 2. We run experiments for Np = Nc + 1, . . . , 40 with λ = 0.0001, σ = τ = ω = 1, u(0) = [0 0]T and 2 ≤ ∆ui (k) ≤ 8, i = 1, 2, for all k. The increments of the reference sequence r are random integers uniformly distributed in the interval [2, 10]. We use optimistic optimization and the cplex solver to solve the corresponding MPL-MPC problem (the glpk solver is not used for comparison because cplex is much faster than glpk when solving the resulting MILP problem for this example). The computational budget of optimistic optimization is set to 700 node expansions. Fig. 2 shows the CPU time for optimistic optimization and cplex and the relative error between 1,1,1 the values of J∆ provided by both methods. We can see that optimistic optimization is faster than cplex after Np = 14 and the relative error between the objective function values is less than 9%.

7. Conclusions We have considered model predictive control for max-plus linear systems which usually results in a nonsmooth nonconvex optimization problem. We have extended optimistic optimization to solve the given problem and derived expressions for the required parameters. Based on the theoretical analysis, we found

J. Xu et al. / Automatica 74 (2016) 16–22

21

Fig. 2. (a) The CPU time for optimistic optimization (oo) and cplex; (b) Relative error between oo and cplex.

that the complexity of the proposed approach increases exponentially in the control horizon instead of the prediction horizon. Moreover, the worst-case complexity of the mixed integer linear programming (MILP) method is exponential in the prediction horizon. As illustrated by the numerical results, optimistic optimization is more efficient than MILP when the prediction horizon is large and the control horizon is small. We only considered the simple bounds on the input rate. The case with general linear constraints on inputs and outputs will be considered in the future. We will also consider leveraging the nonexpansivity property of max-plus linear systems to further reduce the complexity and to get tighten expressions for the parameters of Requirements 1–4.

(3) σ = ∞, τ = ∞

 ˜ (k), ∆v˜ (k)) = max max(αp , βp ) ℓ∞,∞ out (∆u p=1,...,ny     max max Li,· (∆u˜ (k) − ∆v˜ (k)) , j=0,...,Np −1 i=1,...,(j+1)nu

δout

∞,∞

= Np ∥b − a∥∞ max max(αp , βp ). p=1,...,ny

(4) ω = 2

    ℓ2in (∆u˜ (k), ∆v˜ (k)) = 2Lb + u¯ (k − 1)2 L(∆u˜ (k) − ∆v˜ (k))2 ,     δ 2 = 2Lb + u¯ (k − 1) L(b − a) . in

2

2

References Acknowledgment Research supported by the Chinese Scholarship Council.

Appendix The main results in Section 5 are for the case σ = τ = ω = 1. Now we present the dedicated results for other cases. The proofs are similar. (1) σ = 1, τ = ∞ ,∞ (∆u˜ (k), ∆v˜ (k)) = ℓ1out Np −1

 j =0

max

i=1,...,(j+1)nu

1,∞ = δout



max max(αp , βp )

p=1,...,ny

    Li,· (∆u˜ (k) − ∆v˜ (k)) ,

Np (Np + 1)∥b − a∥∞

max max(αp , βp ).

p=1,...,ny

2

(2) σ = ∞, τ = 1 1 ˜ (k), ∆v˜ (k)) = ℓ∞, out (∆u

ny 

max(αp , βp )

p=1

max

max

j=0,...,Np −1 i=1,...,(j+1)nu

∞,1 δout = Np ∥b − a∥∞

    Li,· (∆u˜ (k) − ∆v˜ (k)) ,

ny  p=1

max(αp , βp ).

Amari, S., Demongodin, I., Loiseau, J. J., & Martinez, C. (2012). Max-plus control design for temporal constraints meeting in timed event graphs. IEEE Transactions on Automatic Control, 57(2), 462–467. Atto, A. M., Martinez, C., & Amari, S. (2011). Control of discrete event systems with respect to strict duration: Supervision of an industrial manufacturing plant. Computers & Industrial Engineering, 61(4), 1149–1159. Baccelli, F. L., Cohen, G., Olsder, G. J., & Quadrat, J.-P. (1992). Synchronization and linearity: An algebra for discrete event systems. New York: John Wiley & Sons. Bemporad, A., & Morari, M. (1999). Control of systems integrating logic, dynamics, and constraints. Automatica, 35(3), 407–427. Boimond, J.-L., & Ferrier, J.-L. (1996). Internal model control and max-algebra: Conroller design. IEEE Transactions on Automatic Control, 41(3), 457–461. Busoniu, L., & Morarescu, I.-C. (2014). Consensus for black-box nonlinear agents using optimistic optimization. Automatica, 50(4), 1201–1208. Busoniu, L., Postoyan, R., & Daafouz, J. (2013). Near-optimal strategies for nonlinear networked control systems using optimistic planning. In 2013 American Control Conference (pp. 3020–3025). Washington DC, US. Cohen, G., Dubois, D., Quadrat, J.-P., & Viot, M. (1985). A linear-system-theoretic view of discrete-event processes and its use for performance evaluation in manufacturing. IEEE Transactions on Automatic Control, 30(3), 210–220. Cottenceau, B., Hardouin, L., Boimond, J. L., & Ferrier, J. L. (2001). Model reference control for timed event graphs in dioids. Automatica, 37(9), 1451–1458. De Schutter, B., & van den Boom, T. (2001). Model predictive control for max-pluslinear discrete event systems. Automatica, 37(7), 1049–1056. Gazarik, M. J., & Kamen, E. W. (1999). Reachability and observability of linear systems over max-plus. Kybernetika, 35(1), 2–12. Hardouin, L., Maia, C. A., Cottenceau, B., & Lhommeau, M. (2010). Observer design for (max, plus) linear systems. IEEE Transactions on Automatic Control, 55(2), 538–543. Houssin, L., Lahaye, S., & Boimond, J.-L. (2013). Control of (max,+)-linear systems minimizing delays. Discrete Event Dynamic Systems, 23(3), 261–276. Hren, J. F., & Munos, R. (2008). Optimistic planning of deterministic systems. In Recent advances in reinforcement learning (pp. 151–164). Berlin: Springer. Katz, R. D. (2007). Max-plus (A,B)-invariant spaces and control of timed discreteevent systems. IEEE Transactions on Automatic Control, 52(2), 229–241. Maia, C. A., Andrade, C. R., & Hardouin, L. (2011). On the control of max-plus linear system subject to state restriction. Automatica, 47(5), 988–992. Máthé, K., Buşoniu, L., Munos, R., & De Schutter, B. (2014). Optimistic planning with a limited number of action switches for near-optimal nonlinear control. In 53rd IEEE conference on decision and control (pp. 3518–3523). California, USA: Los Angeles.

22

J. Xu et al. / Automatica 74 (2016) 16–22

Munos, R. (2011). Optimistic optimization of a deterministic function without the knowledge of its smoothness. In 25th Annual Conference on Neural Information Processing Systems (pp. 783–791). Granada, Spain. Munos, R. (2014). From bandits to Monte-Carlo tree search: The optimistic principle applied to optimization and planning. Foundations and Trends in Machine Learning, 7(1), 1–130. Schrijver, A. (1986). Theory of linear and integer programming. Chichester, England: John Wiley & Sons. Valko, M., Carpentier, A., & Munos, R. (2013). Stochastic simultaneous optimistic optimization. In 30th International Conference on Machine Learning, ICML-13, vol. 28 (pp. 19–27). Atlanta, Georgia, USA. van den Boom, T., & De Schutter, B. (2002). Properties of MPC for max-plus-linear systems. European Journal of Control, 8(5), 453–462. Xu, J., De Schutter, B., & van den Boom, T. (2014). Model predictive control for maxplus-linear systems via optimistic optimization. In 12th International Workshop on Discrete Event Systems (pp. 111–116). Cachan, France.

Jia Xu received the B.Sc. degree in Statistics from Shandong University, Weihai, China, in 2009. She was working as a master student with the Department of Mathematics from 2009 to 2011 and as a Ph.D. student with the Department of Control Science and Engineering from 2011 to 2012, Tongji University, Shanghai, China. She is currently working toward the Ph.D. degree with Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands. Her research interests include the control of discrete-event systems, hybrid systems control, and optimization.

Ton van den Boom received his Ph.D. degree in Electrical Engineering from Eindhoven University of Technology, The Netherlands, in 1993. Currently he is an Associate Professor at Delft University of Technology, The Netherlands. His research interests are in the areas of linear and nonlinear model predictive control, and control of discrete-event and hybrid systems.

Bart De Schutter (IEEE member since 2008, senior member since 2010) received the Ph.D. degree in 1996, at K.U. Leuven, Belgium. Currently, he is a Full Professor at the Delft Center for Systems and Control of Delft University of Technology in Delft, The Netherlands. Bart De Schutter is Senior Editor of the IEEE Transactions on Intelligent Transportation Systems and Associate Editor of Automatica. His current research interests include intelligent transportation and infrastructure systems, discrete-event systems, hybrid systems, and multi-level and distributed control. OR SHORTER: Bart De Schutter (IEEE member since 2008, senior member since 2010) is a Full Professor at the Delft Center for Systems and Control of Delft University of Technology in Delft, The Netherlands. He is Senior Editor of the IEEE Transactions on Intelligent Transportation Systems. His current research interests include intelligent transportation and infrastructure systems, hybrid systems, and multilevel control.