Preprints, Preprints, 8th 8th IFAC IFAC International International Symposium Symposium on on Preprints, IFAC Advances in Automotive Automotive Control Symposium Preprints, 8th 8th IFAC International International Symposium on on Advances in Control Advances in Automotive Control Available online at www.sciencedirect.com June Norrköping, Sweden Advances Automotive Control June 19-23, 19-23,in2016. 2016. Norrköping, Sweden June June 19-23, 19-23, 2016. 2016. Norrköping, Norrköping, Sweden Sweden
ScienceDirect
IFAC-PapersOnLine 49-11 (2016) 781–786
A new D-optimal input sequence design A A new new D-optimal D-optimal input input sequence sequence design design 22 algorithm using PI using algorithm using PI PI 2 algorithm ∗∗ Masaki Yamakita ∗∗ Hiroaki Ishiyama Hiroaki Ishiyama ∗ Masaki Yamakita ∗ Hiroaki Ishiyama Hiroaki Ishiyama ∗ Masaki Masaki Yamakita Yamakita ∗ ∗∗ Department Control and Systems, of Department of of Mechanical Mechanical Control and Systems, Tokyo Tokyo Institute Institute of ∗ ∗ Department Mechanical Control and Institute Department of2-12-1 Mechanical Control and Systems, Systems, Tokyo Institute of of Technology,of 2-12-1 Ookayama, Meguro, Tokyo, Tokyo Japan(e-mail: Technology, Ookayama, Meguro, Tokyo, Japan(e-mail: Technology, 2-12-1 Ookayama, Meguro, Tokyo, Japan(e-mail: Technology, 2-12-1 Ookayama, Meguro, Tokyo, Japan(e-mail:
[email protected]).
[email protected]).
[email protected]).
[email protected]).
Abstract: In this paper, aa new statistical approach to design optimal input sequences is proposed. The Abstract: In this paper, new statistical approach to design optimal input sequences is proposed. The Abstract: In this paper, aa combination new statistical approach to design optimal input sequences is proposed. The 22 algorithm. Abstract: In this paper, new statistical approach to design optimal input sequences is proposed. The proposed algorithm is the of D-optimal input design method and the PI In the proposed algorithm is the combination of D-optimal input design method and the PI algorithm. In the 2 2 proposed algorithm is the combination of D-optimal input design method and the PI algorithm. In the 2 2 algorithm algorithmaa is thefunction combination input design method and the PI algorithm. In the proposed method, method, cost function of the theofPI PID-optimal algorithm is modified to calculate control sequence, which proposed cost of is modified to calculate control sequence, which 2 algorithm is modified to calculate control sequence, which 2 proposed method, a cost function of the PI proposed method, a cost function of the PI algorithm is modified to calculate control sequence, which maximizes FIC FIC in in aa statistical statistical way. way. The The proposed proposed method method is is compared compared with with aa previous previous optimal optimal input input maximizes maximizes FIC in aa statistical way. proposed method is with previous optimal input maximizes FICin innumerical statistical way. The The The proposed method is compared compared with aa shows previous optimal input design method simulations. result of the numerical simulation the advantage of design method in numerical simulations. The result of the numerical simulation shows the advantage of design method in numerical simulations. The result of the numerical simulation shows the advantage of design method in numerical simulations. The result of the numerical simulation shows the advantage of the proposed method under a noisy situation. the proposed method under a noisy situation. the proposed method under a noisy situation. the proposed method under a noisy situation. © 2016, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Keywords: D-optimal, Input design Keywords: D-optimal, Input design Keywords: D-optimal, Input Input design design Keywords: D-optimal, 1. INTRODUCTION 1. INTRODUCTION 1. INTRODUCTION INTRODUCTION 1. In this paper, aa new statistical approach to design optimal input In this paper, new statistical approach to design optimal input In this this paper, paper, a new new statistical statistical approach to design design optimal input In a approach to optimal input sequences is proposed. The proposed algorithm is the combinasequences is proposed. The proposed algorithm is the combinasequences is proposed. The proposed algorithm is the combinasequences is proposed. The proposed algorithm isHirsch the combination of D-optimal input design Hirsch (2010) and (2012) tion of D-optimal input design Hirsch (2010) and Hirsch (2012) tion PI of 22D-optimal D-optimal input design(2010). HirschD-optimal (2010) and andinput Hirsch (2012) tion of design Hirsch (2010) Hirsch (2012) and algorithminput Evangelos sequence and PI algorithm Evangelos (2010). D-optimal input sequence 2 2 and PI algorithm Evangelos (2010). D-optimal input sequence andaa PI algorithm Evangelos (2010). D-optimal input sequence is dynamical sequence design method, which is designed to is sequence design method, which is designed to is aa dynamical dynamical sequence design method, which is designed to is dynamical sequence design method, which is designed to maximize the total value of Fisher information criteria (FIC). maximize the total value of Fisher information criteria (FIC). maximize the total value of Fisher information criteria (FIC). maximize the total value of Fisher information criteria (FIC). Since D-optimal D-optimal input input design design method method is is aa model model based based method, method, Since Since D-optimal input method is based method, Since D-optimal input design design method is aa model modelshould based be method, the influence of modeling error and disturbance taken the influence of modeling error and disturbance should be taken the influence of modeling error and disturbance should be taken 2 2 the influence of modeling error and disturbance should be taken into account. The PI algorithm is a statistical way to solve HJB into account. The PI algorithm is a statistical way to solve HJB 2 2 into account. The PI algorithm is a statistical way to solve HJB into account. PI algorithm a statistical way to solve HJB equation in aa The non-linear control isproblem problem by simulating simulating several equation in non-linear control by several equation in a non-linear control problem by simulating several equation in a non-linear control problem by simulating several state trajectories trajectories and and computing computing cost-minimizing cost-minimizing solutions. solutions. In In state state trajectories and cost-minimizing solutions. In 2 algorithm 2 state trajectories and computing computing cost-minimizing solutions. In the proposed method, the cost function of the PI the proposed method, the cost function of the PI algorithm 2 2 the proposed method, the cost function of the PI algorithm the proposed method, the cost function of the PI algorithm is modified to calculate the input sequence, which maximizes is modified to calculate the input sequence, which maximizes is to the input maximizes is modified modified to calculate calculate theproposed input sequence, sequence, which maximizes FIC as the the state state cost. The The proposed method which is compared compared with FIC as cost. method is with FIC as the state cost. The proposed method is compared with FIC as the state cost. The proposed method is compared with previous optimal optimal input input design design method method Kawaguchi Kawaguchi (2013) (2013) in previous in previous optimal input design method Kawaguchi (2013) in previous optimal input design method Kawaguchi (2013) in numerical simulations. The result of the numerical simulation numerical simulations. The result of the numerical simulation numerical simulations. The result of the numerical simulation numerical simulations. The result of the numerical simulation shows the advantage of the proposed method under the noisy shows the advantage of the proposed method under the noisy shows shows the the advantage advantage of of the the proposed proposed method method under under the the noisy noisy situation. situation. situation. situation. 2 2. PI 2. PI 222 BASED BASED D-OPTIMAL D-OPTIMAL ALGORITHM ALGORITHM 2. 2. PI PI BASED BASED D-OPTIMAL D-OPTIMAL ALGORITHM ALGORITHM 2 2 In this section, a way to modify PI apply In this section, aa way to modify PI algorithm to to apply DD2 algorithm In this way to PI to DIn this section, section, way to modify modify PI 2 algorithm algorithm to apply apply Doptimal criteriona is is shown. The basic basic idea of of the the proposed optimal criterion shown. The idea proposed optimal criterion is shown. The basic idea of the proposed optimal criterion is shown. The basic idea of the proposed method is is to to use use −FIC −FIC as as the the state state cost, cost, which which makes makes PI PI 222 method method to −FIC as cost, which makes PI 22 method is is work to use useas −FIC as the the state stateTo cost, which makes PI algorithm FIC maximizer. apply the original algorithm work as FIC maximizer. To apply the original PI 2 algorithm work as FIC maximizer. To apply the original PI work as FIC maximizer. To(Design apply the original PI 2 algorithm to the D-optimal based DoE of Experiments) algorithm to the D-optimal based DoE (Design of Experiments) algorithm to D-optimal DoE Experiments) algorithm,, we to the the D-optimal based DoEaa(Design (Design ofset Experiments) problem we would like to to based start from from problemof set up. See See the the problem would like start problem up. problem , we would like to start from a problem set up. 2 2 problem , we would like to start from a problem set up. See See the the detail about about PI PI 2 algorithm algorithm in in the the Appendix. Appendix. detail detail detail about about PI PI 2 algorithm algorithm in in the the Appendix. Appendix. 2.1 Problem 2.1 Problem set set up up 2.1 2.1 Problem Problem set set up up The system system in in PI PI 222 algorithm algorithm can can be be discretized discretized as as following, following, The The in PI can be discretized as � � � � � � � � The system system in � PI 2 algorithm algorithm can be� discretized as following, following, � � (m) � � � � (m) (m) 0k×p (m) � � xxtt(m) � � �0 k×p � ff (x tii ) � � (x ) (m) i+1 t 0 (m) i+1 k×p T xxt(c) = ff (x + 0(c) θttii + + εεttii )) ,, (1) T = (((θ + (1) (c) k×p i+1 (xtttii )))(m) (c) (c) T = + (1) i+1 gtt(c) (c) (x (c) = fff (x (θ + g θttii + + εεttii )) ,, (1) xxxttt(c) tii ) ii T i+1 (c) (c) (c) (x ) g i+1 t tti i) f (x g xtti+1 t �� Sponsor and i i goes here. Paper titles should financial support acknowledgment i+1 and financial support acknowledgment goes here. Paper titles should �� Sponsor Sponsor and financial support acknowledgment goes here. be written uppercase lowercase letters, uppercase. and financialand support acknowledgment goes here. Paper Paper titles titles should should be Sponsor written in in uppercase and lowercase letters, not not all all uppercase. be written in be written in uppercase uppercase and and lowercase lowercase letters, letters, not not all all uppercase. uppercase.
(c)
where, g gtt(c) is an input input basis basis function function vector vector with with input input sequence sequence (c) where, i is an (c) where, g input function vector input sequence ttii is where, gbeing is an an input basis function vector with with inputand sequence horizon N, and llbasis being the dimension of input, horizon being N, and being the dimension of input, and i horizon being N, and l being the dimension of input, and horizon being N, and l being the dimension of input, and ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ [g11 , ..., ..., g g1N ] ⎡ [g ⎤ , ] 11 1N [g , ..., g ] T 11 1N T ⎢ ⎥ (c) [g , ..., g ] . 11 1N ⎢ ⎥ (c) T . . =⎢ gtt(c) ⎣ ⎦ ⎥ g .. .. .. ⎣ ⎦ ii T = ⎢ ⎥ ,,, (c) = g ⎣ ⎦ . .. gttii = ⎣ ⎦, [g , ..., g ] l1 , ..., glN lN ] [g l1 [g , ..., g ] l1 lN [gl1 , ..., glN ] ⎡ ⎤ ⎡ ⎤ ⎡ 11ti θ ti ⎤ ⎡θ ⎤ θ 1 ti θ212titi ⎥ ⎢ ⎢ ⎥ θ ⎢θ ⎥ .. ⎢ θ = θ22.. titi ⎥ θ = ⎣ ⎦ ⎢ ⎥ ⎣ ⎦ θ = . ⎢ ⎥ .. θ =⎣ . ⎣ .. ⎦ ⎦ θ θ ti llti θ l ti θlti (c) (c) .. In Here, θ is aa weight vector of the basis function g to Here, θ is weight vector of the basis function g In order order to (c) (c) .. In Here, θ is a weight vector of the basis function g order to Here, θ is a weight vector of the basis function g In order to consider the FIC for the system, assume that the system output consider the FIC for the system, assume that the system output consider the FIC for the system, assume that the system output consider the FICas is approximated asfor the system, assume that the system output is approximated is is approximated approximated as as Ψ, (2) =X X TTT Ψ, (2) yyy = (2) Ψ, (2) y= =X X T Ψ, 2 2 (k), ..., ..., xxnn (k), (k), xx112 (k), (k), xx11 (k) (k) (3) where X X= = [x [x11 (k), (3) where 2 (k), ..., x (k), x (k), x (k) (3) where X = [x x11 (k) (3) where X = [x11 (k), ..., xnn (k), x11 (k), (k), ..., ..., xx11 (k)x (k)xnn (k), (k), ..., ..., xxnnlll (k)]. (k)]. (4) (4) xxx22 (k), l (k), ..., x (k)x (k), ..., x (k)]. (4) nn (k), ..., xn (k)]. ..., x11 (k)x (4) x22 (k),vector n X denotes denotes the the regressor regressor whose elements are the the combicombiX vector whose elements are X denotes the regressor vector whose elements are the combination of i-th elements of the state vector x. Under this assumpX denotes the regressor vector whose elements are the combination of elements of the state vector x. Under this assumpnation of i-th i-th elements of the state vector x. Under this assump22 with nation i-th elements of of thePI state vector x.is Under this assumpFIC defined as tion, aa of new cost function with FIC is defined as tion, new cost function of PI 2 2 with FIC is defined as tion, a new cost function of PI with FIC is defined as tion, a new cost function of PI � � � N−1 � N−1 11 T � N−1 � �qqt j + T Rut � + u 1 ∑ t JJtiti = = N−1 Ru u j 212 uttTTjj Rutt jj qqtt jj + JJti = ∑ j=i j = + Ru u j=i ti t ∑ 22 t j t j j=i � j � j=i � � N−1 1 � � .. N−1 N−1 � R(θ −FICtt j + + 11 ((θ θ+ +M Mtt j εεtt j ))TTT R( θ+ +M Mtt j εεtt j ))� .. = ∑ .. N−1 −FIC = j + 2 j ε j )T R(θ + M j ε j ) . 1 −FIC θ + M = ( j=i −FICtt j + 2 (θ + Mtt j εtt j ) R(θ + Mtt j εtt j ) . = ∑ j=i ∑ 22 j j j j j j=i j=i
Here, FIC FICttii is is the the Fisher Fisher information information criteria criteria on on time time ttii ,, which which Here, FIC Here, FICttii is is the Fisher Fisher information information criteria criteria on on time time ttii ,, which which isHere, calculated asthe is calculated as is is calculated calculated as as �� �� T T X)� ��det(X log FIC tii = � = log det(X X) FIC T t det(X FIC = log log ti +Δ det(X T X) X) FICttii = t +Δ 1 tii +Δ +Δ tr(log|r|), 1 ti∑ = 11 tr(log|r|), = = tr(log|r|), 2Δ ∑ = 2Δ t −Δ tii∑ −Δ tr(log|r|), 2Δ 2Δ ttii −Δ −Δ produced by QR decompowhere rr is is the the upper upper triangle triangle matrix matrix where produced by QR decompowhere rr is the triangle matrix produced T X.upper where is the upper triangle matrix produced by by QR QR decompodecompoT sition of X sition of X X. T sition sition of of X X T X. X.
Copyright 2016 IFAC IFAC 795 Hosting by Elsevier Ltd. All rights reserved. 2405-8963 © IFAC (International Federation of Automatic Control) Copyright © 2016, 2016 795 Copyright 2016 IFAC 795 Peer review© of International Federation of Automatic Copyright ©under 2016 responsibility IFAC 795Control. 10.1016/j.ifacol.2016.08.114
IFAC AAC 2016 782 June 19-23, 2016. Norrköping, Sweden
Hiroaki Ishiyama et al. / IFAC-PapersOnLine 49-11 (2016) 781–786
In the above equation, FICti is calculated as the average of FIC in 2Δ time steps near time ti , where Δ is a small time interval. In the cost function, the state cost in time scale of t = [t0 , ...,ti ] is calculated as the average of FICti from t = t0 to t = tN as FIC =
1 tN − t0
tN
FICti dti tN 1 1 t+Δ T = log det X (τ )X(τ )d τ dt tN − t0 t0 2Δ t−Δ
Algorithm.1
(1) Define simulation horizon N, the length L of input sequence, iteration number M(be sure to set M large enough), and the number of state trajectories K generated in each time ti . for j=1:L
t0
(5)
2.2 In the case of SISO with discrete input basis function In this subsection, we will consider a case of SISO and the discrete input as the basis function, which is easy to understand. (c) The discrete time impulse function vector gti for time horizon N can be described as,
(2) Set an initial input gain vector θ0 . (3) Set noise matrix ε ∈ RN×n×K , where n is the dimension of system input (for example, in case of SISO system, n = 1). (4) Inject K types of noise matrix ε ∈ RN×n to generate K different trajectories. For time scale {τi = t0 , ...ti }, calculate following cost function, weight function, and policy updating. for m = 1:M for i = 1:N-1 for k = 1:K
(c) T
gti where
= [g1 , g2 , ..., gN ] 1(i = ti ) gi = δ (i,ti ) = . 0(i �= ti )
P(τi , k) =
exp(− λ1 S(τi )) ∑Kk=1 exp(− λ1 S(τi ))d τi
S(τi , k) = φtN +
The case, i-th element of θ , θi will be an input on time ti .
N−1
∑
j=i
2.3 PI 2 based optimal input sequence design algorithm
N−1
∑ qt j dt + j=i
1 (θt + εt j )T Mt j T RMt j (θt j + εt j ) 2 j (c) (c) T
Mt j ,k =
According to the discussion above, the proposed optimal input design algorithm is summarized as in Algorithm 1.
R−1 gt j gt j
(c) (c) T trace(R−1 gt gt )
∈ RN×N
end 3. COMPARED METHOD (KAWAGUCHI’S METHOD)
δ θi =
δ θ (:, i) = δ θti Update policy,
θ (new) = θ (old) + δ θ . (13) (new) Set the new input gain vector θ to the new initial state vectorθ0 and continue updating the policy.
Θ = [θ0 , θ1 , . . . , θn , θ11 , θ12 , . . . , θnn , . . . , θnl ] X(k) = [1, x1 (k), . . . , xn (k), x1 (k)x1 (k), x1 (k)x2 (k), . . . , xn (k)xn (k), . . . , xn (k)l ]T
(7) (8) (9)
Here, Θ is a parameter vector, and X(k) is a regressor vector on time k. When the equations in all time steps, are augmented in one vector form, the system is expressed as, y = XT Θ + ε.
end (5) Inject the first element of the obtained optimal input sequence θ (new) (1) to the plant model, and set the next state vector as the new initial state vectorθ0 .
(6) T
(10) 796
(12)
end
3.1 NARX model
y(k) = X(k)T Θ + ε (k)
(11)
k=1
In this section, a brief explanation of Kawaguchi’s method is given. Author shows the mechanism of D-optimal DoE scheme, which is proposed in Hirsch (2010), which is used in Kawaguchi (2013).
NARX model which we consider is a parametric models which is represented by a linear sum of nonlinear basis functions and their gain. This model is generally used because of its expression ability and easiness. The NARX model is expressed as,
K
∑ [P(τi , k)Mti ,k εti ,k ]
end
In this case, X is the vector including each regressor in its row direction, and y is the output vector, and ε is the unknown disturbance vector. For the model in a form of equation(10), the least square estimate is calculated as ˆ = X+ · y Θ
(14)
IFAC AAC 2016 June 19-23, 2016. Norrköping, Sweden
Hiroaki Ishiyama et al. / IFAC-PapersOnLine 49-11 (2016) 781–786
with a pseudo inverse matrix X + = (XX T )−1 X T , where M = XX T is known as an information matrix. The determinant of the information matrix is desired to be as large as possible, because the inverse of the information matrix is needed. In D-optimal criteria, given that the set of possible input as Ω,the optimal inputs are those which maximize the value function: J(M) = det(M), (15) which can be rewritten as (16) u(k) = arg max(det(M)), u(k) ∈ Ω, k = 1, . . . , N. u(k)
However, it is not feasible because users have to solve a nonlinear optimization problem with N × nu valuables, in case of N steps with nu input dimensions. Thus in Hirsch (2010), instead, inputs are designed by maximizing the sum of prediction error ˆ k , which is estimated in N step ahead. Given the parameter Θ from the training data available at time k, the prediction error ˆ k and y(k + 1) is calculated a function of u(k + 1) in between Θ the input range Ω as ˆ k ) = σ 2 X(k + 1)T M(k)−1 X(k + 1). (17) var y(u(k ˆ + 1), Θ
783
4.2 Result.1 In this subsection, the simulation results of the proposed algorithm with parameter settings in Table.1 are shown. Fig.1 shows three inputs which are used in the simulation. Dots(.) are the optimal inputs with the proposed method, triangles(�) are the optimal inputs with Kawaguchi’s method, and asterisks(∗) are random inputs(∗). Fig.3 shows the instantaneous value of FIC on time ti calculated from equation (5.10), and Fig.4 shows the average of FIC from t0 to ti , with the inputs in Fig.1. In both Fig.3 and Fig.4, solid lines(−) denote the results of the proposed method, triangle markers(�) denote the Kawaguchi’s method, and dash lines(−−) denote the random inputs.
Therefore, the optimal input is asymptotically solved by iteratively maximizing the normalized variance of prediction d(u(k + 1), M ∗ (k)) in the next equation, d(u(k + 1), M ∗ (k)) = X(k + 1)T M ∗ (k)−1 X(k + 1) M ∗ (k) = M(k)/k. Thus, the optimal input in time k + 1 is defined as, u(k + i) = arg max
nd
∑ X(k + i)T M∗ (k)−1 X(k + i),
u(k+i) i=1
(18)
Fig. 1. With limitations in amplitude: PI 2 (.), compared method(�), random input(∗)
(19)
u(k) ∈ Ω, (20) which is calculated by designing the optimal N steps inputs, and using the first step as the optimal input in time k + 1. Notice that the optimization problem is not assured to be convex. Authors applied multi-shot optimization to solve the problem, to carry out ten optimizations from different initial values. 4. NUMERICAL SIMULATION In this section, results of some numerical simulations are shown. The proposed algorithm is applied to the ARX model to generate the D-optimal input sequence. The obtained input sequence is compared to two other types of input sequences. One is the sequence calculated with another D-optimal DoE algorithm, which was proposed by Kawaguchi (2013), and the another is Gaussian random input sequence. Simulations are carried out with MATLAB 2013(a).
Fig. 2. The system response with the designed input.(observed(broken line),true value(solid line))
4.1 Assumption The proposed method is applied to the following nine dimensional SISO system under the assumption that the user knows the system structure clearly. Plant model is described as xt+1 = a X t + e, where e is a disturbance, and parameter a , and state regressor X is assumed to be defined as a = [0.3, −0.2, 0.2, −0.1, −0.08, 0.05, −0.01, 0.01, 0.01]
Fig. 3. FICti on time ti with limitations in amplitude(PI 2 (line), compared method(�), random input(broken line))
X t = [xt , xt−1 , xt−2 , xt−3 , xt−4 , xt−5 , xt−6 , xt−7 , xt−8 ] e ∼ N (0, σe ), σe ∈ R
797
IFAC AAC 2016 784 June 19-23, 2016. Norrköping, Sweden
Hiroaki Ishiyama et al. / IFAC-PapersOnLine 49-11 (2016) 781–786
Fig. 4. Averaged FICti on timeti with limitations in amplitude(PI 2 (line),compared method(�), random input(broken line)) 4.3 Result.2 In this subsection, results of the numerical simulation are demonstrated to show that the proposed method is superior to the compared method. We have chosen the Kawaguchi’s method as the compared method as it is the another method with D-optimal structure. Here, we think about the way to set the upper and lower limitation of the designed signal amplitude, because our proposed method will not provide any limitation in amplitude of the optimal input. There are two ways, such as,
Fig. 6. FICti on timeti with limitations in amplitude(PI 2 (line), compared method(�), random input(broken line)), limitations in amplitude with clipping(PI 2 (o)
(1) Set upper and lower bounds and cut the signal at the limitation after designing the optimal input sequences. (2) Add a barrier function cost in the original PI 2 algorithm regarding the amplitude of the signal or clipping the input value at the bound in each iteration step. Fig.5, is the designed optimal input sequence with the bounding method 1. The modified algorithm is summarized in Algorithm 2. With the input sequence in Fig.5, FIC of the plant system were calculated as in Fig.6, and Fig.7. It is clear that by the proposed method, we gained the bigger FIC compared to Kawaguchi’s method. From the figures, it can be seen that even by limiting the amplitude in a coercive manner, PI 2 based optimal inputs show better FIC than the compared D-optimal inputs.
Fig. 7. Averaged FICti on time ti with limitations in amplitude(PI 2 (line), compared method(�), random input(broken line)), limitations in amplitude with clipping(PI 2 (o)
Parameter
Fig. 5. With limitations in amplitude: PI 2 (.), compared method(�), random input(∗)
Designation horizon of the optimal inputs : N Length of the optimal inputs:L Amplitude of the disturbance: ε ∼ (0, Σe ) Amplitude of the disturbance in test phase:e Variation of state trajectories in each iteration:K Number of iteration for θ : m Amplitude of the original inputs Weight matrix in cost function : R Weight of function S : h Sampling interval to calculate FIC : 2Δ
value 5 100 ε ∼ (0, 10) e ∼ (0, 100) 10 100 10 diag(0.1, .., 0.1) 1 13(steps)
Table 1. Parameter settings of the PI 2 based optimal input sequence designation
798
IFAC AAC 2016 June 19-23, 2016. Norrköping, Sweden
Hiroaki Ishiyama et al. / IFAC-PapersOnLine 49-11 (2016) 781–786
Algorithm.2
(1) Define simulation horizon N, the length L of input sequence, iteration number M(be sure to set M large enough), and the number of state trajectories K generated in each time ti .
(2) Set an initial input gain vector θ0 . (3) Set noise matrix ε ∈ RN×n×K , where n is the dimension of system input (for example, in case of SISO system, n = 1). (4) Inject K types of noise matrix ε ∈ RN×n to generate K different trajectories. For time scale {τi = t0 , ...ti }, calculate following cost function, weight function, and policy updating.
REFERENCES M. Kawaguchi and M. Yamakita Identification with Physical Error Models for Internal Combustion Engine Proc. of IFAC ACC 2013, pp780-785. Markus Hirsch and Luigi del Re Iterative Identification of Polynomial NARX Models for Complex Multi-Input Systems 8th IFAC Symposium on Nonlinear Control Systems, vol, 2, pp.845-950, 2010 Markus Hirsch and Thomas E.Passenbrunner Extension of Statistic Non-linear DoE Identification Algorithms to Dynamic Systems EUROCAST 2011, part, 2, LNCS 6928, pp.33-40, 2012 Evangelos A. Theodorou and Jonas Buchli and Stefan Schaal A Generalized Path Integral Control Approach to Reinforcement Learning Journal of Machine Learning Research, vol 11, pp.3137-3181,2010
for m = 1:M for i = 1:N-1 for k = 1:K ∑Kk=1
S(τi , k) = φtN + N−1
∑
j=i
exp(− λ1 S(τi )) exp(− λ1 S(τi ))d τi N−1
∑ qt j dt + j=i
1 (θt + εt j )T Mt j T RMt j (θt j + εt j ) 2 j (c) (c) T
Mt j ,k =
R−1 gt j gt j
(c) (c) T trace(R−1 gt gt )
Appendix A. PI 2 ALGORITHM
∈ RN×N
In this section, outline of the PI 2 algorithm is briefly described (for more detail see Evangelos (2010)). PI 2 is a statistical way to solve HJB equation in a non-linear control problem by simulating several state trajectories and computing costminimizing problems. In this algorithm, next five conditions are assumed.
end
δ θi =
K
∑ [P(τi , k)Mti ,k εti ,k ]
(21)
k=1
δ θ (:, i) = δ θti
5. CONCLUSION
In this paper, a new statistical approach to design optimal input sequence is proposed. The proposed algorithm is the combination of D-optimal input design and PI 2 algorithm. In the proposed method, the cost function of PI 2 algorithm is modified to calculate control sequence, which maximizes FIC in statistical context. The proposed method was compared to the previous optimal input design method Kawaguchi (2013) in numerical simulations. The results of the numerical simulation have shown the advantage of the proposed method in the noisy situation.
for j=1:L
P(τi , k) =
785
(22)
• System input is constructed with two parts, the basis function and its gain vector.
end Update policy,
• Disturbance is assumed to be additive to the input gain.
θ (new) = θ (old) + δ θ . (23) for j = 1:N if θ (new) ( j) > C θ (new) ( j) = C θ (new) ( j) = −C else if θ (new) ( j) < −C end end Set the new input gain vector θ (new) to the new initial state vectorθ0 and continue updating the policy.
• System structure is described as equation(1) and the system could be divided into the controllable part ((·)(c) ) and the uncontrollable part ((·)(uc) ).
In equation(1), x denotes the state vector of the system, and g is the basis function vector. ti denotes the i-th time step in a simulation, and m is the number of iteration handled in the end simulation. Besides, ε is the disturbance, and k is the number of trajectories which are generated in each iteraion, and f is the (5) Inject the first element of the obtained optimal input state equation of the plant. (new) sequence θ (1) to the plant model, and set the next state vector as the new initial state vectorθ0 . (uc) (uc) (uc) xti+1 fti+1 xti dt = + (c) (c) (c) fti+1 xti+1 xti end √ 0k×p (A.1) + θti dt + dt εti . (c) gti (·)ti means the value (·) in time ti , for example, xti is a state vector of the plant on time ti , εti denotes the noise vector on time ti, which is induced to the gain vector θti ,and gti is a basis
799
IFAC AAC 2016 786 June 19-23, 2016. Norrköping, Sweden
Hiroaki Ishiyama et al. / IFAC-PapersOnLine 49-11 (2016) 781–786
function vector subject to time ti. Here, a basic mechanism of the PI 2 algorithm is as followings. First, K types of noise vector εti are generated and injected to the system in equation (5.1) to gain state trajectories in time scale τti = [t0 , ...,ti ]. Secondly, stage cost of each trajectory paths are calculated according to the following cost function. N−1
λ S˜ = S + 2 = φtN +
∑ log |Ht j | j=i
N−1
∑ qt j dt j=i
2 (c) (c) xt j+1 − xt j 1 N−1 λ (c) + ∑ dt + − ft j 2 j=i dt 2 −1 Ht
N−1
∑ log |Ht j | j=i
j
2 N−1 1 N−1 (c) T + ) dt ∝ φtN + ∑ qt j dt + ∑ ( θ ε g tj tj 2 j=i t j H −1 j=i tj
= φtN +
+
1 2
N−1
∑ qt j dt j=i
(c) (c) T
gt j gt j
N−1
∑ (θt j + εt j )T j=i
= φtN +
(c) (c) T
trace(gt gt
)
(θt j + εt j )
N−1
∑ qt j dt j=i
1 N−1 + ∑ (θt j + εt j )T Mt j T RMt j (θt j + εt j ). 2 j=i
Finally, policy θti is updated with weight function P(τti ) which is calculated stage cost S as P=
S ∑S
(A.2)
S = exp h
S˜ − S˜min S˜max − S˜min
(A.3)
where S˜max and S˜min are maximum value and minimum value ofS˜ in the iteration each [See Evangelos (2010)]. The policy update [δ θ ]i is defined as, [δ θ ] j = where δ θti =
∑N−1 i=0 (N − i)w j,ti [δ θti ] j ∑N−1 i=0 w j,ti (N − i)
P(τti )Mti εti d τi
(A.4) (A.5)
where j is the j element of the vector θ Evangelos (2010)
θ (new) = θ (old) + δ θ .
(A.6)
800