Numerical Computational Improvement of the Stable-Manifold Method for Nonlinear Optimal Control*

Numerical Computational Improvement of the Stable-Manifold Method for Nonlinear Optimal Control*

Proceedings of the 20th World Congress Proceedings of the 20th World Congress The International Federation of Congress Automatic Control Proceedings o...

439KB Sizes 5 Downloads 66 Views

Proceedings of the 20th World Congress Proceedings of the 20th World Congress The International Federation of Congress Automatic Control Proceedings of the 20th World Proceedings of the 20th World The International Federation of Congress Automatic Control Toulouse, France,Federation July 9-14, 2017 Available online at www.sciencedirect.com The International of Automatic Control The International of Automatic Control Toulouse, France,Federation July 9-14, 2017 Toulouse, France, July 9-14, 2017 Toulouse, France, July 9-14, 2017

ScienceDirect

IFAC PapersOnLine 50-1 (2017) 5103–5108

Numerical Computational Improvement of Numerical Computational Improvement of Numerical Computational Improvement of Numerical Computational Improvement of the Stable-Manifold Method for Nonlinear the Stable-Manifold Method for Nonlinear the Stable-Manifold Method for Nonlinear ⋆⋆for Nonlinear the Stable-Manifold Method Optimal Control ⋆⋆ Optimal Control Optimal Control Optimal Control ∗ ∗

Yasuaki Oishi ∗ Noboru Sakamoto ∗ Yasuaki Oishi ∗ Noboru Sakamoto ∗ Yasuaki Oishi Yasuaki Oishi ∗ Noboru Noboru Sakamoto Sakamoto ∗ ∗ ∗ Department of Mechatronics, Nanzan University, Yamazatocho 18, ∗ Department of Mechatronics, Nanzan University, Yamazatocho 18, Mechatronics, Nanzan University, Yamazatocho ∗ Department of Showa-ku, Nagoya 466-8673, Japan Department of Showa-ku, Mechatronics, Nanzan University, Yamazatocho 18, 18, Nagoya 466-8673, Japan Showa-ku, Nagoya 466-8673, Japan (email: [email protected], [email protected]) Showa-ku, Nagoya 466-8673, Japan (email: [email protected], [email protected]) (email: (email: [email protected], [email protected], [email protected]) [email protected]) Abstract: Two numerical computational techniques are presented for of the Abstract: Two Two numerical numerical computational computational techniques techniques are are presented presented for for improvement improvement of the Abstract: improvement of stable-manifold method, which is effective for nonlinear optimal control. The first technique Abstract: Twomethod, numerical computational techniques areoptimal presented for improvement of the the stable-manifold which is effective for nonlinear control. The first technique stable-manifold method, which is for optimal control. The first is for generation of points on the stable manifold in aa robust against numerical errors. stable-manifold method, which is effective effective for nonlinear nonlinear optimalway control. The first technique technique is for generation of points on the stable manifold in robust way against numerical errors. is for of the manifold in against numerical errors. There, a special numerical method that preserves is used to solve aa differential is for generation generation of points points on on the stable stable manifold Hamiltonian in aa robust robust way way against numerical errors. There, a special special numerical numerical method that preserves preserves Hamiltonian is used used to solve solve differential There, a method that Hamiltonian is to a differential equation sensitive to numerical errors. The second technique is a sort of shooting method to There, a sensitive special numerical method thatThe preserves Hamiltonian is usedoftoshooting solve a differential equation to numerical errors. second technique is a sort method to equation sensitive to numerical errors. The second technique is a sort of shooting method to generate a point corresponding to the desired system state. Again, numerical robustness is an equation sensitive to numerical to errors. The second technique is a sort of shooting method to generate a point corresponding the desired system state. Again, numerical robustness is an generate a corresponding to the desired system state. Again, numerical robustness is issue there. The two techniques are applied to an example shown to be effective. generate a point point corresponding desired state.system Again,and numerical robustness is an an issue there. there. The two two techniques to arethe applied to system an example example system and shown to to be effective. effective. issue The techniques are applied to an system and shown be issue there. two techniques are of applied to an example system and shown to be effective. © 2017, IFACThe (International Federation Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Keywords: nonlinear optimal control, stable-manifold method, Hamiltonian system, numerical Keywords: nonlinear nonlinear optimal optimal control, control, stable-manifold stable-manifold method, method, Hamiltonian Hamiltonian system, system, numerical numerical Keywords: computation, averaged-vector-field method, Gaussian quadrature formula, shooting method, Keywords: nonlinear optimal control, stable-manifold method, Hamiltonian system, numerical computation, averaged-vector-field method, Gaussian quadrature formula, shooting method, computation, averaged-vector-field QR-decomposition. computation, averaged-vector-field method, method, Gaussian Gaussian quadrature quadrature formula, formula, shooting shooting method, method, QR-decomposition. QR-decomposition. QR-decomposition. 1. solution against its theoretical property. Another problem 1. INTRODUCTION INTRODUCTION solution against its theoretical property. Another problem 1. solution against its theoretical property. Another problem is that a flow is produced only after determination of an 1. INTRODUCTION INTRODUCTION solution against its theoretical property. Another problem is that a flow is produced only after determination of an is that a flow is produced only after determination of an initial point and it is not easy to have a flow that gives Nonlinear optimal control has a long history of investiis that point a flowand is produced only after determination of an initial it is not easy to have a flow that gives Nonlinear optimal control has a long history of investiinitial point and it is not easy to have a flow that gives correspondence to a desired state x. It has been necessary Nonlinear optimal control has a long history of investigation mainly through the associated Hamilton–Jacobi– initial point andtoita isdesired not easy tox.have a flow that gives Nonlinear optimal control has a long history of investi- correspondence state It has been necessary gation mainly through the associated Hamilton–Jacobi– correspondence to aainitial desired state x. It been necessary so far to adjust an point by and error in order gation mainly through the Hamilton–Jacobi– Bellman equation (e.g., Al’brekht, 1961; Lukes, 1969; van to desired state x.trial It has has been necessary gation through the associated associated so far to adjust an initial point by trial and error in order Bellmanmainly equation (e.g., Al’brekht, Al’brekht, 1961;Hamilton–Jacobi– Lukes, 1969; 1969; van van correspondence so far to adjust an initial point by trial and error in order to have a desired flow. Bellman equation (e.g., 1961; Lukes, der Schaft, 1991, 1992; Sakamoto, 2002; Navasca & Krener, so far to adjust an initial point by trial and error in order Bellman equation (e.g.,Sakamoto, Al’brekht,2002; 1961;Navasca Lukes, & 1969; van to have a desired flow. der Schaft, 1991, 1992; Krener, to have a desired flow. der Schaft, 1991, 1992; Sakamoto, 2002; Navasca & Krener, 2007). Among the existing approaches, the stable-manifold to have a desired flow. der Schaft, 1991, 1992; Sakamoto, 2002; Navasca & Krener, this paper, two numerical computational techniques are 2007). Among Among the the existing existing approaches, approaches, the the stable-manifold stable-manifold In In this paper, two numerical computational techniques are 2007). method proposed Sakamoto and van der Schaft (2008) 2007). Among the by existing approaches, the stable-manifold In this paper, two numerical computational techniques are presented for the aforementioned two problems. For the method proposed by Sakamoto and van der Schaft (2008) In this paper, two numerical computational techniques are presented for the aforementioned two problems. For the method proposed by Sakamoto and van der Schaft (2008) looks promising and has been applied to many control method proposedand by Sakamoto and van der Schaft control (2008) presented for the aforementioned two problems. For the first problem, we propose the use of a numerical method looks promising has been applied to many presented for the aforementioned two problems. For the first problem, we propose the use of a numerical method looks promising and has been applied to many control problems such as swing-up of pendulum, control with looks promising has been to many control we propose the use method guaranteed to preserve In particular, problems such as asand swing-up of aa aapplied pendulum, control with first first problem, problem, we proposethe theHamiltonian. use of of aa numerical numerical method guaranteed to preserve the Hamiltonian. In particular, problems such swing-up of pendulum, control with saturated input, and nonlinear optimal servo (Sakamoto, problems such as swing-up of a pendulum, control with guaranteed to preserve the Hamiltonian. In particular, the averaged-vector-field method of Quispel and McLaren saturated input, and nonlinear optimal servo (Sakamoto, guaranteed to preserve the Hamiltonian. In particular, the averaged-vector-field method of Quispel and McLaren saturated input, and nonlinear optimal servo (Sakamoto, 2013; Ishikawa & Sakamoto, 2014; Horibe & Sakamoto, saturated input,&and nonlinear2014; optimal servo& (Sakamoto, averaged-vector-field method of and (2008) is proposed to be used, which is considered effec2013; Ishikawa Ishikawa Sakamoto, Horibe Sakamoto, the the averaged-vector-field method of Quispel Quispel and McLaren McLaren (2008) is proposed to be used, which is considered effec2013; & Sakamoto, 2014; Horibe Sakamoto, 2016). The idea of the method is to formulate optimal 2013; Ishikawa & Sakamoto, 2014; Horibe & & an Sakamoto, (2008) is proposed to be used, which is considered effective to a numerically sensitive problem such as long-time 2016). The idea of the method is to formulate an optimal (2008) is proposed to sensitive be used, problem which issuch considered effective to a numerically as long-time 2016). The idea of the method is to formulate an optimal control problem to computation of a stable manifold of the 2016). The idea of the method is to formulate an optimal tive to a numerically sensitive problem such as long-time simulation of the solar system. It may be of independent control problem to computation of a stable manifold of the tive to a numerically sensitive problem such as long-time simulation of the solar system. It may be of independent control problem to computation of a stable manifold of the associated Hamiltonian system in the space of the state x control problem to computation of the a stable of the of the system. It may be of interest that new technique the field of numerical associated Hamiltonian system in in spacemanifold of the the state state x simulation simulation ofsuch the aasolar solar system. It in may be of independent independent that such new technique in the field of numerical associated Hamiltonian system the of and the costate p. Here, a stable invariant associated Hamiltonian system in manifold the space spaceis ofan the state x x interest interest that such a new technique in the field of numerical computation is useful also for optimal control. For the and the costate p. Here, a stable manifold is an invariant interest that such a new technique in the field of numerical computation is useful also for optimal control. For the and the costate p. Here, a stable manifold is an invariant manifold of the system on which the origin becomes an and the costate p. Here, on a stable manifold is an invariant computation is useful also for optimal control. For second problem, a kind of shooting method is proposed. manifold of the system which the origin becomes an computation is useful also for optimal control. For the the second problem, a kind of shooting method is proposed. manifold of the system on which the origin becomes an asymptotically stable equilibrium. In practice, starting manifold of the system on which the origin becomes an problem, aaa kind of method is This basically successive improvement aa flow to asymptotically stable stable equilibrium. equilibrium. In In practice, practice, starting starting second secondis problem, kind of shooting shooting method of is proposed. proposed. This is basically a successive improvement of flow to asymptotically from an initial point close to the origin, we numerically asymptotically stable close equilibrium. In practice, starting This is basically aa successive improvement of aa flow to make its final point to have the x-component close to the from an initial point to the origin, we numerically This is basically successive improvement of flow to make its final point to have the x-component close to the from an initial point close to the origin, we numerically compute a flow of the Hamiltonian system in reversed time from an initial point close to the system origin, in wereversed numerically make its final point to have the x-component close to the desired value. Note that Horibe and Sakamoto (2017) used compute a flow of the Hamiltonian time make its final point to have the x-component close to the desired value. Note that Horibe and Sakamoto (2017) used compute aa flow of system inthis reversed time and obtain a sequence of points. We repeat procedure compute flow of the the Hamiltonian Hamiltonian system reversed time variational desired Note Horibe and (2017) used dynamics for exploration on the stable and obtain obtain sequence of points. points. We We repeatinthis this procedure desired value. value. Note that that Horibe and Sakamoto Sakamoto (2017)maniused variational dynamics for exploration on the stable maniand aaainitial sequence of repeat procedure for several points and compute a manifold so as and obtain sequence of points. We repeat this procedure variational dynamics for exploration on the stable manifold, which plays an important role also in the proposed for several initial points and compute a manifold so as variational dynamics for exploration on the stable manifold, which plays an important role also in the proposed for several initial points and compute a manifold so as to interpolate all the obtained points approximately. This for several initial points and compute a manifold so as method. fold, which plays an role also in proposed Their method, however, does not aim at shooting. to interpolate interpolate all the the obtained points approximately. approximately. This fold, which plays an important important role also in the the proposed method. Their method, however, does not aim at shooting. to all This manifold gives correspondence between x and p and then to interpolate all the obtained obtained points points approximately. This method. Their method, however, does not aim at shooting. The proposed two techniques are applied to a swing-up manifold gives correspondence between x and p and then method. Their method, however, does not aim at shooting. The proposed two techniques are applied to a swing-up manifold gives correspondence between x and p and then an optimal control law. manifold gives correspondence between x and p and then The proposed two techniques are applied to a problem of a pendulum and give interesting results. an optimal control law. The proposed two techniques areinteresting applied toresults. a swing-up swing-up problem of a pendulum and give an optimal control law. an optimal here control law.a flow of the Hamiltonian system is problem of aa pendulum and results. A problem is that problem of pendulum and give give interesting interesting results. The rest of the paper is organized as follows. In Section 2, A problem here is that a flow of the Hamiltonian system is rest of the paper is organized as follows. In Section 2, A problem is a Hamiltonian system numerically difficult to compute because linear approximaA problem here here is that that a flow flow of of the the Hamiltonian system is is The The rest of the paper is organized as follows. In Section 2, the stable-manifold method is reviewed and its two probnumerically difficult to compute because linear approximaThestable-manifold rest of the papermethod is organized as follows. In Section 2, the is reviewed and its two probnumerically difficult to compute because linear approximation of the system has the same number of unstable and numerically difficulthas to compute because linear approximathe stable-manifold method is reviewed and its two problems are pointed out. In order to resolve these problems, tion of the system the same number of unstable and the stable-manifold method is reviewed and its two problems are pointed out. In order to resolve these problems, tion of the system has the same number of unstable and stable modes at any points. Application of the classical tion the system haspoints. the same number of and Hamiltonian lems are out. to these preservation is discussed in Section 33 and aa stableof modes modes at any Application of unstable the classical lems are pointed pointed out. In In order order to resolve resolve these problems, problems, Hamiltonian preservation is discussed in Section and stable at Application of Runge–Kutta often fails to give a precise solution stable modes method at any any points. points. Application of the the classical classical Hamiltonian preservation is discussed in Section 3 and aa shooting method is proposed in Section 4. An example is Runge–Kutta method often fails to give a precise solution Hamiltonian preservation is discussed in4.Section 3 and is shooting method is proposed in Section An example Runge–Kutta method often fails give a solution for aa sufficiently long time range. this case, the value Runge–Kutta method often fails to toIn give a precise precise solution shooting method is proposed in Section 4. An example is presented in each of the two sections. Section 5 concludes for sufficiently long time range. In this case, the value shooting method isof proposed in Section 4. An5example is in each the two sections. Section for aa sufficiently long time range. In case, the value of the Hamiltonian may not be constant the resulting for sufficiently long time In this thison case, value presented presented in each of the two sections. Section 55 concludes concludes the paper. of the the Hamiltonian may notrange. be constant constant on the the resulting presented in each of the two sections. Section concludes the paper. of Hamiltonian may not be on the resulting ⋆ This of maybynot constant on Grant the resulting the work is supported thebeJSPS Kakenhi Number ⋆ the Hamiltonian the paper. paper.

This work is supported by the JSPS Kakenhi Grant Number ⋆ JP15K06157 the Nanzan Research Subsidy This work and is supported by University the JSPS Pache Kakenhi Grant Number ⋆ This work and is supported by University the JSPS Pache Kakenhi Grant Number JP15K06157 the Nanzan Research Subsidy I-A-2 for the academic years 2016 and 2017. JP15K06157 and the Nanzan University Pache Research Subsidy JP15K06157 and the Nanzan University Pache Research Subsidy I-A-2 for the academic years 2016 and 2017. I-A-2 for the academic years 2016 and 2017. I-A-2 for the academic years 2016 and 2017. Copyright © 2017, 2017 IFAC 5274Hosting by Elsevier Ltd. All rights reserved. 2405-8963 © IFAC (International Federation of Automatic Control) Copyright © 2017 IFAC 5274 Copyright ©under 2017 responsibility IFAC 5274Control. Peer review of International Federation of Automatic Copyright © 2017 IFAC 5274 10.1016/j.ifacol.2017.08.777

Proceedings of the 20th IFAC World Congress 5104 Yasuaki Oishi et al. / IFAC PapersOnLine 50-1 (2017) 5103–5108 Toulouse, France, July 9-14, 2017

2. STABLE-MANIFOLD METHOD Consider a nonlinear system x(t) ˙ = f (x(t)) + G(x(t))u(t) (1) having an n-dimensional state x and an m-dimensional input u. Here, f (x) and G(x) are sufficiently smooth functions with f (0) = 0 and their values are an ndimensional vector and an n-by-m matrix, respectively. We would like to obtain an input u(t) that minimizes ∫ ∞ x(t)T Qx(t) + u(t)T Ru(t) dt (2) 0

for some positive-definite matrices Q and R.

The stable-manifold method of Sakamoto and van der Schaft is a method to solve this problem numerically. First, consider a Hamiltonian system ∂H ∂H (x(t), p(t))T , p(t) (x(t), p(t))T (3) x(t) ˙ = ˙ =− ∂p ∂x for the Hamiltonian of the optimal control problem above, 1 H(x, p) = pT f (x) − pT G(x)R−1 G(x)T p + xT Qx, (4) 4 where p is an n-dimensional costate. For this system, we consider an invariant manifold in the (x, p)-space so that the origin is an asymptotically stable equilibrium there. If this manifold is n-dimensional and expressed as {(x, p(x))}, it is called the stable manifold and the function p(x) gives an optimal control law 1 u(t) = − R−1 G(x)T p(x). 2 If the linearization of the original system (1) around x = 0, x(t) ˙ = Ax(t) + Bu(t), has a stabilizing solution P for its Riccati equation P A + AT P − P BR−1 B T P + Q = O, (5) n the stabilizing subspace {(x, 2P x) | x ∈ R } is tangent to the stable manifold at the origin. In the stable-manifold method, the stable manifold is numerically computed. An initial point (x0 , p0 ) is chosen around the origin so as to be on the stable manifold and let evolve in reversed time subject to the Hamiltonian dynamics (3) with a numerical method such as the classical Runge–Kutta method. Since the stable manifold is invariant with respect to the dynamics (3), the produced sequence of points is considered to be on the stable manifold if the computation is precise enough. One repeats this procedure for several initial points and obtain a function p(x) by interpolation of the computed points. A problem of this method is numerical instability in computation of the Hamiltonian dynamics (3). Since the Jacobian of its right-hand side has the same number of unstable and stable modes, it is sensitive to numerical errors both in ordinary time and in reversed time. Although it has a theoretical property that the value of the Hamiltonian (4) is constant on its flow, the value may diverge for a numerical solution. In the stable-manifold method, one has to obtain a solution that goes considerably distant from the origin, which means such numerical instability is harmful. This problem has been circumvented so far by obtaining an initial point precisely on the stable manifold by the use of an iterative method.

Another problem of the stable-manifold method is that a point on the stable manifold is not directly obtained for a given state x but is obtained only after the choice of an initial point. Usually, trial and error are necessary for the choice of an initial point in order to have a point corresponding to the desired x. In the following sections, we make a numerical computational approach to the two problems above. For the first problem, we introduce a numerical method that preserves a Hamiltonian. This method is said to give a flow of a Hamiltonian system in a numerically more stable way than a standard solver of a differential equation. For the second problem, we propose a sort of shooting method. Given a flow of the Hamiltonian system, which is actually a sequence of points, we iteratively improve its initial point so that its final point tends to have a desired x. For the improvement, we consider variational dynamics of the Hamiltonian system at each point in the given sequence and let the subspace tangent to the stable manifold evolve in reversed time. Reaching the final point of the sequence, that subspace shows how the initial point should be moved in order to make the final point have a closer x to the desired one. Some additional techniques are also used for avoidance of numerical instability. 3. HAMILTONIAN PRESERVATION In the stable-manifold method, a flow of the Hamiltonian system (3) is numerically computed in reversed time. More precisely, we fix a step size h > 0, start from an initial point (x0 , p0 ), and iteratively compute a sequence of points (x−k , p−k ), k = 1, 2, . . . , K, so that (x−k , p−k ) approximates (x(−kh), p(−kh)) in the flow of (3). The most standard is the classical Runge–Kutta method, which comT T T T T putes z−k−1 := (xT −k−1 p−k−1 ) from z−k := (x−k p−k ) by ( ∂H )T ∂H (z−k ) − (z−k ) , v1 := ∂p ∂x ( ∂H )T ∂H v2 := (z−k − (h/2)v1 ) − (z−k − (h/2)v1 ) , ∂p ∂x ( ∂H )T ∂H v3 := (z−k − (h/2)v2 ) − (z−k − (h/2)v2 ) , ∂p ∂x ( ∂H )T ∂H v4 := (z−k − hv3 ) − (z−k − hv3 ) , ∂p ∂x z−k−1 := z−k − (h/6)(v1 + 2v2 + 2v3 + v4 ). This method is, however, not guaranteed to preserve a Hamiltonian, that is, the equality H(x−k−1 , p−k−1 ) = H(x−k , p−k ) may not hold for the Hamiltonian H(x, p) in (4) while H(x(t), p(t)) is constant on the exact flow (x(t), p(t)). If there is a numerical method that preserves a Hamiltonian, it is expected to work in a numerically stable manner even for a numerically sensitive system. One of such methods is the averaged-vector-field method proposed by Quispel and McLaren (2008). In our context, their method is written as ∫ 1( )T ∂H ∂H (z(r)) − (z(r)) dr, (6) z−k−1 := z−k − h ∂p ∂x 0 where z(r) := (1 − r)z−k + rz−k−1 . This method preserves a Hamiltonian as below.

5275

Proceedings of the 20th IFAC World Congress Toulouse, France, July 9-14, 2017 Yasuaki Oishi et al. / IFAC PapersOnLine 50-1 (2017) 5103–5108

Table 1. The values of wi and ri for the Gaussian quadrature formula in the cases of ℓ = 2, 3, 4. ri √ 1/2 ± 3/6 1/2 √ 1/2 ± 15/10

ℓ=2 ℓ=3 ℓ=4

1/2 ± 1/2 ±

√ √

21 − 14 21 + 14



/

6/5 14



/

6/5 14

Hamiltonian H(x, p)

1

wi 1/2 4/9 5/18 √ (18 + 30)/72 (18 −



0.5 0 -0.5

30)/72

-1 -0.8

T T T T T Fact. If z−k−1 = (xT −k−1 p−k−1 ) and z−k = (x−k p−k ) satisfy (6), they also satisfy H(z−k−1 ) = H(z−k ).

For completeness, its proof is given in the appendix. Remark. It is well known that a group of numerical methods, called symplectic methods, shows numerically good properties in their application to Hamiltonian systems (Hairer, Lubich, & Wanner, 2006). Symplectic methods, however, do not preserve Hamiltonian. In our context, preservation of Hamiltonian is important because it implies H(x−k , p−k ) = 0 for the obtained (x−k , p−k ), which is nothing but the Hamilton–Jacobi–Bellman equation. This is the reason why we employed the average-vector-field method rather than symplectic methods.  The averaged-vector-field method has second-order precision, that is, the error of the final point (x−K , p−K ) from the exact value (x(−Kh), p(−Kh)) is proportional to h2 when h is made sufficiently small with Kh kept constant. In this sense, it is inferior to the classical Runge–Kutta method, which has fourth-order precision. However, it has an attractive property of Hamiltonian preservation and is used for a numerically sensitive problem such as long-term simulation of the solar system. In order to apply the averaged-vector-field method to our problem, there are some issues to be addressed. First, we need to consider how to carry out the integration in (6). Analytical integration is possible in the case of a polynomial H(x, p), for example. When it is not easy, numerical integration can be used. Here, we use the Gaussian quadrature formula and replace (6) by z−k−1 := z−k −

ℓ ∑ i=1

hwi

( ∂H ∂p

(z(ri )) −

5105

)T ∂H (z(ri )) , (7) ∂x

where wi and ri are some constants chosen so that the integration becomes exact for a polynomial integrant of degree up to 2ℓ − 1. Table 1 gives the values of wi and ri for ℓ = 2, 3, 4 (Davis & Polonsky, 1972). Next, we notice that z−k−1 appears not only in the lefthand side of (7) but also in its right-hand side because z(r) = (1−r)z−k +rz−k−1 . In other words, the method (7) (and also the original method (6)) determines z−k−1 only implicitly and requires us to solve a nonlinear equation for z−k−1 with Newton’s method, for example. Although this means extra computational burden, the convergence is usually fast from a good initial point such as z−k . Moreover, all the computation is made offline in the stablemanifold method and thus computational burden is not a serious problem.

-0.6

-0.4

-0.2

0

time (s) Fig. 1. The values of the Hamiltonian for the sequences produced by the proposed method (solid line) and the classical Runge–Kutta method (dashed line). In the case of the classical Runge–Kutta method, the values of the Hamiltonian are away from zero for the time less than −0.5, which shows unreliability of its result. Example 1. In order to see the effect of the proposed method, we applied it to swing-up control of an pendulum attached to a massless cart following Sakamoto (2013). (Stabilization of a pendulum has been used as an example of nonlinear optimal control by many authors such as Holzh¨ uter (2004), Osinga and Hauser (2006), and Fujimoto and Sakamoto (2011).) Here, the mass of the pendulum is m = 9.21 × 10−2 kg, its length is 2L = 0.30 m, its inertia is J = 6.91 × 10−4 kg · m2 , and the gravitational acceleration is g = 9.81 m/s2 . Then, the dynamics of the pendulum can be written in the form of (1) with ) ( x2 f (x) = mgL sin x1 −mL2 x22 sin x1 cos x1 , G(x) =

(

J+mL2 sin2 x1

0

)

−L cos x1 J+mL2 sin2 x1 = (x1 x2 )T

,

where the state x consists of the angle x1 of the pendulum from the swing-up position and its time derivative x2 and the input u is the force given to the cart. We are to obtain the input u(t) so as to minimize (2) for Q = 0.01I and R = 2. Linearize the dynamics at x = 0 and formulate the Riccati equation (5). For its stabilizing solution P , a point (x0 , 2P x0 ) is close to the stability manifold for a small enough x0 . Here, with x0 = (−0.005 0)T and the step size h = 0.002, we numerically solved the equation (3) in reversed time to obtain a sequence of points on the stability manifold. In order to see the reliability of the result, we evaluated the Hamiltonian (4) at each point in the sequence. Since the Hamiltonian should be preserved and the sequence should converge to the origin in ordinary time, the value of the Hamiltonian should be constantly equal to zero in an ideal situation. In Figure 1, the dashed line shows the value of the Hamiltonian obtained from the classical Runge–Kutta method. Here, the horizontal axis stands for the time −kh. It can be seen that the value considerably deviates from zero for the time less than −0.5. On the other hand, the solid line shows the result of the proposed method with the quadrature formula of ℓ = 4. There,

5276

Proceedings of the 20th IFAC World Congress 5106 Yasuaki Oishi et al. / IFAC PapersOnLine 50-1 (2017) 5103–5108 Toulouse, France, July 9-14, 2017

the Hamiltonian is preserved close to zero up to the time −0.7. (This time can be further extended with a larger ℓ.) The proposed method required more computational time than the classical Runge–Kutta method. Namely, it took 307 s and the classical Runge–Kutta method took 36 s with a computer equipped with Intel Core i7-5600U CPU (2.60 GHz) and 16.0 GByte memory. This increase of computational time, however, should be acceptable in offline computation.  4. SHOOTING METHOD Next, we consider how to make the final point of a sequence have the x-component close to the desired value. Suppose that we have a sequence {(x−k , p−k )}k=0,1,...,K by numerically solving the equation (3) with a step size h > 0 in reversed time. The initial point (x0 , p0 ) is chosen close enough to the stable manifold and close enough to the origin. One way to do this is to use (x0 , 2P x0 ) for a small enough x0 and the stabilizing solution P of the Riccati equation (5). With the initial point close enough to the stable manifold, all the points in the sequence can be regarded to be on the stable manifold as well. Suppose that the final point (x−K , p−K ) of the sequence has the x-component not very distant from the desired value x∗ . Our objective is to iteratively improve the sequence on the stable manifold so that the final point is to have the x-component close enough to x∗ . The algorithm for this objective is presented later. Its idea is to be explained first.

whose final point has the x-component closer to x∗ . We obtain this sequence by solving the nonlinear differential equation (3) from this new initial point. If the final point indeed has the x-component close enough to x∗ , we are done. Otherwise, with the updated sequence of points, we go back to the beginning and consider its perturbation again. This is a basic idea of the proposed method. To make it more practical, we further apply some numerical computational techniques. Since the columns of (U0T V0T )T form a basis of the allowable perturbation subspace, they are better to be made orthonormal for numerical stability. This is especially so noting that the time evolution of the variational dynamics (8) is numerically sensitive because its coefficient matrix has the same number of unstable and stable eigenvalues. We hence apply the QR-decomposition (Golub & Van Loan, 1996) to have ( ) ( ) �0 U U0 R := 0 V0 V�0

� T V� T )T are orthonormal and so that the columns of (U 0 0 R0 is an upper triangular square matrix. For each of k = 1, 2, . . . , K − 1, we similarly apply the QR-decomposition to the matrix resulting from the one-step time evolution. If we remember R−k for all k = 0, 1, 2, . . . , K − 1, it is possible to choose an appropriate ∆x0 so as to make the final point to have the x-component close to x∗ .

Basically, the given sequence {(x−k , p−k )}k=0,1,...,K is perturbed so that the x-component of the final point moves closer to x∗ . Since the initial point (x0 , p0 ) is close to the origin, if the perturbation (∆x0 , ∆p0 ) satisfies ∆p0 = 2P ∆x0 for the stabilizing solution P , the perturbed point (x0 + ∆x0 , p0 + ∆p0 ) can be regarded on the stable manifold.

Sometimes, the step ∆x0 is too large and does not make the final point to have the x-component close to x∗ . In this case, a damping technique is useful. Namely, if the standard step ∆x0 does not improve the final point, we try a shorter step, e.g., (1/2)∆x0 . If it does not make any progress either, try (1/22 )∆x0 and so on.

Next, we are to obtain the corresponding perturbation to the succeeding point in the sequence, i.e., (x−1 , p−1 ). This can be made by letting (∆x0 , ∆p0 ) evolve for the time −h subject to the variational dynamics  2  ∂ H ∂2H ( ) (x0 , p0 )  ( )  ∂p∂x (x0 , p0 ) 2 d ∆x ∂p   ∆x . = 2  ∆p ∂2H ∂ H dt ∆p (x0 , p0 ) − 2 (x0 , p0 ) − ∂x ∂x∂p (8) Since the variational dynamics is linear, its numerical solution can be obtained not for each (∆x0 , ∆p0 ) but for the whole subspace of allowable perturbation {(∆x0 , 2P ∆x0 ) | ∆x0 ∈ Rn }. More specifically, the classical Runge–Kutta method, for example, is applied to each column of the matrix (U0T V0T )T := (I 2P )T and let it evolve for the time −h. The resulting vectors are aligned to T T T compose a matrix (U−1 V−1 ) . This matrix gives the subspace of allowable perturbation to (x−1 , p−1 ) in the form of {(∆x−1 , ∆p−1 ) | ∆x−1 = U−1 ∆x0 , ∆p−1 = V−1 ∆x0 }.

Algorithm. Input: a sequence {(x−k , p−k )}k=0,1,...,K on the stable manifold such that the initial point (x0 , p0 ) is close to the origin and the final point (x−K , p−K ) has the xcomponent not very distant from the desired value x∗ .

The same procedure is repeated to the final point (x−K , p−K ). Since we want to move the x-component close to x∗ , it is reasonable to ask x−K + ∆x−K = x−K + U−K ∆x0 = x∗ to determine ∆x0 . Then the initial point (x0 + ∆x0 , p0 + 2P ∆x0 ) is expected to give a sequence

To combine these ideas, we have the following algorithm.

Output: a sequence {(x−k , p−k )}k=0,1,...,K on the stable manifold such that the final point has the x-component close enough to x∗ .

Procedure: 1. With the stabilizing solution P of the Riccati equation (5), let (U0T V0T )T := (I 2P )T . 2. For k = 0, 1, . . . , K − 1, repeat the following. (1) Apply the QR-decomposition to have ( ) ) ( �−k U U−k R . := −k V−k V�−k

(2) For (x−k , p−k ) in the given sequence, consider the variational dynamics  2  ∂ H ∂2H ( ) (x−k , p−k )  ( )  ∂p∂x (x−k , p−k ) 2 d ∆x ∂p   ∆x = 2  ∆p ∂2H ∂ H dt ∆p (x−k , p−k ) − 2 (x−k , p−k ) − ∂x ∂x∂p

5277

Proceedings of the 20th IFAC World Congress Yasuaki Oishi et al. / IFAC PapersOnLine 50-1 (2017) 5103–5108 Toulouse, France, July 9-14, 2017

it accessible to anybody. The proposed techniques are believed to be useful for this objective.

0 -10

3rd

4th

APPENDIX: PROOF OF FACT

2nd

-20 x2

-30 1st sequence

-40 -50

5107

0

1

2

3

4

5

x1 Fig. 2. The behavior of the proposed shooting method applied to the pendulum. At the fourth iteration, the final point of the sequence almost achieved the desired value x∗ = (4 − 10)T . and apply the classical Runge–Kutta method to compute the time evolution of each column of � T V� T )T for the time −h. The resulting vectors (U −k −k T T V−k−1 )T . are aligned to define a matrix (U−k−1

3. Evaluate and store the present error in the Euclid norm as e := ∥x∗ − x−K ∥. Choose ∆x0 so that x∗ − x−K = U−K R−K+1 · · · R−1 R0 ∆x0 . With (x0 + ∆x0 , p0 + 2P ∆x0 ) as a new initial point, produce a sequence of points by numerically solving the equation (3). If the produced final point (x−K , p−K ) improves the error as ∥x∗ − x−K ∥ < e, accept the produced sequence and proceed to the next step. Otherwise, replace ∆x0 by ∆x0 /2 and produce a sequence again. 4. If the error ∥x∗ − x−K ∥ is close enough to zero, output the present sequence and stop. Otherwise, go back to the step 1.  Example 2. Figure 2 shows a result of the proposed method applied to the preceding pendulum. Our first sequence of points was produced for the initial point with x0 = (−0.01 0.05)T , the step size h = 0.002, and the final time −Kh = −0.45. Suppose that we want a sequence whose final point has x∗ = (4 − 10)T . Application of the proposed method improved gradually a sequence of points. At the fourth iteration, the final point almost had x∗ as its x-component. At each iteration, the value of the Hamiltonian was kept smaller than 0.02. The time required for the whole computation was 52 s.  5. CONCLUSION Two numerical computational techniques are proposed for improvement of the stable-manifold method. The first technique is to preserve the Hamiltonian during generation of points on the stable manifold and to make the point generation robust to numerical errors. The second technique is a shooting method for generation of a point corresponding to a desired system state. The stable-manifold method has been applied to many control problems and shown to be effective. A remaining task is to decrease its know-how factors and to make

� For a function H(r) := H((1 − r)z−k + rz−k−1 ), we have ∫ 1 � dH � � (r) dr. H(1) − H(0) = 0 dr Since ( ∂H ) dz � ∂H dH (r) = (z(r)) (z(r)) (r) dr ∂x ∂p dr ( ∂H ) ∂H = (z(r)) (z(r)) (z−k−1 − z−k ), ∂x ∂p we have H(z−k−1 ) − H(z−k ) ∫ 1( ) ∂H ∂H (z(r)) (z(r)) dr (z−k−1 − z−k ). = ∂x ∂p 0 On the other hand, the equation (6) means ∫ 1( )T ∂H ∂H (z(r)) − (z(r)) dr, z−k−1 − z−k = −h ∂p ∂x 0 which gives H(z−k−1 ) − H(z−k ) = 0. REFERENCES Al’brekht, E. G. (1961). On the optimal stabilization of nonlinear systems. Journal of Applied Mathematics and Mechanics, 25 (5), 1254–1266. Davis, P. J., & Polonsky, I. (1972). Numerical interpolation, differentiation, and integration. In M. Abramowitz & I. A. Stegun (Eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 10th printing (pp. 875–924). Washington, D. C., USA: National Bureau of Standards. Fujimoto, R., & Sakamoto, N. (2011). The stable manifold approach for optimal swing up and stabilization of an inverted pendulum with input saturation. In Proceedings of the 18th IFAC World Congress, Milan, Italy (pp. 8046–8051). Golub, G. H., & Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Baltimore, USA: Johns Hopkins University Press. Hairer, E., Lubich, C., & Wanner, G. (2006). Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations, 2nd ed. Heidelberg, Germany: Springer. Holzh¨ uter, T. (2004). Optimal regulator for the inverted pendulum via Euler–Lagrange backward integration. Automatica, 40 (9), 1613–1620. Horibe, T., & Sakamoto, N. (2016). Swing up and stabilization of the acrobot via nonlinear optimal control based on stable manifold method. In Proceedings of the 10th IFAC Symposium on Nonlinear Control Systems, Monterey, USA (pp. 380–385). Horibe, T., & Sakamoto, N. (2017). Optimal swing up and stabilization control for inverted pendulum via stable manifold method. IEEE Transactions on Control Systems Technology, to appear. Ishikawa, K., & Sakamoto, N. (2014). Optimal control for control moment gyros: center-stable manifold approach. In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, USA (pp. 5874–5879).

5278

Proceedings of the 20th IFAC World Congress 5108 Yasuaki Oishi et al. / IFAC PapersOnLine 50-1 (2017) 5103–5108 Toulouse, France, July 9-14, 2017

Lukes, D. L. (1969). Optimal regulation of nonlinear dynamical systems. SIAM Journal on Control and Optimization, 7 (1), 75–100. Navasca, C., & Krener, A. J. (2007). Patchy solutions of Hamilton Jacobi Bellman partial differential equations. In A. Chiuso, A. Ferrante, & S. Pinzoni (Eds.), Modeling, Estimation and Control: Festschrift in Honor of Giorgio Picci on the Occasion of His Sixty-Fifth Birthday (pp. 251–270). Berlin, Germany: Springer. Osinga, H. M., & Hauser, J. (2006). The geometry of the solution set of nonlinear optimal control problems. Journal of Dynamics and Differential Equations, 18 (4), 881–900. Quispel, G. R. W., & McLaren, D. I. (2008). A new class of energy-preserving numerical integration methods. Journal of Physics A: Mathematical and Theoretical, 41 (4), 1–7. Sakamoto, N. (2002). Analysis of the Hamilton–Jacobi equation in nonlinear control theory by symplectic geometry. SIAM Journal on Control and Optimization, 40 (6), 1924–1937. Sakamoto, N. (2013). Case studies on the application of the stable manifold approach for nonlinear optimal control design. Automatica, 49 (2), 568–576. Sakamoto, N., & van der Schaft, A. J. (2008). Analytical approximation methods for the stabilizing solution of the Hamilton–Jacobi equation. IEEE Transactions on Automatic Control, 53 (10), 2335–2350. van der Schaft, A. J. (1991). On a state space approach to nonlinear H∞ control. Systems & Control Letters, 16 (1), 1–8. van der Schaft, A. J. (1992). L2 -gain analysis of nonlinear systems and nonlinear state feedback H∞ control. IEEE Transactions on Automatic Control, 37 (6), 770–784.

5279