Proceedings, 2018 IFAC Workshop on Networked & Autonomous Proceedings, 2018 Air & Space Systems Proceedings, 2018 IFAC IFAC Workshop Workshop on on Networked Networked & & Autonomous Autonomous Proceedings, 2018Santa IFAC Fe, Workshop Networked & Autonomous Air Space Systems June NM USAonAvailable online at www.sciencedirect.com Air & & 13-15, Space 2018. Systems Proceedings, 2018 IFAC Fe, Workshop on Networked & Autonomous Air & Space 2018. Systems June June 13-15, 13-15, 2018. Santa Santa Fe, NM NM USA USA Air & Space Systems June 13-15, 2018. Santa Fe, NM USA June 13-15, 2018. Santa Fe, NM USA
ScienceDirect
IFAC PapersOnLine 51-12 (2018) 136–141
Optimal trade-off analysis for efficiency and Optimal trade-off analysis for efficiency and Optimal trade-off analysis for efficiency and Optimal trade-off analysis rendezvous for efficiency and safety in the spacecraft and Optimal trade-off analysis for efficiency and safety in the spacecraft rendezvous and safety in the spacecraft rendezvous and safety in the spacecraft rendezvous and docking problem safety in the spacecraft rendezvous and docking problem docking docking problem problem problem Abraham docking P. Vinod and Meeko M. K. Oishi ∗∗
Abraham Abraham P. P. Vinod Vinod and and Meeko Meeko M. M. K. K. Oishi Oishi ∗ Abraham P. Vinod and Meeko M. K. Oishi ∗∗ Electrical and Computer Engineering, University New Mexico, Abraham P. Vinod and Meeko M. K. of Oishi Electrical and Computer Computer Engineering, University of Electrical and Engineering, University of New New Mexico, Mexico, Albuquerque, NM 87131 USA (e-mail:
[email protected], Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131 USA (e-mail:
[email protected], Albuquerque, NM 87131 USA (e-mail:
[email protected],
[email protected]) Electrical and Computer Engineering, of New Mexico, Albuquerque, NM 87131 USA (e-mail:University
[email protected],
[email protected])
[email protected]) Albuquerque, NM 87131 USA (e-mail:
[email protected],
[email protected])
[email protected]) Abstract: We study the problem of spacecraft rendezvous and docking, specifically the tradeAbstract: We study the problem of spacecraft rendezvous and and docking, specifically the tradeAbstract: We study theefficiency. problem of spacecraft rendezvous tradeoff between safety and The safety component thisdocking, problemspecifically is to staythe within a Abstract: We study theefficiency. problem of spacecraft rendezvousin and docking, specifically the tradeoff between safety and The safety component in this problem is to stay within off between We safety and efficiency. The safety component inand thisdocking, problem is to stay within aa line-of-sight cone and reach the target at an appropriate time in the presence of stochastic Abstract: study the problem of spacecraft rendezvous specifically the tradeoff between safety andreach efficiency. The safety intime this inproblem is to stay within a line-of-sight cone and and the target target at an an component appropriate the presence presence of stochastic stochastic line-of-sight cone reach the at appropriate inproblem the of disturbance sensing and The efficiency off between due safety and efficiency. The safety intime thiscomponent isin tothis stay within is a line-of-sight coneto and reach the modeling target aterrors. an component appropriate time in the presence of problem stochastic disturbance due to sensing and modeling errors. The efficiency component in this problem is disturbance toand sensing and modeling errors. Thethis efficiency component in this problem is our desire to due minimize control effort or fuel. We appropriate pose problem as the a bi-criterion optimization line-of-sight cone reach the target at an time in presence of stochastic disturbance due to sensing and modeling errors. The efficiency component in this problem is our desire to to minimize minimize control effort or or fuel. fuel. We We pose this problem as aa bi-criterion bi-criterion optimization our desire control effort problem as problem. our recent convexity forpose the stochastic reach-avoid problem when the disturbance to sensing and modeling Thethis efficiency component in thisoptimization problem is our desireUsing to due minimize control effort orresults fuel.errors. We pose this problem as a bi-criterion optimization problem. Using our recent convexity results for the stochastic reach-avoid problem when the the problem. Using ourrestricted recent convexity results forpose thewe stochastic reach-avoid problem when control policies are to open-loop policies, show that this bi-criterion optimization our desire to minimize control effort or fuel. We this problem as a bi-criterion optimization problem. Using our recent convexity results for the stochastic reach-avoid problem when the control policies are restricted restricted to open-loop open-loop policies, wethe show that this this bi-criterion bi-criterion optimization control are to policies, show that optimization problempolicies isUsing convex. The resulting tractability permits exploration of theproblem trade-off between problem. ourrestricted recent convexity results for thewe stochastic when the control policies are to open-loop policies, we show thatreach-avoid this bi-criterion optimization problem is efficiency convex. The resulting tractability permits the exploration of increasing the trade-off trade-off between problem is convex. The resulting tractability permits the exploration of the between safety and in the stochastic system. We also show that while the control control policies are restricted to open-loop policies, we show that this bi-criterion optimization problem is convex. The resulting tractability permits the exploration of the trade-off between safety and efficiency in the the stochastic system. We also show that that while while guarantees, increasing the the control safety in stochastic We show increasing control boundsand need not always translate to ansystem. increase in also thethe attainable decreasing problem is efficiency convex. The resulting tractability permits exploration ofguarantees, the trade-off between safety and efficiency in the stochastic system. We also show thatsafety while increasing the control bounds need not always translate to an increase in the attainable safety decreasing bounds need not always translate to ansystem. increase in also the guarantees, decreasing the control bounds typically translates a decrease in attainable the safety guarantees. safety and efficiency in the stochastic We showattainable thatsafety while increasing the control bounds need not always translate to an to increase in the safety guarantees, decreasing the control control bounds typically translates to decrease in attainable the attainable safety guarantees. the bounds typically translates to aa decrease in the attainable safety guarantees. bounds need not always translate to an increase in the attainable safety guarantees, decreasing the control bounds typically translates to a decrease in the attainable safety © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd.guarantees. All rights reserved. the control Stochastic bounds typically translates to aoptimization, decrease in the attainable safety guarantees. Keywords: reachability, convex bi-criterion optimization, spacecraft Keywords: Stochastic reachability, convex optimization, optimization, bi-criterion bi-criterion optimization, optimization, spacecraft spacecraft Keywords: Stochastic reachability, convex rendezvous and docking problem Keywords: Stochastic reachability, convex optimization, bi-criterion optimization, spacecraft rendezvous and docking problem rendezvous and docking problem Keywords: reachability, rendezvous Stochastic and docking problem convex optimization, bi-criterion optimization, spacecraft rendezvous and docking problem 1. INTRODUCTION satellite staying within a predetermined safe region (such 1. INTRODUCTION INTRODUCTION satellite staying predetermined (such 1. satellite staying within within predetermined safe region (such as a line-of-sight cone)aaa in the presencesafe of region a stochastic 1. INTRODUCTION satellite staying within predetermined safe region (such as a line-of-sight cone) in the presence of a stochastic as a line-of-sight cone) in the presence of a stochastic disturbance. The goal of this paper is to compute conThe safety-critical1.and expensive nature of space systems satellite INTRODUCTION staying within a predetermined safe region (such as a line-of-sight cone)ofinthis thepaper presence ofcompute a stochastic disturbance. The goal is to conThe safety-critical and expensive nature of space systems disturbance. The goal ofinthis paper is toofcompute conThe safety-critical and expensive natureparamount. of space systems trollers that optimize both the performance and safety of makes the need for reliable autonomy While as a line-of-sight cone) the presence a stochastic disturbance. The goal of this paper is to compute conThe safety-critical and expensive natureparamount. of space systems trollers that optimize both the performance and safety of makes the need for reliable autonomy While trollers that optimize both the performance and safety of makes the need for reliable autonomy paramount. While the spacecraft rendezvous and docking problem (a foursafety must be assured, despite modeling uncertainties and disturbance. The goalboth of this is problem to compute conThe safety-critical expensive natureparamount. of space systems trollers that optimize the paper performance and safety of makes the need forand reliable autonomy While the spacecraft rendezvous and docking (a foursafety must be assured, despite modeling uncertainties and the spacecraft rendezvous and docking problem (a probfoursafety must be assured, despite modeling uncertainties and dimensional discrete-time stochastic optimal control other disturbance forces, it is generally only one of the trollers that optimize both the performance and safety of makes the need for reliable autonomy paramount. While the spacecraft rendezvous and docking problem (a foursafety must be assured, despite modeling uncertainties and dimensional discrete-time stochastic optimal control probprobother disturbance forces, it itthe is design generally only one of of For the dimensional discrete-time stochastic optimal control other disturbance forces, is generally only one the lem). We achieve this via bi-criterion optimization and several factors influencing of a controller. the spacecraft rendezvous and docking problem (a foursafety must be assured, despite modeling uncertainties and discrete-time stochastic optimal control probother disturbance forces, itthe is design generally one of For the dimensional lem). We this via bi-criterion optimization and several factors influencing of aaonly controller. lem). We achieve this viastochastic bi-criterion optimization and several factors influencing of controller. ourachieve recent results the convexity of the safety example, in spacecraft rendezvous and docking maneuvers, dimensional discrete-time optimal control probother disturbance forces, itthe is design generally one of For the exploit lem). We achieve this viaon bi-criterion optimization and several factors influencing the design of aonly controller. For exploit our recent results on the convexity of the safety example, in spacecraft rendezvous and docking maneuvers, exploit our recent results on the convexity of the safety example, in spacecraft rendezvous and docking maneuvers, problem (Vinod and Oishi (2017, 2018)). one may wish to dock safely while minimizing the fuel lem). We achieve this via bi-criterion optimization and several factors influencing the design of a controller. For exploit our recent results on the convexity of the safety example, in spacecraft rendezvous and docking maneuvers, (Vinod and Oishi (2017, 2018)). one may wish to to Such dockmaneuvers safely while while minimizing the fuel fuel problem problemour (Vinod and Oishi on (2017, 2018)). one wish dock safely the costsmay (efficiency). areminimizing crucial in resupplyexploit recent results the convexity of the the safety example, in spacecraft rendezvous and docking maneuvers, problem (Vinod and Oishi (2017, 2018)). one may wish to dock safely while minimizing the fuel Reachability analysis is a useful tool for identifying set costs (efficiency). Such collection maneuversofare are crucial in resupplyresupplycosts (efficiency). maneuvers crucial in ing the space data, satellite repair, Reachability analysis is tool for identifying the (Vinod and Oishi (2017, 2018)). one may wish station, to Such dockcollection safely while minimizing the fuel problem Reachability analysis is aa useful useful tool for identifying the set set costs (efficiency). Such maneuvers are crucial in resupplyof initial states in a dynamical system that can be driven to ing the space station, of data, satellite repair, analysis is a useful tool for identifying the set ing space station, collection ofare data, satellite repair, and the other missions. Typically, this class of in problems is Reachability of states in dynamical system that can be to costs (efficiency). Such maneuvers crucial resupplyof initial initial states in aaset dynamical system that can be driven driven to ing the space station, collection of data, satellite repair, the desired target at an appropriate time while avoiding and other missions. Typically, this class of problems is Reachability analysis is a useful tool for identifying the set initial states in aset dynamical system that can be driven to and other missions. Typically, this class or ofthe problems is of treated either from the safety perspective efficiency the desired target at an appropriate time while avoiding ing the space station, collection of data, satellite repair, the desired target set at an appropriate time while avoiding and other missions. Typically, this class of problems is a set of undesired states (the “reach-avoid” problem). treated either from the safety perspective or the efficiency of initial states in a dynamical system that can be driven to the desired target set states at an appropriate time whileproblem). avoiding treated either from the safety perspective orofimportant the efficiency perspective, but treatment of both of these oba set of undesired (the “reach-avoid” and other missions. Typically, this class problems is atheset of undesired states (the “reach-avoid” problem). treated eitherbut from the safety perspective orimportant the efficiency For stochastic systems, the stochastic reach-avoid problem perspective, treatment of both of these obdesired target set at an appropriate time while avoiding a set of undesired states (the “reach-avoid” problem). perspective, but treatment of both of these important objectives either simultaneously is largely lacking inorimportant the stochastic systems, the reach-avoid problem treated from the safety perspective theliterature. efficiency Forset stochastic systems, the stochastic stochastic reach-avoid problem perspective, but treatment of both of these ob- For formulated and states solved using“reach-avoid” dynamic programming jectives simultaneously is has largely lacking in the the literature. awas of undesired (the problem). For stochastic systems, the stochastic reach-avoid problem jectives simultaneously largely lacking in literature. Model predictive controlis provided efficient controllers was formulated and solved using dynamic programming perspective, but treatment of both of these important obwas formulated and solved using dynamic programming jectives simultaneously is largely lacking in the literature. (Summers and Lygeros (2010) and Abate et al. (2008)). Model predictive control has provided efficient controllers For stochastic systems, the stochastic reach-avoid problem was formulated and solved using dynamic programming Model predictive control has provided efficient controllers to perform these maneuvers (Weiss et al. (2012); Park (Summers and Lygeros (2010) and Abate et al. jectives simultaneously is has largely lacking in the literature. (Summers and Lygeros (2010) anddynamic Abate etprogramming al. (2008)). (2008)). Model predictive control provided efficient controllers The grid-based dynamic programming implementation of to perform these maneuvers (Weiss et al. (2012); Park was formulated and solved using (Summers and Lygeros (2010) and Abate et al. (2008)). to perform these maneuvers (WeissHartley etefficient al. et (2012); Park The et al. (2011); Gavilan et al. (2012); al. (2012); grid-based dynamic programming implementation of Model predictive control has provided controllers The grid-based dynamic programming implementation of to perform these maneuvers (Weiss et al. (2012); Park the stochastic reach-avoid problem, proposed in Abate et al. (2011); (2011); GavilanStochastic et al. al. (2012); (2012); Hartleyhas et al. al. (2012); (Summers and reach-avoid Lygeros and Abate et al.in(2008)). The grid-based dynamic(2010) programming implementation of et al. Gavilan et Hartley et (2012); Weiss et al. (2015)). reachability been used the stochastic problem, proposed Abate to perform these maneuvers (Weiss et al. (2012); Park theal.stochastic reach-avoid problem, proposed in Abate et al. (2011); GavilanStochastic et al. (2012); Hartleyhas et al. (2012); et (2007), suffers from the curse of dimensionality and is Weiss et al. (2015)). reachability been used The grid-based dynamic programming implementation the stochastic reach-avoid problem, proposed in Abate Weiss et al. (2015)). Stochastic reachability has been used to verify the problem, i.e., characterize the initial set of et al. (2007), (2007), suffers from the the curse curse of ofproblems. dimensionality and of is et al. (2011); GavilanStochastic eti.e., al. characterize (2012); Hartley et al. (2012); et al. suffers from dimensionality and is Weiss et al. (2015)). reachability has been used limited to at most three-dimensional Polynomial to verify the problem, the initial set of the stochastic reach-avoid problem, proposed in Abate al. (2007), suffers from the curse ofproblems. dimensionality and is to verify the problem, i.e., characterize initial set of et states from which such maneuvers can be the performed safely limited to at most three-dimensional Polynomial Weiss et al. (2015)). Stochastic reachability has been used limited to at most three-dimensional problems. Polynomial to verify the problem, i.e., characterize the initial set of optimization-based approximate dynamic programming states from which such maneuvers maneuvers can be performed safely et al. (2007), suffers from the curse dynamic ofproblems. dimensionality and is to at most three-dimensional Polynomial states from which such performed safely (Lesser etthe al. (2013); et can al. be (2017); Vinodset and optimization-based approximate programming to verify problem, i.e., characterize the initial of limited optimization-based approximate dynamic programming states from which such Gleason maneuvers can be performed safely approaches have been proposed that have shown moder(Lesser et al. (2013); Gleason et al. (2017); Vinod and limited to at most three-dimensional problems. Polynomial optimization-based approximate dynamic programming (Lesser et al. (2013); Gleason et al. (2017); Vinod and Oishi (2018)). The safety problem, in case, is to design have been proposed have shown moderstates from which such maneuvers be performed safely approaches have been proposed that have However, shown moder(Lesser et al. (2013); Gleason et can al.this (2017); Vinod and approaches ate scalability (Kariotoglou et al.that (2013)). these Oishi (2018)). The safety problem, in this case, is todocking design optimization-based approximate dynamic programming approaches have been proposed that have However, shown moderOishi (2018)). The safety problem, in this case, is to design controllers that maximize the probability of the ate scalability (Kariotoglou et al. (2013)). these (Lesser et al. (2013); Gleason et al. (2017); Vinod and ate scalability (Kariotoglou et al. (2013)). However, these Oishi (2018)). The safety problem, in this case, is to design approaches typically provide an overapproximation of the controllers that maximize the probability of the docking approaches have been proposed that have shown moderate scalability (Kariotoglou etanal.overapproximation (2013)). However, of these controllers that maximize the probability of the typically provide the Oishi The safety problem, in this case, is todocking design approaches approaches typically provide an overapproximation of the controllers that the probability of docking stochastic reach-avoid set, resulting in nonconservative This (2018)). material is maximize based upon work supported by the the National ate scalability (Kariotoglou et al. (2013)). However, these approaches typically provide an overapproximation of the stochastic reach-avoid set, resulting in nonconservative This material is based upon work supported by the National controllers that maximize the probability of the docking stochastic reach-avoid set, resulting in nonconservative Science Foundation and by the Air Force Research Lab. Oishi and This material is based upon work supported by the National solutions. The spacecraft rendezvous and docking problem approaches typically provide an overapproximation of the This material is based upon work supported by the National stochastic reach-avoid set, resulting in nonconservative Science are Foundation andunder by the the Air Grant Force Research Research Lab. Oishi Oishi and and solutions. The spacecraft rendezvous and docking Vinod supported NSF Number CMMI-1254990 Science Foundation and by Air Force Lab. solutions. The spacecraft rendezvous and docking problem This material was approximated usingset, particle filters and convexproblem chance stochastic reach-avoid resulting in nonconservative is and based work supported by theOishi National Science Foundation byupon the Air Force Research Lab. and solutions. The spacecraft rendezvous and docking problem Vinod are supported under NSF Grant Number CMMI-1254990 was approximated using particle particle filters and convex convex chance (CAREER, Oishi) and under CNS-1329878 and Oishi is supported under Vinod are supported NSF Grant Number CMMI-1254990 was approximated using filters and chance constrained optimization techniques inand Lesser et al.problem (2013). Science Foundation andunder by the Air Grant Force Research Lab. Oishi and solutions. The spacecraft rendezvous docking Vinod are supported NSF Number CMMI-1254990 (CAREER, Oishi) and and supported under was approximated using particle filters and convex chance constrained optimization techniques in Lesser et al. (2013). AFRL Grant Number FA9453-17-C-0087. Any is findings, (CAREER, Oishi) and CNS-1329878 CNS-1329878 and Oishi Oishi isopinions, supported under constrained optimization techniques in Lesser et al. (2013). The underlying approximation used Lesser et al. (2013) Vinod are supported under NSF Grant Number CMMI-1254990 (CAREER, Oishi) and CNS-1329878 and Oishi is supported under was approximated using particle filters and convex chance AFRL Grant FA9453-17-C-0087. Any findings, constrained optimization techniques in Lesser et al. (2013). and conclusions or recommendations expressed this material are AFRL Grant Number Number FA9453-17-C-0087. Any inopinions, opinions, findings, The underlying approximation used in Lesser et al. (2013) The underlying approximation used in Lesser et al. (2013) (CAREER, Oishi) and CNS-1329878 and Oishi is supported under was shown to be an underapproximation, and a Fourier AFRL Grant Number FA9453-17-C-0087. Any opinions, findings, and or expressed in are constrained optimization techniques et (2013). The underlying approximation used in in Lesser Lesser et al. al. (2013) those of the authors and do not necessarily reflect the material views of the and conclusions conclusions or recommendations recommendations expressed in this this material are was shown to be an underapproximation, and a Fourier AFRL Grant Number FA9453-17-C-0087. Any opinions, findings, was shown to be an underapproximation, and a Fourier and conclusions or recommendations expressed in this material are transform-based grid-free and recursion-free solution to those of the authors and do not necessarily reflect the views of the The underlying approximation used in Lesser et al. (2013) National Science Foundation. those of the authors and do not necessarily reflect the views of the was shown to be an underapproximation, and a Fourier transform-based grid-free and recursion-free solution to and conclusions or recommendations expressed in this are those of the authors and do not necessarily reflect the material views of the transform-based grid-free andproposed recursion-free solution to National Foundation. this underapproximation was (Vinod and Oishi was shown to be an underapproximation, and a Fourier M. Oishi Science is the corresponding National Science Foundation. author for this work. transform-based grid-freewas andproposed recursion-free solution to this underapproximation (Vinod and Oishi those of the authors and do not necessarily reflect the views of the National Science Foundation. M. Oishi is the corresponding author for this work. this underapproximation was proposed (Vinod and Oishi M. Oishi is the corresponding author for this work. transform-based grid-freewas andproposed recursion-free to this underapproximation (Vinodsolution and Oishi National Foundation. author for this work. M. Oishi Science is the corresponding M. Oishi is©the corresponding author for this work. Copyright 2018 IFAC 136 this underapproximation was proposed (Vinod and Oishi ∗ ∗ ∗ ∗ ∗
2405-8963 © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Copyright 2018 136 Copyright © under 2018 IFAC IFAC 136 Control. Peer review© responsibility of International Federation of Automatic Copyright © 2018 IFAC 136 10.1016/j.ifacol.2018.07.101 Copyright © 2018 IFAC 136
2018 IFAC NAASS June 13-15, 2018. Santa Fe, NM USA
Abraham P. Vinod et al. / IFAC PapersOnLine 51-12 (2018) 136–141
(2017)). The Fourier transform-based underapproximation was demonstrated to be faster and less conservative to the particle filter and chance-constrained approaches (Vinod and Oishi (2018)). We rely upon this underapproximation technique in our analysis here. Since the stochastic reachavoid problem inherently does not optimize for any other cost function apart from safety, the corresponding optimal controller may not be efficient with respect to any performance cost other than safety. Therefore, we perform a trade-off analysis between safety and performance in the spacecraft rendezvous and docking problem. Researchers have recently examined multi-objective optimization with safety as one of the objectives. Lexicographic optimization techniques have been used in performing multi-objective optimization in space applications (Dueri et al. (2016)), and a in more general POMDP framework (Lesser and Abate (2017)). We have analyzed the problem of path-planning (minimizing a trajectory cost) while ensuring probabilistic safety from stochastic obstacles (HomChaudhuri et al. (2017)). In this paper, we pose the problem of guaranteeing both safety (through an underapproximation) and performance as a bi-criterion optimization problem. Building on the convexity results of the Fourier transform-based underapproximation presented in Vinod and Oishi (2018), we show that this problem is convex resulting in a tractable implementation. This analysis quantifies the balance between the desire of lower control effort (performance) and higher probability of constraint satisfaction (safety). Our main contributions are: (1) formulate a bi-criterion optimization problem that optimizes simultaneously the performance and the safety of the system (Sec. 3), (2) provide sufficient conditions for this problem to be convex allowing a tractable implementation (Sec. 3), (3) construct an optimal trade-off curve between performance and safety (Sec. 4.2), and (4) investigate the effect of control bounds on the stochastic reach-avoid maneuvers (Sec. 4.3). We discuss strategies to implement the Fourier transformbased underapproximation in Sec. 3.1. We formulate our problem statement and revisit existing results in the literature in Sec. 2, provide numerical values used in simulations in Sec. 4.1, and provide conclusions and directions for future work in Sec. 5. 2. PROBLEM FORMULATION We denote a discrete-time time interval by N[a,b] for a, b ∈ N and a ≤ b, which inclusively enumerates all integers in between (and including) a and b, random variables/vectors with bold case, non-random vectors with an overline, and concatenated random variables/vectors as bold case with overline. We denote the Cartesian product of the set S with itself k ∈ N times as S k . 2.1 Spacecraft Rendezvous and Docking We consider two spacecraft in the same circular orbit. One spacecraft, referred to as the deputy, must approach and dock with another spacecraft, referred to as the 137
137
chief, at a specified time (the control time horizon) while remaining in a line-of-sight cone. The line-of-sight cone is the region where accurate sensing of the deputy is possible. As given in Wiesel (1989), the relative planar dynamics are described by the Clohessy-Wiltshire-Hill (CWH) equations, Fx x ¨ − 3ωx − 2ω y˙ = (1a) md Fy (1b) y¨ + 2ω x˙ = md The chief is located at the origin, the position of the deputy is x, y ∈ R, ω = µ/R03 is the orbital frequency, µ is the gravitational constant, and R0 is the orbital radius of the chief spacecraft. We define the state as x = [x y x˙ y] ˙ ∈ R4 and the input as u = [Fx Fy ] ∈ U ⊂ R2 . We discretize the dynamics (1) in time, via zero-order hold, to obtain the discrete-time LTI system and add a Gaussian disturbance to account for the modeling uncertainties and the disturbance forces, (2) xk+1 = Axk + Buk + wk with wk ∈ R4 as an IID Gaussian zero-mean random process with a known covariance matrix Σw . We denote the known initial condition of (2) as x0 ∈ R4 and denote the time horizon using N ∈ N. 2.2 Bi-criterion optimization A general bi-criterion optimization problem can be stated as J1 (y) 2 minimize (w.r.t. R+ ) (3) π J2 (y) subject to y ∈ Y.
where Y ⊆ Rn and Ji : Y → R+ . The objective in (3) is ordered using generalized inequalities defined on R2+ (Boyd and Vandenberghe, 2004, Sec. 2.4). As discussed in (Boyd and Vandenberghe, 2004, Sec. 4.7), bi-criterion optimization problems typically do not have an optimal solution since Ji are rarely non-competing. In other words, in order to solve (3), compromises have to be made among the objectives. In these cases, the minimal elements of the set of achievable values of the optimization problem are used. Recall that minimal elements (w.r.t R2+ ) of a set means that no other solution of the set lies to the left and below the minimal element of interest (e.g. (2.17) in Boyd and Vandenberghe (2004)). These solutions are known as Pareto optimal solutions and the set of Pareto optimal solutions is called the optimal trade-off curve. This curve is typically obtained via scalarization of (3). For some λ1 , λ2 ≥ 0, we have J1 (y) minimize [λ1 λ2 ] = λ1 J1 (y) + λ2 J2 (y) (4) π J2 (y) subject to y ∈ Y. We obtain the trade-off curve by solving (4) for each pair λ1 , λ2 ≥ 0. Note that λ1 and λ2 can not simultaneously be equal to zero. To facilitate analysis, we define a relative weight λ1 ≥0 (5) λ= λ2
2018 IFAC NAASS 138 13-15, 2018. Santa Fe, NM USA June
and simplify (4) as minimize π subject to
Abraham P. Vinod et al. / IFAC PapersOnLine 51-12 (2018) 136–141
λJ1 (y) + J2 (y) y ∈ Y.
maximize
(6)
We solve (6) for every λ ≥ 0 to obtain the optimal trade-off curve. The optimal trade-off curve consists of the optimal solutions to (3) under varying degree of compromises, modeled via λ, between the objectives. 2.3 Probabilistic safety via stochastic reach-avoid problem Define a Markov policy π = (µ0 , µ1 , . . . , µN −1 ) ∈ M as a sequence of universally measurable state-feedback laws, µ : X → U. The random vector X = [x 1 x2 . . . xN ] generated from x0 under the action of π has a probability measure PxX0 ,π as described in Summers and Lygeros (2010) and Vinod and Oishi (2017). Further, define the terminal time probability, rˆxπ0 (S, T ), for known x0 and π, as the probability that the execution with policy π is inside the target set T at time N and stays within the safe set S for all time up to N . From Summers and Lygeros (2010), rˆxπ0 (S, T ) = PxX0 ,π ∀k ∈ N[0,N −1] xk ∈ S ∧ xN ∈ T . (7)
From (Summers and Lygeros, 2010, Def. 10), a Markov policy π ∗ is a maximal reach-avoid policy in the terminal sense if and only if it is the optimal solution of (8), ∗
rˆxπ0 (S, T ) = sup rˆxπ0 (S, T ).
(8)
π∈M
A dynamic programming-based solution of (8) is given in (Summers and Lygeros, 2010, Thm. 11). However, implementation of a grid-based dynamic programming solution suffers from the curse of dimensionality, preventing its use in this problem. 2.4 Fourier transform-based underapproximation of the stochastic reach-avoid problem We next propose a grid-free and recursion-free scalable underapproximation to the optimal value function of (8). Define an open-loop policy as ρ : X → U N that maps an initial condition x0 ∈ X to a input vector concatenated N over time, such that ρ(x0 ) = [u for 0 u1 . . . uN −1 ] ∈ U each x0 ∈ X . Given x0 and W = [w0 w1 . . . wN −1 ] ∈ W N , we obtain (9) X = Ax0 + Hρ(x0 ) + GW . The matrices A, H, G are given by specific combinations of the matrices A and B (see (Skaf and Boyd, 2010, Sec. 2)). Here, X is defined using a known ρ and x0 has a probability measure PxX0 ,ρ . In Vinod and Oishi (2017), an underapproximation for (8) for a given x0 ∈ X was proposed as ∗
rˆxρ0 (S, T ) =
sup
ρ(x0 )∈U N
rˆxρ0 (S, T )
(10)
with decision variable ρ(x0 ), and rˆxρ0 (S, T ) = PxX0 ,ρ ∀k ∈ N[0,N −1] xk ∈ S ∧ xN ∈ T . (11) The problem (10) can be equivalently written as
138
ψX (Z; x0 , ρ)dZ
(12) subject to ρ(x0 ) ∈ U N where ψX is the probability density function of X. The probability density function ψX may be obtained using Fourier transforms as described in et al. (2017). For Vinod a Gaussian disturbance wk ∼ N 0, Σw , (13a) X ∼ N mX , ΣX ρ(x0 )
S N −1 ×T
mX = Ax0 + Hρ(x0 )
(13b)
ΣX = G(IN ⊗ Σw )G . (13c) For polytopic S and T , the objective of (12) thus simplifies to an integral of a Gaussian distribution over a polytope. Lemma 1. (Vinod et al., 2017, Thm. 3) Given x0 ∈ X , if ψw is a log-concave and continuous density, the sets X , S, and T are Borel and convex, and the set U is compact and convex, then (12) is a log-concave optimization problem. Note that the underapproximation given in (12) is conservative. In other words, given an optimal solution ρ∗ of (12), there exists a Markov controller that provides an equal or better probabilistic safety guarantee than ρ∗ . 2.5 Problem statement We apply these techniques to the problem of spacecraft rendezvous and docking. Hence we consider the relative dynamics of two spacecraft as in (2) in the following. Problem 2. Formulate a tractable bi-criterion convex optimization problem to simultaneously maximize the terminal time probability and minimize the control effort for the spacecraft rendezvous and docking problem. Problem 3. Compute the optimal trade-off curve for Problem 2. Problem 4. Characterize the influence of ubound on Prob2 lem 2 when U = [−ubound , ubound ] . 3. BI-CRITERION STOCHASTIC OPTIMAL CONTROL The bi-criterion stochastic optimal control problem is Jx0 (π) 2 minimize (w.r.t. R+ ) π − log rˆxπ0 (S, T )) subject to π∈M (14) where Jx0 (π) : M → R+ is the fuel cost (minimize for efficiency) associated with a Markov policy π. Note that − log rˆxπ0 (S, T )) ensures that the bi-criterion objective in (14) may be compared using generalized inequalities defined on R2+ , while still allowing for maximization of the terminal time probability. Due to the computational challenge in computing rˆxπ0 (S, T ), we only consider open-loop policies, and use a convex function Lx0 (ρ) : U N → R+ for the fuel cost, Lx0 (ρ) 2 minimize (w.r.t. R+ ) ρ(x0 ) . (15) − log rˆxρ0 (S, T ) subject to
ρ(x0 ) ∈ U N
2018 IFAC NAASS June 13-15, 2018. Santa Fe, NM USA
Abraham P. Vinod et al. / IFAC PapersOnLine 51-12 (2018) 136–141
139
Theorem 5. Given x0 ∈ X , if ψw is a log-concave and continuous density, the sets X , S, and T are Borel and convex, the set U is compact and convex, and the function Lx0 (ρ) is convex over U N , then (15) is a convex optimization problem. Proof: A bi-criterion optimization problem is convex if each of its objectives are convex and the constraints are convex (see Sec. 4.7.5 of Boyd and Vandenberghe (2004)). Lemma 1 and convexity of Lx0 (ρ) completes the proof. Theorem 5 solves Problem 2. On scalarization of (15), we obtain using (5) rxρ0 (S, T )) minimize λLx0 (ρ) − log(ˆ ρ(x0 )
subject to ρ(x0 ) ∈ U N
.
(16)
2
For each λ ≥ 0, (16) is a convex optimization problem. In this paper, we consider the fuel cost to be Lx0 (ρ) = ρ(x0 )2
Fig. 1. Pareto optimal curve for x0 = [0.75 − 0.75 0 0] with the corresponding sets of achievable bi-criterion objective values compared to which they are minimal.
(17)
which is a convex function (Boyd and Vandenberghe, 2004, Sec 3.1.5). Note that construction of the optimal trade-off curve is highly parallelizable due to the recursion-free property of the Fourier-transform approach and the structure in (16). The optimal trade-off curve can be made arbitrarily accurate by evaluating (16) for more values of λ. The curve provides several insights on the trade-offs present in bi-criterion optimization, as discussed in Sec. 4.7.5 of Boyd and Vandenberghe (2004). We explore some of these insights in Sec. 4. 3.1 Implementation
minimize ρ(x0 )2 ρ(x0 ) (18) mX ∈ S N −1 × T . subject to N ρ(x0 ) ∈ U using mX defined in (13b). Problem (18) optimizes for a safe mean trajectory while minimizing the fuel cost. A safe mean trajectory would typically result in higher values for the safety probability rˆxρ0 (S, T ). Empirically, we found this approach to serve as a good initial guess for patternsearch. For polytopic S and T , (18) is a quadratic program. One may consider relaxations of (18) when it is infeasible. 4. TRADE-OFF ANALYSIS In this section, we compute the Pareto optimal curve for the problem (15). We first discuss the numerical values used in the simulation and then perform trade-off analysis. 4.1 Numerical values
Despite establishing that (16) is a convex optimization problem, two major challenges arise:
We consider a diagonal disturbance covariance matrix Σw = 10−4 ×diag(1, 1, 5×10−4 , 5×10−4 ), and time horizon (1) Unavailability of a closed-form expression for rˆxρ0 (S, T ), of N = 5. We define the target and safe sets as and T = z ∈ R4 : |z1 | ≤ 0.1, −0.1 ≤ z2 ≤ 0, ρ (2) Dependence of rˆx0 (S, T ) on the tail of the distribution |z3 | ≤ 0.01, |z4 | ≤ 0.01} (19) ψX for large regions of U N . 4 S = z ∈ R : |z1 | ≤ z2 , |z3 | ≤ 0.05, |z4 | ≤ 0.05 . (20) The first challenge arises from the fact that rˆxρ0 (S, T ) is an integral of a Gaussian over a polytope for which no closed- We analyze the initial condition x0 = [0.75 − 0.75 0 0] . form expression is available. We mitigate this challenge by Note that x0 lies on the boundary of S implying that the using Genz’s algorithm (see Genz (2017)) to compute an problem of ensuring safety is non-trivial (see Figure 3). approximation of rˆxρ0 (S, T ). Genz’s algorithm uses quasi- All computations were performed using MATLAB on an Monte-Carlo simulations and Cholesky decomposition to Intel core i7-4600 processor with 2.1GHz clock rate and 8 evaluate this integral (see Genz (1992)). Since Genz’s GB RAM. The codes used for this paper are available at algorithm provides a noisy evaluation of rˆxρ0 (S, T ), we https://github.com/unm-hscl/abyvinod-NAASS2018. use MATLAB’s patternsearch solver instead of fmincon as suggested in Vinod and Oishi (2017). 4.2 Trade-off analysis for a specific x0 ∈ S The second challenge arises from the fact that if ρ(x0 ) 2 does not drive the system towards satisfying the reach- We define the input space as U = [−0.1, 0.1] . Figure 1 ρ avoid constraints, then rˆx0 (S, T ) will depend heavily on shows the Pareto optimal curve corresponding to the the tail of the distribution ψX . Recall that Monte-Carlo optimization problem (3), and Figure 2 shows the tradesimulations do not model distribution tails well. Further, off between performance and safety. In Figure 2, we see rˆxρ0 (S, T ) will be close to zero for large regions of U N caus- that the maximum terminal time probability that may ing convergence issues for the solver. We partially mitigate be attained for the given x0 and U is 0.85. We do not this problem by appropriately initializing the solver when plot the Pareto optimal solution corresponding ∗to λ = 0, ρ solving (16). We solve the following optimization problem since for ρ∗ (x0 )2 = 0, the corresponding rˆx0 (S, T ) is to obtain the initialization for (16), insignificant, as seen in Figure 3. 139
2018 IFAC NAASS 140 13-15, 2018. Santa Fe, NM USA June
Abraham P. Vinod et al. / IFAC PapersOnLine 51-12 (2018) 136–141
Fig. 2. Trade-off curve between safety and performance for x0 = [0.75 − 0.75 0 0] .
Fig. 5. Influence of bounded control on safety obtained in this analysis is an underapproximation and may be further improved upon using a Markov policy. 4.3 Influence of the bounded control on safety 2
We define the input space as U = [−ubound , ubound ] and consider ubound ∈ {0.05, 0.075, 0.1, 0.5}. Figure 5 shows the Pareto optimal curve for varying ubound at x0 = [0.75 − 0.75 0 0] . Fig.
3. Pareto optimal trajectories from [0.75 − 0.75 0 0] for time horizon N = 5.
x0
=
The Pareto optimal curves for ubound = 0.1 and ubound = 0.5 are very similar, indicating that moderate increases in control bounds from ubound = 0.1 will not translate to an increase in safety. On the other hand, even a small decrease in the control bounds from ubound = 0.1 show a significant decrease in the achievable terminal time probability. This characterizes the effect of control saturation in the given reach-avoid problem. From Figure 5, we need ubound ≥ 0.75 to obtain a probabilistic safety guarantee of 0.8. We thus obtain empirical lower bounds on ubound to ensure a desired probabilistic safety guarantee. This solves Problem 4. 5. CONCLUSION
Fig. 4. Computation time for various values of λ (see (5)). The computation time was 44.92 seconds for λ1 = 0 (λ = 0) and 0.03 seconds for λ2 = 0 (λ = ∞). The shaded regions in Figure 1 are the sets corresponding to the positive cone R2+ shifted to the Pareto optimal solutions. These regions cover the achievable bi-criterion objective values that are sub-optimal compared to the respective Pareto optimal solutions. We see that a slight decrease in the desired terminal time probability guarantee provides a compromise that is optimal with respect to a significantly larger portion of the achievable bi-criterion objective values. The fuel cost may be reduced even further if the desired terminal time probability can be further relaxed. Figure 3 shows the trajectories for the optimal solutions corresponding to the compromises, a solution that preferentially weights safety, a solution that preferentially weights efficiency, and the solution to (18). Figure 4 shows the computational time taken for evaluating (16) at different λ values. The total computation time for 17 evaluations of (16) was approximately 59 minutes. This analysis solves Problem 3 and may be done for any other x0 ∈ S as well. Note that the safety probability 140
In this paper, we used bi-criterion optimization and stochastic reachability to analyze the trade-off between safety and performance in a stochastic system, namely the spacecraft rendezvous and docking problem. Solving a stochastic reach-avoid problem typically yields controllers that incur large control costs since the control design does not optimize for control effort. We discuss an approach to compute optimal trade-off curves that are conservative with respect to safety, and help minimize control effort while maximizing the terminal time probability. We also show the effect of bounded control authority on the given reach-avoid problem. REFERENCES Abate, A., Amin, S., Prandini, M., Lygeros, J., and Sastry, S. (2007). Computational approaches to reachability analysis of stochastic hybrid systems. In Proc. Hybrid Syst.: Comput. and Ctrl., 4–17. Abate, A., Prandini, M., Lygeros, J., and Sastry, S. (2008). Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica, 44(11), 2724–2734. Boyd, S. and Vandenberghe, L. (2004). Convex optimization. Cambridge Univ. Press.
2018 IFAC NAASS June 13-15, 2018. Santa Fe, NM USA
Abraham P. Vinod et al. / IFAC PapersOnLine 51-12 (2018) 136–141
Dueri, D., Leve, F.A., and A¸cikmese, B. (2016). Minimum error dissipative power reduction control allocation via lexicographic convex optimization for momentum control systems. IEEE Trans. Contr. Sys. Techn., 24(2), 678–686. Gavilan, F., Vazquez, R., and Camacho, E.F. (2012). Chance-constrained model predictive control for spacecraft rendezvous with disturbance estimation. In Control Engineering Practice. Genz, A. (1992). Numerical computation of multivariate normal probabilities. J. of Comp. and Graph. Stat., 1(2), 141–149. Genz, A. (2017). QSCMVNV. URL http://www.math. wsu.edu/faculty/genz/software/matlab/qscmvnv. m. Gleason, J., Vinod, A., and Oishi, M. (2017). Underapproximation of reach-avoid sets for discrete-time stochastic systems via Lagrangian methods. In IEEE Conf. Dec. Ctrl. (accepted). [Online]. Available: https: //arxiv.org/abs/1704.03555. Hartley, E.N., Trodden, P.A., Richards, A.G., and Maciejowski, J.M. (2012). Model predictive control system design and implementation for spacecraft rendezvous. In Control Engineering Practice. HomChaudhuri, B., Vinod, A.P., and Oishi, M. (2017). Computation of forward stochastic reach sets: Application to stochastic, dynamic obstacle avoidance. In Proc. American Ctrl. Conf. Seattle, WA. Kariotoglou, N., Summers, S., Summers, T., Kamgarpour, M., and Lygeros, J. (2013). Approximate dynamic programming for stochastic reachability. In Proc. European Ctrl. Conf., 584–589. Lesser, K. and Abate, A. (2017). Multiobjective optimal control with safety as a priority. IEEE Trans. Control Syst. Technol., PP(99), 1–13. Lesser, K., Oishi, M., and Erwin, R. (2013). Stochastic reachability for control of spacecraft relative motion. In IEEE Conf. Dec. Ctrl., 4705–4712. Park, H., Di Cairano, S., and Kolmanovsky, I. (2011). Model predictive control for spacecraft rendezvous and docking with a rotating/tumbling platform and for debris avoidance. In Proc. American Ctrl. Conf. Skaf, J. and Boyd, S. (2010). Design of affine controllers via convex optimization. IEEE Trans. Auto. Ctr., 55(11), 2476–87. Summers, S. and Lygeros, J. (2010). Verification of discrete time stochastic hybrid systems: A stochastic reachavoid decision problem. Automatica, 46(12), 1951–1961. Vinod, A.P. and Oishi, M. (2017). Scalable underapproximation for the stochastic reach-avoid problem for highdimensional LTI systems using fourier transforms. IEEE Ctrl. Syst. Lett., 1(2), 316–321. Vinod, A.P., HomChaudhuri, B., and Oishi, M. (2017). Forward stochastic reachability analysis for uncontrolled linear systems using Fourier transforms. In Hybrid Systems: Comp. & Control, 35–44. Vinod, A.P. and Oishi, M. (2018). Exploiting compactness and convexity of reach-avoid sets for discrete-time stochastic LTI systems for a scalable polytopic underapproximation. In Proc. Hybrid Syst.: Comput. and Ctrl. (submitted). Weiss, A., Kolmanovsky, I., Baldwin, M., and Erwin, R.S. (2012). Model predictive control of three dimensional 141
141
relative motion. In Proc. American Ctrl. Conf. Weiss, A., Baldwin, M., Erwin, R.S., and Kolmanovsky, I. (2015). Model Predictive Control for Spacecraft Rendezvous and Docking: Strategies for Handling Constraints and Case Studies. IEEE Transactions on Control Systems Technology, 23(4), 1638–1647. Wiesel, W. (1989). Spaceflight Dynamics. McGraw-Hill, New York.