17th IFAC Workshop on Control Applications of Optimization 17th IFAC Workshop onOctober Control 15-19, Applications Optimization Yekaterinburg, Russia, 2018 of 17th IFAC on Control Applications 17th IFAC Workshop Workshop onOctober Control 15-19, Applications of Optimization Optimization Yekaterinburg, Russia, 2018 of Available online at www.sciencedirect.com 17th IFAC Workshop onOctober Control 15-19, Applications Yekaterinburg, Russia, 2018 Yekaterinburg, Russia, October 15-19, 2018 of Optimization Yekaterinburg, Russia, October 15-19, 2018
ScienceDirect
IFAC PapersOnLine 51-32 (2018) 428–433
Algorithms for the Parametric Optimization of Nonlinear Systems Based on the Algorithms Optimization of Nonlinear Algorithms for for the the Parametric Parametric Optimization of System* Nonlinear Systems Systems Based Based on on the the Conditions of Optimal Algorithms for the Parametric Optimization of Nonlinear Systems Based on the Conditions of Optimal System* Conditions of Optimal System* Conditions of Optimal System*
Valery Afanas’ev*. Anna Presnova** Valery Afanas’ev*. Anna Presnova** Valery Anna Valery Afanas’ev*. Afanas’ev*. Anna Presnova** Presnova** Valery Afanas’ev*. Anna Presnova** *Moscow Institute of Electronics and Mathematics *Moscow Institute of Electronics and Mathematics National Research University "Higher School of Economics” (
[email protected]) *Moscow Institute Institute of of Electronics Electronics and and Mathematics *Moscow Mathematics National Research University "Higher School of Economics” (
[email protected]) *Moscow Institute of Electronics and Mathematics ** Moscow Institute of Electronics and Mathematics National Research University "Higher School of Economics” (
[email protected]) National Research University "Higher School of Economics” (
[email protected]) ** Moscow Institute of Electronics and Mathematics National Research University "Higher School of Economics” (
[email protected]) National Research University "Higher School of Economics”(
[email protected]) ** Moscow Institute of Electronics and Mathematics ** Moscow Institute of Electronics and Mathematics National Research University "Higher School of Economics”(
[email protected]) ** Moscow Institute of Electronics and Mathematics National National Research Research University University "Higher "Higher School School of of Economics”(
[email protected]) Economics”(
[email protected]) National Research University "Higher School of Economics”(
[email protected]) Abstract: The problem of optimal control is formulated for a class of nonlinear objects that can be Abstract: The problem of aoptimal control and is formulated class ofonnonlinear thatstructure can be represented as objects with linear structure parametersfor thataaa depend the state. objects The linear Abstract: problem of control is for class objects that can Abstract: The The problem of aoptimal optimal control and is formulated formulated for class of ofonnonlinear nonlinear objects thatstructure can be be represented as objects with linear structure parameters thata depend theallow state. for Thethe linear Abstract: The problem of optimal control is formulated for class of nonlinear objects that can be of the transformed nonlinear system and the quadratic functional of quality synthesis of represented as objects with a linear structure and parameters that depend on the state. The linear structure represented as objects with a linear structure and parameters that depend on theallow state. for Thethe linear structure of the transformed nonlinear system and the quadratic functional of quality synthesis of represented as objects with a linear structure and parameters that depend on the state. The linear structure optimal control, i.e. parameters of the regulator, move from the need to search for solutions of the of nonlinear system and the functional of quality allow for the of of the the transformed transformed nonlinear system andregulator, the quadratic quadratic functional of quality allow for for solutions the synthesis synthesis of optimal control, i.e. parameters of the move from the need to search of the of the transformed nonlinear andregulator, functional of quality allow for on thethe synthesis of The Hamilton-Jacobi equation to ansystem equation ofthe thequadratic Riccati type withthe parameters that depend state. optimal control, i.e. parameters of the move from need to search for solutions of the optimal control, i.e. parameters of the regulator, move from the need to search for solutions of the The Hamilton-Jacobi equation to an equation of the Riccati type with parameters that depend on thetostate. optimal control, i.e. parameters of the regulator, move from the need to search for solutions of the main problem of implementing optimal control is related to the problem of finding a solution such an The Hamilton-Jacobi equation equation to to an an equation equation of of the the Riccati Riccati type type with with parameters parameters that that depend depend on on the the state. state. The Hamilton-Jacobi main problem implementing optimal control is related towith the problem of finding a solution tostate. suchThe an Hamilton-Jacobi equation to an equation of the Riccati parameters that depend on of theparametric The papertype proposes an algorithmic method equation at theof of object functioning. main problem ofpace implementing optimal control is related related to the the problem problem of finding finding a solution solution to such such an an main problem of implementing optimal control is to of a to The paper proposes an algorithmic method of parametric equation at the pace of object functioning. main problem implementing optimal control related tothe theuse problem findingmethod a solution to such an optimization ofofpace the regulator. This method is isbased onproposes ofalgorithmic theof necessary conditions for the The paper an of parametric equation at the of object functioning. The paper proposes an algorithmic method of parametric equation at the pace of object functioning. optimization of the regulator. This method is based onproposes the useanofalgorithmic the necessary conditions for the The paperThe method ofused parametric equation atofthe pace of object functioning. optimality the control system under consideration. constructed algorithms can be both to optimization of the regulator. This method is based on the use of the necessary conditions for the optimization of the regulator. This method is based The on the use of the necessarycan conditions for the optimality of the control system under consideration. constructed algorithms be used both to optimization of the regulator. This method is based on the use of the necessaryare conditions for this the optimize the non-stationary objects themselves, if the corresponding parameters selected for optimality of the control system under consideration. The constructed algorithms can be used both to optimalitythe of non-stationary the control system under consideration. constructed parameters algorithms are can selected be used for boththis to optimize objects themselves, ifsystem theThe corresponding optimality of the control system under consideration. The constructed algorithms can be used both to purpose, and to optimize the entire managed by means of the corresponding parametric optimize the non-stationary objects themselves, if the corresponding parameters are selected for this optimize the non-stationary objects themselves, if the corresponding parameters are selected for this purpose, and to optimize the entire managed system by means of the corresponding parametric optimize the non-stationary objects themselves, if the corresponding parameters are selected for this adjustment of the regulators. The example of drug treatment of patients with HIV is demonstrated the purpose, and and to to optimize optimize the the entire entire managed managed system system by by means means of of the the corresponding corresponding parametric parametric purpose, adjustment ofofthe regulators. The example of drugsystem treatment of patients withcorresponding HIV is demonstrated the purpose, and totheoptimize the entire managed by means of the parametric effectiveness developed algorithms. adjustment of the regulators. The example of drug treatment of patients with HIV is demonstrated the adjustment of the regulators. The example of drug treatment of patients with HIV is demonstrated the effectiveness of the developed algorithms. adjustment ofofthe regulators. The example of drug treatment of patients with HIV is demonstrated the effectiveness the developed algorithms. effectiveness of the developed algorithms. © 2018, IFACNonlinear (International Federation of Automatic Control) Hosting the by Elsevier Ltd. All rights reserved. the Keywords: differential equations, optimal control, Hamilton-Jacobi equation, effectiveness of the developed algorithms. Keywords: Nonlinear differential equations, optimal control, the Hamilton-Jacobi equation, the Pontryagin minimum principle, the Riccati equation with parameters that depend on the state. Keywords: equations, optimal control, Hamilton-Jacobi equation, Keywords: Nonlinear Nonlinear differential differential equations, optimal control, the the Hamilton-Jacobi equation, the the Pontryagin the Riccati equation with parameters that Hamilton-Jacobi depend on the state. Keywords: minimum Nonlinearprinciple, differential equations, optimal control, the equation, the Pontryagin with parameters that depend on the state. Pontryagin minimum minimum principle, principle, the the Riccati Riccati equation equation with parameters that depend on the state. with parameters that depend on the state. Pontryagin minimum principle, the Riccati equation Riccati matrix equation with state-dependent parameters in 1. INTRODUCTION Riccati matrix equation state-dependent parameters in 1. INTRODUCTION the rate of operation of thewith system. Riccati matrix equation with state-dependent Riccati matrix equation with state-dependent parameters parameters in in 1. INTRODUCTION 1. INTRODUCTION the rate of operation of the system. Riccati matrix equation with state-dependent parameters in For the first time, the1.problem of control of nonlinear objects the rate of operation of the system. INTRODUCTION the rate of operation of the system. For the firstequivalent time, the problem of control Therate material of the article placed in the following order: in of operation of the is system. with their representation in of thenonlinear form ofobjects linear the For the first time, of of nonlinear objects The material of the article is placed in the following order: ina For the firstequivalent time, the the problem problem of control control of nonlinear objects with their representation in the form of linear the second section formulated the problem of controlling The material of the article is in order: For the first time, the problem of control of nonlinear objects models (State Dependent Coefficient, SDC) state- the The second materialsection of the article is placed placed in the the following following order: in in with equivalent representation in form of formulated the problem of controlling with their their equivalent representation in the the formwith of linear linear The material of the article is placed in the following order: inaaa models (State Dependent Coefficient, SDC) with statenonlinear object. In the third section, the method of extended the second section formulated the problem of controlling with their equivalent representation in the form of linear the second section formulated the problem of controlling dependent parameters and functionals whose penalty matrices models (State Dependent Coefficient, SDC) with statenonlinear object. In the third section, the method of extended models (State Dependent Coefficient, SDC) withmatrices state- linearization the second section formulated the problem of controlling dependent parameters andoffunctionals penalty usedIn synthesis in a nonlinear thethe third section, theoptimal method of models (State Dependent Coefficient, with in statealso depend on the state the objectwhose wasSDC) formulated the linearization nonlinear object. object. Inin third section,of methodcontrol of extended extended dependent parameters and whose penalty used inthe the synthesis ofthe optimal control in a dependent parameters andoffunctionals functionals whose penalty matrices matrices nonlinear object. In the third section, the method of extended also depend on the state the object was formulated in the problem with an undetermined end time of the transient used in of optimal control in dependent and whose penalty matrices early 60thparameters of the thestate XXof century 1962). linearization used in the the synthesis synthesis of time optimal control in aa also depend depend on the state offunctionals the object(Pearson, was formulated formulated in The the linearization also on the object was in the problem with an undetermined end of the transient linearization in the synthesis of time optimal control in a Theused implementation ofend the synthesized control process. early 60th of the XX century (Pearson, 1962). The problem with an undetermined of the transient also depend on the state of the object was formulated in the transformation of the original nonlinear differential equation, problem with an undetermined end time of the transient early 60th XX (Pearson, 1962). The The implementation ofend the synthesized control process. early 60th of ofof the the XX century century (Pearson, 1962). The encounters problem with an undetermined time of the transient transformation the original nonlinear differential equation, the complexity of solving a Riccati-type equation The implementation of the synthesized control process. early 60th ofofthe XX century 1962). The process. The which describes control (Pearson, system, into aequation, system implementation of the synthesized equation control transformation the original differential the complexity of solving a Riccati-type transformation ofthe the original original nonlinear nonlinear differential The implementation synthesized control process. which describes control system, intoparameters, aequation, system encounters with parameters that depend on ofthethe state of the system. To encounters the complexity of solving a Riccati-type equation transformation ofthe the original original nonlinear differential equation, with a linear structure, but with state-dependent encounters the complexity of solving a Riccati-type equation which describes original control system, into a system with parameters thatit depend on the state of the system. To which describes the original control system, intoparameters, a system encounters the complexity of solving a Riccati-type equation with a linear structure, but with state-dependent solve this problem, is proposed to use one of the methods with parameters that depend on the state of the system. To which describes the original control system, into a system with parameters thatit depend on the stateone of the system. To and use ofstructure, quadratic but quality allow the synthesis solve with aa linear with state-dependent parameters, this problem, is proposed to use of the methods withthe linear structure, but withfunctional state-dependent parameters, with parameters that depend on the state of the system. To and the use of quadratic quality functional allow the synthesis of algorithmic construction of systems with incomplete solve this problem, it is proposed to use one of the methods with a linear structure, but with state-dependent parameters, solve this problem, it is proposed to use one of the methods of control to make the transition from the Hamilton-Jacobiand the use of quadratic quality functional allow the synthesis of algorithmic construction of systems with incomplete andcontrol the usetoofmake quadratic quality functional allow the synthesis solve this problem, it is proposed to use one of the methods of the transition from the Hamilton-Jacobiinformation. The fourth section presents a method for algorithmic of with incomplete and the use quadratic quality functional allow the synthesis of algorithmicTheconstruction construction of systems systems incomplete Bellman equation to Riccati type the equation with state- of of control control toofmake make thethe transition from the Hamilton-Jacobiinformation. fourth section presentswith a method for of to the transition from Hamilton-Jacobiof algorithmic construction with incomplete Bellman equation to the Riccati type the equation with state- algorithmic construction of ofa systems system parametric information. The fourth section presents a method for of control to make the transition from Hamilton-Jacobiinformation. The fourth section presents a method for dependent parameters (State Dependent Riccati Equation, Bellman equation to the Riccati type equation with statealgorithmic construction of a system with parametric Bellman equation to the Riccati type equation with stateinformation. construction The section presents aa method for dependent parameters (State Dependent Riccati Equation, optimization, basedfourth on the application ofwith function of algorithmic of a system with parametric Bellman equation to the Riccati type equation with statealgorithmic construction of a system parametric SDRE). This is the basis of SDRE-method of synthesis of dependent parameters (State Dependent Equation, optimization, basedactions on the application ofThe a fifth function of dependent parameters (State Dependent Riccati Riccati Equation, algorithmic construction of a system with parametric SDRE). This is the basis of SDRE-method of synthesis of admissible control (Hamiltonians). section optimization, based based on on the the application application of of aa function function of of dependent parameters (State Dependent Riccati Equation, optimal control systems (Cimen, 2008). SDRE). nonlinear This is the the basis of SDRE-method SDRE-method of synthesis synthesis of optimization, admissible control actions (Hamiltonians). The fifth section SDRE). This is basis of of of based on the the application ofThe a fifth function of optimal control (Cimen, 2008). demonstrates the use of theoretical results obtained using admissible control actions (Hamiltonians). section SDRE). nonlinear This is the basissystems of SDRE-method of synthesis of optimization, admissible control actions (Hamiltonians). The fifth section optimal nonlinear control systems (Cimen, 2008). demonstrates the model use of the theoretical results obtained using optimal nonlinear control systems (Cimen, 2008). admissible control actions (Hamiltonians). The fifth section a mathematical that describes the behavior of the It should be noted, that until now there are a number of issues demonstrates the use of the theoretical results obtained using optimal nonlinear control systems (Cimen, 2008). demonstrates the model use of the theoretical results obtainedofusing aahuman mathematical that describes the behavior the It should noted, that until there are aa number of issues demonstrates thesystem use of the theoretical obtained using immune in the presenceresults of the HIV virus in related tobe the ambiguity ofnow representation of a nonlinear mathematical model that describes the behavior of It should be noted, that until now there are number of issues a mathematical model that describes the behavior of the the It should be noted, that until now there are a number of issues human immune system in the presence of the HIV virus in related to the ambiguity of representation of a nonlinear a mathematical model that describes the behavior of the It should be noted, that until now there are a number of issues the management of the supply of HAART medications. object in the form of a model with a linear structure and with human immune system in the presence of the HIV virus related to the ambiguity of representation of a nonlinear human immune system in theofpresence ofmedications. the HIV virus in in related tothethe ambiguity of with representation of a nonlinear the management of the supply HAART object in form of a model a linear structure and with human immune system in the presence of the HIV virus in related to the ambiguity of representation of a nonlinear The main problem of the state-dependent object of aa linear structure and the management management of of the the supply supply of of HAART HAART medications. medications. object in in the the form form parameters. of aa model model with with linear structure and with with The main problem of state-dependent parameters. the management of the supply of HAART medications. object in the form of a model with a linear structure and with implementing the regulator obtained based on the SDRE 2. NON-LINEAR OPTIMAL REGULATER The main of state-dependent The based main onproblem problem of state-dependentthe parameters. parameters. implementing regulator obtained the SDRE 2. NON-LINEAR OPTIMAL REGULATER The mainto on problem of state-dependent parameters. method is the difficulty of finding a solution the algebraic implementing the regulator obtained based the SDRE 2. implementing the regulator obtained based on the SDRE 2. NON-LINEAR NON-LINEAR OPTIMAL OPTIMAL REGULATER REGULATER method is the difficulty of finding a solution to the algebraic implementing the regulator obtained basedto on the SDRE 2. NON-LINEAR OPTIMAL REGULATER method method is is the the difficulty difficulty of of finding finding aa solution solution to the the algebraic algebraic 2.1. Problem statement method is the difficulty of finding a solution to the algebraic 2.1. Problem statement *This work (research grant №16-08-00552) was supported by Russian 2.1. Problem Problem statement statement 2.1. *This workfor(research grant №16-08-00552) was supported by Russian Foundation Basic Research, project 16-08-00552 2.1. Problem Consider a statement deterministic controllable non-linear system *This grant was *This work workfor(research (research grant №16-08-00552) №16-08-00552) was supported supported by by Russian Russian Foundation Basic Research, project 16-08-00552 Consider a the deterministic controllable non-linear system *This workfor grant №16-08-00552) was supported by Russian Foundation Basic project described by ordinary differential equation Foundation for(research Basic Research, Research, project 16-08-00552 16-08-00552 Consider aa deterministic deterministic controllable non-linear system system Consider controllable non-linear Foundation for Basic Research, project 16-08-00552 described by the ordinary differential equation Consider a the deterministic controllable non-linear system described ordinary equation described by by the ordinary differential differential equation described by the ordinary differential equation Copyright © 2018, 2018 IFAC 428Hosting 2405-8963 © IFAC (International Federation of Automatic Control) by Elsevier Ltd. All rights reserved. Copyright © 2018 IFAC 428 Peer review© of International Federation of Automatic Copyright 2018 IFAC 428 Copyright ©under 2018 responsibility IFAC 428Control. 10.1016/j.ifacol.2018.11.422 Copyright © 2018 IFAC 428
IFAC CAO 2018 Valery Afanas’ev et al. / IFAC PapersOnLine 51-32 (2018) 428–433 Yekaterinburg, Russia, October 15-19, 2018
d
x( t ) f ( x( t )) g ( x( t ))u( t ) , x( t ) x . 0
dt
0
(1)
H x, u,
Here x() x(t ) R n , t [t0 , T ] is a state vector of the
system; x() x , X 0 x is a range of possible initial
V ( t , x( t )) x
V (t , x (t )) x
1
2
x
429
T
(t )Qx (t ) u (t ) R u (t ) T
(5)
f ( x(t )) g ( x(t ))u(t ). 0
conditions of the system; u R is a control; f ( x ), g ( x ) are continuous matrix functions.
Optimum controls u (t ) is the stationarity point of the Hamiltonian (5) and is determined by the relation
Assumption 2.1. Function f ( x(t )) - continuous differentiable
V (t , x(t )) (6) u0 (t ) R 1 g T ( x(t )) , x where vector V (t , x(t )) / x(t ) is determined by solution of Hamilton-Jacobi-Bellman equation:
r
T
with respect to x x i.e. f () C1 (Ω x ) . In addition, we shall assume that the functions f ( x( t )), g ( x( t )) such that for any initial conditions (t0 , x0 ) R x only one solution
V ( t , x ( t ))
x (t , t0 , x0 ) of (1) is possible.
t
Assumption 2.2. Suppose that for x 0 n the following conditions are fulfilled: f (0) 0 and g ( x(t )) 0,
Introduce the cost functional
x
1
x g ( x( t )) R
f ( x( t )) 1 T
g ( x( t ))
V ( t , x( t )) x
Т
(7)
T
x (t )Qx ( t ) 0,
2
1T J ( x(), u()) x T (t )Q x(t ) u T (t ) R u(t ) dt . 2t
V (T , x(T )) 0.
(2)
0
The main difficulty under implementation of controls in form (6) consists in finding of vector V (t , x(t )) / x(t ) satisfying scalar partial derivative equation (7). If (7) is successfully solved, the control is carried out using the principle of state feedback, i.e. u(t ) u(t , x(t )) .
In functional (2) a symmetrical matrices Q and R are positive definite. Assumption 2.3. Let f ( x(t )), g ( x(t )) sufficiently smooth functions such that function V (t , x ) , defined as 1 T T V ( s, x()) inf x (t )Qx(t ) u T (t )R u(t ) dt , u()U 2 s differentiable function for any admissible controls.
V (t , x (t ))
1 V ( t , x( t )) 2
x(t ) Ω x .
Consider the cases when the transient time is not specified t t0 , T , T . In this case V ( t , x ( t )) / t 0 in (7). The trajectory of the system under optimal control will take the form
(3)
Т d V ( x(t )) x(t ) f ( x(t )) g ( x(t )) R 1 g T ( x(t )) , (8) dt x x(t ) x ,
Assumption 2.4. A function V (t , x ) determined in (3) is locally Lipschitz in x .
0
The problem consists in construction of an optimal strategy, i.e. in finding of control u (t ) minimizing a functional of (2) on the object (1).
T
where V ( x(t )) / x is sought by a solution of equation V ( x(t )) f ( x(t )) x
2.2. Optimal controls
T
1 V ( x(t )) V ( x(t )) g ( x(t ))R 1 g T ( x(t )) 2 x x 1 x T (t )Qx(t ) 0 2
In general case, value of an assigned function V (t , x(t )) is a solution of dynamic programming problem connected with partial differential equation of the first order (first order PDE) Hamilton-Jacobi-Bellman (Bellman, 1957).
V (t , x(t )) V (t , x()) min H x(t ), u(t ), , uU x where H is Hamiltonian
0
(9)
3. THE METHOD OF EXTENDED LINEARIZATION AND THE PROBLEM OF OPTIMAL CONTROL
(4)
The search for a solution of the partial differential equation (9) will be carried out by applying to the initial system (1) the method of "extended linearization" (State-DependentCoefficient, SDC-representation) (Afanasyev, 2015). Taking Assumptions 2.1 and 2.2 into account, we represent the 429
IFAC CAO 2018 430 Valery Afanas’ev et al. / IFAC PapersOnLine 51-32 (2018) 428–433 Yekaterinburg, Russia, October 15-19, 2018
vector f ( x(t )) in the form f ( x(t )) A( x(t )) x(t ) .We note that such a representation is not unique for systems whose order is higher than the first. The equation of the model of the object (1), taking into account the assumptions made, takes the form
Let the controlled object be described by a nonlinear equation of general form
d x(t ) f ( x(t ), u(t ), (t ), (t )), x(t0 ) x0 . dt
Here x R n , u R r , u() U , R k is the vector of the parameters of the object subjected to uncontrolled disturbances, R h - the vector of the parameters of the system optimization object. Note that in the general case k h . The quality functional is written in the general form
d x(t ) A( x(t )) x(t ) g ( x(t ))u(t ), x(t0 ) x0 . (10) dt We call the representation of a nonlinear control system (1) in the form (10) an SDC-representation. Assumption 3.1. The system (10), which is the SDCrepresentation of the nonlinear control system (1), is stabilized (controllable) in the region t0 , T Ω x if the pair A( x(t )), g ( x(t )) (t , x(t )) t0 , t
f
is
stabilizable
(controllable)
J ( x( ), u( )) K ( x(T ))
for
x
The fulfillment of Assumptions 2.1, 2.2, 3.1 is a necessary and sufficient condition for representing the controlled system (1) in the form of its controllable model (10) (Krasovsky, 1959).
S ( x(t )) x(t ).
S ( x(t )) g ( x(t )) R 1 g T ( x(t )) S ( x(t )) Q 0,
H (t , x(t ), u(t ), (t ) x
d (t ) dt
(11)
(T )
K ( x(T )) x
T
,
T
(17)
.
We note that the boundary conditions at the right end for the function t and the behavior of the Hamiltonian depend on the object, the type of the given region of finite values of the state of the system, and the task (or non-identification) of the time of the transient process.
(12)
We rewrite equation (9) with allowance for (11) S ( x(t )) A( x(t )) AT ( x(t )) S ( x(t ))
(15)
where t is the adjoint function, which is a solution of the differential equation
Then, in accordance with (6), the optimal control will be determined by the relation u (t ) R 1 g T ( x(t ))S ( x(t )) x(t ).
L(t , x(t ), u(t ))dt.
H (t , x(t ), (t ), u(t )) L(t , x(t ), u(t )) T (t ) f ( x(t ), u(t )), (16)
To find solutions of the Hamilton-Jacobi-Bellman equation (9), we introduce the relation
V ( x(t )) / x
T
t0
Let us consider the case when perturbing effects R k are absent, or completely compensated by a change in the parameters selected for this R k , i.e. the right-hand side of equation (14) has the form f ( x(t ), u(t )) . We form the Hamiltonian (Atans and Falb, 1968; Afanasyev et al., 2003)
Ω .
Т
(14)
(13)
Differentiating H (t , x(t ), u(t ), (t ) ) with respect to time, taking into account the possibility of transition to an open field of control actions, we obtain
which allows us to find the matrix S ( x(t )) for the expression (12). Equation (13) in the literature is called the Riccati algebraic matrix equation with state-dependent parameters (State-Dependent-Riccati-Equation, SDRE) (Cimen, 2008).
H H dx H d H du d H ( t , x, u, ) . (18) t x dt dt u dt dt
Lemma 3.1. The model (10) of system (1) is stable if the matrix S ( x(t )) that is a solution of equation (13) is positive definite.
Taking into account that the differential equations (14) and (17) form a canonical form, and also that H / u 0 , the last expression can be rewritten in the form
Proof of Lemma 3.1. can be found in (Cimen, 2008; Afanasyev, 2015). The main problem of implementing a control of the form (12) is the complexity of finding the matrix, as a solution to equation (13).
d H (t , x0 (t ), u0 (t ), 0 (t ) ) dt
H (t , x0 (t ), u0 (t ), 0 (t ) ) . t
(19)
Thus, the behavior of the Hamiltonian under optimal control u0 (t ) and the corresponding trajectory x0 (t ) changes during a transient process along a completely defined trajectory determined by solving the differential equation (19) with the boundary condition at the right end (with the exception of the stationary case when the Hamiltonian in explicit form does not depend on time (Atans and Falb, 1968)). This property of the Hamiltonian is based on the construction of algorithms for optimizing the control system.
4. ALGORITHMS OF PARAMETRIC OPTIMIZATION Before proceeding to the formation of an algorithmic method for searching for realizable solutions for finding a matrix S ( x(t )) , we shall consider in general form the method for the formation of optimization algorithms for non-stationary control systems based on the application of functions of admissible values of control actions. In the calculus of variations, these functions are Hamiltonians. 430
IFAC CAO 2018 Valery Afanas’ev et al. / IFAC PapersOnLine 51-32 (2018) 428–433 Yekaterinburg, Russia, October 15-19, 2018
Let us write down the necessary conditions for a minimum of the quality functional, expressed in the behavior of the Hamiltonian on the optimal trajectory. Let be (t ) - the value of the Hamiltonian at each instant of time in the absence of parametric perturbations (or when they are completely parried) under optimal control and the corresponding trajectory of system (14). The scalar function (t ) takes specific values in each formulation of the optimal control problem. In this way, 0 0 0 0 H (t ) H (t , x (t ), u (t ), (t )) (t ) . We introduce a scalar function (t ) such that 0
(t ) H 0 (t ) (t ) 0.
2
(21)
(22)
T
H (t , x(t ), u(t ), (t ), (t ), (t ))
(t ),
(23)
Then the inequality (22) takes the form (t )
H (t , x(t ), u(t ), (t ), (t ), (t )) d (t ) dt
2
H (t , x(t ), u(t ), (t ), (t ), (t ))
(t )
(26) d z1 (t ) c1 z1 (t ) y (t ) b1 z1 (t ), dt d w(t ) c2 i (t ) y (t )w(t ) c2 qy (t )w(t ) b2 w(t ), dt d z2 (t ) c2 qy (t )w(t ) hz2 (t ), dt where i - concentration of uninfected cells of the immune system, Т-helpers; - the production rate of T-helpers in the body; d - rate of natural death of T-helpers. When a virus enters the bloodstream, uninfected cells become infected at a speed . The concentration of infected cells - y . The infected cells naturally die at the rate a , in addition Т-killers ( z1 ) kill them at the rate p1 , and immunoglobulins ( z2 ) kill the infected cells at the rate p2 . B-lymphocytes ( w ) are activated in the body at the speed c2 , and at the speed q they turn into immunoglobulins. The maximum effectiveness of drugs is expressed by the coefficient , u - the dose of the drug administered, that is, our intervention. The values of the parameters are taken from (Zurakowski, 2006).
(24)
2
.
d i (t ) di (t ) (1 u )i (t ) y (t ), dt d y (t ) (1 u )i(t ) y (t ) dt ay (t ) p1 z1 (t ) y (t ) p2 z2 (t ) y (t ),
Let the algorithm of parametric optimization look like d (t ) dt (t0 ) 0 .
(25)
*
Let us consider an example of applying the above-described method of parametric optimization of a nonlinear system. As the object of study, a mathematical model was chosen that describes the behavior of the human immune system in the presence of the HIV virus. Synthesis and analysis of these mathematical models have been practiced for many years not only by biologists, but also by mathematicians. This article uses a mathematical model proposed by American scientists (Wodarz, 1999), which is in excellent agreement with clinical data.
Then d VL ( (), ()) dt H (t ) d (t ) H (t ) d (t ) (t ) 0. dt dt
5. THE OPTIMIZATION METHOD IN THE PROBLEM OF ADMINISTERING THE SUPPLY OF HAART PREPARATIONS
(20)
1 2 (t ) . 2
2
The fulfillment of condition (25) provides asymptotic properties to the process of parametric optimization of a nonstationary object of the form (14).
Obviously, if for h k and (t ) (t ) , then equality (20) will not be satisfied, i.e. (t ) H (t ) (t ) 0 , here H (t ) H (t , x(t ), u(t ), (t ), (t ), (t )) . This is the basis for optimization algorithms. We introduce the Lyapunov function 2
H (t , x(t ), u(t ), (t ), (t ), (t ))
H (t , x(t ), u(t ), (t ), (t ), (t )) d (t ) (t ) dt
Thus, condition (20) is a necessary condition for minimizing the optimality of the control system in the absence of parametric perturbations. If the problem has only one minimum and there are no other points of stationarity of the Hamiltonian (for example, a linear object and a quadratic quality criterion) or the researcher has information about the region of the main extremum (minimum) of the functional corresponding to the domain of variation of the admissible control actions, then condition (20) is and sufficient condition for optimality.
1 VL ( (), ()) (t ) 0 (t ) 2
(t )
431
0.
Since the system satisfies the necessary assumptions using the SDC-method, we represent (26) in the form
*
d (t ) - be the maximum rate of change of perturbing Let dt parameters. Then from (24) it follows that parametric optimization will be successful if the following inequality holds:
d x(t ) A( x(t )) x(t ) g ( x(t ))u(t ), dt x(t0 ) x0 ,
431
(27)
IFAC CAO 2018 432 Valery Afanas’ev et al. / IFAC PapersOnLine 51-32 (2018) 428–433 Yekaterinburg, Russia, October 15-19, 2018
where xT [i y z1 w z2 ] . Here the matrix A( x ) has an ambiguous representation. The problem of choosing the optimal representation of the original model (26) in the form (27), i.e. finding the optimal matrix A( x(t )) , is unresolved. To compare the results, the simulation will be carried out for 5 different representations of the matrix A( x(t )) .The matrices A( x(t )) and g ( x(t )) can, for example, take the form
A
d i y
i
0
0
a p1 z1 p2 z2
0
0
c1 z1 c2 (iw qw)
b1 0
0 b2
c2 qw
0
0
0 0 0
H x(t ), u(t ), S0 s(t ) x(t ) . Taking into account that when S ( x(t )) S0 s(t ) the Hamiltonian has the form
there
(t )
H x(t ), u(t ), S0 s(t ) x(t )
(31) 1 T 1 d x (t )Qx(t ) u T (t )Ru(t ) x T (t ) S 0 s(t ) x(t ), 2 2 dt the sensitivity function is defined by expression
0 0 , 0 0 h
H x(t ), u(t ), S0 s(t ) x(t ) s T
(32)
gR 1 g T ( S0 s(t )) AT ( x(t )) x(t ) x T (t ).
Thus, the algorithm (30) takes the form H x, u, S0 s x d s(t ) H x, u, S0 s x , s(t ) dt T
g ( x) iy iy 0 0 0 . T
For the synthesis of control u (t ) , we introduce the quality functional J ( x, u ) lim
1
T 2
T
x T (t )Q x(t ) u
T
s(t0 ) 0,
where the sensitivity function is determined by the expression (32), and the Hamiltonian by the expression (33). Control with parametric optimization takes the form
(t ) Ru(t ) dt . (28)
t0
u(t ) R 1 g T ( x(t )) S0 s(t ) x(t ),
The optimal control is determined by the relation (12), where the matrix S ( x(t )) is a solution of the Riccati equation with parameters that depend on the state (13).
d x f ( x(t )) B( x(t ))R 1 g T ( x(t )) S0 s(t ) x(t ), (35) dt x(t0 ) x0 .
6. COMPUTER MODELLING
(29)
The Hamiltonian of the system under optimal control (12) and the corresponding trajectory is zero. This can be easily verified by writing down the Hamiltonian of the system and using (13). We note that in the case when S (v(t )) S0 s(t ) , the optimality condition (20) of the system is not satisfied, which will be the basis of the algorithm for parametric optimization of the system by rearranging the parameters of the matrix s(t ) . The matrix S0 S ( x0 ) in (29) is found from the solution of the Riccati equation (13) with constant parameters (for x(t0 ) x0 ) using the lqr operator in MATLAB, and the matrix s(t ) is found using the algorithmic method. In accordance with the above method of forming the parametric optimization algorithm (23), we write T
d s(t ) dt
H x(t ), u(t ), S0 s
s ( t ) x( t ) (t ),
(34)
the system (27) with control (34) takes the form
As already mentioned above, the matrix S ( x(t )) as a solution to the Riccati equation with parameters that depend on the state of the object is very difficult to find in the general case. To find the matrix S ( x(t )) , we use the algorithmic method presented in Section 4. We represent this matrix in the form (Presnova, 2016) S ( x(t )) S0 s(t )
(33)
(30)
s(t0 ) 0,
432
Here is a computer simulation of an object with synthesized controls constructed using the algorithmic method. Let the initial conditions of the system (26) correspond to a very weak patient with the HIV virus: i(t0 ) 0, 2 , y(t0 ) 0,8 , z1 (t0 ) 0,08 , w(t0 ) 0, 01 , z2 (t0 ) 0,01 (the data given is the normalized number of cells, according to (Zurakowski, 2006)). For the given model of the immune system, the value of the concentration of healthy cells of the immune system for normal vital activity should be in the range of 8-10. The simulation was carried out for a given initial state in two modes: in the treatment synthesized by the algorithmic method, and in the absence of any treatment, that is, when u 0 . Figure 1 shows that the synthesized management copes with the task and deduces the concentration of healthy cells of the human immune system at an acceptable level. Figure 2 shows that in the presence of treatment, the concentration of cells infected with the HIV virus in the body quickly decreases to the lowest possible level, but does not become equal to 0 (it is impossible to completely remove the virus from the body in reality). Figure 3 shows the values of the matrix of transient processes s(t ) , found by the proposed algorithm (34). Since the representation of the vector function f ( x(t )) in the form f ( x(t )) A( x(t )) x(t ) is not unique, the paper considered several variants of possible
IFAC CAO 2018 Yekaterinburg, Russia, October 15-19, 2018 Valery Afanas’ev et al. / IFAC PapersOnLine 51-32 (2018) 428–433
matrices A( x(t )) , for which the corresponding matrices S were found and simulation was performed.
433
matrix f ( x(t )) in the form f ( x(t )) A( x(t )) x(t ) does not significantly affect the quality of the transient processes. The obtained simulation results for the chosen mathematical model of the immune system of a person with HIV demonstrate the success of the constructed control using the algorithmic method.
0
Fig. 2. The concentration of cells infected with HIV in the human body. Fig. 7. Hamiltonian for various matrices A( x ) . 7. CONCLUSIONS In this paper, the conditions of the optimization process were formulated that ensure the asymptotic transfer of the quality functional from its peripheral values to a minimum. Based on the method of algorithmic construction, algorithms for parametric optimization of nonlinear control systems with a quadratic functional of quality were presented and investigated. When constructing algorithms for parametric optimization of the system, the property of the behavior of the Hamiltonian on the corresponding trajectory is used.
Fig. 3. Time variation of the parameters of the matrix s(t ) optimizing the system.
REFERENCES Atans M., Falb P. (1968). Optimal control. Mechanical engineering, Moscow. Afanasyev V.N., Kolmanovskyi V.B. and Nosov V.R. (2003). Mathematical theory of control system design. High school, Moscow. Afanasyev V.N. (2015). Control of nonlinear uncertain dynamic objects. URSS, Moscow. Bellman R. (1957). Dynamic programming. Princeton University Press, Princeton. Cimen T.D. (2008). State-Dependent Riccati Equation (SDRE) Control: A Survey. Proc. 17th World Conf. IFAC, p. 3771-3775. Krasovsky N.N. (1959). Some problems of the theory of stability of motion. Fiz.mat.lit., Moscow. Pearson J.D. (1962). Approximation methods in optimal control. Journal of Electronics and Control, №12, p. 453-469. Presnova A. (2016). Method of extended linearization in the uncertain object control problem. Quality. Innovation. Education, № 2, p. 31-40. Wodarz D., Nowak M.A. (1999). Specific therapy regimes could lead to long-term immunological control of HIV. Proceedings of the National Academy of Sciences, vol. 96, № 6, p. 14464-14469. Zurakowski R., Teel A. (2006). A model predictive control based scheduling method for HIV therapy. Journal of Theoretical Biology, vol. 238, p. 368-382.
Fig. 4. Dynamics of healthy cells with different matrices A( x ) .
Fig. 5. Dynamics of infected cells with different matrices A( x ) .
Fig. 6. Control actions for different matrices A( x ) . As can be seen from the results of computer simulation (Figures 4, 5, 6, 7), the different representation of the 433