17th IFAC Workshop on Control Applications of Optimization 17th IFAC Workshop on Control Applications of Optimization Yekaterinburg, Russia, 2018 of Optimization 17th IFAC Workshop onOctober Control 15-19, Applications Yekaterinburg, Russia, October 15-19, 2018 Available online at www.sciencedirect.com 17th IFAC Workshop onOctober Control 15-19, Applications Yekaterinburg, Russia, 2018 of Optimization Yekaterinburg, Russia, October 15-19, 2018
ScienceDirect
IFAC PapersOnLine 51-32 (2018) 503–508
Two Targets Pursuit-Evasion Differential Two Targets Pursuit-Evasion Differential Two Targets Pursuit-Evasion Differential Game with a Restriction on the Targets Two Targets Pursuit-Evasion Differential Game with a Restriction on the Targets Game with a Restriction on the Targets ⋆ ⋆ Turning Game with a Restriction on the Targets Turning ⋆ Turning ⋆ Turning Evgeny Ja. Rubinovich ∗∗
Evgeny Ja. Rubinovich ∗ Evgeny Ja. Rubinovich ∗ ∗ Evgeny Ja. Rubinovich Institute of Control Sciences, 65 Profsoyuznaya Str. ∗ Trapeznikov Trapeznikov Institute of Control Sciences, 65 Str. ∗ Trapeznikov Institute of Control Sciences, 65 Profsoyuznaya Profsoyuznaya Str. Moscow, 117997, Russia (e-mail:
[email protected]). Moscow, 117997, Russia (e-mail:
[email protected]). ∗ Trapeznikov of Control Sciences, 65 Profsoyuznaya Str. Moscow, Institute 117997, Russia (e-mail:
[email protected]). Moscow, 117997, Russia (e-mail:
[email protected]). Abstract: The The differential differential game game under under consideration consideration belongs belongs to to the the class class of of pursuit-evasion pursuit-evasion Abstract: Abstract: The pursuers differential game under consideration belongs to the class of pursuit-evasion games in which are less than targets. Namely, the differential game of games in which pursuers are less than targets. Namely, the differential game of one one pursuer pursuer Abstract: The pursuers differential consideration belongs to is thefalse, class of pursuit-evasion games in which aregame less under than targets. Namely, the differential game of one pursuer against a coalition of two coherently dodging targets, one of which is considered on the the against a coalition of two coherently dodging targets, one of which is false, is considered on games in which pursuers are lessclassification than targets. Namely, the differential game ofmotion. one pursuer against a coalition of two coherently dodging targets, oneThe of which is false, is considered onEach the plane. The probabilities of targets are given. pursuer has a simple plane. The probabilities of targets classification are given. pursuer has motion. against a coalition two dodging targets, oneThe ofturning which isradius. false, isThe considered onEach the plane. probabilities of coherently targets areallowable given. The pursuer has aa simple simple motion. Each of the the The targets has aaofrestriction restriction onclassification the minimum main criterion of targets has on the minimum allowable turning radius. The main criterion plane. The probabilities of targets classification areto given. Theturning pursuer has a simple motion. Each of the targets has a restriction on the minimum allowable radius. The main criterion is the mathematical expectation of the distance the true target at a terminal point in time is the mathematical expectation of the minimum distance to the trueturning target at a terminal pointcriterion in time of theismathematical targets hasinaadvance restriction allowable radius. The main is the expectation of the the distance to the true target at a terminal point time that not fixed fixed andon chosen by the the pursuer pursuer during the pursuit pursuit process. Theinsaddle saddle that is not in advance and chosen by during the process. The is theisof mathematical expectation of the distance to theIllustrative true target at a terminal point time that not fixed ininadvance and strategies chosen by was the found. pursuer during theexamples pursuit process. Theinsaddle point the game closed-loop are given. point of thefixed game in closed-loop are given. that and strategies chosen by was the found. pursuerIllustrative during theexamples pursuit process. The saddle pointisofnot the gameininadvance closed-loop strategies was found. Illustrative examples are given. © 2018, (International Federation of Automatic Control)Illustrative Hosting by examples Elsevier Ltd. Allgiven. rights reserved. point ofIFAC thepursuit-evasion game in closed-loop strategies found. are Keywords: differential game,was false targets. Keywords: pursuit-evasion differential game, false targets. Keywords: pursuit-evasion differential game, false targets. Keywords: pursuit-evasion differential game, false targets. 0 ˙ 1. Z 1. INTRODUCTION INTRODUCTION = vvi r[ψ r[ψi ]] − − u, u, Z Zi (0) (0) = =Z Zi0 ,, Z˙˙ ii = (1) 1. INTRODUCTION Z˙ i = vii r[ψii ] − u, Zii (0) = Z0ii0 , (1) ψ vvi w ψ 1. INTRODUCTION i ,, i ] − u, (1) r[ψ Ziii(0) (0) = =ψ Zii00i0,,, Z˙˙ ii = ψ = w ψ (0) = ψ The considered formulation belongs to the class of difi i The considered formulation belongs to the class of difψi = vi wi , ψi (0) = ψi , (1) The considered formulation belongs the class of The dif- where 0 components ferential pursuit-evasion games with group target. with the i (t) vi wis ψi (0) =ψ ferential pursuit-evasion games with a ato group target. The where Z Zii = =ψ˙ iZ Z= isi , 2D-vector 2D-vector with the i (t) i , components The considered formulation belongs to the class of differential pursuit-evasion games with a group target. The where ZZi iy= with components specificity of that pursuers are less {Z (t)} directed from athe current instant i (t) is 2D-vector ix (t), i at specificitypursuit-evasion of these these tasks tasks is is that the the pursuers are less than than (t)}Z from P P to to E E current instant ix (t),ZZiy= i at a ferential games withpursuers a group target. The {Z where Zdirected with the components i by i (t) is 2D-vector specificity of these tasks is that the are less than {Z (t), Z (t)} directed from P to E at a current instant the targets, and among the targets may be false. These t (Fig.1, superscripts 0 and t are marked the initial ix iy i the targets, and among the targets may be false. These t (Fig.1, by superscripts 0 and t are marked the initial specificity of and theseamong tasks is that the pursuers are less than {Z (t), Z (t)} directed from P to E at a current instant ix iy i the targets, the targets may be false. These t (Fig.1, by superscripts 0 and t are marked the initial problems, in turn, are subdivided into differential games of r[ψ current positions of the players); r[ψ i ]] = i (t)] problems, in and turn,among are subdivided intomay differential games of and = r[ψ (t)] and current positions of the players); r[ψ i i the targets, the targets be false. These t (Fig.1, by superscripts 0 and t are marked the initial problems, in turn, are subdivided into differential games of consecutive and joint pursuit. In the first class, as a rule, ] = r[ψ (t)] and current positions of the players); r[ψ {cos ψ (t), sin ψ (t)} is a unite vector directed along a i i i (t), sin ψi (t)} is a unite vector directed along consecutive and joint pursuit. In the first class, as a rule, {cos ψ a i i problems, in turn, are subdivided into differential games of consecutive and joint pursuit. In the first class, as a rule, ] = r[ψ (t)] and current positions of the players); r[ψ the task of the pursuer includes the sequential capture i i {cos ψ (t), sin ψ (t)} is a unite vector directed along a velocity of the target E forming an angle ψ (t) with the i i i i the task of the pursuer includes the sequential capture velocity of the target E forming an angle ψ (t) with the i consecutive joint pursuit. In the first class, ascapture a rule, the task of and theor pursuer includes the sequential {cos ψOX sin ψitarget (t)}fixed isEiia forming unite vector directed along a of all purpose all true targets, if the pursuer has the i (t), velocity of the an angle ψ (t) with the axis of some coordinate system XOY . Here i of all purpose or all true targets, if the pursuer has the axis OXofofthesome fixed coordinate system XOY . Here the task ofclassify theorpursuer includes the sequential capture of all purpose all true targets, if the pursuer has the velocity target E forming an angle ψ (t) with the ability to the target from a set distance or by axis OX of some fixed coordinate system XOY . Here i i ”equal by definition”. The player P ability to classify the target fromifathe set distancehas or the by means ”equal definition”. Thesystem playerXOY P controls controls of all purpose or all targets, ability to classify thetrue target from a set pursuer distance ortask by axismeans OX of someby fixed coordinate . Here direct contact. In the case of joint prosecution, the means ”equal by definition”. The player E Pi controls its velocity vector u = u(t), and the target direct contact. In the case of joint prosecution, the task controls its velocity vector u = u(t), and the target E i ability to classify the target from a set distance or by direct contact. In the case of joint prosecution, the task of the persecutor is to approach directly with the group of means ”equal by definition”. The player P controls its velocity vector u = u(t), and the target E v = v (t) and w = w (t) is its velocity vector module i of the persecutor isthe to approach directly with the the group of vii = vii (t) and wii = wii (t) is its velocity vector module and and direct contact. Inis caseinofboth joint prosecution, task of the persecutor totime, approach directly with the group of its targets. At the same the first and the second vector uwi= the target Esubject i controls =velocity vi (t) and wi = (t)u(t), is its and velocity vector module and aavi track curvature respectively. The controls are to targets. At the same time, in both the first and the second track curvature respectively. The controls are subject to of the persecutor isfor totime, approach directly with of avi track targets. At criteria the same in both first andthe thegroup second cases, the optimizing the process of pursuit=restrictions vi (t) and wi = wi (t) is its The velocity vector module and curvature respectively. controls are subject to the cases, the criteria for optimizing the process of pursuitthe restrictions targets. At the same time, in both the first and the second cases, the criteria for optimizing pursuitevasion of the ”Time” or ”Miss” type typeprocess can be beofused. used. For the a track curvature respectively. The controls are subject to restrictions evasion of the ”Time” or ”Miss” can For E vvi ∈ [0, βi ], w cases, forthe optimizing the ofused. pursuitevasionthe of criteria the ”Time” or ”Miss” typeprocess can bethe For the restrictions example, of of targets Eii :: wii ∈ ∈ [−κ [−κii ,, κ κii ], ], i ∈ [0, βi ], example,ofminimizing minimizing theortime time of pursuit pursuit of all all the targets E : v ∈ [0, β ], w ∈ [−κ , κi ], evasion the ”Time” ”Miss” type can bethe used. For i i i i i example, minimizing the time of pursuit of all the targets or minimize the mathematical expectation of penalty P : |u| ≤ 1, or minimize the mathematical expectation of the penalty P i :: |u| ≤ [0, 1, β ], E vi ∈ wi ∈ [−κi , κi ], example, minimizing the the time of pursuit of of all the thei.e. targets or minimize the mathematical expectation penalty to true etc. formulation, with ≤ 1, i to the the true target, target, etc. In In the simplest simplest formulation, i.e. with where β P< :1 and|u| κ are given constants. or minimize the mathematical expectation of the penalty to the true target, etc. In the simplest formulation, i.e.goals with where βii P< :1 and|u| simple (inertialess) movements of players, κi ≤ are1,given constants. simple (inertialess) movements of the the players, the the goals where βi < 1 and κii are given constants. to the true target, etc. Inthe the simplest formulation, i.e.goals with simple (inertialess) movements of the players, the of the joint pursuit of two objectives considered in The payoff functional (criterion) of the joint pursuit movements of the two of objectives considered in where 1 and κi are given constants. βi < functional payoff (criterion) simple (inertialess) theofplayers, thepursuit goals of the joint pursuitThe of simplest the two tasks objectives considered in The Ol’shansky (1974). alternate The payoff functional (criterion) Ol’shansky (1974). The simplest tasks of alternate pursuit of the joint pursuit of the two tasks objectives considered in G max ,, Ol’shansky (1974). simplest of alternate pursuit 1 (T 2 (T 1 |Z of were investigated in (1980), GZ Zpayoff (T ), ), Z Zfunctional (T )) p p(criterion) |Z1 (T (T )| )| + +p p2 |Z |Z2 (T (T )| )| → → min min max of two two targets targets wereThe investigated in Abramyants Abramyants (1980), The P E 1 ,E2 Ol’shansky (1974). The simplest tasks of alternate pursuit P E GZ11 (T ), Z22 (T ) p11 |Z11 (T )| + p22 |Z22 (T )| → min max 1 ,E2 , of two targets were investigated in Abramyants (1980), Breakwell (1979). The main idea behind the introduction Breakwell (1979). Theinvestigated main idea behind the introduction P E 1 ,E2 , G Z (T1 ),and Z (T2 )= p |Z (T )| + p2probabilities |Z2 (T )| → min max of targets of two targets were in Abramyants (1980), where Breakwell (1979). The main idea behind the introduction of joint pursuit is that joint pursuit brings the pursuer p2 = 11 − −1p p11 1are are given given probabilities where1 p p1 and2 p Pof targets E1 ,E2 of joint pursuit is that joint pursuit brings the pursuer and p = 1 − p are given probabilities of targets where p classification. The criterion makes sense Breakwell (1979).isThe main behind the introduction 1 2 1 of joint that jointidea pursuit brings closer to the of as and thus brings The criterion makes sense of of the theofdistance distance closer to pursuit the system system of targets targets as a a whole, whole, andthe thuspursuer brings classification. − p1 are given probabilities targets where p1 and pexpectation 2 = 1criterion classification. The makes sense of at thethe distance mathematical to the true target end of joint pursuit istrue that joint pursuit brings the pursuer closer to the system of targets as a whole, and thus brings him closer to the target. mathematical expectation to the true target at the end of of him closer to the true target. The criterion makes sense oftargets thethe distance mathematical expectation to the true target at end of2 the game. P is a minimizing player and the E closer to the of target. targets as a whole, and thus brings classification. 1 ,, E him closer tosystem the true the game. P is a minimizing player and the targets E E 1 2 mathematical expectation to the true target at the end of the game. P is a minimizing player and the targets E , E play a role of maximizing player. The terminal time T is him closer to the true target. 1 2 play a role of maximizing player. The terminal time T is 2. the game. Pof ismaximizing a minimizing player andterminal the targets E1T, Eis2 play a role player. The time not fixed in advanced, unknown to the targets and chosen 2. PROBLEM PROBLEM STATEMENT STATEMENT not fixed in of advanced, unknown the terminal targets and chosen 2. PROBLEM STATEMENT playthe a role maximizing player.to time T is not fixed in advanced, unknown toThe the targets and chosen by pursuer during pursuit process. by the pursuer during the the pursuit process. 2. PROBLEM STATEMENT not fixed in advanced, unknown to the targets and chosen Consider a planar pursuit-evasion differential game of one by the pursuer during the pursuit process. Consider a planar pursuit-evasion differential game of one Consider a against planar pursuit-evasion differential gameevading of one by the pursuer during the pursuit process. pursuer a consistently pursuer P Pa against a coalition coalition of of two two consistently evading 3. Consider planar pursuit-evasion differential game of one pursuer P against a coalition of two consistently evading targets E and E , one of which is false. The players 1 and E2 , one of which is false. The players 3. PROBLEM PROBLEM SOLUTION SOLUTION targets E 2a coalition of two consistently evading 3. PROBLEM SOLUTION pursuer P11 against targets E and E , one of which is false. The players motion describes by the equations (here and later i = 1, 2) 2 motion describes by the equations (here and later i = 1, 2) 3. PROBLEM SOLUTION targets E1 and E oneequations of which(here is false. The i players motion describes by2 ,the and later = 1, 2) Following Following by by Pontryagin’s Pontryagin’s maximum maximum principal principal for for dynamic dynamic motion describes by the equations (here and later i = 1, 2) ⋆ Following by Pontryagin’s maximum principal for dynamic games Bryson (1980), consider Hamiltonian of the system work was partially supported by Russian Foundation for ⋆ This games Bryson (1980), consider Hamiltonian of the system This work was partially supported by Russian Foundation for Following by Pontryagin’s maximum principaloffor dynamic ⋆ games Bryson (1980), consider Hamiltonian the system (1) (time argument t is omitted) Basic Research under grant 16-08-01076. This work was partially supported by Russian Foundation for (1) (time argument t is omitted) Basic Research under grant 16-08-01076. ⋆ games Bryson (1980), consider Hamiltonian of the system ThisResearch work was partially supported by Russian Foundation for (1) (time argument t is omitted) Basic under grant 16-08-01076. (1) (time argument t is omitted) Basic Research under grant 16-08-01076. 2405-8963 © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Copyright © 2018 IFAC 503 Copyright 2018 responsibility IFAC 503Control. Peer review©under of International Federation of Automatic Copyright © 2018 IFAC 503 10.1016/j.ifacol.2018.11.471 Copyright © 2018 IFAC 503
IFAC CAO 2018 504 Evgeny Ja. Rubinovich / IFAC PapersOnLine 51-32 (2018) 503–508 Yekaterinburg, Russia, October 15-19, 2018
H=
2 vi λi · r[ψi ] + µi wi − (λ1 + λ2 ) · u → max min , P
i=1
E1 ,E2
where symbol · denotes the scalar product of vectors in 2D. Here the conjugate variables λi = λi (t) and µi = µi (t) satisfy the system of differential equations λ˙ i = −∂H/∂Zi = 0, i.e λi = const, µ˙ i = −∂H/∂ψi = −vi [−λix sin ψi + λiy cos ψi ].
(2) (3)
A saddle-point of Hamiltonian H is realized by u0 = −(λ1 + λ2 )|λ1 + λ2 |−1 , wi0 vi0 =
0, βi ,
(4)
= −κi sign µi ,
if
(5)
λi · r[ψi ] + µi wi ≥ 0,
if
(6)
λi · r[ψi ] + µi wi < 0.
It follows from (2) and (4) that P moves along a straight line with a maximum admissible speed. Then, there are two possible cases only: a) Z1 (T ) �= 0, Z2 (T ) �= 0, b) Z1 (T ) = 0, Z2 (T ) �= 0 (resp. Z1 (T ) �= 0, Z2 (T ) = 0). In case a) the terminal transversality conditions give 2 δG + λi · δZi + µi δψi − Hδt = 0, (7) T
i=1
It follows from this Theorem 6. Whenever Z1 (T ) �= 0 and Z2 (T ) �= 0 a saddle-point of the game is implemented on the following movements of the players. P moves along a straight line with maximum speed. A movement of the target Ei consists on a sharp turn maneuver (STM) with maximum admissible track curvature κi and SM movement along a straight line with maximum speed βi . Continue the analysis of the problem. It follows from 3∗ (Lemma 2) and transversality conditions (8)–(10) that at the moment T the target’s velocities directed from the point P T (the terminal position of the pursuer). Using this fact and (10) we get a formula for evading angles αi from the pursuit direction: cos α1 = (β 2 + p1 − p2 )/(2βp1 ), (15) cos α2 = (β 2 + p2 − p1 )/(2βp2 ),
where δG = p1 e1 · δZ1 + p2 e2 · δZ2 , ei Zi (T )|Zi (T )|−1 . It follows from the transversality conditions that λi = −pi ei , (8) µi (T ) = 0, (9) H(T ) = 0. (10) Note, H(t) = H(T ) = 0 inasmuch as the right hand side (1) not dependent on t. Now we investigate the possibility of special modes by controls vi , wi of the player Ei . Let vi �≡ 0, then SM is possible 1) by the components vi and wi concurrently under µi = 0, and λi · r[ψi ] = 0, i.e. λi ⊥r[ψi ]; 2) by the component wi only under µi = 0; 3) by the component vi only under λi · r[ψi ] + µi wi = 0.
The cases 1)–3) are described by lemmas 1 – 4 (proof of lemmas in the Appendix A). Lemma 1. Case 1) of SM initiation simultaneously by two components is impossible. Lemma 2. SM by component wi is characterized by the conditions: 1∗ . wi SM = 0; 2∗ . SM takes place only at the final stage of the movement, that is once a SM, the movement will not leave it until the end of the control process; 3∗ . vi∗ = βi , moreover, the vectors λi and r[ψi ] are directed opposite. Corollary 3. From Lemma 2 and (5) it follows that wi0 ∈ {−κi , 0, κi }. Lemma 4. Case 3) of SM initiation by vi component is impossible. Corollary 5. Starting to move at t = 0 the target Ei moves with maximum speed vi = βi to the end of the game.
(SM)
where β p1 β1 + p2 β2 . It follows from the restriction −1 ≤ cos αi ≤ 1 that (15) takes place when β ≥ max{p1 − p2 , p2 − p1 }. (16) Hence, if SM takes place, the players optimal trajectories structure at the last stage of the movement (after STM) coincides with analogous structure for a simple motion problem statement Ol’shansky (1974) (see Fig. 2 and Fig. 3 where τi is a time of the end of STM by the target Ei ). Fig.
(11) (12) (13)
o
T
E2
Y
(14)
E2
o
o
E20
1
2
P 0 oP o
2
PT
1
o
P o
E10 o
1
E1
o
X o
E1T
Fig. 2. Optimal players trajectories; β1 = β2 = 0.6; p1 = 0.6
3 demonstrates so-called diversionary maneuver, when E1 , moving initially to P T point, manages to pass it before the time T . For some initial positions E20 , using evading angles
Fig. 1. Problem geometry 504
IFAC CAO 2018 Yekaterinburg, Russia, October 15-19, 2018 Evgeny Ja. Rubinovich / IFAC PapersOnLine 51-32 (2018) 503–508
o
Y
ψi (T ) = ψi0 + σi βi κi T, (18) � � −1 0 0 0 Zi (T ) = Zi + σi κi r⊥ [ψi (T )] − r⊥ [ψi ] − u T, (19) 2) under SM of the target Ei presence (τi is a time interval of STM for Ei ) � σi βi κi t, for t ≤ τi , 0 ψi (t) = ψi + σi βi κi τi , for t > τi .
T 2
E
o
E20 P0 o
o
2
P
E10o
E2
o
1
o
E1
X
2
o
1
P
o
PT
1 T o E1
Let ei = Zi (T )|Zi (T )|−1 {cos γi , sin γi }, that is cos γi = Zix (T )�Z 2 (T ) + Z 2 (T )�−1/2 , iy ix sin γ = Z (T )�Z 2 (T ) + Z 2 (T )�−1/2 . i
Fig. 3. Diversionary maneuver; β1 = β2 = 0.6; p1 = 0.6
From the transversality conditions (8), (10) we have
i=1
2 �
pi ei |.
(20)
iy
ix
ψi (T ) = ψi0 + σi βi κi τi = γi . (21) By integrating (1) we have � � 0 0 Zi (T ) = Zi0 +σi κ−1 i r⊥ [ψi (T )]−r⊥ [ψi ] +(T −τi )βi r[γi ]−u T (22) where τi = (γi − ψi0 )/(σi βi κi ) with σi �= 0. In view of introduced designation for γi we get from (4), (8) and (17) 2 � pi r[γi ], (23) u0 = n−1
Note that in this problem may occur another situation (not arising in the simplest case), characterized by the fact that at the time T of arrival of the pursuer in the point P T built taking into account SM (i.e., according to (15)) one of the targets (or both targets) do not have time to finish STM and get on SM (i.e. the motion along a straight line). Two cases respond to this situation. The first is characterized by the fact that the order makes sense to move (regime 3, (R3)), i.e. the targets carry out STM with vi = βi . The second is characterized by the fact that one (or both) of the target has the sense to stand, i.e. vi ≡ 0 (regime 4, (R4)). In this a moving target makes STM out (if time) on a SM. If the moving target E1 (E2 ) time to enter a SM that relations(15) are true, they only need to put β2 = 0 (β1 = 0). For stationary target and the target does not have time to take STM, formulas (15) lose their validity. 0 , e depend on the initial In these cases, the angles αi u i 0 0 data of the problem (Zi , ψi ), which follows from the integration of motion equations taking into account the transversality conditions. Let’s perform this integration by denoting r⊥ [ψi ] = r⊥ [ψi (t)] {sin ψi (t), − cos ψi (t)} and introducing target’s Ei rotation index σi 1 for the left STM, σi −1 for the right STM, 0 for immovable target. σi2 βi pi ei ·r[ψi (T )] = |
iy
Then, in view of (8), we fined
(15) is not optimal strategy for the players because it is possible a paradoxical situation when the target E1 does not have time to arrive at point P T before the pursuer. This situation we call as a regime of the first kind (R1). It corresponds to the case b) Z1 (T ) = 0, Z2 (T ) �= 0 and inequality (16) is fulfilled. When (16) is not fulfilled (regime of the second kind (R2)) the case b) takes place in the form: Z1 (T ) = 0, if p1 > p2 or Z2 (T ) = 0, if p2 > p1 . Unlike R1, R2 can occur under any initial positions E20 . Physically, this means that the probabilistic characteristics of the targets, rather than the geometric ones, play the first role, and the meeting with target Ei takes place because of the high probability that it is the true target (see Ol’shansky (1974)).
2 �
505
(17)
i=1
By integrating (1) from 0 to T taking into account (4) and (8) we get 1) under the absence of SM
i=1
where
� �� 2 2 �2 � � �2 � � pi sin γi = pi cos γi + n=� i=1
=
2 � i=1
(24)
i=1
� � σi2 βi pi cos γi cos ψi (T ) + sin γi sin ψi (T ) .
Expressing with the help of (20), (21), (23), (24) the variables cos γi , sin γi , ψi (T ), n, u0 through Zix (T ), Ziy (T ), we get after substitution in (22) and (24) the system of five variables (coordinates Zix , Ziy and an instant T ), which gives a problem solution in the case 2). By comparing (21), (22) with (18), (19) we draw a conclusion that in the case 1) we need to put τi = T in (20)–(24) and to calculate ψi (T ) by (18) instead of (21). If the target Ei is stationary, then in (20)–(24) we put τi = T , σi = 0 and ψi (T ) = ψi0 . 3) The case σ1 = σ2 = 0 (both targets stand still) – is a degenerate case. Here |p1 e1 + p2 e2 | = 0, i.e. vectors e1 and e2 are directed opposite. This case corresponds to SM by control u. Let us consider this case. If vectors e1 and e2 are directed opposite and the targets stand still, then P T belongs to the segment [E10 , E20 ] of length a. Let a length of E10 , P T is equal to ξ. Then |Z1 (T )| = ξ and |Z2 (T )| = a−ξ, so our problem is reduced to minimizing (by ξ ∈ [0, a]) the linear function G = p1 ξ + p2 (a − ξ) = (p1 − p2 )ξ + p2 a. The required minimum ξ 0 is achieved at one end of the segment [0, a]: 0, if p1 > p2 (meeting P with E1 ), 0 ξ = a, if p1 < p2 (meeting P with E2 ), any in [0, a], if p1 = p2 .
We just need to consider the case b) Z1 (T ) = 0, Z2 (T ) �= 0 (the case Z1 (T ) �= 0, Z2 (T ) = 0 is considered analogously)
505
IFAC CAO 2018 506 Evgeny Ja. Rubinovich / IFAC PapersOnLine 51-32 (2018) 503–508 Yekaterinburg, Russia, October 15-19, 2018
that means a meeting P and E1 . In this case the transversality conditions take the form (7) were δG = e2 · δZ2 , δZ1 = 0, and hence λ2 = −e2 , µi (T ) = 0, H(T ) = 0. It gives (see (4)) u0 = −(λ1 − e2 )|λ1 − e2 |−1 . As in the case a) it is not difficult to show that if SM by wi takes place that Lemma 2 is true. Define now λ1 −l cos γ1 , sin γ1 , e2 −l cos γ2 , sin γ2 . Then u0 = n−1 l cos γ1 + cos γ2 , l sin γ1 + sin γ2 , where 2 2 −1/2 n |e2 −λ1 | = l cos γ1 +cos γ2 + l sin γ1 +sin γ2 . By integrating (1) we get the system (18), (19) if SM is absence, and the system (21), (22) if SM takes place. It follows from H(T ) = 0 (transversality condition) that 2 n=l βi σi2 cos γi − ψi (T ) .
It is clear that for a sufficiently large value of |Z 0 |, the targets manage to make STM and take SM with cos α = β (see (15)). Let Z 0 be such that SM takes place. Then the following three cases are possible. At the initial time, P 0 may be: behind, at the side or ahead of the target. The sets of points P 0 in these cases form the zones A, B and D (Fig. 4). The boundaries of the zones will be defined below. Here we note that in zones A and D σ1 σ2 = −1, i.e. one goal is performed by the right one and the other by the left STM. In B zone σ1 σ2 = 1, i.e. both targets are made of the same name (right) STM. In zones A, B and D the following lemma holds (about the localization of aiming point P T ). Lemma 7. 1. For zone A (Fig. 5), the point P T lies on the inner lobe of the Pascal’s limacon with a pole at point Q1 = {tan α, 0} with α = arccos β and the parametric equation (t – parameter)
i=1
Obtained system is analogous to the system (20), (24) with the only difference that in it instead of unknown variables Z1x (T ), Z1x (T ) (which are equal to 0 in the case b)) appeared unknown variables l and γ1 . In conclusion, we note that at the meeting with the target E1 , it is possible that v1 ≡ 0 or v2 ≡ 0 (case v1 ≡ v2 ≡ 0 has already been dealt with).
So, the structure of optimal maneuvers is revealed. The choice of the optimal maneuver corresponding to the given initial conditions is recommended as follows. For each maneuver, the value of criterion G is calculated from the above final set of admissible maneuvers. The optimal maneuver corresponds to the largest value of the criterion.
Fig. 5. P 0 ∈ A,
4. THE CASE OF IDENTICAL TARGETS Let the targets be identical, i.e. β1 = β2 β < 1, p1 = p2 = 0.5, κ1 = κ2 = 1, and have the same initial positions Z10 = Z20 Z 0 E10 = E20 E 0 , ψ10 = ψ20 ψ 0 . Let XOY be fixed coordinate system in which the origin O coincides with E 0 and ψ 0 = 0 (Fig. 4). In this coordinate
β = 0.6, P T – the aiming point
x(t) = (2 cos α cos t − cos 2t − cos 2α)/ sin 2α, y(t) = (−2 cos α sin t + sin 2t)/ sin 2α.
(25)
2. For zone D (Fig. 6), the point P T lies on the outer lobe of the Pascal’s limacon with a pole at point Q2 = {− tan α, 0} and the parametric equation
Fig. 6. P 0 ∈ D,
Fig. 4. Game space zoning, β1 = β2 = 0.6, p1 = p2 = 0.5 system, the space of initial conditions is two-dimensional and represents the region of possible initial positions of the pursuer {Zx0 , Zy0 }. So, it is possible to construct a partition of the upper (similarly – lower) half-plane into the zones of determination of the players’ maneuvers. 506
β = 0.6, P T – the aiming point
x(t) = (2 cos α cos t + cos 2t + cos 2α)/ sin 2α, y(t) = (2 cos α sin t + sin 2t)/ sin 2α.
(26)
3. For zone B (Fig. 7) the point P T lies on a circle of radius 1/β with center at point O2 = {0, −1} and by equation x2 + (y + 1)2 = β −2 .
(27)
IFAC CAO 2018 Yekaterinburg, Russia, October 15-19, 2018 Evgeny Ja. Rubinovich / IFAC PapersOnLine 51-32 (2018) 503–508
507
5. CONCLUSION
The differential game of joint pursuit of two targets, one of which is false, is considered on the plane with given classification probabilities and restrictions on the turn. The case of identical targets is analyzed in detail. A further numerical study of the players’ behavior in the case of nonidentical targets is of interest.
REFERENCES
V.K. Ol’shansky and Ye.Ya. Rubinovich. Simplest Differential Games of Pursuing a Systrm of two Plants. Automation and Remote Control, 1:24–34, 1974. T.G. Abramyants, E.P. Maslov and E.Ya. Rubinovich. A Simplest Differential Games of Alternate Pursuer. Automation and Remote Control, 8:5–15, 1980. J.V. Breakwell and P. Hagedorn. Point Capture of Two Evaders in Succession. Journal of Optimization Theory and Applications, 27:, 1980. A.E. Bryson and Ho Yu-Chi. Applied optimal control. Toronto, London: Blaisdell Publishing Company, 1969.
Fig. 7. P 0 ∈ B,
β = 0.6, P T – the aiming point
The position of point P T for zones A, B and D defines a following lemma on the direction of pursuit. Lemma 8. The velocity vector of the pursuer P is directed in zones A and D to the poles of limacons Q1 and Q2 , respectively, and in zone B - along a tangent to the circle (27).
Appendix A. PROOF OF LEMMAS A.1 Proof of Lemma 1 By differentiating (11) and (12) with respect to time by virtue of the system (1), (2) and (3), we find
Define now the boundaries between zones A, B and D. Lemma 9. The boundary ΓAB between zones A and B coincides with a straight line passing through the pole Q1 at an angle π − α to the axis OX. (Fig. 7)
µ˙ i = vi [−λix sin ψi + λiy cos ψi ] = 0, [−λix sin ψi + λiy cos ψi ]vi wi = 0.
For the construction of the boundary ΓBD between zones B and D, only the geometric characteristics of the aiming points are insufficient, since from the points of zone B and from the points of zone D one can simultaneously draw both a straight line through the pole Q2 and a tangent to circle (27). Therefore, the boundary ΓBD is constructed numerically as the locus of points P 0 = {x, y} satisfying the equation GB (x, y) = GD (x, y), where GB (x, y) and GD (x, y) are the payoffs in zones B and D corresponding to point P 0 = {x, y} (Fig. 4). We continue the partition of the space of the player’s P starting positions. As noted above, for some initial positions (when P 0 is close to E 0 ) the targets do not have time to complete the STM by the time of arrival of P at point P T constructed according to the rules of zones A, B and D. The totality of such points P 0 forms the zone C. The boundary Γ of zone C is the geometric locus of the points P 0 for which the pursuit time is equal to the time of the end of STM (the pieces of Γ passing through zones A, B and D are noted as ΓA , ΓB and ΓD . The zone C by boundaries γA , γB and γD is divided into subzones CA , CB , CD and CE (Fig. 4). When P0 ∈ CE the meeting of P and E1 occurs. The numerical construction of the boundaries Γ, γA , γB , γD and the optimal trajectories of the players in zone C for different values of β is due to the analysis of a number of particular cases and is not presented as it is of a technical nature. 507
It follows from (A.1) −λix sin ψi + λiy cos ψi = 0, i.e. λi � r[ψi ], this contradicts (12). Lemma is proved.
(A.1)
(A.2)
A.2 Proof of Lemma 2 The differentiation of (13) with respect to t by virtue of the system (1), (2) and (3) gives the equality (A.1). It follows from this (A.2) (due to vi �= 0). By differentiating (A.2) with respect to time by virtue of the same system, we obtain [λix cos ψi + λiy sin ψi ]ψ˙ i = 0. Since, according to (A.2), equality λix cos ψi + λiy sin ψi = 0 is impossible, it remains ψ˙ i = vi wi = 0, that is wiSM = 0 (condition 1∗ ) and ψiSM = const. But due to (A.2) and (3) µ˙ = 0, in addition by (13) µi = 0. Therefore, taking into account (9), we conclude that once on SM, the movement has not come to the end of the control process. So, if SM takes place, it is only at the final stage of the movement (condition 2∗ ). Next, we note that under SM by wi (i.e. if wi = 0), the equality ψiSM = const is fulfilled. Hence, λi · r[ψi ] = const �= 0 and by (A.2) λi � r[ψi ]. It follows from this (in view of (6)) that under SM vi = βi and the vectors λi and r[ψi ] are directed oppositely (condition 3∗ ). Lemma is proved.
IFAC CAO 2018 508 Evgeny Ja. Rubinovich / IFAC PapersOnLine 51-32 (2018) 503–508 Yekaterinburg, Russia, October 15-19, 2018
Let’s rotate tangents around O1 and O2 with a single angular velocity. Then ψ1 (t) = α − t, ψ2 (t) = α + t. (A.6) Solving the system (A.5) and (A.6) with respect to x = x(t) and y = y(t), we obtain formulas (25).
A.3 Proof of Lemma 4 We show that if at time t0 = 0 vi0 (0) = βi is executed, i.e. the second inequality (6) is executed, then this inequality will remain until the end of the game. This would prove the impossibility of equality (14), i.e. SM under component vi .
The situation P 0 ∈ D depicted at Fig. 6. To get formulas (26) it is sufficiently in (25) to change α by π − α and t by −t.
So, let vi0 (0) = βi , i.e. the second inequality (6) is executed at t0 = 0. Select a distant scale so that βi κi = 1, where i is a fixed number of the considered target. Then, denoting σi � sign µ(t), we get wi0 = −σi κi = 1, (A.3) ˙ (A.4) ψi = −σi . Let t1 be the first moment when the second inequality (6) turns into equality and let t2 be the first moment when the function µi (t) turns into zero. Define τ � min{t1 , t2 } and show that on the segment [0, τ ] the function f � λi · r[ψi ] + µi wi = const. Indeed, when t ∈ [0, τ ], it follows from (A.4) that ψi (t) = ψi (0) − σi t. Integrating (3) in view of the last equality, we get � � µi (t) = µi (0) + σi βi λi · r[ψi (t)] − r[ψi (0)] . By substituting µi (t) into f in view of (A.3) and Lemma 1, we fined � � �� f = λi · r[ψi ] − σi κi µi (0) + σi βi λi · r[ψi (t)] − r[ψi (0)] =
The situation P 0 ∈ B depicted at Fig. 7. The geometrical location of the points from which the circle O2 is visible at a given angle � π − �2α is obviously a concentric circle of radius 1/ sin π/2−α = 1/ cos α = 1/β. Lemma is proved. A.5 Proof of Lemma 8
1. Let P 0 ∈ D (Fig. 6) and let P T be an aiming point, corresponds to P 0 . At point P T , the tangents E1T P T and E2T P T intersect at angle 2α, and the direction of pursuit P 0 P T passes along the bisector of angle ∠E1T P T E2T . Denote by M, N and Q intersection points with axis OX by tangent E1T P T , E2T P T and straight line P 0 P T respectively. Then, according to the theorem of sinuses, from triangles △P T QM and △P T N M we obtain � � �� l/ sin(π − ψ2 − α) = q + tan ψ1 /2 / sin(π − 2α + α), � � � � �� l/ sin(π − ψ2 ) = tan ψ1 /2 − tan ψ2 /2 /sin(π−2α), (A.7) with cos α = β, � (A.8)� ψ1 + ψ2 = 2π = 2α, T where � by �l and q denoted lengths of the segments P , M and 0, Q respectively. From (A.7) we get � � � � � �� � sin ψ2 tan ψ1 /2 −tan ψ2 /2 . q+tan ψ1 /2 = 2β sin(ψ2 + α) (A.9) From (A.9) in view of (A.8) we have q = tan α = const. � � So, the length of the segment 0, Q = q = tan α coincides with the distance from the origin to the pole of limacons, i.e. points Q and Q2 coincide. For P 0 ∈ A the proof is similar (Fig. 5).
= −|µi (0)|κi + βi λi · r[ψi (0)] = const < 0. From Lemma 1 (about the impossibility of SM at the same time for vi and wi ) it follows that the function f = λi · r[ψi ] − |µi |κi is continuous in t. It follows from the above that τ = t2 . When t > t2 (i.e. µi (t) changes a sign) we will carry out a similar reasoning, taking the moment t as the initial. So, if t > t2 f = −|µi (t2 )|κi + βi λi · r[ψi (t2 )] = f (t2 − 0) = = −|µi (0)|κi + βi λi · r[ψi (0)] = const < 0. This completes the proof. A.4 Proof of Lemma 7
2. Let P 0 ∈ B (Fig. 7). Since from point P T the circle O2 is seen at an angle π − 2α, the straight P T E2T cuts off the arc equal to, obviously, 2α. But the angle between P T E2T and the direction of pursuit is α, that is, the half of the cut-off arc, therefore, P 0 P T is a tangent to the circle (27). Lemma is proved.
From (8) and the fact that λi � r[ψi ] for SM (see (A.2)) it follows that after STM the targets move along tangents to the turning circles, and the tangents pass through the point P T . The angle between the tangents to the circles O1 and O2 is obviously 2α, and between the tangents to the circle O2 is equal to π − 2α (Fig. 5–7) Consequently, for zones A and D, the aiming point P T lies in the right half-plane and belongs to the geometric locus of points at which the tangents to the circles of STM (O1 and O2 ) intersect at a given angle 2α with cos α = β. For zone B, the point P T lies in the right half-plane and belongs to the geometric locus of the points, from which the right circle of STM (O2 ) is visible at an angle π − 2α. We derive the equations of these geometrical points. The situation P 0 ∈ A depicted at Fig. 5. The tangents to the circles O1 and O2 satisfy the equations � � � y = x − tan(ψ1 /2) tan ψ1 , � � (A.5) y = − x − tan(ψ2 /2) tan ψ2 .
A.6 Proof of Lemma 9 It is clear that for P 0 ∈ ΓAB the direction of pursuit must possess both the properties of zone A and the properties of zone B. Therefore, by Lemma 7, line P 0 P T must simultaneously pass through the pole Q1 of limacons (25) and be tangent to the circle (27). Let us show that a straight line passing through the pole Q1 at an angle π−α to axis OX will be tangent to the circle (27). To do this, it suffices to show that ∠OQ1 Q2 = π/2−α or, that ∠OQ2 Q1 = α. The latter follows from the fact that OQ1 = tan α (Lemma 7). Lemma 9 is proved. 508