Numerical Study of Two-on-One Pursuit-Evasion Game

Proceedings of the 18th World Congress The International Federation of Automatic Control Milano (Italy) August 28 - September 2, 2011 Numerical Study...

Download PDF

391KB Sizes 0 Downloads 47 Views

Report

PDF Reader
Full Text

Proceedings of the 18th World Congress The International Federation of Automatic Control Milano (Italy) August 28 - September 2, 2011

Numerical Study of Two-on-One Pursuit-Evasion Game S.A. Ganebny ∗ S.S. Kumkov ∗ S. Le M´ enec ∗∗ V.S. Patsko ∗ ∗ Institute of Mathematics and Mechanics, Ural Branch of Russian Academy of Sciences, Ekaterinburg, Russia (e-mail: {ganebny,kumkov.jr,patsko}@imm.uran.ru). ∗∗ EADS / MBDA, Paris, France (e-mail: [email protected]).

Abstract: A problem of pursuit-evasion with two pursuers and one evader is studied. The formulation of the problem was suggested in Le M´enec (2008, 2010). In that work, some cases were solved by analytic methods. This paper deals with the problem by means of numerical construction of solvability sets. Variants of the game are considered where application of analytic approaches can be very diﬃcult. Keywords: bang-bang control, computational methods, diﬀerential games, feedback control 1. INTRODUCTION

the pursuers are joined into one useful control. Such an investigation is the aim of this paper.

Nowadays, group pursuit-evasion games (several pursuers and/or several evaders) are studied intensively: Pschenichnyi (1976); Hagedorn and Breakwell (1976); Grigorenko (1991); Levchenkov and Pashkov (1990); Chikrii (1997); Abramyantz and Maslov (2004); Stipanovic et al. (2009); Blagodatskih and Petrov (2009). From a general point of view, a group pursuit-evasion game (without any hierarchy among players) can often be treated as an antagonistic diﬀerential game, where all pursuers are joined into a player, whose objective is to minimize some functional, and, similarly, all evaders are joined into another player, who is the opponent to the ﬁrst one. The theory of diﬀerential games gives an existence theorem for the value function of such a game. But, usually, any more concrete results (for example, concerning eﬀective constructing the value function) cannot be obtained. This is due to high dimension of the state vector of the corresponding game and absence of convexity of time sections of level sets (Lebesgue sets) of the value functions. Just these reasons can explain why group pursuit-evasion games are very diﬃcult and are investigated usually by means of speciﬁc methods and under very strict assumptions. In this paper, we investigate a pursuit-evasion game with two pursuers and one evader. Such a model formulation arises during analysis of a problem, where two aircrafts (or missiles) intercept another one in the horizontal plane. The peculiarity of the game explored in the paper is that solvability sets (the sets wherefrom the interception can be guaranteed with a miss, which is not greater than some given value) and optimal feedback controls can be build numerically in a one-to-one antagonistic game, where ⋆ This work is supported by the Russian Foundation for Basic Research (projects 09-01-00436 and 10-01-96006) and by the Program “Mathematical control theory” of the Presidium of the RAS (project 09-Π-1-1015).

978-3-902661-93-7/11/$20.00 © 2011 IFAC

2. FORMULATION OF PROBLEM We consider a game in the plane. At the initial instant, which is assumed to be zero, one evader (E) heads two pursuers (P 1 and P 2), see Fig. 1.

Fig. 1. The initial positions of the pursuers and evader Let us assume that initial closing velocities are parallel and quite large, and control accelerations aﬀect only lateral components of objects’ velocities. Thus, one can suppose that instants of passages of the evader by each of the pursuers are ﬁxed. Below, we call them the termination instants and denote by Tf 1 and Tf 2 , respectively. We consider both cases of equal and diﬀerent termination instants. The players’ controls deﬁne the lateral deviations of the evader from the ﬁrst and second pursuers at the termination instants. Minimum of the absolute values of these deviations is called the resulting miss. The objective of the pursuers is minimization of the resulting miss, the evader maximizes it. The pursuers generate their controls by a coordinated eﬀort (from one control center). In the relative linearized system, the dynamics is the following (see Le M´enec (2008, 2010)): y¨1 = −aP 1 + aE , y¨2 = −aP 2 + aE , a˙ P 1 = (AP 1 u1 − aP 1 )/lP 1 , a˙ P 2 = (AP 2 u2 − aP 2 )/lP 2 , a˙ E = (AE v − aE )/lE . (1)

9326

10.3182/20110828-6-IT-1002.00797

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

Here, y1 and y2 are the current lateral deviations of the evader from the ﬁrst and second pursuers; aP 1 , aP 2 , aE are the lateral accelerations of the pursuers and evader; u1 , u2 , v are the players’ controls; AP 1 , AP 2 , AE are the maximal values of the accelerations; lP 1 , lP 2 , lE are the time constants describing the inertiality of servomechanisms. The controls have bounded absolute values: |u1 | ≤ 1, |u2 | ≤ 1, |v| ≤ 1. (2) The linearized dynamics of all three objects in the problem under consideration is typical (see, for example, Shinar and Shima (2002)). Consider new coordinates x1 and x2 , which are the values of y1 and y2 forecasted to the corresponding termination instants Tf 1 and Tf 2 under zero players’ controls. One has xi = yi + y˙ i τi − aP i lP2 i h(τi /lP i )+ 2 h(τi /lE ), i = 1, 2. (3) aE lE Here, xi , yi , aP i , and aE depend on t, and τi = Tf i − t; h(α) = e−α + α − 1. We have xi (Tf i ) = yi (Tf i ). Passing to a new dynamics in the “equivalent” coordinates x1 and x2 (see Le M´enec (2008, 2010)), we obtain x˙ 1 = −AP 1 lP 1 h(τ1 /lP 1 )u1 + AE lE h(τ1 /lE )v, (4) x˙ 2 = −AP 2 lP 2 h(τ2 /lP 2 )u2 + AE lE h(τ2 /lE )v. Join both pursuers P 1 and P 2 into one player, which will be called the ﬁrst player. The evader E is the second player. The ﬁrst player governs the controls u1 and u2 ; the second one governs the control v. We introduce the following payoﬀ functional: ) ( ) ( φ x1 (Tf 1 ), x2 (Tf 2 ) = min x1 (Tf 1 ) , x2 (Tf 2 ) , (5) which is minimized by the ﬁrst player and maximized by the second one. Thus, we get a standard antagonistic game with dynamics (4) and payoﬀ functional (5). This game has the value function V (t, x), where x = (x1 , x2 ). Each level set { } Wc = (t, x) : V (t, x) ≤ c of the value function coincides with the maximal stable bridge (see Krasovskii and Subbotin (1974, 1988)) built from the target set { } Mc = (t, x) : t = Tf 1 , |x1 | ≤ c; t = Tf 2 , |x2 | ≤ c . The set Wc can be treated as the solvability set for the pursuit-evasion game with the result c. When c = 0, we have the situation of the exact capture. The exact capture implies equality to zero of at least one of yi at the instant Tf i , i = 1, 2. The main aim of this paper is to construct the sets Wc for the game under consideration. The diﬃculty of the problem is that time sections Wc (t) of these sets are nonconvex. Constructions are made by means of an algorithm for building maximal stable bridges worked out by the authors for problems with two-dimensional state variable. The algorithm is similar to one used in Patsko and Turova (2001). Another objective is to build optimal feedback controls of the ﬁrst player (that is, of the pursuers P 1 and P 2) and the second one (the evader E). The works Le M´enec (2008, 2010) consider only cases with exact captures and pursuers “stronger” than the evader. The latter means that the parameters AP i , AE and lP i , lE (i = 1, 2) are such that the maximal stable bridges

in the individual games (P 1 vs. E and P 2 vs. E) grow monotonically in the backward time. 3. MAXIMAL STABLE BRIDGE: CONTROL WITH DISCRIMINATION OF OPPONENT Let Tf 1 = Tf 2 . Denote for briefness Tf = Tf 1 . Below, MSB is an abbreviation for “maximal stable bridge”. MSB Wc is the set maximal by inclusion in the space t, x, t ≤ Tf and possessing the following stability property: for any position (t∗ , x∗ ) ∈ Wc (t∗ ), t∗ < Tf , any instant t∗ > t∗ , (t∗ ≤ Tf ), any constant control v of the second player, which obeys the (constraint )|v| ≤ 1, there is a measurable control t → u1 (t), u2 (t) of the ﬁrst player, t ∈ [t∗ , t∗ ), |u1 (t)| ≤ 1, |u2 (t)| ≤ 1, guiding system (4) from the state x∗ to the set Wc (t∗ ) at the instant t∗ . So, the stability property assumes discrimination of the second player by the ﬁrst one: the choice the ﬁrst player’s control in the interval [t∗ , t∗ ) is made after the second player announces his control in this interval. It is known (see Krasovskii and Subbotin (1974, 1988)) that any stable bridge Wc is close. Moreover, the set ( ) (2) Wc (t) = cl R2 \ Wc (t) (here, the symbol cl denotes the (2) operation of closure) is the time section of MSB Wc for the second player at the instant t. The bridge terminates (2) at the instant Tf on the set Mc = cl(R2 \ Mc ). If the (2) initial position of system (4) is in Wc and if the ﬁrst player is discriminated by the second one, then the second player is able to guide the motion of the system to the set (2) (2) Mc at the instant Tf . Thus, ∂Wc = ∂Wc . It is proved that for any initial position (t0 , x0 ) ∈ ∂Wc , the value c is the best guaranteed result for the ﬁrst (second) player in the class of feedback controls. Presence of an idealized element (the discrimination of the opponent) allowed to create eﬀective numerical methods for backward construction of MSBs (see Ushakov (1998)). Linearity of the dynamics and two-dimensionality of the state variable suﬃciently simplify the algorithms. If Tf 1 ̸= Tf 2 , then there is no any appreciable complication in constructing MSBs for the problem considered in this paper in comparison with the case Tf 1 = Tf 2 . Indeed, let Tf 1 > Tf 2 . Then in the interval (Tf 2 , Tf 1 ] in (4), we take into account only the dynamics of the variable x1 , when building the bridge Wc backwardly from the instant Tf 1 . With that, the terminal set at the instant Tf 1 is taken as {(x1 , x2 ) : |x1 | ≤ c}. When the constructions are made up to the instant Tf 2 , we take ∪{ } Wc (Tf 2 ) = Wc (Tf 2 + 0) (x1 , x2 ) : |x2 | ≤ c , and further constructions are made on the basis of this set. 4. STRONG PURSUERS, EQUAL TERMINATION INSTANTS Let us add dynamics (4) by a “cross-like” target set Mc = {|x1 | ≤ c} ∪ {|x2 | ≤ c}, c ≥ 0, at the instant Tf = Tf 1 = Tf 2 . Then we get a standard linear diﬀerential game with ﬁxed termination instant and nonconvex target set. The collection {Wc } of MSBs describes the value function of the game (4) with payoﬀ functional (5).

9327

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

Choose the following typical parameters of the game: AP 1 = 2, AP 2 = 3, AE = 1, lP 1 = 1/2, lP 2 = 1/0.857, lE = 1, Tf 1 = Tf 2 = 6. Let us explain results of the study by means of Figs. 2–5. Fig. 2 shows results of constructing the set W = W0 (that is, for c = 0). In the ﬁgure, one can see several time sections W (t) of this set. The bridge has a quite simple structure. At the initial instant τ = 0 of the backward time (that is, when t = 6), its section coincides with the target set M0 , which is the union of two coordinate axes. Further, at the instants t = 4, 2, 0, the cross thickens, and two triangles are added to it. The widths of the vertical and horizontal parts of the cross correspond to sizes of the MSBs in the individual games with the ﬁrst and second pursuers. These triangles are located in the II and IV quadrants (where the signs of x1 and x2 are diﬀerent, that is, the evader is between the pursuers) give the zone where the capture is possible only under collective actions of both pursuers (trying to avoid one of the pursuer, the evader is captured by another one). These additional triangles have a simple explanation from the point of view of problem (1). Their hypotenuses have slope equal to 45◦ , that is, are described by the equation |x1 | + |x2 | = const. As the hypotenuse reaches a point (x1 , x2 ), the instant τ corresponds to the instant, when the pursuers cover together the distance |x1 (0)| + |x2 (0)|, which is between them at the initial instant t = 0. Therefore, at this instant, both pursuers come to the same point. Since the evader was initially between the pursuers, it is captured at this instant.

Fig. 3. Level sets of the value function for the case of two strong pursuers and equal termination instants

The set W built in the coordinates of system (4) coincides with the description of the solvability set obtained analytically in Le M´enec (2008, 2010). The solvability set for system (1) is deﬁned as follows: if in the current position of system (1) at the instant t, the forecasted coordinates

Fig. 4. Result of optimal motion simulation for the case of two strong pursuers and equal termination instants

Fig. 2. Time sections of the bridge W for the case of two strong pursuers and equal termination instants

Fig. 5. Trajectories in the original space for the case of two strong pursuers and equal termination instants

9328

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

x1 , x2 are inside the time section W (t), then under the controls u1 , u2 the motion is guided to the target set M0 ; otherwise, if the forecasted coordinates are outside the set W (t), then there is an evader’s control, which deviates system (4) from the target set, therefore, there is no exact capture in original system (1). Time sections Wc (t) of other bridges Wc , c > 0, have shape similar to W (t). In Fig. 3, one can see the sections Wc (t) at t = 2 (τ = 4) for a collection {Wc } corresponding to some series of values of the parameter c. For other instants t, the structure of the sections Wc (t) is similar. The sets Wc (t) describe the value function x → V (t, x). The ﬁrst player governs two controls u1 and u2 . Velocity component of system (4) depending on u1 is horizontal, and the component depending on u2 is vertical. Analyzing the structure of the sections Wc (t) at some instant t, one can conclude that on any horizontal line, a minimum of the value function x → V (t, x) is attained in some interval including x1 = 0. It follows from this that for optimal feedback control it is necessary to take u01 (t, x) = 1 if x1 > 0, and u01 (t, x) = −1 if x1 < 0. Thus, the vertical axis is a switching line for the control u1 . In the axis, the optimal control can be taken arbitrary under constraint |u1 | ≤ 1. In the same way, on any vertical line, the minimum of the function x → V (t, x) is attained in some segment including x2 = 0. Take u02 (t, x) = 1 if x2 > 0, and u02 (t, x) = −1 if x2 < 0. The switching line for the control u2 is the horizontal axis. In the axis, the choice of the control is also arbitrary under condition |u2 | ≤ 1. The switching lines (the coordinate axes) at any t divide the plane x1 , x2 into 4 cells. In each of these cells, the optimal control of the ﬁrst player is constant. The vector control (u01 (t, x), u02 (t, x)) is applied in a discrete scheme with some time step ∆: a chosen control is kept constant during the next time step of length ∆. Then, on the basis of the new position, a new control is chosen, etc. When ∆ → 0, this control guarantees to the ﬁrst player a result not greater than V (t0 , x0 ) for any initial position (t0 , x0 ). Now, let us describe the optimal control of the second player. The vectogram of the second player in system (4) is a segment parallel to the diagonal of I and III quadrants. Using the sets Wc (t) at some instant t, let us analyze the change of the function x → V (t, x) along the lines parallel to this diagonal. Consider some of these line such that it passes through the II quadrant. One can see that local minima are attained at points, where the line crosses the axes Ox1 and Ox2 , and a local maximum is in the segment, where the line coincides with the boundary of some level set of the value function. The situation is similar for lines passing through the IV quadrant. As the switching lines for the second player’s control v, let us take three lines: the axes Ox1 and Ox2 , and a slope line Π(t), which consists of two half-lines passing through middles of the diagonal parts of the level sets boundaries in the II and IV quadrants. In the switching line, the control v can take arbitrary values such that |v| ≤ 1. Inside each of 6 cells, into which the plane is separated by the switching lines, the control is taken either v = +1 or v = −1, such that pulls the system towards the points of maximum. Applying this control in a discrete scheme with the time

step ∆, the second player guarantees as ∆ → 0 the result not less than V (t0 , x0 ) for any initial position (t0 , x0 ). Note: since W (t) ̸= ∅, then the global minimum of the function x → V (t, x) is attained at any x ∈ W (t) and equal 0. Thus, when the position (t, x) of the system is such that x ∈ W (t), the players can choose, generally speaking, any controls from their constraints. If x ∈ / W (t), the choices should be made according to the rules described above and based on the switching lines. In Fig. 4, one can see results of optimal motion simulation. This ﬁgure contains time sections W (t) (thin solid lines; the same sections as in Fig. 2), switching lines Π(0) at the initial instant and Π(6) at the termination instant of the direct time (dotted lines), and two trajectories for two diﬀerent initial positions: ξI (t) (thick solid line) and ξII (t) (dashed line). The motion ξI (t) starts from the point x01 = 40, x02 = −25 (marked by a black circle), which is inside the initial section W (0) of the set W . Thus, the evader is captured: the endpoint of the motion (also marked by a black circle) is at the origin. The initial point of the motion ξII (t) has coordinates x01 = 25, x02 = −50 (marked by a star). This position is outside the section W (0), and the evader escapes from the exact capture: the endpoint of the motion (also marked by a star) has nonzero coordinates. Fig. 5 gives the trajectories of the objects in the original space. For all simulations here and below, we take y10 = −x01 , y20 = −x02 , y˙ 10 = y˙ 20 = 0, a0P 1 = a0P 2 = a0E = 0. Solid lines correspond to the ﬁrst case, when the evader is successfully captured (at the termination instant, the positions of both pursuers are the same as the position of the evader). Dashed lines show the case, when the evader escapes: at the termination instant no one of the pursuers superposes with the evader. In this case, one can see as the evader aims itself to the middle between the terminal positions of the pursuers (this guarantees the maximum of the payoﬀ functional φ). 5. STRONG PURSUERS, DIFFERENT TERMINATION INSTANTS Take the parameters as in the previous section, except the termination instants. Now they are Tf 1 = 7 and Tf 2 = 5. MSB W = W0 for system (4) with the taken target set M0 = {t = Tf 1 , x1 = 0} ∪ {t = Tf 2 , x2 = 0} is built in the following way. At the instant τ1 = 0 (that is, t = Tf 1 ), the section of the bridge coincides with the vertical axis x1 = 0. At the instant τ1 = 2, τ2 = 0 (that is, t = Tf 2 ), we add the horizontal axis x2 = 0 to the bridge, which has expanded after the passed time period. Further, the time sections of the bridge are constructed using standard procedure under relation τ2 = τ1 −2. In the same manner, bridges Wc , c > 0, corresponding to target sets Mc = {t = Tf 1 , |x1 | ≤ c} ∪ {t = Tf 2 , |x2 | ≤ c} can be built: at the instant τ1 = 0 we take a vertical strip |x1 | 6 c, which shows the nonzero terminal distance c between the ﬁrst pursuer and the evader; then MSB from this strip is constructed up to the instant τ1 = 2; at this instant, unite it with the horizontal strip |x2 | 6 c corresponding to the same deviation c from the second pursuer; further, a bridge is constructed starting from this new section.

9329

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

Results of construction of the set W are given in Fig. 6. When τ1 > 2, time sections W (t) grow both horizontally and vertically; two additional triangles appear, but now they are curvilinear. Analytical description of these curvilinear parts of the boundary is diﬃcult. Due to this, in Le M´enec (2008, 2010), there is only an upper estimation for the solvability set for this variant of the game. Total structure of the sections Wc (t) at t = 2 (τ1 = 5, τ2 = 3) is shown in Fig. 7. Optimal feedback controls of the pursuers and evader are constructed in the same way as in the previous example, except that the switch line Π(t) for the evader is formed by the corner points of the additional curvilinear triangles of the sets Wc (t), c ≥ 0. In Fig. 6, the trajectory for the initial point x01 = 50, x02 = −25 is shown as a solid line between two points marked by starts. The trajectories in the original space are shown in Fig. 8. One can see that at the beginning the evader escapes from the second pursuer and goes downstairs, after that the evader’s control is changed to escape from the ﬁrst pursuer and the evader goes upstairs. 6. TWO WEAK PURSUERS, DIFFERENT TERMINATION INSTANTS

Fig. 6. The bridge W and optimal motions for the case of two strong pursuers and diﬀerent termination instants

Now, we consider a more interesting variant of the game when both pursuers are weaker than the evader. Let us take the following parameters: AP 1 = 0.9, AP 2 = 0.8, AE = 1, lP 1 = lP 2 = 1/0.7, lE = 1, and diﬀerent termination instants Tf 1 = 7, Tf 2 = 5.

Fig. 7. Level sets of the value function for the case of two strong pursuers and diﬀerent termination instants

Fig. 8. Trajectories in the original space for the case of two strong pursuers and diﬀerent termination instants

Since in this variant the evader is more maneuverable than the pursuers, they cannot guarantee the exact capture. Let us take some level of the miss, namely, x1 (Tf 1 ) ≤ 2.0, x2 (Tf 2 ) ≤ 2.0. Time sections W2.0 (t) of the corresponding MSB are shown in Fig. 9. The upper-left subﬁgure corresponds to the instant when the ﬁrst player stops to pursue. The upper-right subﬁgure shows the picture for the instant, when the second pursuer ﬁnishes its pursuit. At this instant, the horizontal strip is added, which is a bit wider than the vertical one contracted during the passed period of the backward time. Then, the bridges contracts both in horizontal and vertical directions, and two additional curvilinear triangles appear (see middleleft subﬁgure). The middle-right subﬁgure gives the view of the section when the vertical strip collapses, and the lower-left subﬁgure shows the conﬁguration just after the collapse of the horizontal strip. At this instant, the section loses connectivity and disjoins into two parts symmetrical with respect to the origin. Further, these parts continue to contract (as it can be seen in the lower-right subﬁgure) and ﬁnally disappear. { } Time sections Wc (t) and corresponding switching lines of the ﬁrst player are given in Fig. 10 at the instant t = 0 (τ1 = 7, τ2 = 5). The dashed line is the switching line for the component u1 ; the dotted one is for the component u2 . The switching lines are obtained as a result of the analysis of the function x → V (t, x) in horizontal (for u1 ) and vertical (for u2 ) lines. In some region around the origin, the switching lines for u1 (respectively, for u2 ) diﬀer from the vertical (horizontal) axis. If in the considered horizontal (vertical) line, the minimum of the value function is attained in a segment, then the middle of such a segment

9330

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

Fig. 11. Typical map of switching lines and optimal controls for the second player (the evader)

Fig. 12. Trajectories of the objects in the original space

Fig. 9. Time sections of MSB W2.0

is taken as a point for the switching line. Arrows show directions of components of the control in 4 cells. Similarly, in Fig 11, switching lines and optimal controls are displayed for the second player. Here, the switching lines are drawn with thick solid lines. We have 4 cells, where the second player’s control is constant. For simulations, let us take the initial position x01 = 12, x02 = −12 for system (4). In Fig. 12, trajectories of the objects are shown in the original space. At the beginning of the pursuit, the evader closes to the ﬁrst (lower) pursuer. It is done to increase the miss from the second (upper) pursuer at the instant Tf 2 . Further closing is not reasonable, and the evader switches its control to increase the miss from the ﬁrst pursuer at the instant Tf 1 . 7. COMPARATIVE ANALYSIS OF THE CASES OF STRONG AND WEAK PURSUERS

Fig. 10. Typical map of switching lines and optimal controls for the ﬁrst player (the pursuers)

1) The case of strong pursuers is, of course, the simplest. From the mathematical point of view, the origin is that for any c ≥ 0 the t-sections Wc (t) expand when the backward time grows (as, respectively, t decreases). Therefore, the (2) sections Wc (t) of MSB of the second player contract. Let (2) Tf 1 ≥ Tf 2 . At the instant Tf 2 , the set Wc (Tf 2 ) consists (2) of four convex sets Wci (Tf 2 ), i = 1, 4, one per quadrant of

9331

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

the plane x1 , x2 . Due to linearity of dynamics (4) for the case of strong pursuers, such a structure (one convex set (2) Wci (t) in each quadrant) is kept for any instant t ≤ Tf 2 . (2) So, we can build any of Wci (t), i = 1, 4, independently using a simpler algorithm (see Isakova et al. (1984); Kumkov and Patsko (2001)) for backward construction of MSBs dealing with the case of convex sections. This was made by authors to check the results of the algorithm worked out for a general case. The check was successful, and the sets built in two ways coincide. (2) Wci (t)

The convexity of the sets allows to use the works (Botkin and Patsko (1983); Patsko (2006); Zarkh (1990)) for justiﬁcation of constructing the players’ optimal feedback strategies on the basis of switching lines. These works suggest algorithms for the case of linear diﬀerential games with payoﬀ function having convex Lebesgue sets. In the case of strong pursuers, the optimal feedback control of the ﬁrst player is very simple: switching lines deﬁning the controls u1 and u2 do not depend on t; the switching line for u1 coincides with the axis x2 , and the switching line for u2 is the axis x1 . So, we have u0i (t, x) = sign xi (t), i = 1, 2, that is, each pursuer, when acts optimally, independently targets to the evader (in the forecasted coordinates). But the optimal control of the second player is not so trivial: one of switching lines for his control, namely, Π(t) depends on t, and its computation, especially for the case Tf 1 ̸= Tf 2 , is diﬃcult. 2) In the case of weak pursuers, the character of evolution of MSBs Wc changes. For any c > 0 with decreasing t, there is an instant, when a connected set Wc (t) disjoins into two separate parts. As t decreases further, the set Wc (t) becomes empty. Also, there are two instants when the set Wc (t) contracts jump-like (at one instant, the contraction is along one coordinate axis, at another one, the sets contracts along the second axis). Complication of the structure of the sets Wc , c > 0, aﬀects the geometry of the switching lines of the players. Justiﬁcation of optimality of the suggested methods of control (as parameters of time and space discretization tend to zero) needs additional study in this case. 3) Numerical construction of the switching lines and usage of them for generating optimal feedback controls have acceptable computation diﬃculty. But note that obtaining the current state x(t) of system (4), which is necessary to get the optimal control, needs knowledge of all state variables of the original system including ap1 (t), ap2 (t), and aE (t), which is quite problematic in practical situations. 8. CONCLUSION Presence of two pursuers acting together and minimizing the miss from the evader leads to nonconvexity of the time sections of the value function, when the situation is considered as a standard antagonistic diﬀerential game, where both pursuers are joined into one player. In the paper, results of numerical study of this problem are given for several variants of the parameters. The structure of the solution depends on the presence or absence of dynamic

advantage of one or both pursuers over the evader. We have not considered a diﬃcult case, when at the ﬁnal stage of pursuit the advantage belongs to the evader, but at the initial stage, at least one of the pursuers has advantage over the evader. Further investigations will deal with just this variant of the game. REFERENCES Abramyantz, T.G. and Maslov, E.P. (2004). A diﬀerential game of pursuit of a group target. Izv. Akad. Nauk Teor. Sist. Upr, (5), 16–22. (in Russian). Blagodatskih, A.I. and Petrov, N.N. (2009). Conﬂict Interaction Controlled Objects Groups. Udmurt State University, Izhevsk, Russia. (in Russian). Botkin, N.D. and Patsko, V.S. (1983). Universal strategy in a diﬀerential game with ﬁxed terminal time. Problems of Control and Inform. Theory, (11), 419–432. Chikrii, A.A. (1997). Conﬂict-Controlled Processes, volume 405 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht. Grigorenko, N.L. (1991). The problem of pursuit by several objects. In Diﬀerential games — developments in modelling and computation (Espoo, 1990), volume 156 of Lecture Notes in Control and Inform. Sci., 71–80. Springer, Berlin. Hagedorn, P. and Breakwell, J.V. (1976). A diﬀerential game with two pursuers and one evader. Journal of Optimization Theory and Applications, 18(2), 15–29. Isakova, E.A., Logunova, G.V., and Patsko, V.S. (1984). Computation of stable bridges for linear diﬀerential games with ﬁxed time of termination. In A.I. Subbotin and V.S. Patsko (eds.), Algorithms and Programs for Solving Linear Diﬀerential Games, 127–158. Institute of Mathematics and Mechanics, Ural Scientiﬁc Center, Academy of Sciences of USSR, Sverdlovsk. (in Russian). Krasovskii, N.N. and Subbotin, A.I. (1974). Positional Diﬀerential Games. Nauka, Moscow. (in Russian). Krasovskii, N.N. and Subbotin, A.I. (1988). Game-Theoretical Control Problems. Springer-Verlag, New York. Kumkov, S.S. and Patsko, V.S. (2001). Construction of singular surfaces in linear diﬀerential games. In E. Altman and O. Pourtallier (eds.), Annals of the International Society of Dynamic Games, Vol. 6: Advances in Dynamic Games and Applications, 185–202. Birkhauser, Boston. Levchenkov, A.Y. and Pashkov, A.G. (1990). Diﬀerential game of optimal approach of two inertial pursuers to a noninertial evader. Journal of Optimization Theory and Applications, 65, 501–518. Le M´enec, S. (2008). Linear diﬀerential game with two pursuers and one evader. In Abstracts of 13th International Symposium on Dynamic Games and Applications, 149–151. Wroclaw University of Technology, Wroclaw, Poland. Le M´enec, S. (2010). Linear diﬀerential game with two pursuers and one evader. In Annals of the International Society of Dynamic Games, Vol. 11: Advances in Dynamic Games and Applications. Birkhauser, Boston. (to appear). Patsko, V.S. (2006). Switching surfaces in linear diﬀerential games. Journal of Mathematical Sciences, 139(5), 6909–6953.

9332

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

Patsko, V.S. and Turova, V.L. (2001). Level sets of the value function in diﬀerential games with the homicidal chauﬀeur dynamics. International Game Theory Review, 3(1), 67–112. Pschenichnyi, B.N. (1976). Simple pursuit by several objects. Kibernetika, 3, 145–146. (in Russian). Shinar, J. and Shima, T. (2002). Non-orthodox guidance law development approach for the interception of maneuvering anti-surface missiles. Journal of Guidance, Control, and Dynamics, 25(4), 658–666. Stipanovic, D.M., Melikyan, A.A., and Hovakimyan, N. (2009). Some suﬃcient conditions for multi-player pursuit-evasion games with continuous and discrete observations. In P. Bernhard, V. Gaitsgory, and O. Pourtallier (eds.), Annals of the International Society of Dynamic Games, vol. 11: Advances in Dynamic Games and Applications, 133–145. Springer, Berlin. Ushakov, V.N. (1998). Construction of solutions in diﬀerential games of pursuit-evasion. In Diﬀerential Inclusions and Optimal Control, volume 2 of Lecture Notes in Nonlinear Analysis, 269–281. Nicholas Copernicus University, Torun. Zarkh, M.A. (1990). Universal strategy of the second player in a linear diﬀerential game. Prikl. Math. Mekh., 54(3), 395–400. (in Russian).

9333

Numerical Study of Two-on-One Pursuit-Evasion Game

Numerical Study of Two-on-One Pursuit-Evasion Game

Recommend Documents