Mathematical and Computer Modelling 52 (2010) 1506–1520
Contents lists available at ScienceDirect
Mathematical and Computer Modelling journal homepage: www.elsevier.com/locate/mcm
Modeling crowd dynamics by the mean-field limit approachI Christian Dogbé Department of Mathematics, University of Caen, CNRS UMR 6139, BP 5186, F-14032 Caen, France
article
info
Article history: Received 25 February 2010 Received in revised form 28 May 2010 Accepted 1 June 2010 Keywords: Crowd behavior dynamics Differential games Feedback control Mean-field limit
abstract We propose to couple through optimal control theory, a dynamical crowd evolution models where pedestrians make decisions with a well-defined set of actions and strategies by minimizing their utility function. We explain how situations of this kind can be modeled by making use of differential game theory. By leveraging on some simple examples, we introduce the most fundamental concept of mean field games recently introduced by Lasry and Lions [J.-M. Lasry, P.-L. Lions, Mean field games, Jpn. J. Math. 2 (1) (2007) 229–260]. The main interest of mean field games in our work is to simplify the interactions between pedestrians. As a consequence, we derive a continuum formulation of the crowd dynamics and nonlinear systems involving partial differential equations of crowd models. © 2010 Elsevier Ltd. All rights reserved.
1. Motivations and contents This paper presents the extension of modeling of crowd dynamics based on some conceptual development of a new class of equations recently proposed by Bellomo and Dogbé [1,2] to model the evolution of systems of interacting pedestrian models. Usually, one distinguishes two levels of modeling: microscopic and macroscopic models. In the microscopic framework, pedestrians are treated as individual entities (particles). The evolution of the particles in time is determined by physical and social laws which describe the interaction among the particles as well as their interactions with the physical surrounding. Examples for microscopic methods are social-force models (see [3–5] and the references therein). In contrast to microscopic models, macroscopic models treat the whole crowd as an entity without considering the movement of single individuals. It expresses the law of pedestrian movement through paradigms like fluid mechanics. This approach has been settled by Hughes [6,7] and subsequently developed by various authors [1,8] by means of classical methods of continuum mechanics based on the use of mass and momentum conservation equations properly closed by phenomenological models modeling the relation of the acceleration term, or mean velocity, to local flow conditions. Pedestrian dynamics has some obvious similarities with fluids, therefore it is not surprising that, very much like for vehicular dynamics, the earliest models of pedestrian dynamics took inspiration from hydrodynamics or gas-kinetic theory. Classical approaches use the well-known concepts from fluid and gas dynamics [9]. Various models developed for macroscopic traffic flow can be readily extended to capture the dynamics of pedestrian flow. For an extensive review on different approaches of modeling we refer to [10,11]. Though such models seem to reproduce pedestrian movement, the movements of pedestrians in these models must directly related to the decision making processes of pedestrians, and as the characteristics of pedestrian flow are apparently affected by decisions of pedestrians, it is necessary to model the processes properly. It is interesting to note that stochasticity is an essential ingredient in order to capture the real essence of the interactions among the pedestrians. The dynamics of pedestrians can either be deterministic or stochastic. In the first case the behavior at a certain time is completely determined by the present state. In the stochastic models, the behavior is controlled by certain probabilities such that the pedestrians can react differently in the same situation.
I The author is indebted with the referees who have contributed, by valuable remarks, to improve the quality of this paper.
E-mail address:
[email protected]. 0895-7177/$ – see front matter © 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.mcm.2010.06.012
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1507
The main point here is how can we model such a process? One way to tackle a problem is to use the techniques of statistical mechanics, particularly the mean-field limit approach linking the differential population games. In these games, the population profile evolves according to deterministic or stochastic differential equations or inclusions and the state process is a jump and drift process. Each individual in a large population interacts with other randomly selected pedestrians. The states and actions of each pedestrian in an interaction together determine the instantaneous payoff for all involved pedestrians. This class is related to the so-called mean field games with additional individual state processes. Lasry and Lions [12] develop a mean field approach for optimal control and differential games – continuous state and time – mean field games with continuum of players, where the payoff to each player depends on other players actions only through their statistical distribution. The basic idea is that in very large games, players are not influenced by individual actions but only by average properties. An equilibrium is then a distribution of actions among the population such that each individual action is a best reply against this distribution. A mean field game is a generalization of the statistical mechanics problem solved by the Hamilton–Jacobi equations, which govern a continuum of particles moving so as to optimize both their final location and minimize their cost of travel. In a mean field game, the particles also ‘‘care’’ about the density of the other particles around them, so the utility of each particle depends on the behavior of the other particles. This is what makes it a form of game theory. Furthermore, one of the complexity problems which have to be taken into account in this kind of modeling refers to the notion of ‘‘rationality hypothesis’’ of the pedestrian in the sense that the choice of pedestrians derives from optimization given his preferences. For example, consider large crowds of pedestrians attending events, such as sporting events or religious gatherings. The behavior of such crowds can be described as rational and goal-directed because the members of the crowds have clear knowledge of their goals and where they lie. The motion of a crowd can therefore be modeled using rational principles. In this paper, it is assumed that all individuals’ decisions have a negligible effect on the performance of others. The above criterion is based on what is called the Nash equilibrium in game theory [13]. One of principle of this criterion is that, in the equilibrium situation, there is no cooperation between individuals assumed, and that all individuals make their decisions in an egoistic and rational way. The usual concept of Nash equilibrium requires that at an equilibrium profile every player (almost every in the case of infinitely many players represented as a measure space) maximizes his payoff as the function of his strategy given the strategies of the remaining players, this leads us to use the concept of Nash equilibrium to describe the anticipation behavior of pedestrians. Finally, to solve these problems, we have been led to introduce the fundamental concept of the empirical measure. The advantage to work with the empirical measure rather than the microscopic configuration is that, the all empirical measure live in the same space (infinite dimensional) of probability measures on Euclidean space. It then becomes possible to ask whether these empirical measure converges to something, whereas, it would not be obvious at all for the microscopic configurations, which live all in different spaces when N varies. We will see thereafter, that in the random case, the expectation of empirical measurement provides a good concept of density of the pedestrians. The paper is structured into five further sections. In order to be self-contained we present an overview of the stochastic game in Section 2. Then, in Section 3, we model the problem in the case of a single pedestrian, under deterministic and stochastic configuration. Under certain assumptions, the optimal control problem is partially solved by deriving equations which satisfy the so-called Hamilton–Jacobi–Bellman equation. Section 4 is devoted to the modeling of N pedestrians distributed through space. This leads formally to Fokker–Planck equation. Since we are faced with a family of problems of finite pedestrians, in the sense that the interactions are envisaged between a finite number of pedestrians, it is conceptually unclear what we mean by ‘‘optimality’’ and even more unclear what we mean by ‘‘an optimal control law’’; so our first task was to specify exactly what problem we are facing by letting the number of pedestrians to infinity. This task has been accomplished in Section 5, where we consider stochastic differential games and N pedestrian Nash points and then, we pass to the limit when N goes to infinity and derive formally mean field games equations. Section 6 develops an analysis of some complexity problems related to various features of the class of models we are dealing with and provides an overlook on research perspectives toward further the mathematical problems generated by the models reviewed in this paper. 2. Notation and definitions In order to appreciate the forthcoming sections and to avoid unnecessary confusion, it is helpful to understand the form of game that represents people’s behavior in pedestrian flow. In our work we adopt the normal form representation. Formally, a normal form of a game Γ is given by
Γ = h{1, . . . , N }, Σ , U, {Ji }i ,
(2.1)
where {1, . . . , N } is the set of players (pedestrians), Σ refers to a dynamic system describing with the stochastic differential equation we deal with; Ui is the action set for player i, Ui = U1 × U2 × · · · × UN is the Cartesian product of the sets of actions available to each player, and {Ji } = {J1 , . . . , JN } is the set of the so-called utility functions that each player i wishes to minimize, where Ji : U → R. For every player i, the utility function is a function of the action chosen by pedestrian i, ui , and the actions chosen by all the pedestrian in the game other than pedestrian i denoted as u−i . Together, ui and u−i make up the action tuple a. An action tuple is a unique choice of actions by each pedestrian. From this model, steady-state conditions known as Nash equilibria can be identified. The Nash equilibrium concept is used to analyze the outcome of the strategic interaction of several decision makers. It is a way of predicting what will happen if several people are making decisions at
1508
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
Fig. 1. Geometry of the domain occupied by the crowd.
the same time, and if the decision of each one depends on the decisions of the others. Mathematically, u is a best response by pedestrian i to u−i if u ∈ {arg min Ji (ui , u−1 )}, i.e. u ∈ U and J (u∗ ) 6 J (u), ∀u ∈ U. A Nash equilibrium (NE) is an action tuple that corresponds to the mutual best response: for each pedestrian i, the action selected is a best response to the actions of all others. Formally, the action tuple u∗ = (u∗1 , u2 , u∗3 , . . . , u∗n ) is a NE if Ji (u∗i , u∗−i ) 6 Ji (ui , u∗−i ) for ∀ui ∈ U and for ∀i ∈ N. Mathematically, this definition means that from the point of view of the ith pedestrian, Ji ui , (uj )j6=i is a symmetric function with respect to uj and these functions are independent of i. At large, there are static games or dynamic games with complete information or incomplete information. Crucially, in a static game, players take their decisions simultaneously (individually and independently); they then move (not necessarily simultaneously but bound to the decisions they took) and then receive their payoffs. That is, the pedestrians in a static game are unaware about the strategies the other players in the game may choose but any player may hypothesize on the strategies other players may choose. In a dynamic game, decisions are taken sequentially. A game with complete information is a game in which each pedestrian knows the game U, notably the set of pedestrians N, the set of strategies U and the set of payoff functions J. Otherwise, the game is classified as a game of incomplete information. At this point of the discussion, it is very important to explicitly state that we consider in this paper, the game to be with complete information. In this assumption, we postulate that the pedestrians know the state of the system at each instant of time. ∗
3. Optimal control formulation of a single pedestrian Before considering the problem in its general form, we are going to use calculus of variations approach to develop the dynamics of a pedestrian. We consider the behavior of a single pedestrian trying to optimize his path in spacetime with respect to a fixed cost function to maximize against. One could also reverse the sign here, and minimize a utility function rather than maximize a cost function; mathematically, there is no distinction between the two. Some geometrical notations are necessary to identify the target of the crowds. The dynamical state of a pedestrian is defined by his position, say x ∈ Ω and velocity coordinates v ∈ Ω which is a bounded domain representing the pedestrian area, and assume that the target Ta lies on the boundary ∂ Ω . The geometry is represented in Fig. 1, showing an outlet zone corresponding to evacuation, for instance a point of the boundary corresponding to the exit. The geometry can be further modified by inserting internal obstacles and an inlet zone [1]. It is important to note that when the domain Ω is strewn with obstacles, we must take into account that some points x ∈ Ω cannot be directly connected to the point of target by straight paths. We define the pedestrian velocity field v in the point x ∈ Ω at time t by v(t , x) = v(t , x)ν(x), where v is the magnitude of v (i.e., the pedestrian speed) and ν indicates a unit vector in the direction of v. In general, the direction of pedestrians is modeled as the superposition of two contributions: the desired direction, i.e. the direction that the pedestrian will take to reach their goal in a clear area and direction of interaction, namely the direction that will pedestrians to avoid congested areas. For simplicity, we can take Ω to be a Euclidean space Rd , d > 2 with all of its usual structure. Then natural space of the phases for the system of N pedestrians is (Rd )N . For the state of the pedestrians we adopt the notation: X (t ) := (x, v)
where x := (x1 , . . . , xN ) and v := (v1 , . . . , vN ).
(3.1)
As the positions of the pedestrians can be represented by points xi (t ) in space, which change continuously over time t, pedestrian dynamics can be described by the following equation of motion [1]: d dt
x(t ) = v(t )
and
d dt
v(t ) = γ (t )
(3.2)
in a vector form to take into account that pedestrians move in more than one space dimension. Here, γ (t ) denotes the acceleration vector. Pedestrian can influence the state x directly via the control α defined by the acceleration γ (t ), i.e.
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1509
α(t ) := γ (t ). Using the state definition in (3.1), the dynamics of the pedestrian traffic system are described by a system of coupled ordinary differential equations d
X (t ) = (˙x, v˙ ) = (v, γ ) = b(X (t ), α(t )), (3.3) dt where b is an arbitrary function representing autonomous dynamics, α = (α1 , . . . , αn ) the control vector, reflecting how the pedestrians influence the state X (e.g. by accelerating and changing directions). In the following, we will suppose that b is sufficiently differentiable. This is a strong assumption. Before getting into the body of modeling of our problem, we make the following assumptions: (H1) (1) (2) (3) (4)
The pedestrians are assumed to have complete and accurate knowledge of the global state of the system. The dynamics of each pedestrian is additive in the control and disturbed by additive Brownian noise. We assume the existence of objective interactions among the individuals and the surrounding environment. The performance of the pedestrians is valued by a global cost function, which is an integral of instantaneous costs plus an end cost. The joint task of the pedestrians is modeled by the end cost.
Assume now that a pedestrian is at some location x(0) at time t in domain Ω and would like to end up at some better location x(t ) at a later time t = T > 0. Each location x in the domain Ω has some cost J (T , x) ∈ R at this final time T , which is small when x is a desirable location and large otherwise, so that the pedestrian would like to minimize J (T , x(T )). The cost function J weights any possible trajectories of pedestrian. Assume that we are not concerned with transportation problem [14,15], we find the value of x(T ) that minimizes J (x(T )) and the pedestrian takes an arbitrary path (e.g. a constant velocity straight line path) from x(0) at time t = 0 to x(T ) at time t = T . Now, if one supposes that there is a transportation cost in addition to the cost of the final location—for instance, moving at too fast, a velocity may incur an energy cost. To model this, we introduce a velocity cost function, the so-called ‘‘running costs’’ L : Ω → R, where L(v)dt measures the marginal cost of moving at a given velocity v for time dt, and then define the functional of a trajectory x : [0, T ] → Ω of the form J (x) =
Z
T
L(x0 (t ))dt + φ(T , x(T )).
(3.4)
0
The function φ designs a pure cost part φ (or terminal cost) function that corresponds to the effort to supply to move of size the action of pedestrians. It is arbitrary, and represents the environment of the pedestrian. Examples of running costs are: distance between origin and destination, proximity of obstacles or other physical obstructions, stimulation and attractiveness of environment. It is often assumed that L and φ are convex functions. Generally, it is natural to choose L to be convex or strictly convex and to assume that L is even, thus L(−v) = L(v). This is a finite horizon control problem, in which the control is the time derivative of the state and meaning that there exist a finite number of stages. Note that in (3.4), the horizon of the problem is T and the time of starting the study of the system is t = 0. The goal now is to select a trajectory that minimizes this functional (3.4). The problem depends now only on velocity and not on the space and the time. Moreover, we can also consider the case of the cost functions depending of these parameters, i.e. L(t , x, v) such that we replace Eq. (3.4) by J (x, v) =
T
Z
L(t , x(t ), v(t ))dt + φ(T , x(T )).
(3.5)
0
Alternatively, one can consider the case with infinite time horizon in which the pedestrians are not aware of the duration of the game. Clearly, this is often the case in strategic interactions. In this case, it is natural to insert an exponential factor, discounting the value of gains far into the future. For some discount factor ζ > 0, we thus define J (x, v) =
T
Z
e−ζ t L(t , x(t ), v(t ))dt .
(3.6)
0
Different kind of cost functions which seems relevant in some problems presenting congestion effects as for traffic on highway, crowd moving in domains with obstacle. A model example of a cost function is a quadratic cost function L(v) = 12 |v|2 (other normalizing factors 12 than can of course be used here). Another case that can be considered is the discomfort of pedestrian due to walking too close to obstacles and walls based on the specification of the repellent force term of obstacles proposed by Helbing [16] and Hoogendoorn [17]. In that case, if obstacles are described by areas Ok ⊂ Ω , for k = 1, . . . , M, one can assume that, for any obstacle k, the running cost component L is given by a monotonically decreasing function of the distance d(x, Ok ) between the location x of the pedestrian and the obstacle, i.e. L(t , x, v) = ak exp(−d(Ok , x)/bk ), where the distance d is defined by the minimum distance between the pedestrian location x and obstacle k d(x, Ok ) = min{kx − yk}. y∈Ok
(3.7)
1510
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
The parameters ak > 0 and bk > 0 are scaling parameters, describing the region of influence of obstacle k. Both ak and bk are dependent on the type of obstacle that is considered. One way to compute the optimal trajectory is to solve the Euler–Lagrange equation associated to Eq. (3.4), with the boundary condition that the initial position x(0) is fixed. This is an ODE for which can be solved by a variety of methods. There is also another method which consists to derive a PDE rather an ODE. In order to do so, let us take 0 6 t0 6 T and x0 ∈ Ω and define the optimal cost u(t0 , x0 ) at the point (t0 , x0 ) in spacetime to be u(t0 , x0 ) = inf
Z
T
L(x0 (t ))dt + φ(T , x(T ))
(3.8)
t0
over all (smooth) paths x : [t0 , T ] → Ω starting at x(t0 ) = x0 and with an arbitrary endpoint x(T ). Informally, this is the cost the pedestrian would place on being at position x0 at time t0 . By definition, when t0 = T , the optimal cost at x0 coincides with the final cost at x0 . The final cost u(T , ·) can thus be viewed as a boundary condition for the optimal cost. For times t0 less than T , it turns out that (under some regularity hypotheses, which I will not discuss here) the optimal cost function obeys a partial differential equation, known as the Hamilton–Jacobi–Bellman equation, which we shall heuristically derive (using infinitesimals as an informal tool) as follows. Assume that, the pedestrian finds himself at position x0 at some time t0 < T and is deciding where to go next. Presumably there is some optimal velocity v in which the pedestrian should move in (a priori, this velocity need not be unique). So, if dt is an infinitesimal time, the pedestrian should move at this velocity for time dt, ending up at a new position x0 + dt at time t0 + dt, and incurring a travel cost of L(v)dt. At this point, the optimal cost for the remainder of the pedestrian’s journey is given by u(t0 + dt , x0 + vdt ) by definition of u. This leads to the heuristic formula u(t0 , x0 ) = u(t0 + dt , x0 + vdt ) + L(v)dt
(3.9)
which on Taylor expansion (and omitting higher order terms) gives u(t0 , x0 ) = u(t0 , x0 ) + dt [∂t u(t0 , x0 ) + v · ∇x u(t0 , x0 ) + L(v)] .
(3.10)
On the one hand, v is being chosen to minimize the final cost. Thus we see v that should be chosen to minimize the expression v · ∇x u(t0 , x0 ) + L(v).
(3.11)
On the other hand, from the strict convexity that minimum v will be unique, and will be some function of ∇x u(t0 , x0 ). If we introduce the Legendre transform H : Ω → R of L : Ω → R by the formula H (p) := sup [v · p − L(v)] , x∈Ω
(3.12)
then, by using the hypothesis that L is even, the minimal value of v · ∇x u(t0 , x0 ) + L(v) is just −H (∇x u(t0 , x0 )). We conclude that u(t0 , x0 ) = u(t0 , x0 ) + dt [∂t u(t0 , x0 ) − H (∇x u(t0 , x0 ))]
(3.13)
leading to the Hamilton–Jacobi–Bellman equation
− ∂t u − H (∇x u) = 0.
(3.14)
Eq. (3.14) is being solved backwards in time, as the optimal cost is prescribed at the final time t = T , but we are interested in its value at earlier times, and in particular when t = 0. Once one solves this equation, one can work out the optimal velocity v = v(t0 , t0 ) to travel at each location (x0 , t0 ). Indeed, from the above discussion we know that v is to minimize the expression v · ∇x u(t0 , x0 ) + L(v), and thus v˜ := −v maximizes the expression v˜ · p + L(˜v), where p := u(t0 , x0 ). By definition, the value of this expression is equal to H (p): H (p) = v˜ · p − L(˜v). We can view v˜ as a function of p. As v˜ maximizes this expression for fixed p, we see that
∂ (˜v · p − L(˜v)) = 0. ∂ v˜ Applying the chain rule, we conclude that H 0 (p) = v˜
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1511
and so the correct velocity to move in is given by v = −H 0 (∇x u).
(3.15)
We point out that, the introducing also chance in the movement of the pedestrian influences the state trajectory throughout a given time interval. Since these trajectories depend on a random parameter, then actions of this additional pedestrian will be realizations of a stochastic process whose statistics are known a priori. Since the pedestrian evolving in Ω which is a d-dimensional continuous state space of Rd and its state evolving time according to the controlled stochastic differential equation, we can describe this mathematically in terms of a state equation as dX (t ) = b(X (t ))dt + σ (X (t ))dB(t ),
(3.16)
in accordance with assumption 1. The noise in the dynamics (3.16) is modeled by the Brownian motion B(t ), i.e., a normally distributed d-dimensional stochastic process in continuous time with mean 0 and variance t, and the d × d matrix σ which represents the variance of the noise. Any autonomous dynamics are modeled by b, which is a Rd -valued function of X (t ) and t. The state change dX (t ) is the sum of the noisy control and the autonomous dynamics. So assume that the pedestrian steering mechanism is subject to a little bit of fluctuation, so that if the pedestrian wishes to get from x0 to x0 + vdt in time dt, the pedestrian instead ends up at x0 + vdt + σ dBt , where dBt is the infinitesimal of a standard Brownian motion in Rd , and σ > 0 is a parameter measuring the noise level. The behavior of the pedestrian is valued by a cost function which is now a stochastic quantity rather than a deterministic one. Therefore, the only rational thing to do now for pedestrians is to minimize the cost functional J (x) = E
T
Z
L(x (t ))dt + φ(T , x(T )) xs = x0 . 0
0
(3.17)
The expectation E[·] is taken with respect to the probability measure under which X (t ) is the solution to (3.16) given the condition X (s) = x0 . Indeed, we can define the optimal (expected) cost function u(t0 , x0 ) much as before, as the minimal expected cost over all strategies of the pedestrian; this is a deterministic function due to the taking of expectations. The Eq. (3.9) is now modified to u(t0 , x0 ) = Eu(t0 + dt , x0 + vdt + σ dBt ) + L(v)dt .
(3.18)
√ We Taylor expand this, using the Ito’s formula heuristic dBt = O( dt ) to obtain u(t0 , x0 ) = Eu(t0 , x0 ) + ∂t u0 (t0 , x0 )dt + v · ∇x u0 (t0 , x0 )dt
+ dBt · ∇x u0 (t0 , x0 ) +
σ2 2
∇ 2 u(t0 , x0 )(dBt , dBt ) + L(v)dt .
Here, Brownian motion dBt over time dt has zero expectation, and each of the 2-components has a variance of dt 2 (and the covariance are zero). As such, we can compute the expectation here and obtain u(t0 , x0 ) = u(t0 , x0 ) + ∂t u0 (t0 , x0 )dt + v · ∇x u0 (t0 , x0 )dt +
σ2 2
1u(t0 , x0 ) + L(v)dt .
So the only effect of the noise is reflected by the additional term σ2 1u(t0 , x0 ) to the right-hand side. This term does not depend on v and so does not affect the remainder of the analysis. We obtain the following equation of evolution partial differential equations for the unknown scalar functions u = u(t , x) 2
− ∂t u − ν 1u + H (∇x u) = 0,
(3.19)
the so-called the viscous Hamilton–Jacobi–Bellman equation for the optimal expected cost, where ν := σ is the viscosity, and the optimal velocity is still given by the formula (3.15). Eq. (3.19) can be viewed as a nonlinear backwards heat equation, which makes sense since we are looking for this cost backwards in time. The diffusive effect of the heat equation then reflects the uncertainty of future cost caused by the random noise. If we put H (p) = 12 |p|2 , we obtain the (backward) Hamilton–Jacobi–Bellman equation: 1 2
1
− ∂t u − ν 1u + |∇x u|2 = 0. 2
2
(3.20)
4. Optimal control of a multi-pedestrian system We now turn to the issue of optimally controlling a multi-agent system of N pedestrians. In principle, the theory developed for a single pedestrian straightforwardly generalizes to the multi-pedestrian situation. Assume now that instead of having just one pedestrian, one has a huge number N of pedestrians distributed throughout space. In addition, let us assume that all the pedestrians have identical motivations, in particular, they are all trying to minimize the same cost function, which implies in particular that all the pedestrians at a given point (t0 , x0 ) in spacetime
1512
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
will all move in a given velocity v(t0 , x0 ). Let us ignore the random noise for this initial discussion, in which case, we have the deterministic model. dX (t ) = b(X (t ), α(t ))dt ,
(4.21)
with the initial condition X (s) = x0 ∈ Ω ,
t > s.
(4.22)
The real-valued map t 7→ α(t ) = (α1 , . . . , αN ) is the control implemented by the ith pedestrian. The ith pedestrian minimizes a criterion of the form T
Z
α
L(xs , αs )ds + ψ(x(T ), T ) .
u(x, s) = inf E
(4.23)
s
In Eq. (4.23), L is an additive cost such that L(xs , α) ∈ (Rd )N × Rd and ψ is a Lipschitz function, ψ ∈ C 0,1 (Rd × R+ ). Rather than deal with each of the pedestrians separately, we will pass to a continuum limit N → ∞ andRconsider just the (normalized) density function ρ(t , x) of the pedestrians, which is a non-negative function with total mass Ω ρ(t , x)dx = 1 for each time t. Informally, for an infinitesimal box in space [x, x + dx], the number of pedestrians in that box should be approximately N ρ(t , x). We now suppose that the velocity field v is given to us, as well as the initial distribution ρ(0, x) of the pedestrians, and ask how the distribution will evolve as time goes forward. There are several ways to find the answer, but we will take a distributional viewpoint and test the density ρ(t , Rx) against various test functions F (x)—smooth, compactly supported functions of space (independent of time). The integral Ω ρ(t , x)F (x)dx can be viewed as the continuum limit of the sum N1 i=1 F (xi (t )), where xi (t ) is the location of the ith pedestrian at time t. At time t, the pedestrian should move at velocity v(t , xi (t )). Differentiating both sides of
PN
Z
N 1 X
F (xi (t )) N i =1 using the chain rule, we thus arrive at the heuristic formula Ω
ρ(t , x)F (x)dx '
Z
N 1 X
v(t , xi (t )) · ∇ F (xi (t )). N i=1 The right-hand side of (4.24) in the continuum limit N → ∞, should become
∂t ρ(t , x)F (x)dx '
Ω
N 1 X
Z
v(t , x) · ∇ F (x(t ))dx, N i =1 Ω which, after an integration by parts, becomes
Z Ω
v(t , xi (t )) · ∇ F (xi (t )) =
(4.24)
v(t , x) · ∇ F (x(t ))dx = −
Z Ω
(4.25)
∇ · (ρ v)(t , x)F (x)dx.
In summary, for every test function F we have
Z − Ω
[∂t ρ(t , x)F (x) + ∇ · (ρ v)(t , x)] F (x)dx = 0
which leads to the transport equation
∂t ρ(t , x) + ∇ · (ρ v)(t , x) = 0. We reintroduce additional random noise to the system by controlling the value of α such that Eq. (4.21) becomes dX (t ) = b(X (t ), α(t ))dt + σ (X (t ), α(t ))dBt ,
(4.26)
in accordance with assumption 1 and with the initial condition (4.22). Thus, of all the pedestrians that are infinitesimally close to x at time t, they will all try to move to x + vdt at time t + dt, but instead each one ends up at a slightly different location x + vdt + σ dBt , where the Brownian path dBt is different for each pedestrian. These observations lead to the formal equation:
Z
N 1 X
F (xi (t ) + v(t , xi (t ))dt + σ dBt ). N i=1 Taylor expanding the right-hand side as before, and then passing to the continuum limit, we eventually see that the righthand side takes the form Ω
ρ(t + dt , x)F (x)dx '
N 1 X
Z
σ2
F (xi (t ) + v(t , xi (t ))dt + σ dBt ) = ρ(t , x) F (x) + vdt · ∇ F (x) + N i =1 2 Ω and then repeating the above computations leads us to the Fokker–Planck equation
∂t ρ(t , x) − ν 1ρ(t , x) + ∇ · (ρ v)(t , x) = 0.
1F (x) dx
(4.27)
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1513
5. From pedestrian dynamics to mean field games crowd dynamics As we have seen so far, in the derivation of the Hamilton–Jacobi–Bellman equations above, each pedestrian had a fixed cost function to minimize that did not depend on the location of the other pedestrians. We will generalize the previous case. More precisely, we will describe through new systems of PDEs the asymptotic behavior of our differential games in which the number N of players tends to infinity. In this way one obtains a model which is linked to game theory and Nash equilibria. We postulate that: (H2) The cost function of each pedestrian also depend on the density function ρ of all the other pedestrians. We take controls of feedback information pattern, where the permission pattern strategies depend only on the current value of the state and time; that means
αs = α(s, Xs ).
(5.28)
In fact, in an interacting of our dynamics, the state evolution of an individual pedestrian is affected by an empirical average of coupling terms involving all other pedestrians. More precisely, α(s, Xs ) ∈ Rd represents the strategy of the pedestrian applied at time s which describes the move each pedestrian want to make given her position. If one hypothesizes that the number of pedestrians is large and they interact through their probability density function ρ(t ), then this modifies the structure of Eq. (3.3). Consequently, the dynamical phenomena can be fairly reproduced by stochastic ordinary differential equations for pedestrians to attain a preferred objective dX (t ) = b((t ), α(t , Xt ); ρ(t ))dt + σ (X (t ), α(t , Xt ); ρ(t ))dBt ,
(5.29)
with initial condition X ( s) = x0 ,
t > s.
(5.30)
Taking into account the dependence of the density of pedestrians which models interactions between pedestrians but reflects the insignificance of the individual influence, the criterion of the ith pedestrian is given by T
Z
u(x, s) := J (x, α; ρ) = inf E α
L(x(t ), α(t ); ρ(t ))dt + F (x(T ); ρ(T ))
(5.31)
s
over all eligible control αt . In Eq. (5.31), L(xs , α; ρ) ∈ (Rd )N × Rd × R+ ; F ∈ C 0,1 (Rd × P2 (Rd )) represents the marginal cost to a pedestrian of having a given density at the current location. We point out that in Eq. (5.31), the functionals J , L and F stand for Ji (·), Li (·) and Fi (·) where the index i stands for the number of pedestrian. Finally, it is necessary to specify the kind of information available to the pedestrians. The approach we present requires that each pedestrian has full information on the states which results in very high control complexity under large population. We consider situations where each pedestrian chooses his optimal strategy in view of a limited global information on the game. Thus we hypothesize that the pedestrians know the state of the system at each instant of time t. We call this the case of complete information. In this respect, attention should be paid to the following fact: we have n criteria and N pedestrians and we are within the framework of the theory of the games and we are going to work with Nash points. In mean field systems the cost function of a pedestrian depends on the actions taken by other agents only through the empirical distribution of all shares—all interactions are global. It is useful to consider static problem. In our setup it is natural to use the following definition: For N > 1, let (xi )16i6N be a finite system of random variables. We call empirical measure associated with the configuration (xi )16i6N , the measure of probability
µ ˆx = µ ˆ Nx =
N 1 X
N i=1
δxi .
(5.32)
A function uN (x1 , . . . , xN ) is called a symmetric function with respect to (x1 , . . . , xd ) belonging to Ω N to mean uN (x) = uN (xσ ), ∀ x ∈ Ω N , ∀σ ∈ S N , where x = (x1 , . . . , xN ), xσ = (xσ (1) , . . . , xσ (N ) ) and S N denotes the set of permutations of N. In this setup, we postulate that: (H3) The pedestrians are identical or indistinguishable. The hypothesis (H3) ensures that one can change the index of the state without changing the problem and the optimal cost for the ith pedestrian, gives the other, that is:
JiN xi , xj j= 6 i
≡ J N xi , xj j6=i .
1514
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
In other words, JiN (x) is symmetric function with respect to xj j6=i and independent to i. It is well known that function J which is symmetric with respect to xj , becomes a function on the probability measures when N tends to infinity. To describe the behavior of the symmetric function with n variables Ji , when n tend to infinity, we denote by
1
εx :=
N X
N − 1 j=1
δ xj
the empirical distribution for the system configuration (xi )16i6N , where δx is the Dirac measure centered in x. We then interpret Hypothesis (H3) by introducing the following assertion: (H4) For a sequence of symmetrical functions GN : Ω N → Rd , up to extraction of subsequence, GN (y1 , . . . , yN ) ' G(εy ), N →∞
G ∈ C (P (E )),
where the probability space P (E ) is equipped with a distance involving weak convergence. After all these preparations, the following hypothesis appears to be natural: (H5) The overall state of the system depends weakly on each of the pedestrians reflecting the global state of the pedestrians in the sense that, given two sets {y1 , . . . , yN }, and {z1 , . . . , zN }, then we have
N G (y1 , . . . , yN ) − GN (z1 , . . . , zN ) 6 ω (dLP (y, z )) ,
(5.33)
ω is uniform continuity modulus and dLP is Lévy–Prokhorov metric. The distance dLP reflects the weak dependence in relation to a small number of variables compared to N, meaning that, in a situation that involves a large number of individuals, only one individual cannot change the dynamic of the system. The translation of the hypothesis (H5) with regard to the cost function involves that: JiN (x) → J (x, ρ) ∈ C (E , P (E )), N →∞
1 6 i 6 N.
(5.34)
In other words, the limit sup JiN (x) − J (xi , εx ) → 0
(5.35)
N →∞
x∈Rd
holds. Going back to the running cost L in the criterion (4.23), (H3) can be interpret by introducing in the cost Ji , the functional Li : (Rd )N × Rd → Rd × Rd × P2 (Rd ) defined as follows Li (xs , α) = L(xi (s), α; εx ),
j 6= i,
(5.36)
where εx represents the average position of all pedestrians. The formulation (5.36) precises the symmetrical character of the function L, depending on an empirical measure. This idea is translate by the following technical assumption: (H6) We assume that infx,ρ
L(x,α;ρ) 1+kαk kαk→∞
→ +∞ and φi (x) = ψ(xi ; εx ), j 6= i, where k · k denotes Euclidean norm in Rd .
We will now pass formally in the limit, N → +∞, when the number of pedestrians becomes large. Before doing, we begin to describe briefly the strategy: we fix the feedback of (N − 1) pedestrians and obtain diffusive behaviors; for the last pedestrian (that it is the key point of the Nash points), we minimize and obtain the Hamilton–Jacobi–Bellman equations. The analysis will be divided into two steps: the first step is to derive an equation that incorporates all equations for all initial distributions. Once we have specified the initial distributions of the pedestrians, the second step will be to derive the resulting mean field games equations. Finally, letting N → +∞, an evolution equation is derived for the value function u via the symmetrical property of u. Let us introduce the following hypothesis (H7) The value function u(t , x) ∈ [0, T ] × (Rd )N is symmetric with respect to positions (xi )j6=i meaning that u is invariant under the transformation
(x1 , . . . , xN ) 7−→ (xσ (1) , . . . , xσ (N ) ),
for σ ∈ SN .
Throughout the paper, we assume that ui ∈ C 2 ([0, T ] × (Rd )N ). The equation obtaining has the following structure:
Step 1. Each pedestrian sees all the positions, all the Brownian one, and with the absence of the term of control, one would obtain the heat equation in Rd which is Laplacian with respect to x and described by the equation:
−
∂ ui σ2 − ∆ x ui . ∂t 2
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1515
Next, we have stochastic control with respect to the ith variable. This corresponds to have an optimal dynamics with respect to xi , appearing in Fokker–Planck equations. Since we are concerned with a problem of control, this equation is a nonlinear Fokker–Planck equation. In a Nash equilibrium, other strategies are frozen and one tries to find the best strategy of the ith pedestrian. We obtain the Hamiltonian function H (xi , ∇xi ui ; εx ). On the other hand, we get the other strategies through the Hamiltonian function H. Infinitesimally, the ith pedestrian tries to find the maximum strategy by maximizing: sup αi · ∇xi ui + L(xi , αi ) ,
αi
so that, the convexity of L makes it possible to definite Legendre transform (with respect to the control α ): H (x, p, ρ) = sup {α · p + L(x, α; ρ)} , α
where H ∈ R × R × P2 (Rd ) and where we put p := ∇xi ui . The optimal control αi (x, p) is obtained by deriving the Hamiltonian H with respect to p: d
d
∂H (x, p; ρ) = αopt ∂p meaning that, the control α maximizes H at the point (x, p) giving the information about the other pedestrians:
X ∂H j6=i
∂p
(xj , ∇xj uj ; εx ) · ∇xj ui .
This gives the optimal control of the ith pedestrian in the Nash equilibrium. The dynamics of changes in the jth coordinates are given for the criterion of ith pedestrian. Combining all these terms, we arrive at the following final equation for the strategies ui (t , x) ∈ C 2 ([0, T ] × (Rd )N ):
−
X ∂H ∂ ui σ2 − ∆x ui + H (xi , ∇xi ui ; εx ) + (xj , ∇xj uj ; εx ) · ∇xj ui = 0, ∂t 2 ∂p j6=i
(5.37)
with a terminal condition (j 6= i) ui |t =T = ψ(xi , εx ).
(5.38)
This completes the first step. We point out that the quasi-linear parabolic equation (5.37) contains three main parts: the first part is a heat equation with terminal condition (5.38) giving Fokker–Planck equation, and obtaining in the absence of a term of the control system. The second one is the term of Hamiltonian H (xi , ∇xi ui ; εx ) given by H (x, p, ρ) = sup {α · p + L(x, α; ρ)} , αi
obtained if there were only stochastic control with a single pedestrian. Finally, the last term obtained by the actions of other pedestrians.
Step 2.
Next, we examine the passage to the limit, for the equation in Rd × P2 (Rd ), by starting a sufficient condition to ensure weak dependence of pedestrian. (H8) For a small T , we assume that
|∇xj ui | 6
C N
,
j 6= i,
and |∇x ui | 6 C .
(5.39)
The first inequality seems natural for our particular case, since by looking for the final condition, the derivative with respect to xj is for weak dependence with respect to each of the pedestrians. Whenever there is xj in the equation, it is always appears a factor 1/N. The second inequality gives equicontinuity. It will now be useful to introduce further assumptions about the derivation of our evolution equation. To this end, we postulate that: (H9) Up to extraction of the subsequence and by indistinguishability, we assume that ui (x, t )
→ U (xi , ρ, t ),
N →+∞
where U ∈ C (Rd × P2 (Rd ) × [0, T ]) meaning that ui is the same function of xi , ∀i. By passing to the limit as N → +∞ with the help of hypothesis (H9), we deduce that:
1516
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
On the one hand, the first two terms establish the limit of the heat equation describing random walks:
∂ ui σ2 ∂U σ2 ∂U 1 − ∆ x ui * − ∆x U + , − 1ρ . ∂t 2 ∂t 2 ∂ρ 2
(5.40)
On the other hand, for j 6= i, we get H (xi , ∇xi ui ; εx ) * H (x, ∇x U , ρ).
(5.41)
Furthermore, for j 6= i,
∂H ∂H (xi , ∇xj uj ; εx ) · ∇xj ui * (x, ∇x U (x, ρ), ρ) · ∇x ρ ∂p ∂p leading to the following quasi-linear parabolic equation: 2 ∂ U − σ ∆ U + ∂ U , − 1 1ρ + H (x, ∇ U ; ρ) + ∂ U , −div ∂ H (x, ∇ U ; ρ) = 0 x x x ∂t 2 ∂ρ 2 ∂ρ ∂p U |t =T = F (x, ρ),
(5.42)
(5.43)
which is the main equation of mean field games (MFG) (see [12]). The term quasi-linear means that nonlinearities only affect the first derivatives. The above equations have to be regarded as a mathematical framework which may generate specific models on the basis of an analysis and modeling of pedestrian interactions. The following example illustrate the concept of MFG. We replace the cost function (5.31) by J (x, T ) =
T
Z
L(x0 (t ))dt + F (ρ(t , x(t ))) dt + u(T , x(T )),
(5.44)
0
where F : R+ → R represents the marginal cost to a pedestrian of having a given density at the current location. If F is increasing, this intuitively means that the pedestrian prefers to be away from the other pedestrians (a reasonable hypothesis for traffic congestion), which should lead to a repulsive effect; conversely, a decreasing F should lead to an attractive effect. The Hamilton–Jacobi–Bellman equations can be formally derived for this cost functional by a similar analysis to before, leading (in the presence of viscosity) to the equation
− ∂t u − ν 1u + H (∇x u) = F (ρ).
(5.45)
Meanwhile, the velocity field v is still given by the formula (3.15), so the Fokker–Planck equation (4.27) becomes (after reversing sign)
− ∂t u − ν 1u + ∇x · (ρ H 0 (∇x u)) = 0.
(5.46)
The system of (5.45), (5.46), with prescribed final data u(T , x) for u at time T and initial data ρ(0, x) for ρ at time 0, is then an example of a mean field game. The backward evolution equation (5.45) represents the pedestrians’ decisions based on where they want to be in the future; and the forward evolution equation (5.46) represents where they actually end up, based on their initial distribution. Typical examples for F include nonlocal smoothing operators. For example, we can choose F (y) = y2 a nondecreasing function. Such a function is used to model repulsive cases when the players do not like to share their position with others. Various models can be derived using different expressions of the cost function. For instance,
• a specific example of the type of our modeling refers to the case where the running function L can take to be quadratic, i.e. L(t , x, α(t , x)) = g (t , x, ρ(t , x)) −
1 2
kαk2 .
(5.47)
This means, there is a separation between the gain by the position x at time t and the cost of moving α , which are quadratic. The simplest case of L is L(x, α) =
1 2
kαk2 − V (x)
(5.48)
where L is a quadratic operator in α , for the given potential energy V . Typical examples for V include nonlocal smoothing operators. To determine the control parameter α , observe that
1 α(x) = arg min − kak2 + ∇ u(x) · a . a
2
Subsequently, we deduce the control parameter: α(x) = ∇ u(x).
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1517
• As an immediate consequence of (5.47), the following particular case can be considered: L(t , x, α(t , x)) = g (t , x, ρ(t , x)) −
ρθ
kαk2 ,
2 where θ > 0 corresponds to the model of congestion; thereby, the control parameter is
α(x) = ρ −θ ∇ u(x). • The next example exhibits both local and global interaction of pedestrians. We consider the case of a ‘‘mean field planning problem’’ of pedestrian’s path from its state at xs to various destinations. Assume that the planned walking speed kvk is a trade-off between the time remaining to get to the destination in time and the energy use due to walking at a particular speed. In addition, this kinetic energy consumption is a quadratic function of the pedestrian speed. If we add the local cost C (t , x, ρ(t , x)) per unit distance of movement at time t ∈ T in the facility, depending on the local operating conditions in the walking facility, then, the running cost is given by 1
J (x, v) =
τ
Z
2
kv(s)k2 + L(s, x, ρ(s, x)) ds + V (T , ρ(T , x)),
(5.49)
0
where L, V : Ω → R and k · k denotes Euclidean norm in Rd . A typical example is when the crowd aims to reach zones of low potential V at terminal time t0 (as an extreme case, if V is 0 on some set Ω and +∞ elsewhere then the aim of the crowd is to be in the safe zone Ω at the terminal time). In the case of the running cost (5.49), the dynamics is governed by the following Fokker–Planck equation
∂ρ σ2 − 1ρ + div(ρ v) = 0 ∂t 2 ρ(0, ·) = ρ0
(5.50)
with the diffusion process of the type dXs = σ dws + vds, showing a remarkable analogy with the nonlinear diffusion model. Eq. (5.50) is generalization of the following simpler system of (5.45), (5.46), with F = 0 and ν = 0:
∂u 1 ∂ρ + |∇ u|2 = F (ρ) + div (ρ∇ u) = 0 ∂t 2 ∂t ρ(0, x) = ρ0 (x), ρ(T , x) = ρT (x),
(5.51) (5.52)
introducing by the authors [18] as a fluid mechanics formulation of the Monge–Kantorovich mass transfer problem. It is a ‘‘planning problem’’. Indeed, we have some outcome function ρ across the population, we know where it starts, and we define where we want it to end. • As in analogy to mean field games to the population dynamics, the following example shows how our model can accommodate the framework considered in [19]. The pedestrians can basically minimize log utility of the running cost: L(ρ(t , x)) = ln(ρ(t , x)),
(5.53)
meaning that, inside the crowd, pedestrians want to be like other pedestrians. The running cost (5.53) generates the following utility function J (x, α) :=
T
Z
ln(ρ(s, xs )) −
0
1 2
|α 2 (s, xs )| ds.
(5.54)
In infinite horizon problem, the pedestrians will maximize the utility function of the form u( t , x ) =
T
Z min
(αs )s>t ,xt =x
E t
1 (ln(ρ(s, xs )) − |α 2 (s, xs )|)e−ζ (s−t ) ds , 2
(5.55)
with dXt = α(t , Xt )dt + σ dBt .
(5.56)
Therefore, the control problem is equivalent to the following system of HJB and Kolmogorov equations, respectively given by
min ∇ u · b − b
kb k2 2
+
∂ρ σ2 + ∇ · (ρ b) = 1ρ ∂t 2
∂u σ 2 + 1u − ζ u = − ln(ρ) ∂t 2
(5.57)
(5.58)
such that
α(t , xt ) = ∇ u(t , xt )
(5.59)
1518
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
which can be rewritten in the form
1 ∂u σ 2 + 1u + |∇ u|2 − ζ u = − ln(ρ) ∂t 2 2 σ2 ∂ρ + ∇ · (ρ∇ u) = 1ρ. ∂t 2
(5.60)
• We can also consider the case in which each pedestrian minimizes the criterion of the type Z T L(t , xt , α(t , xt ), ρ(t , xt ))dt + φ(T , xT , α(T , xT ), ρ(T , xT )) . min E α
0
Then, it is easy to check that the control parameter is given by
α(t , xt ) = arg min(L(t , xt , b, ρ(t , xt )) + ∇ u(t , xt ) · b) b
from which we deduce the following Hamilton–Jacobi type equation
1
∂t u(t , x) + sup L(t , x, α, ρ(t , x)) + ∇ u(t , x) · α + tr(σ σ 2
α(·)
u|s=T = φ(T , x, ρ(T , x)),
T
D2x u
(t , x)) = 0
0 < s < T,
(5.61) (5.62)
where tr denotes the trace, D2 φ = (φri , φrj ), i, j = 1, . . . , n is a matrix of second order partial derivative of φ . Eq. (5.61) is associated with Kolmogorov equation:
∂t ρ(t , x) + div(α(t , xt )ρ(t , x)) = ρ(0, x) = ρ0 (x),
1 2
tr(σ (t , x)0 D2xx ρ(t , x)σ (t , x))
ρ(T , x) = ρT (x),
(5.63)
in Ω
(5.64)
given two probability densities ρ0 and ρT . Since the function ρ represents a probability density it is natural to supplement (5.63)–(5.64) with the further conditions
Z Ω
ρ(t , x)dx = 1,
ρ > 0.
(5.65)
If we confine our analyze to case of the feedback information pattern i.e. depending only on the instantaneous state of the system and not the past, and setting
( α0 (t ) := α(xt , t ) dxt = σ (xt , α0 (t ); ρ(t ))dBt + b(xt , α0 (t ); ρ(t ))dt x(s) = x0 , t > s,
(5.66)
then, thanks to the preceding arguments, the resulting transport equation for the measure is written as
∂ρ − ∂ij (aij ρ) + ∂i (bi ρ) = 0, ∂s ρ|s=0 = ρ0 ,
0
(5.67)
where the vector-valued b = (b1 , . . . , bN ) and the matrix-valued function a = (aij ), i, j = 1, . . . , N are defined by aij =
1 2
σ σ T (x, α0 (x, s)),
b = bi (x, α0 (x, s)).
This is a Fokker–Planck equation associated to the system (or Liouville equation) in deterministic case. Therefore, ρ acts through the optimal coupling. Note that the terms aij and bi depend on ρ . For a given ρ, u is determined by the Eq. (5.66), u determines α0 and α0 determines aij and bi . A specific example of this type of analysis refers to the problem of planning where σ = 1 leading to the stochastic equation dX (t ) = dBt + α(t )dt ,
(5.68)
supplemented with the value function of the form T
Z
u(x, s) = inf E α
s
1
θ
θ
|α(t )| + F (x(t ); ρ(t )) dt + φ(x(T ); ρ(T ))
(5.69)
and the (nonlinear) mapping F associating to a probability density ρ is a Lipschitz function on Rd or Td . Each of dynamic being free and the coupling with the pedestrian is done through a structure of cost. Thus we get the following system of
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
1519
equations
1 q ∂u 1 − 1u + |∇ u|p = F (x, ρ), p = ∂ t 2 p q − 1 u|t =T = φ(x, ρ), 0 < t < T ∂ρ 1 − 1ρ + div (∇ρ)p−1 = 0 2 ∂t ρ|t =0 = ρ0 with the notation |ξ |p = (ξ )p−1 · ξ , where ξ is a vector. For instance, if one takes σ = 0, p = 2, H (x, p) = F ≡ F (ρ(x)), it is easily checked that u satisfies the following nonlinear elliptic equation:
(5.70)
1 2
|p|2 , and
∂u 1 + |∇ u|2 = F (ρ) ∂t 2 (5.71) ∂ρ + div (ρ∇ u) = 0. ∂t More precisely, setting v = ∇ u, deriving the first equation of (5.71) with respect to x, we get ∂(ρ v) + div(ρ v ⊗ v) + ∇ p(ρ) = 0 ∂t (5.72) ∂ρ + div (vρ) = 0, ∂t with the pressure p given by p0 (z ) = −zF 0 (z ). This is a compressible Euler equation in the so-called barotropic regime widely studies in literature (see for instance [20,21]). We also have planning problems, which consists in prescribing either a density ρ and a utility function u, but with final time with u, i.e. compressible Euler equation with only two conditions on the density (an initial condition and a final condition). Typically, taking the pressure equal to zero, F ≡ 0, i.e. the compressible Euler equation without pressure with ρ0 = ρ|t =0 and ρ1 = ρ|t =T (see [22]), then, we obtain the optimality equation for the definition of the so-called Monge–Kantorovich (or Wasserstein) distance between the two measures ρ0 and ρ1 . 6. Critical analysis and perspectives In this paper a mean-field limit framework for the modeling of crowd dynamics has been proposed based on the concept of differential game. We have put forward the models for pedestrian behavior, based on the assumption that pedestrians are cost minimizers. We have applied different technique of optimal control theory and the new theory of mean field games. It is worth providing a critical analysis of the approach of this paper compared with the existing literature based on the probabilistic concept of distribution [23,24]. One of the criticisms of game theory, as applied to the modeling of human decisions, is that human beings are, in practice, rarely fully rational. Our approach is developed thanks to mean field games theory, both deterministic and stochastic, using the concept of pedestrians’ preference, and trying to minimize some objective function and based on the rationality assumption, meaning that, each pedestrian is assumed to be rational in the sense that it both optimizes its own cost and its strategy is based upon the assumption that the other pedestrians are rational. Nevertheless, according to some works of the recent literature such as [25], it is extremely unrealistic to think that individuals are perfectly rational, in the sense that all of them want to maximize some fixed utility functions: in reality, opinions and preferences can dynamically vary across time. Thus, we can also describe this kind of phenomena using the probabilistic concept of distribution of preferences [11]. Many of the formalisms in traditional game theory imply a degree of rationality by the players/pedestrians involved in a game. A few comments may be of interest. When deciding what to do, a pedestrian will have to make a conjecture or prediction about what he thinks of those around him will do. On the basis of this prediction or belief about what he thinks of those around him will do, a rational pedestrian will then choose his utility function-maximizing strategy. As crucial as the notion of rationality is for the theory, the term rationality is not without problems. For one thing, the term rationality is not universally defined, and for another thing, human pedestrians are often not the hyper-rational pedestrians the theory requires them to be. Many applications of game theory therefore involve abstractions and simplifications to various degrees. For instance, this happens when game theory is applied to the modeling of interactions in genes, viruses, or cells, as is the case in evolutionary game theory [26]. Finally, in this paper, we mostly investigate pedestrian motion in connection with differential game theory and with complete information. This means that each pedestrian knows the identity of other pedestrians, their strategy functions and the resulting utility function. The concept of complete information should not be confused with the concept of perfect information which means all pedestrians know all moves that have taken place. On the other hand, modeling approach leaves various problems still open, which need to be properly treated to refine the simple models delivered in this paper. A further aspect to be considered, as well remarked in Daganzo’s criticisms [27], is the heterogeneous behaviors of pedestrians.
1520
C. Dogbé / Mathematical and Computer Modelling 52 (2010) 1506–1520
This aspect is even more critical in the physics of crowds where changes in the environmental conditions can introduce substantial modifications in the individual behaviors. Indeed, this is the case of transition from normal to panic conditions, where individuals loose their trend to the target and are attracted by directions, correctly or incorrectly, identified to escape danger. In general, the modeling approach has to take into account not only local interactions, but also long-range interactions which can be identified either by specific targets, such as an exit zone, or by attraction or repulsion from groups of individuals. Regarding the search for solutions of models obtained, solving the coupled system of Eqs. (5.45)–(5.46), one evolving backwards in time, and one evolving forwards in time, is highly non-trivial, and in some cases existence or uniqueness or both break down (which suggests that the mean field approximation is not valid in this setting). However, there are various situations in which the solutions are obtained: having a small T (so that the pedestrians only plan ahead for a short period of time), having positive viscosity, and having an increasing F (i.e. an aversion to crowds) all help significantly, and one typically has a good existence and uniqueness theory in such cases, based to a large extent on energy methods. One interesting feature is that when becomes large enough (and in the attractive case when is decreasing), uniqueness can start to break down, due to the existence of standing waves that are sustained solely by nonlinear interactions between the forward evolution equation and the backward one. As already laid out in the introduction, the models proposed to be regarded as applications consistent with the framework of [1,12]. It should be clear that the very aim of the present paper is only to give a preliminary methodological pattern of crowd behavior dynamics in terms of mean field games models. We have shown how certain mathematical structures can be used to model the collective behavior of crowd dynamics. Importantly, we have established dynamic game theoretic models. Looking at research perspectives, an interesting program consists in extending this result to modeling swarm (see [28] and the references therein). Hopefully, the procedure can be developed also for other collective behaviors. References [1] N. Bellomo, C. Dogbé, On the modelling crowd dynamics from scaling to hyperbolic macroscopic models, Math. Models Methods Appl. Sci. 18 (Suppl.) (2008) 1317–1345. [2] C. Dogbé, On the numerical solution of second-order macroscopic models of pedestrian flows, Math. Comput. Appl. 56 (2008) 1884–1898. [3] D. Helbing, P. Molnár, Social force model for pedestrian dynamics, Phys. Rev. E 51 (1995) 4282–4286. [4] D. Helbing, I. Farkas, T. Vicsek, Simulating dynamical feature of escape panic, Nature 407 (2000) 487–490. [5] D. Helbing, I.J. Farkas, P. Molnar, T. Vicsek, Simulation of pedestrian crowds in normal and evacuation situations, in: M. Schreckenberg, S.D. Sharma (Eds.), Pedestrian and Evacuation Dynamics, Springer, Berlin, 2002, pp. 21–58. [6] R.L. Hughes, A continuum theory for the flow of pedestrians, Transp. Res. B 36 (2002) 507–536. [7] R.L. Hughes, The flow of human crowds, Annu. Rev. Fluid Mech. 35 (2003) 169–182. [8] V. Coscia, C. Canavesio, First-order macroscopic modelling of human crowd dynamics, Math. Models Methods Appl. Sci. 18 (Suppl.) (2008) 1217–1247. [9] L.F. Henderson, The statistics of crowd fluids, Nature 229 (1971) 381–383. [10] N. Bellomo, C. Bianca, M. Delitala, Complexity analysis and mathematical tools towards the modelling of living systems, Phys. Life Rev. 6 (2009) 144–175. [11] N. Bellomo, C. Dogbé, On the modelling of traffic and crowds. A survey of models, speculations and perspectives, SIAM Rev. (in press). [12] J.-M. Lasry, P.-L. Lions, Mean field games, Jpn. J. Math. 2 (1) (2007) 229–260. [13] J.F. Nash, Non-cooperative games, Ann. of Math. 54 (1951) 286–295. [14] L. Kantorovich, On the translocation of masses, Dokl. Akad. Nauk SSSR 37 (7–8) (1942) 227–229. [15] L. Kantorovich, On a problem of Monge, Uspekhi Mat. Nauk. 3 (1948) 225–226. [16] D. Helbing, Traffic Dynamics: New Physical Modeling Concepts, Springer-Verlag, Berlin, 1997 (in German). [17] S.P. Hoogendoorn, P.H.L. Bovy, Pedestrian route-choice and activity scheduling theory and models, Transp. Res. B 38 (2004) 169–190. [18] J.-D. Benamou, Y. Brenier, A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem, Numer. Math. 84 (3) (2000) 375–393. [19] O. Guéant, A reference case for mean field games models, J. Math. Pures Appl. 92 (3) (2009) 276–294. [20] P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 1, Oxford Science Publications, Oxford, 1996, Vol. 2, Clarendon Press, 1998. [21] A.J. Chorin, J.E. Marsden, A Mathematical Introduction to Fluid Mechanics, third ed., in: Texts in Applied Mathematics, vol. 4, Springer-Verlag, New York, 1993. [22] J.-D. Benamou, Y. Brenier, K. Guittet, The Monge–Kantorovitch mass transfer and its computational fluid mechanics formulation, Internat. J. Numer. Methods Fluids 40 (1–2) 21–30. [23] L.F. Henderson, On the fluid mechanics of human crowd motion, Transp. Res. 8 (1974) 509–515. [24] S.P. Hoogendoorn, P.H.L. Bovy, A gas-kinetic model for pedestrian flows, Transp. Res. Rec. 1710 (2000) 28–36. [25] A. Rubinstein, Economics and Languages, Cambridge University Press, 2000. [26] M.A. Nowak, K. Sigmund, Evolutionary dynamics of biological games, Science 303 (5659) (2004) 793–799. [27] C.F. Daganzo, Requiem for second-order fluid approximations of traffic flow, Transp. Res. B 29B (1995) 277–286. [28] S. Albeverio, W. Alt, Stochastic dynamics of viscoelastic skeins: condensation waves and continuum limit, Math. Models Methods Appl. Sci. 18 (2008) 1149–1192.