Short time solution to the master equation of a first order mean field game

Short time solution to the master equation of a first order mean field game

JID:YJDEQ AID:10108 /FLA [m1+; v1.304; Prn:25/11/2019; 13:20] P.1 (1-68) Available online at www.sciencedirect.com ScienceDirect J. Differential Eq...

2MB Sizes 2 Downloads 21 Views

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.1 (1-68)

Available online at www.sciencedirect.com

ScienceDirect J. Differential Equations ••• (••••) •••–••• www.elsevier.com/locate/jde

Short time solution to the master equation of a first order mean field game Sergio Mayorga Department of Mathematics, Baylor University, Waco TX 76798, USA Received 25 June 2019; revised 10 November 2019; accepted 12 November 2019

Abstract The goal of this paper is to show existence of short-time classical solutions to the so called Master Equation of first order Mean Field Games, which can be thought of as the limit of the corresponding master equation of a stochastic mean field game as the individual noises approach zero. Despite being the equation of an idealistic model, its study is justified as a way of understanding mean field games in which the individual players’ randomness is negligible; in this sense it can be compared to the study of ideal fluids. We restrict ourselves to mean field games with smooth coefficients but do not impose any monotonicity conditions on the running and initial costs, and we do not require convexity of the Hamiltonian, thus extending the result of Gangbo and Swiech to a considerably broader class of Hamiltonians. © 2019 Elsevier Inc. All rights reserved. MSC: 34A13; 35R06; 35R15; 45K05; 49L99; 49N70; 91A13; 91A23 Keywords: Mean field games; Master equation; Hamilton-Jacobi equations; Fixed-point method; Characteristic equations; Wasserstein gradient

E-mail address: [email protected]. https://doi.org/10.1016/j.jde.2019.11.031 0022-0396/© 2019 Elsevier Inc. All rights reserved.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.2 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

2

1. Introduction The master equation (ME, for short) of first order mean field games, namely,  ⎧ ⎪ ⎪ ∂s u(s, q, μ) + ∇μ u(s, q, μ)(x) · ∇p H (x, ∇q u(s, x, μ))μ(dx) ⎪ ⎪ ⎪ ⎨ d T

⎪ ⎪ ⎪ ⎪ ⎪ ⎩

+ H (q, ∇q u(s, q, μ)) + F (q, μ) = 0

in (0, T ) × T d × P(T d ),

u(0, q, μ) = g(q, μ),

(1.1)

on T d × P(T d )

is a non-local, infinite-dimensional partial differential equation that arises in mean field game (abbreviated MFG) theory and can be interpreted as either the limit, as N → ∞, of a system of N coupled Hamilton-Jacobi equations that represent the Nash equilibrium of a differential game played by N interacting deterministic particles, or as the limiting case (formally, at least) of the master equation of a stochastic differential game when the viscosity parameter, associated with the intrinsic noise of the infinitesimal particles, tends to zero. In (1.1), P(T d ) is the Wasserstein space of Borel probability measures on the d-dimensional torus T d . The objective of this paper is to construct a short-time solution to (1.1) for an arbitrary smooth Hamiltonian H . To our knowledge, existence of solutions to (1.1) has only been shown for the particular case [24] of the quadratic Hamiltonian H (q, p) = 12 |p|2 . Mean-field theory in differential games began with the works of P.L. Lions and J.M. Lasry [34] and M. Huang, P.E. Caines, R.P. Malhamé [32], attracting great interest since then for its numerous applications and posing challenging theoretical questions. Equation (1.1) and its higher-order version were first introduced by Lasry and Lions [36], motivated by, among other reasons [16], the need to clarify the connection between games with finitely but many players, and MFGs. Regarding the latter, and considerably more scrutinized so far than (1.1), is the so called first order mean field game system: ⎧ ∂t U (t, q) + H (q, ∇q U (t, q)) + F (q, σt ) = ⎪ ⎪ ⎨ ∂t σt + div(σt ∇p H (q, ∇q U )) = U (0, ·) = ⎪ ⎪ ⎩ σs =

0 in (0, s) × T d , 0 in D ((0, s) × T d ), g(·, σ0 ), μ,

(a) (b) (c) (d)

(1.2)

which describes a Nash-type equilibrium state of a differential game1 played by a continuum of players on T d who seek to minimize a certain cost function that depends on the collective behavior of all the players; in such a state, U (t, q) represents the value function of a typical player q at time t and σt is the mass distribution of all the players at time t , represented by a Borel probability measure on T d . The first equation in (1.2a)-(1.2d) is a forward Hamilton-Jacobi equation and the second a backward continuity equation. These equations can be derived as the optimality conditions of the aforementioned game (see, e.g. [6,12,31]) or as the limit of approximate Nash equilibria in finitely-many player games (e.g. [33]). Alternatively, a solution to (1.1) can be used directly to construct an optimal control for the Nash equilibrium of the mean 1 Systems such as (1.2a)-(1.2d) are often called deterministic because they derive from MFGs where the differential equation that governs the evolution of each player has no stochastic terms, with the resulting MFG system featuring no second order derivatives.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.3 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

3

field game [22]. The now extensive and rapidly growing literature on MFGs includes surveys and books [7,12,18,28] that give comprehensive accounts of the theory, its models and applications, with at least one book [29] devoted entirely to regularity theory of MFGs. In the present work, the Hamiltonian H is smooth, and not necessarily convex in p (which is commonly required), while the couplings F and g are smooth and jointly Lipschitz, with a continuous “mixed” derivative (see Section 2.1 for the assumptions). In the master equation, F and g are required to possess some more regularity in both variables. For convenience, we refer the reader to Section 3 for a summary of our results. When H (q, ·) is not convex, the typical interpretation of the MFG system as an optimization problem in mean field games is lost, since the Legendre transform of H is not defined (however, nonconvex MFGs appear in the literature [40], [5]). Nevertheless, this does not prevent us from looking at (1.1) and (1.2a)-(1.2d) as a legitimate problem in PDEs. Considerable understanding of the relationship between master equations and mean field game systems has been gained since their inception, and, indeed, an established strategy [16], which we also use here, to construct a solution to the master equation is to use solutions to the MFG system as the starting point; a probabilistic approach can be found, e.g., in [19,18]. However, it is MFG systems that have seen more rigorous and abundant treatment in the literature. Theoretically, this is mostly due to the derivative in measure that appears in (1.1). The main issue with the strategy just mentioned becomes to prove that the solutions to (1.2a)-(1.2d) behave nicely enough with respect to the terminal measure μ. Concerning the system2 (1.2a)-(1.2d), weak solutions in the viscosity sense were initially obtained on T d and Rd for regularizing couplings, Hamiltonians with quadratic growth in the momentum variable and arbitrary time horizons [4,12,13,34]. Uniqueness is usually obtained by imposing monotonicity conditions on the couplings. Several significant modifications and refinements of these results have been obtained since then, e.g., existence and uniqueness of weak solutions in the case of local couplings [14], first-order [15,35,39] and higher [30] Sobolev estimates of such solutions, with different growth conditions of the Hamiltonian, whose convexity in the momentum variable is always required, and absolute continuity of the terminal measure μ; all the approaches in these developments work for arbitrary time horizons. Regarding second order MFGs, i.e., 

U + ∂t U (t, q) + H (q, ∇q U (t, q)) + F (q, σt ) = 0 −U + ∂t σt + div(σt ∇p H (q, ∇q U )) = 0

in [0, s] × T d , in D ([0, s] × T d ),

(1.3)

strong solutions for considerably larger classes of coefficients, in which the interaction of the particles enters the equation directly in H and not necessarily through F as a separate summand, have been obtained by Ambrose [1,2], with the requirement of certain smallness assumptions on the data (see, also, [20]). In general, the presence of the viscosity term in (1.3) affords solutions to enjoy better regularity properties. For this reason, and because (1.3) accommodates models where players act non-deterministically, second order mean field games have received more attention in the scientific literature [28,29]. Incidentally, we see that due to the sign of the viscosity term in the first equation of (1.3), the system can be well posed only if an initial condition is prescribed while the opposite sign in the second equation makes it mandatory to prescribe the terminal value of σ . 2 Our presentation of the MFG is reversed in time with respect to the most frequent one encountered in literature, i.e., with a terminal condition for g and an initial one for σ , and with minus signs in front of the time derivatives.

JID:YJDEQ AID:10108 /FLA

4

[m1+; v1.304; Prn:25/11/2019; 13:20] P.4 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

In the course of our work towards the ME, we obtain classical solutions for short times of the system (1.2a)-(1.2d), with the conditions on the coefficients H, F, g mentioned above. With respect to master equations, most available literature has dealt with the higher-order variants [8,9,18,17]. The recent paper by Cardaliaguet et al. [16] includes rigorous proofs of classical solutions for the master equations of second-order MFGs, such as (1.2a)-(1.2d) and their characterization as the convergence of N -player Nash systems as N → ∞. As for the first-order ´ ech in [24], where short-time master equation (1.1), a major step was achieved by Gangbo-Swi˛ strong solutions to both the MFG system (1.2a)-(1.2d) and the ME are obtained for quadratic Hamiltonians and potential-derived couplings. The same result was later proved by Bessi [10] using different techniques. The present paper works with general smoothness conditions on H , F and g that include [24] as a particular case, but which constitute a solid generalization, especially since it does not require any geometric assumptions: the Hamiltonian H may be nonconvex in p. We use similar ideas and techniques to arrive at the ME, but our route to the MFG system is different. Let us explain the differences with [24]. Given H, F, g as in Section 2.1, and given 0 < s < T , μ ∈ P(T d ), we prove that, granted T is small, there are functions 1 : [0, T ] × T d → T d , 2 : [0, T ] × T d → Rd that solve the infinite-dimensional Hamiltonian system ∂t 1 = ∇p H (1 , 2 ),

∂t 2 = −∇q H (1 , 2 ) − ∇q F (1 , 1# μ)

(1.4)

with initial and terminal conditions 2 (0, q) = ∇q g(1 (0, q), 1 (0, ·)# μ),

1 (s, q) = q,

providing us with a path in P(T d ) given by t → σt := 1t # μ and prove that there is a function U : [0, T ] × T d → R such that3 ∇q U (t, 1t ) = 2t .

(1.5)

On the other hand, we have that the velocity vector vt driving the path σt satisfies v(t, 1t ) = ∂t 1t , and the first equation in (1.4) implies ∇p H (q, ∇q U (t, 1t )) = ∂t 1t .

(1.6)

Comparing (1.6) and (1.5), we see that they are the same if H (q, p) = 12 |p|2 , which is the Hamiltonian in [24], and, indeed, in that case, ∂t 1 = 2 , with ∇q U (t, ·) coinciding with the velocity vt (a posteriori from (1.5)). Thus, the function 2 is not present in [24], with ∂t 1 taking its place. In their case, to show that ∇q U (t, 1t ) = ∂t 1t , that is, to show that ∇q U , where U is 3 We follow the convention, common in this field, of using the subindex t to mean “at time t ”, and thus a shorthand for (t, . . .) rather than the time derivate.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.5 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

5

the solution to eq. (1.2a), drives the flow of players in eq. (1.2b), they study the following value function on the Wasserstein space4 : U (s, μ) = inf



s 



L(q, vt (q)) − F(σt )σt (dq) dt + G(σ0 ) σs = μ, σ ∈ AC 2 (0, s; P(T d )) ,



(σ,v) 0 Td

(1.7) where L = 12 |v|2 , and F, G : P(T d ) → R are functions whose Wasserstein gradients are F and g. By solving the optimality equations, they produce the unique minimizer for this value function, and due to the monotonicity with respect to |v| in 12 |v|2 , it follows, by the theory of absolutely continuous curves on the Wasserstein space, that v is the gradient of a function; thus the symmetry of ∇q vt (q) is used to establish ∇q U (t, 1t ) = ∂t 1t and is the key to constructing the solution to the MFG system by characteristics. In the case of the general Hamiltonian, it is no longer clear how this approach can give us (1.5). We turn, instead, to a more direct procedure (Lemma 4.19) that also helps to shed further light on how the equations (4.3) are indeed the characteristics of (1.2a)-(1.2d). This optic allows us to present a sort of uniqueness counterpart ˜ σ˜ ) is in W 2,3;∞ ((0, T ) × (Theorem 4.22) to the existence result, namely, that if a solution (U, d 2 d T ) × AC (0, T ; P(T )), then it must coincide, at least for a shorter time T , with the pair (U, σ ) constructed from (1 , 2 ). With this approach we manage to circumvent the specific potential forms for F and g present in [24]. In Section 5 we work out the differentiability of  in μ through the same discretization approach used in [24]. This is followed by the chain rules and Lipschitz estimates of the composite functions that enter the representation formula for u in (4.37). We should say that the formulas for the Wasserstein gradients have to be defined and their Lipschitz estimates proved; there is no general rigorous rule on composite functions that we can invoke. Finally, and due to the preceding remarks about ∂t 1t and 2t , an extra tool (Lemma 2.5) is needed to complete the chain rule for u that is really the essence of the master equation (Theorem 6.7). 2. Preliminaries For full details on the theory of optimal transport and the Wasserstein space of probability measures on the d-dimensional torus T d := Rd /Zd , we refer the reader to [25]. In this section we set down the notation for the paper, present a few general results that will be needed and fix the class of coefficients for the MFG equations. We also fix our meaning of classical solution to the MFG system and to the master equation. • The set of equivalence classes on Rd with respect to the equivalence relation: x∼y

iff

there exist integers n1 , . . . , nd such that x (j ) − y (j ) = nj , j = 1, . . . , d

is denoted by T d , where x (j ) , y (j ) are the j -th coordinates of x, y. If x, y ∈ T d then, 4 See also [27] for a recent connection between value functionals such as (1.7) and Hopf-Lax formulae on the Wasserstein space.

JID:YJDEQ AID:10108 /FLA

6

[m1+; v1.304; Prn:25/11/2019; 13:20] P.6 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

|x − y|T d := min{|x  − y  | x, y ∈ Rd , x  ∼ x, y  ∼ y}. • If μ, ν are Borel probability measures on Rd , (μ, ν) denotes the set of those Borel probability measures γ on Rd × Rd whose marginals are μ and ν: that is, π#1 γ = μ and π#2 γ = ν, where π 1 , π 2 : Rd × Rd → Rd are the first and second coordinate projections, respectively, and the subindex # stands for the pushforward operator. • We use the standard notation P2 (Rd ) for the Wasserstein space of Borel probability measures on Rd whose second moments are finite, with quadratic Wasserstein distance W2 . • For μ, ν ∈ P2 (Rd ), we define

W (μ, ν) =

1/2

 |x

inf

γ ∈(μ,ν) Rd ×Rd

− y|2T d γ (dx, dy)

,

(2.1)

and let 0 (μ, ν) denote the set of optimal transport plans γ between μ and ν, i.e. those for which the infimum in (2.1) is attained. With the equivalence relation  μ∼ν

 φdμ =

iff Rd

φdν for all φ ∈ C(T d )

Rd

on P2 (Rd ), where C(T d ) are all real-valued continuous functions φ on Rd such that φ(x) = φ(x  ) whenever x ∼ x  , it is true that W (μ, ν) = W (μ , ν  ) whenever μ ∼ μ and ν ∼ ν  . In this way, W in formula (2.1) is defined on the set of equivalence classes, which we henceforth denote by P(T d ). Moreover, W is a metric on P(T d ), with respect to which P(T d ) is compact. • By a mapping F : T d → S, where S is any set, we mean F : Rd → S such that F (x) = F (x  ) whenever x ∼ x  . Likewise, a mapping F : P(T d ) → S is a function F : P(Rd ) → S that takes constant values on the equivalence classes of P(T d ). Furthermore, a function F : T d → T d is to be understood as a function F : Rd → Rd such that F (x) ∼ F (y) whenever x ∼ y.  • If x = (x1 , . . . , xn ) ∈ (Rd )n , then μx ∈ P2 (Rd ) denotes the measure μx = n1 nj=1 δxj . Such measures are called averages of Dirac masses. • If f, g : Rd → Rd are Borelian, and μ ∈ P2 (Rd ), then, estimating through (f × g)# μ one obtains W2 (f# μ, g# μ) ≤ f − g L2 (Rd ,μ) .

(2.2)

• Let μ ∈ P2 (Rd ). Then L2 (T d , μ) denotes the completion of C(T d ) with respect to the L2 (Rd ,μ)

. At the same time, we define the tangent L2 (Rd , μ) norm: L2 (T d , μ) = C(T d ) space to P(T d ) at μ, Tμ P(T d ), to be the L2 (Rd , μ)-completion of the subspace of L2 (T d , μ) consisting of gradients of smooth periodic functions on Rd : Tμ P(T d ) :=

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.7 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

7

L2 (Rd ,μ)

∇C ∞ (T d ; R) . Since L2 (T d , μ) is a Hilbert space, if we have ξ, η ∈ L2 (T d , μ), d and η ∈ Tμ P(T ), then 

 ξ(x) · η(x)μ(dx) =

Td

ξ¯ (x) · η(x)μ(dx),

(2.3)

Td

where ξ¯ is the projection of ξ onto Tμ P(T d ). • Wasserstein distance between average of Dirac masses. If μ, ν ∈ P2 (Rd ) are such that 1 n 1 n μ = n j =1 δxj and ν = n j =1 δyj , where xj = xk , yj = yk for j = k, then there is a permutation p : {1, . . . , n} → {1, . . . , n} such that 1 W (μ, ν) = |yp(j ) − xj |2T d . n n

2

j =1

• We denote by AC 2 (0, T ; P(T d )) the set of paths μ : (0, T ) → P(T d ) for which there t exists m ∈ L2 (0, T ) such that W (μt1 , μt2 ) ≤ t12 m(τ )dτ whenever 0 < t1 ≤ t2 < T . • We say that a time-dependent velocity vector field vt : T d → Rd is a velocity vector field for the absolutely continuous path μt if vt ∈ Lp (T d , μ), T  |vt (q)|μt (dq)dt < ∞ 0 Td

and the continuity equation is true: ∂t μt + div(vt μt ) = 0

in D ((0, T ) × T d ).

• A path μt in P(T d ), 0 ≤ t ≤ 1, is said to be a constant-speed geodesic if W (μt1 , μt2 ) = |t2 − t1 |W (μ0 , μ1 ),

t1 , t2 ∈ [0, 1].

Given a path μt in P(T d ), a velocity vt for μt is in L2 (T d ; μ) but it may or may not be in Tμt P(T d ). However, if μt is an AC 2 (0, T ; P(T d )) path, a velocity field of minimal L2 (T d , μ)-norm always exists, and it belongs to Tμt P(T d ). This is the content of [3, Theorem 8.3.1]. Remark 2.1. Let μ, ν ∈ P(T d ), γ ∈ 0 (μ, ν). For each 0 ≤ τ ≤ 1, let μτ := [(1 − τ )π 1 + τ π 2 ]# γ . Let w τ , 0 ≤ τ ≤ 1, be the velocity vector field of minimal norm for μτ . (i) For 0 ≤ τ ≤ 1, w τ L2 (μτ ) = W (μ, ν).

JID:YJDEQ AID:10108 /FLA

8

[m1+; v1.304; Prn:25/11/2019; 13:20] P.8 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

(ii) For every f ∈ C(T d ; Rd ), every 0 ≤ τ ≤ 1,  f ((1 − τ )x + τy) · wτ ((1 − τ )x + τy)γ (dx, dy) T d ×T d



=

f ((1 − τ )x + τy) · (y − x)γ (dx, dy).

T d ×T d

(iii) Furthermore, fix τ ∈ (0, 1), and let γ τ ∈ 0 (μ, μτ ). Then, for every f1 , f2 ∈ C(T d ; Rd ),  [f2 (y) · w τ (y) − f1 (x) · w 0 (x)]γ τ (dx, dy) T d ×T d



=

[f2 (y) − f1 (x)] ·

y −x τ γ (dx, dy). τ

T d ×T d

We omit the proof of this remark, for which the reader can refer to [37]. • Density of average of Dirac masses in P(T d ). Let μ ∈ P2 (Rd ). As it is well known (see, for instance, [11, Ex. 8.1.6]), the set of average of Dirac masses is dense in P2 (Rd ) with convergence respect to narrow convergence. In P(T d ), this convergence coincides with 1 n d in W . Thus, there exists a sequence {μ(n)}∞ j =1 δxj (n) , an 1 ⊂ P2 (R ), with μ(n) = n average of Dirac masses, such that W (μ, μ(n)) → 0 as n → ∞. Moreover, this sequence can be chosen so that each xj (n) ∈ supp(μ), where supp(μ) is the support of the measure μ. 2.1. Assumptions for the mean-field game equations 1. Let H ∈ C 3 (T d × Rd ), H = H (q, p). In this manuscript, ∇q H (·, ·) will always denote the gradient of H with respect to q, evaluated at (·, ·). Similarly for ∇p H (·, ·), and higher-order derivatives. 2. Let F = F (q, μ), q ∈ T d , μ ∈ P(T d ), be continuous in the μ variable and of class C 3 in q, and let κ > 0 be a constant such that 2 3 |∇q F (q, μ)|, |∇qq F (q, μ)|, |∇qqq F (q, μ)| ≤ κ,

q ∈ T d , μ ∈ P(T d ).

Suppose, further, that ∇q F is κ-Lipschitz on T d × P(T d ), meaning that  |∇q F (q1 , μ1 ) − ∇q F (q2 , μ2 )| ≤ κ |q1 − q2 |2 + W 2 (μ1 , μ2 ), q1 , q2 ∈ T d , μ1 , μ2 ∈ P(T d ). 3. Furthermore: we require that the vector field ∇q F (q, μ) is differentiable with respect to every μ, at every q, and

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.9 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

9

2 ∇μ ∇q F (q, μ)(x) =: ∇μq F (q, μ)(x)

is continuous in (q, μ, x) (hence, uniformly bounded). 4. Let g = g(q, μ), q ∈ T d , μ ∈ P(T d ), and suppose g satisfies exactly the same conditions asked of F . We call the triple (H, F, g) the coefficients for the mean-field game equations. 2.2. Assumptions for the master equation In addition to the previous set of conditions, here we suppose that the functions (q, μ) → F (q, μ),

(q, μ) → g(q, μ)

are twice differentiable

in the sense explained below in Section 2.4.1, and that 2 2 2 ∇μ F, ∇qμ F, ∇μμ F, ∇xμ F

are continuous in all its variables

(and, therefore, uniformly bounded). We suppose an identical statement holds for g. Examples. 1. The following is the case in [24]:  F (q, μ) =

 φ(q − y)μ(dy),

g(q, μ) = U 0 (q) +

Td

U 1 (q − y)μ(dy),

Td

where U 0 , U 1 , φ are smooth functions and φ, U 1 are even. 2. We can take  1 F (q, μ) = U (q) + (q, y1 , . . . , ym )μ(dy1 ) · · · μ(dym ) m (T d )m

(and similarly for g), where U and  are smooth and  is symmetric in its m + 1 variables; see [23]. 2.3. Definitions of classical (strong) solutions Let T > 0, and F, g : T d × P(T d ) → R be continuous; let H : T d × Rd → R be continuous and differentiable in p. MFG system. Let 0 < s < T , μ ∈ P(T d ). We say that the pair of functions U : (0, T ) × T d → R, σ : (0, T ) → P(T d ) is a classical solution to the first-order MFG system (1.2a)-(1.2d) on T d with coefficients (H, F, g) and parameters s, μ if the following hold: 

U ∈ C 1 ((0, T ) × T d );

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.10 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

10



the path σ ∈ AC 2 (0, T ; P(T d )) and (1.2b) is true in the sense of distributions, i.e., for every ϕ ∈ Cc∞ ((0, T ) × T d ): T  [∂t ϕ(t, q) + ∇ϕ(t, q) · ∇p H (q, ∇q U (t, q))]σt (dq)dt = 0; 0 Rd



equation (1.2a) is satisfied pointwise, along with the condition (1.2c) at time t = 0 for U and the condition (1.2d) at time t = s for σ .

We will often refer to the function U in (1.2a) as the value function. Master equation. We say that the function u : (0, T ) × T d × P(T d ) → R is a classical solution the master equation of first-order MFGs (1.1) with coefficients (H, F, g) if: u is differentiable in s, with ∂s u(·, ·, μ) continuous at every μ ∈ P(T d );  u is differentiable in q, with ∇q u continuous in all three variables;  u is differentiable in μ (see the following section), and u satisfies (1.1) pointwise. 

We will refer to the function u in (1.1) as the full value function. 2.4. Differentiability in the Wasserstein space Let W be a real-valued function on P(T d ) and let μ ∈ P2 (Rd ) be fixed. For ξ ∈ L2 (T d , μ), ν ∈ P2 (Rd ), γ ∈ (μ, ν), define  e(ν, ξ, γ ) := W(ν) − W(μ) −

ξ(x) · (y − x)γ (dx, dy).

Rd ×Rd

We have chosen to present this section with a notation similar to the one found in the paper [26], which unifies the different notions of differentiability on P2 (Rd ) used in the literature. If r > 0, set e[ξ, r] =

 |e(ν, ξ, γ )| 1 π − π 2 γ ≤ r 1 2 γ ∈(μ,ν) ν∈P2 (Rd ) π − π γ sup

sup

and e0 [ξ, r] =

 |e(ν, ξ, γ )| 1 π − π 2 γ ≤ r . 1 − π 2

π d γ γ ∈0 (μ,ν) ν∈P2 (R ) sup

sup

Here π 1 : T d × T d → T d denotes projection onto the first component; and π 2 onto the second.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.11 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

11

Definition 2.2. With the preceding notation, we say that W is differentiable at μ if lim e0 [ξ, r] = 0.

(2.4)

r→0+

The set of all ξ ∈ L2 (T d , μ) for which (2.4) holds is denoted ∂W(μ). Lemma 2.3. If ξ ∈ ∂W(μ), then so is its projection ξ¯ onto Tμ P(T d ), which is then the unique element of minimal norm in ∂W(μ) and is denoted by ∇μ W(μ). We will call it the Wasserstein gradient of W at μ. Its proof can be found in [25]. Remark 2.4. The following is an alternative characterization of a vector field ξ ∈ L2 (T d , μ) that satisfies (2.4):  W(ν) − W(μ) −

sup

γ ∈0 (μ,ν) Rd ×Rd

ξ(x) · (y − x)γ (dx, dy) = o(W (μ, ν)).

(2.5)

Likewise, ξ satisfies lim e[ξ, r] = 0

r→0+

if and only if  W(ν) − W(μ) −

sup

γ ∈(μ,ν) Rd ×Rd

ξ(x) · (y − x)γ (dx, dy) = o(W (μ, ν)).



The following lemma will be used in the final section. Lemma 2.5. With the foregoing notation, if W : P(T d ) → R is differentiable at μ, then lim e[∇μ W(μ), r] = 0.

r→0+

By Remark 2.4, this is the same as  W(ν) − W(μ) −

sup

γ ∈(μ,ν) Rd ×Rd

Proof. See the Appendix.

2

∇μ W(μ)(x) · (y − x)γ (dx, dy) = o(W (μ, ν)).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.12 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

12

2.4.1. Twice differentiability In [23], the notion of Hessian of a function on the Wasserstein space is defined. We will follow the same framework. Let ρ,  be moduli of continuity, with ρ concave. We will say that a function V : T d × P(T d ) → R is twice differentiable at (q, μ) if the following hold: • the mapping x → ∇μ V (q, ν)(x) exists and is differentiable for every ν in a neighborhood of 2 V (q, ν)(x); μ, with its derivative denoted by ∇xμ 2 V (q, μ)(x) exists; • the gradient ∇q ∇μ V (q, μ)(x) =: ∇qμ • there exists a Borel, bounded matrix-valued function Aμμ : (T d )3 → Rd×d such that sup

γ ∈0 (μ,ν)

2 |∇μ V (q, ¯ ν)(y) − ∇μ V (q, μ)(x) − ∇qμ V (q, μ)(x)(q¯ − q) − Pγ [μ](q, x, y)|





≤ o(|q¯ − q|) + W (μ, ν) + |x − y| ρ(W (μ, ν)) + (|x − y|) , where  2 Pγ [μ](q, x, y) = ∇xμ V (q, μ)(x)(y − x) +

Aμμ (q, x, a)(b − a)γ (da, db).

T d ×T d

Without loss of generality, we may suppose that Aμμ (q, x, ·) ∈ Tμ P(T d ) for all q, x. We put 2 ∇μμ V (q, μ)(·, ·) := Aμμ (q, ·, ·),

q ∈ Td.

In regard to the former definition and notation, the following fact will be useful. Proposition 2.6. Let V : T d × P(T d ) → R be twice differentiable, in the sense explained above. Let h → q h , h → x h , be differentiable paths in T d defined on an interval I , and μh ∈ AC 2 (I ; P(T d )), with vh a continuous in h velocity vector field for μh . (i) There exists a set J ⊂ I , of equal measure to that of I , such that, if h0 ∈ I , then the function h → ∇μ V (q h , μh )(x h ) is differentiable at h0 and d [∇μ V (q h , μh )(x h )] h=h = 0 dh 2 2 = ∇qh V (q h0 , μh0 )(x h0 )(q h ) |h=h0 +∇xh V (q h0 ), μh0 )(x h0 )(x h ) |h=h0  2 + ∇μμ V (q h0 , μh0 )(x h0 , r)vh0 (r)μh0 (dr). Td

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.13 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

13

2 V , ∇ 2 V , ∇ 2 V are continuous, and the paths h → x h , h → q h are in C 1 (I ), then (ii) If ∇qμ xμ μμ

b ∇μ V (q , μb )(x ) − ∇μ V (q , μa )(x ) = b

b

a

a

d ∇μ V (q h , μh )(x h )dh dh

a

for any a, b ∈ I . Proof. See the Appendix. 2 3. Main statements We collect here the three main statements that were proved in this paper. Let H, F, g be as in Section 2.1. Statement 1. (Theorem 4.21) If T is sufficiently small, in a way that depends only on the coefficients (H, F, g), then, for every 0 < s < T , μ ∈ P(T d ), the MFG system (1.2a)-(1.2d) admits a classical solution (U, σ ), in the sense of Section 2.3. Moreover, (U, σ ) ∈ W 2,2;∞ ((0, T ) × T d ) × AC 2 (0, T ; P(T d )). Statement 2. (Theorem 4.22) If (U˜ , σ˜ ) ∈ W 2,3;∞ ((0, T ) × T d ) × AC 2 (0, T ; P(T d )) is a classical solution to the MFG system (1.2a)-(1.2d), then, at least during a possibly shorter interval ˜ σ˜ ) must be the pair constructed for [0, T ] than the one in the previous statement, the pair (U, Theorem 4.21. Additionally, let F, g be as in Section 2.2. Statement 3. (Theorem 6.7) If T is small enough, in a way that depends only on the coefficients (H, F, g), then the master equation (1.1) admits a classical solution in the sense of Section 2.3. 4. Solving the MFG system In this section, we construct a solution to the first-order MFG system (1.2a)-(1.2d). A fixedpoint argument will give us existence and uniqueness of solutions to the characteristics of the system. These solutions are functions  : [0, T ] × T d → T d × Rd that depend on s and μ and are the backbone of our work. They incorporate enough regularity that we can construct classical solutions (in the sense defined above) to the MFG system on T d . 4.1. System of equations and its solution For T > 0, we will denote by M the space of continuous functions Z = (Q, P ) : [0, T ] × T d −→ T d × Rd , endowed with the uniform norm,

Z ∞ = max |Z(t, q)| = max{(|Q(t, q)|2 + |P (t, q)|2 )1/2 | t ∈ [0, T ], q ∈ T d }. 0≤t≤T

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.14 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

14

That is, M = C([0, T ] × T d ; T d × Rd ). Similarly, let M1 := C([0, T ] × T d ; T d ),

M2 := C([0, T ] × T d ; Rd ).

Definition 4.1. Let θ ∈ R+ , and fix μ ∈ P(T d ), s ∈ [0, T ]. ¯ = ((m ¯ s,μ )1 , (m ¯ s,μ )2 ) as fol¯ s,μ : M → M, m 1. (Fixed point operator) Define the operator m lows: ¯ P¯ ) ∈ M, then m ¯ = ((m ¯ (m ¯ ¯ s,μ (Z) ¯ s,μ )1 (Z), ¯ s,μ )2 (Z)), If Z¯ = (Q, where: ¯ (m

¯ ) (Z)(t, q) = q +

t

s,μ 1

∇p H (Q¯ τ (q), θ P¯τ (q))dτ,

(4.1)

s

¯ (m

1 1 ¯ ) (Z)(t, q) = ∇q g(Q¯ 0 (q), Q¯ 0 # μ) − θ θ

t

s,μ 2

∇q H (Q¯ τ (q), θ P¯τ (q))

0

+ ∇q F (Q¯ τ (q), Q¯ τ # μ)dτ,

(4.2)

¯ τ (q) := Q(τ, ¯ 0 ≤ t ≤ T , q ∈ Rd . In equalities (4.1) and (4.2), Q q), P¯τ (q) := P¯ (τ, q), τ ∈ d [0, T ], q ∈ R . 2. (Coefficient bounds I) For B > 0, let ¯ l(B) :=

max

q∈Rd ,|p|≤B, μ∈P (T d )

¯ h(B) :=

max

q∈Rd ,|p|≤B,

√ 2 |∇H (q, θp)| + |∇q F (q, μ)| , √ √ 2 3 2|∇ 2 H (q, θp)| + 2|∇ 3 H (q, θp)| +|∇qq F (q, μ)| +|∇qqq F (q, μ)| ,

μ∈P (T d )

c := max{d, κ}. ¯ ¯ Thus, for a fixed B, the numbers l(B), h(B), c depend only on the coefficients (H , F , g). ¯ ¯ P¯ are periodic in q (i.e., q ∈ T d ), if q  ∼ q then (m ¯ s,μ )1 (Z)(t, q) ∼ Notes. (1) Since Q, s,μ 1  s,μ 1 d ¯ ¯ ¯ ) (Z)(t, ·) is indeed a mapping into T , in the sense explained in ¯ ) (Z)(t, q ), so (m (m the Preliminaries. ¯ s,μ and the coefficient bounds depend on the value of θ . (2) Both the fixed-point operator m 2 d ∂H 2  (3) Throughout this text, |∇H (q, p)|2 = dj =1 ∂(q∂H (j ) )2 (q, p) + j =1 ∂(p (j ) )2 (q, p) , and the norms of second order derivatives are defined similarly, i.e., we are using quadratic norms. ¯ P¯ ), so on the left-hand side of (4.1) ¯ s,μ has a fixed point (Q, Suppose that the operator m ¯ ¯ and (4.2) we would see Q(t, q) and P (t, q) respectively. Set Q := Q¯ and P := θ P¯ . Then Z := (Q, P ) satisfies

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.15 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

⎧ in [0, s] × T d , ∂t Q(t, q) = ∇p H (Q(t, q), P (t, q)) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ∂t P (t, q) = − ∇q H (Q(t, q), P (t, q)) − ∇q F (Q(t, q), Q(t, ·)# μ) ⎪ Q(s, q) = q on T d , ⎪ ⎪ ⎪ ⎪ ⎩ P (0, q) = ∇q g(Q(0, q), Q(0, ·)# μ)

15

in [0, s] × T d , (4.3)

on T d .

We will refer to the system (4.3) as the Hamiltonian ODEs with parameters s and μ. Definition 4.2. If T > 0, A1 , A2 , B, E, E1 , E2 > 0, define M0 (A1 , A2 , B, E, E1 , E2 , T ) ⊂ M ¯ ·) = (Q(·, ¯ ·), P¯ (·, ·)) such that: to be the subset of those Z(·, 2,2;∞ ¯ (i) Z(·, ·) belongs to W ([0, T ] × T d ; T d × Rd ); (ii) the following bounds hold: ⎧ 2 ¯ ¯ M1 ≤ A1 , ∇q Q

¯

∂t Q

2 ≤ A1 , ∇qq Q

3 ≤ A1 ; ⎪ ⎪ C([0,T ]×T d ;T d ) C([0,T ]×T d ;T d ) ⎪ ⎪ ⎪ ⎪ 2 ⎨ ∂t P¯ M2 ≤ A2 , ∇q P¯

2 ≤ A2 , ∇qq P¯ C([0,T ]×T d ;R3d ) ≤ A2 C([0,T ]×T d ;Rd ) ⎪ ⎪

P¯ M2 ≤ B; ⎪ ⎪ ⎪ ⎪ ⎩ ∇ Q¯

2 ¯ q 0 C(T d ;T d 2 ) , ∇qq Q 0 C(T d ;T 3d ) ≤ E;

(4.4)

¯ M ≤ E1 , ∂tt2 P¯ M ≤ E2 . (iii) ∂tt2 Q

1 2 Here W 2,2;∞ ([0, T ] × T d ; T d × Rd ) is the Sobolev space of functions periodic in q, taking values in T d ×Rd , with essentially bounded second-order weak derivatives in t and second-order weak gradients in q. Since functions in W 1,1;∞ are Lipschitz, M0 (A1 , A2 , B, E, E1 , E2 , T ) is indeed a subset of M. The following is a standard fact, so we will omit its proof. Proposition 4.3. For any A1 , A2 , B, E, E1 , E2 , T > 0, M0 (A1 , A2 , B, E, E1 , E2 , T ) is closed in M. Lemma 4.4. Let θ > 0 of Definition 4.1 be arbitrary. There exist A1 , A2 , B, E, E1 , E2 > 0, and ¯ s,μ maps M0 (A1 , A2 , B, E, E1 , E2 , T ) into itself, for any s ∈ (0, T ) and T > 0, such that m d μ ∈ P(T ). The numbers A1 , A2 , B, E, E1 , E2 , T depend only on the coefficients. Proof. Observe that5 1 1 ¯ ¯ τ (q), θ P¯τ (q))| ¯ s,μ )2 (Z)(t, |(m q)| ≤ |∇q g(Q¯ 0 (q), Q¯ 0 # μ)| + t sup [|∇q H (Q θ θ 0≤τ ≤t + |∇q F (Q¯ τ (q), Q¯ τ # μ)|], 5 We make a convention here and in the rest of the paper that in the application of the classical chain rule, and only if are concerned solely about estimates, juxtaposition is enough, i.e., we will not pay attention to the order of the factors or whether they are properly transposed.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.16 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

16

¯ ¯ s,μ )1 (Z)(t, |∂t (m q))| ≤ |∇p H (Q¯ t (q), θ P¯t (q))| , ¯ ¯ s,μ )2 (Z)(t, |∂t (m q)| ≤ ¯ |∇q (m

1 1 |∇q H (Q¯ t (q), θ P¯t (q))| + |∇q F (Q¯ t (q), Q¯ t # μ)| ; θ θ

¯ ) (Z)(t, q)| ≤

s,μ 1



t d+

2 ∇ H (Q ¯ τ (q), θ P¯τ (q))∇q Q ¯ τ (q) qp

s

2 + ∇pp H (Q¯ τ (q), θ P¯τ (q))θ ∇q P¯τ (q) dτ ; ¯ ¯ s,μ )2 (Z)(t, |∇q (m q)| 1 2 ¯ 0 (q)| + 1 g(Q¯ 0 (q), Q¯ 0 # μ)∇q Q ≤ |∇qq θ θ

t

2 ¯ τ (q)|dτ |∇qq F (Q¯ τ (q), Q¯ τ # μ)∇q Q

0

1 + θ

t

2 ∇ H (Q¯ τ (q), θ P¯τ (q))∇q Q¯ τ (q) + ∇ 2 H (Q¯ τ (q), θ P¯τ (q))θ ∇q P¯τ (q) dτ qq pq

0

The previous lines are inequalities for the moduli of Q, P , and their derivatives. Let us also compute second-order derivatives to find: 2 ¯ ¯ s,μ )1 (Z)(t, |∇qq (m q)|

t ≤

3 3 (∇ H (Q¯ τ (q), θ P¯τ (q))∇q Q ¯ τ (q) + ∇pqp ¯ τ (q) H (Q¯ τ (q), θ P¯τ (q))θ ∇q P¯τ (q))∇q Q qqp

s 2 2 ¯ ¯ τ (q), θ P¯τ (q))∇qq + ∇qp H (Q Qτ (q) 3 3 + (∇qpp H (Q¯ τ (q), θ P¯τ (q))∇q Q¯ τ (q) + ∇ppp H (Q¯ τ (q), θ P¯τ (q))θ ∇q P¯τ (q))θ ∇q P¯τ (q) 2 2 ¯ ¯ τ (q), θ P¯τ (q))θ ∇qq + ∇pp H (Q Pτ (q) dτ

¯ and, for the second component, for the first component of m, 2 ¯ ¯ s,μ )2 (Z)(t, |∇qq (m q)|



1 3 2 2 ¯ ¯ 0 # μ)∇qq ¯ 0 (q) +∇qq g(Q¯ 0 (q), Q (∇qqq g(Q¯ 0 (q), Q¯ 0 # μ)∇q Q¯ 0 (q))∇q Q Q0 (q) θ t 1 3 3 ¯ τ (q) H (Q¯ τ (q), θ P¯τ (q))θ ∇q P¯τ (q))∇q Q (∇qqq H (Q¯ τ (q), θ P¯τ (q))∇q Q¯ τ (q) + ∇pqq + θ 0 2 2 ¯ H (Q¯ τ (q), θ P¯τ (q))∇qq + ∇qq Qτ (q) 3 3 ¯ τ (q), θ P¯τ (q))θ ∇q P¯τ (q))θ ∇q P¯τ (q) + (∇qpq H (Q¯ τ (q), θ P¯τ (q))∇q Q¯ τ (q) +∇ppq H (Q 2 2 ¯ + ∇pq H (Q¯ τ (q), θ P¯τ (q))θ ∇qq Pτ (q) dτ

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.17 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

1 + θ

t

17

3 2 2 ¯ (∇ F (Q¯ τ (q), Q¯ τ # μ)∇q Q ¯ τ (q))∇q Q ¯ τ (q) +∇qq F (Q¯ τ (q), Q¯ τ # μ)∇qq Qτ (q) dτ. qqq

0

We deal with A1 , A2 , B, E, T first. Let A1 , A2 , B, E, T be for the moment arbitrary positive ¯ satisfies (4.4). From the latter inequalities numbers. Suppose that Z¯ ∈ M0 , that is, Z¯ = (P¯ , Q) we see that: (a)

¯ c/θ + T h(B)/θ ≤B

¯ (b1) l(B) ≤ A1

¯ ≤ B; ¯ s,μ )2 (Z)| implies |(m

¯ ≤ A1 ; ¯ s,μ )1 (Z)| implies |∂t (m

¯ (b2) l(B)/θ ≤ A2

¯ ≤ A2 ; ¯ s,μ )2 (Z)| implies |∂t (m

(c)

¯ c + T h(B)(A 1 + θ A2 ) ≤ A1

(d)

cE/θ +

(e)

¯ c + T h(B)(A 1 + θ A2 ) ≤ E

T ¯ h(B)(A1 + θ A2 ) ≤ A2 θ

¯ ≤ A1 ; ¯ s,μ )1 (Z)| implies |∇q (m ¯ ≤ A2 ; ¯ s,μ )2 (Z)| implies |∇q (m ¯ ¯ s,μ )1 (Z)| implies |∇q (m ≤ E; t=0

(f1)

¯ T h(B)(A 1 + θ A2 )(A1 + θ A2 + 1) ≤ A1

(f2)

¯ T h(B)(A 1 + θ A2 )(A1 + θ A2 + 1) ≤ E

(g)

2 ¯ ≤ A1 ; ¯ s,μ )1 (Z)| implies |∇qq (m 2 ¯ ¯ s,μ )1 (Z)| implies |∇qq (m ≤ E; t=0

T ¯ T ¯ 1 h(B)A1 (A1 + 1) ≤ A2 cE(E + 1) + h(B)(A 1 + θ A2 )(A1 + θ A2 + 1) + θ θ θ 2 ¯ ≤ A2 . ¯ s,μ )2 (Z)| implies |∇qq (m

We need to set A1 , A2 , B, E, T so that the above inequalities hold simultaneously. First choose B > c/θ . The number B now depends only on the coefficients and θ (through c), and thus ¯ ¯ l(B), h(B) depend only on the coefficients and θ , through B. Let T be small enough that c/θ + ¯ ¯ This gives (a). Choose E to be any number such that (T /θ )l(B) ≤ B (i.e. T < (θ B − c)/l(B)). E > c, and pick A1 , A2 such that ¯ A1 > max{l(B), E, c}, A2 > max{

¯ l(B) 1 , cE(E + 1)}. θ θ

(4.5) (4.6)

This gives (b1), (b2). Making T possibly smaller by letting T < R := min

θB − c E−c , , ¯ ¯ l(B) h(B)(A 1 + θ A2

A2 − cE(E + 1)/θ , ¯ (1/θ)h(B)[(A1 + θ A2 )(A1 + θ A2 + 1) + A1 (A1 + 1)] E , ¯ + θ A )(A + θ A + 1) h(B)(A 1 2 1 2

(4.7)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.18 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

18

we make sure that: (e) holds, and, consequently, (c) holds because A1 > E; (g) holds and therefore (d) as well; and (f2) holds, hence, (f1) is true. Finally, let us answer the existence of the ¯ ¯ s,μ )1 (Z)(t, q) to be constants E1 , E2 . We compute ∂tt2 (m 2 2 ¯ ¯ t (q) + θ∇pp ¯ s,μ )1 (Z)(t, ∂tt2 (m q) = ∇pq H (Q¯ t (q), θ P¯t (q))∂t Q H (Q¯ t (q), θ P¯t (q))∂t P¯t (q)

¯ ¯ s,μ )2 (Z)(t, and, thanks to the conditions in Section 2.1, we get for ∂tt2 (m q): 1 2 2 ¯ ¯ s,μ )2 (Z)(t, ∂tt2 (m q) = − ∇qq H (Q¯ t (q), θ P¯t (q))∂t Q¯ t (q) + ∇pq H (Q¯ t (q), θ P¯t (q))∂t P¯t (q) θ 1 2 F (Q¯ t (q), Q¯ t # μ)∂t Q¯ t (q) − ∇qq θ  1 2 ¯ t # μ)(Q¯ t (x))∂t Q ¯ t (x)μ(dx). ∇μq F (Q¯ t (q), Q − θ Td

Therefore, it is enough to choose E1 , E2 large enough such that √ ¯ ¯ h(B)A 1 + θ h(B)A 2 ≤ E1 / 2, √ 1¯ 1 1 2 ¯ h(B)A1 + h(B)A 2 + cA1 + ∇μq F ∞ A1 ≤ E2 / 2. 2 θ θ θ Proposition 4.5. (Contraction property) Let θ > 2κ. Then there exist positive numbers A1 , ¯ s,μ maps A2 , B, E, E1 , E2 , T such that for any s ∈ [0, T ], μ ∈ P(T d ), the operator m M0 (A, B, E, E1 , E2 , T ) into itself and is a contraction. Proof. We run the previous lemma to obtain the numbers A1 , A2 , B, E, E1 , E2 , and T , and decrease T , if necessary, so that  T < min R,



1−

2κ θ

¯ 2(1 + 1/θ) + 2κ/θ h(B)

,

(4.8)

¯ P¯ ), Z¯  = (Q ¯  , P¯  ) ∈ M0 . Let s ∈ [0, T ], where R is the number defined in (4.7). Let Z¯ = (Q, d s,μ ¯ , that μ ∈ P(T ) be arbitrary. We have, for the first component of m ¯ ¯ s,μ )1 (Z)(t, ¯ s,μ )1 (Z¯  )(t, q)| |(m q) − (m ¯ τ (q), θ P¯τ (q)) − ∇p H (Q¯ τ (q), θ P¯τ (q))|. ≤ |s − t| max |∇p H (Q t≤τ ≤s

Since H is C 2 , we can write 1 ¯ q) − Z¯  (τ, q)|, |∇p H (Q¯ τ (q), θ P¯τ (q)) − ∇p H (Q¯ τ (q), θ P¯τ (q))| ≤ Mτ,q |Z(τ,

where 1 ¯ τ (q) + λQ ¯ τ (q), (1 − λ)θ P¯τ (q) + λθ P¯τ (q)]|. Mτ,q = max |∇(∇p H )[(1 − λ)Q 0≤λ≤1

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.19 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

19

¯ we apply (2.2) to get For the second component of m ¯ ¯ s,μ )2 (Z)(t, ¯ s,μ )2 (Z¯  )(t, q)| |(m q) − (m  1 κ ¯ ¯  (q)|2 + Q ¯0 −Q ¯  2 2 ¯ τ (q), θ P¯τ (q)) |Q0 (q) − Q max |∇q H (Q ≤ 0 0 L (μ) + t 0≤τ ≤t θ θ   ¯ τ (q)|2 + Q ¯τ −Q ¯ τ 2 2 − ∇q H (Q¯ τ (q), θ P¯τ (q))| + κ |Q¯ τ (q) − Q L (μ) ≤

√ κ t ¯ τ (q), θ P¯τ (q))|, 2 (1 + t) Z¯ − Z¯  ∞ + max |∇q H (Q¯ τ (q), θ P¯τ (q)) − ∇q H (Q θ θ 0≤τ ≤t

with 2 ¯ q) − Z¯  (τ, q)|, |∇q H (Q¯ τ (q), θ P¯τ (q)) − ∇q H (Q¯ τ (q), θ P¯τ (q))| ≤ Mτ,q |Z(τ,

where 2 Mτ,q = max |∇(∇q H )[(1 − λ)Q¯ τ (q) + λQ¯ τ (q), (1 − λ)θ P¯τ (q) + λθ P¯τ (q)]|. 0≤λ≤1

1 , M 2 ≤ h(B), ¯ But, since M0 is a convex subset of M, it is true that Mτ,q (τ, q) ∈ [0, s] × T d . τ,q It follows that

¯ ¯ ¯ s,μ )1 (Z)(t, ¯ s,μ )1 (Z¯  )(t, q)| ≤ |s − t|h(B)

|(m q) − (m Z¯ − Z¯  ∞ ,

0 ≤ t ≤ s, q ∈ T d ,

and ¯ ¯ s,μ )2 (Z)(t, ¯ s,μ )2 (Z¯  )(t, q)| ≤ |(m q) − (m

√ 1 1 ¯ 2 κ(1 + t) Z¯ − Z¯  ∞ + t h(B)

Z¯ − Z¯  ∞ . θ θ

Consequently, since 0 ≤ t, s ≤ T , we obtain √ √ √ 1 1 ¯ ¯ 2T h(B) + 2( 2 κ(1 + T ) + T h(B))

Z¯ − Z¯  ∞ θ θ √  2κ 1 2κ  ¯ ¯ = 2(1 + ) + (4.9) + T h(B)

Z − Z¯  ∞ . θ θ θ

¯ −m ¯ s,μ (Z) ¯ s,μ (Z¯  ) ∞ ≤

m

Due to (4.8), the expression inside the square brackets in (4.9) is less than 1.

2

It follows now that the operator (4.1) and (4.2) has a unique fixed point in M0 (A1 , A2 , B, E, E1 , E2 , T ), where A1 , A2 , B, E, E1 , E2 , T are as above. Definition 4.6. Fix μ ∈ P(T d ), s ∈ [0, T ]. Define the operator ms,μ : M → M, m = ((ms,μ )1 , (ms,μ )2 ) as follows:

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.20 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

20

If Z = (Q, P ) ∈ M, then ms,μ (Z) = ((ms,μ )1 (Z), (ms,μ )2 (Z)), where : t ) (Z)(t, q) = q +

∇p H (Qτ (q), Pτ (q))dτ,

s,μ 1

(m

(4.10)

s

t (m

) (Z)(t, q) = ∇q g(Q0 (q), Q0 # μ) −

∇q H (Qτ (q), Pτ (q)) + ∇q F (Qτ (q), Qτ # μ)dτ,

s,μ 2

0

(4.11) 0 ≤ t ≤ T , q ∈ Td. Corollary 4.7. ¯ has a unique fixed (i) For any T > 0, s ∈ [0, T ], μ ∈ P(T d ), θ > 0, the operator m point in M0 (A1 , A2 , B, E, E1 , E2 , T ) if, and only if, m has a unique fixed point in M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). (ii) With θ > 2κ, fix s ∈ [0, T ], μ ∈ P(T d ). Let A1 , A2 , B, E, E1 , E2 , T be as obtained in Lemma 4.4. Then ms,μ maps M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) into itself and the system (4.3) has a unique solution in M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). ¯ μ] = ( ¯ 1 [s, μ],  ¯ 2 [s, μ]) that satisfies ¯ s,μ has a unique fixed point [s, Proof. (i) Suppose m 1 2 ¯ ¯ ¯ ¯ the bounds of Definition 4.2 with Q =  and P =  . Define ¯ 1 [s, μ], 2 [s, μ] := 

¯ 2 [s, μ]. 2 [s, μ] := θ 

(4.12)

Then it is straightforward to check that [s, μ] = (1 [s, μ], 2 [s, μ]) is the unique fixed point of the operator ms,μ such that the inequalities of Definition 4.2 are true for 1 , 2 with the new constants θ A2 , θ B, θ E2 in place of A2 and B, E2 respectively, that is, such that Q = 1 , P = 2 satisfy ⎧ 2 Q C([0,T ]×T d ;T 2d ×T d ) ≤ A1 ;

∂t Q M1 ≤ A1 , ∇q Q C([0,T ]×T d ;T d ×T d ) ≤ A1 , ∇qq ⎪ ⎪ ⎪ ⎪ ⎪ 2 ⎪ P C([0,T ]×T d ;R2d ×Rd ) ≤ θ A2 ;

∂ P 2 ≤ θ A2 , ∇q P C([0,T ]×T d ;Rd ×Rd ) ≤ θ A2 , ∇qq ⎪ ⎪ ⎨ t M

P M2 ≤ θ B; ⎪ ⎪ ⎪ 2 ⎪

∇q Q0 C(T d ;T d ×T d ) , ∇qq Q0 C(T d ;T 2d ×T d ) ≤ E, ⎪ ⎪ ⎪ ⎪ ⎩ 2

∂tt Q M ≤ E1 , ∂tt2 P M ≤ θ E2 . (4.13) The sufficiency part of the statement is equally easily verified. ¯ s,μ at Z¯ = (ii) We take Z = (Q, P ) ∈ M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), and evaluate m ¯ ¯ ¯ ¯ ¯ (Q, P ) where Q = Q and P = P /θ , Z ∈ M0 (A1 , A2 , B, E, E1 , E2 , T ). Then, by ¯ ∈ M0 (A1 , A2 , B, E, E1 , E2 , T ). But (m ¯ = (ms,μ )1 (Z) and ¯ s,μ (Z) ¯ s,μ )1 (Z) Lemma 4.4, m 1 s,μ 2 s,μ 1 s,μ ¯ ¯ ) (Z) = θ (m ) (Z), thus, m (Z) ∈ M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). Further(m ¯ μ] of m ¯ in M0 (A1 , A2 , B, E, E1 , more, Proposition 4.5 provides a unique fixed point [s,

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.21 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

21

E2 , T ). Defining [s, μ] as in (4.12), then, by (i), [s, μ] is the unique fixed point of the operator ms,μ on M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), so it is the unique solution to (4.3) in M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). 2 We stress that, as long as T is as in (4.8), we have a solution [s, μ] to (4.3) for any μ ∈ P(T d ), and 0 ≤ s ≤ T , and moreover, its Q and P components 1 [s, μ] and 2 [s, μ] satisfy the bounds of Definition 4.2 with θ A2 , and θ B, θ E2 in place of A, B, θ E2 respectively, independently of μ, with solutions being continuous and differentiable in t and q. [s, μ] will always denote such solutions to (4.3) as found in Corollary 4.7. Remark 4.8. The preceding proofs make it clear that T can be assumed to be smaller if necessary at each following step, without affecting the validity of the previous statements. We choose to refer back to this remark in later stages instead of imposing tighter bounds on T than (4.8) above that would make their purpose unclear at first reading. We may sometimes just say “T is small”, having this remark in mind. 4.2. First regularity properties Lemma 4.9. For any fixed Z = (Q, P ) ∈ M, t ∈ [0, T ], q ∈ Rd , and μ ∈ P(T d ), the function s → ms,μ (Z)(t, q) is continuous. Likewise, for any fixed Z ∈ M, t ∈ [0, T ], q ∈ Rd , and s ∈ [0, T ], the function μ → ms,μ (Z)(t, q) is continuous. Proof. Continuity of s → ms,μ (Z)(t, q) for fixed t, q, μ is immediate from Definition 4.6 (formulas (4.10) and (4.11)). Looking at the same definition for the continuity with respect to μ, note that for each τ , 0 ≤ τ ≤ T , the function q → Qτ (q) is Lipschitz. This implies (see, e.g., [24, Remark 3.3]) that the function μ → Qτ # μ is Lipschitz from P(T d ) into itself, with the same constant. Furthermore, since the mapping (τ, q) → Q(τ, q) is Lipschitz, the Lipschitz constants of the functions q → Qτ (q) are bounded with respect to τ , 0 ≤ τ ≤ T . These facts, combined with the Lipschitz continuity of ∇q F and ∇q g, show that μn → μ in P(T d ) for all Z ∈ M0 , t, s ∈ [0, T ], q ∈ Rd .

=⇒

ms,μn (Z)(t, q) → ms,μ (Z)(t, q)

2

We will need the continuity and differentiability of the fixed point [s, μ] with respect to s, and its continuity with respect to μ. This is addressed in Lemmas 4.12 and 4.14 below. Before that, let us name the coefficient bounds that will appear in the calculations.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.22 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

22

Definition 4.10. (Coefficient bounds II) For B > 0, let l(B) := h(B) :=

max

√ 2 |∇H (q, p)| + |∇q F (q, μ)|,

max

√ √ 2 3 2|∇ 2 H (q, p)| + 2|∇ 3 H (q, p)| + |∇qq F (q, μ)| + |∇qqq F (q, μ)|.

q∈Rd ,|p|≤B, μ∈P (T d ) q∈Rd ,|p|≤B, μ∈P (T d )

¯ ¯ Unlike the coefficient bounds l(B) and h(B), here l(B) and h(B) are independent of the number θ . However, if, say, (Q, P ) ∈ M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), then |∇p H (Q(t, q), P (t, q))| ≤ l(θB). Definition 4.11. For any D = (D1 , D2 ), D1 , D2 > 0, define: (i) M∗0,D (A1 , θA2 , θB, E, E1 , θE2 , T ) ⊂ W 1,2,2;∞ ([0, T ] × [0, T ] × T d ; T d × Rd ) as the subset of those Z(·; ·, ·) such that, for each s ∈ [0, T ], Z(s; ·, ·) ∈ M0 (A1 , θA2 , θB, E, E1 , θE2 , T ), and

∂s Q(s; ·, ·) ∞ ≤ D1 ,

∂s P (s; ·, ·) ∞ ≤ D2

(4.14)

wherever ∂s Q, ∂s P are defined; (ii) Q∗0,D1 (A1 , θA2 , θB, E, E1 , θE2 , T ) ⊂ W 1,2,2;∞ ([0, T ] × [0, T ] × T d ; T d ) as the subset of those Q(·; ·, ·) such that, for each s ∈ [0, T ], (Q(s; ·, ·), 0) ∈ M∗0 (A1 , θA2 , θB, E, E1 , θE2 , T ), and

∂s Q(s; ·, ·) ∞ ≤ D1 wherever ∂s Q is defined. By an argument similar to that of Proposition 4.3, the sets M∗0 and Q∗0 just defined are closed subsets for the uniform convergence of C([0, T ] × [0, T ] × T d ; T d × Rd ) and C([0, T ] × [0, T ] × T d ; T d ) respectively.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.23 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

23

Lemma 4.12. For fixed μ ∈ P(T d ), using the same notation of Definition 4.11, and θ > 2κ: (i) There exists a pair of positive constants D = (D1 , D2 ) such that, if Z ∈ M∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), then the function (s, t, q) → ms,μ (Z(s; ·, ·))(t, q) belongs to M∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) for any μ ∈ P(T d ). (ii) The mapping s → [s, μ](t, q) is differentiable in s a.e. on the interval 0 < s < T , for every μ ∈ P(T d ), t ∈ [0, T ], and q ∈ T d , and it satisfies

∂s 1 [s, μ](·, ·) C([0,T ]×T d ;Rd ) ≤ D1 ,

∂s 2 [s, μ](·, ·) C([0,T ]×T d ;Rd ) ≤ D2

for a.e. s ∈ [0, T ], μ ∈ P(T d ). Proof. Given a function Z ∈ M∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), define mμ (Z) ∈ C([0, T ] × [0, T ] × T d ; T d × Rd ) to be the first function displayed in the statement. (i) Let us show that mμ (Z) ∈ M∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) for an appropriate D. Indeed, that mμ (Z) is continuous is evident. For a.e. s ∈ (0, T ), t



∂s Q (s; t, q) = −∇p H (Q(s; s, q), P (s; s, q)) +

2 [∇qp H (Q(s; τ, q), P (s; τ, q))∂s Q(s; τ, q) s

2 + ∇pp H (Q(s; τ, q), P (s; τ, q))∂s P (s; τ, q)]dτ,

so

∂s Q (s; ·, ·) ∞ ≤ l(θB) + T h(θ B)(D1 + D2 ). Put Q(s; τ, ·)# μ =: στs , 0 ≤ τ ≤ T . Then, for a.e. s ∈ (0, T ), ∂s P  (s; t, q) = (∂s )(g(Q(s; 0, q), σ0s )) t 2 [∇qq H (Q(s; τ, q), P (s; τ, q))∂s Q(s; τ, q)

− 0

2 H (Q(s; τ, q), P (s; τ, q))∂s P (s; τ, q) + (∂s )(g(Q(s; τ, q), στs ))]dτ. + ∇pq

By the joint Lipschitz constants of g and F being bounded by κ, we obtain

∂s P  (s; ·, ·) ∞ ≤ 2κD1 + T (2κD1 + h(θ B)(D1 + D2 )).

JID:YJDEQ AID:10108 /FLA

24

[m1+; v1.304; Prn:25/11/2019; 13:20] P.24 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

Thus, if D1 > l(θ B), and D2 > 2κD1 , we refer to Remark 4.8 and assume that T < min



D1 − l(θB) D2 − 2κD1 , h(θ B)(D1 + D2 ) 2κD1 + h(θ B)(D1 + D2 )

to obtain ∂s Q (s; ·, ·) ∞ ≤ D1 and ∂s P  (s; ·, ·) ∞ ≤ D2 . The fact that mμ (Z)(s; ·, ·) ∈ M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) follows from Corollary 4.7. (ii) Let Z 0 ∈ M0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) be arbitrary, with D from part (i). Define ∗ inductively Z k = mμ (Z k−1 ), k = 1, . . . Then {Z k }∞ k=0 is a sequence in M0,D (· · · ), and for each k s,μ k−1 s ∈ [0, T ], Z (s; ·, ·) = m (Z (s; ·, ·)), so for each fixed s ∈ [0, T ], Z k (s; ·, ·) −→ [s, μ](·, ·)

uniformly in M0 (A1 , θA2 , θB, E, E1 , θE2 , T ),

by the fixed point theorem. We thus have pointwise convergence of Z k (·; ·, ·) to Z[·, μ](·, ·). We call on the equicontinuity, uniform boundedness of the sequence and the periodicity of the functions (i.e., they are defined on [0, T ] × [0, T ] × T d ) to conclude that this convergence is actually uniform. The closedness of the subspace M∗0,D (· · · ) with respect to uniform convergence now ensures that [·, μ](·, ·) belongs to this subspace, so it is differentiable with respect to s for a.e. s ∈ [0, T ] and satisfies (4.14). 2 In the following, the constants D1 , D2 will always be as in Lemma 4.12. The next remark will not be used before Section 5. ¯ P¯ ) are related as in the proof of Corollary 4.7(ii), Remark 4.13. If Z = (Q, P ) and Z¯ = (Q, ¯ P = θ P¯ , then Z ∈ M∗ (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) if, and only if, Z¯ ∈ that is, Q = Q, 0,D M∗0,D¯ (A1 , A2 , B, E, E1 , E2 , T ), where D¯ = (D¯ 1 , D¯ 2 ),

D¯ 1 := D1 , D¯ 2 := D2 /θ.



Lemma 4.14. Let θ > 2κ. The mapping [0, T ] × [0, T ] × T d × P(T d ) −→ [0, T ] × [0, T ] × T d × P(T d ) (t, s, q, μ) −→ (t, s, 1 [s, μ](t, q), μ) is continuous, and for any fixed μ ∈ P(T d ), 1 [·, μ](·, ·) is a C 1 diffeomorphism. d d Proof. Let {μk }∞ k=1 be a sequence in P(T ) converging to μ ∈ P(T ). Consider the sequence ∞ ∞ 1 1 { [·, μk ](·, ·)}1 and an arbitrary subsequence { [·, μkj ](·, ·)}j =1 . Being in Q∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), the latter is equicontinuous and uniformly bounded, so there is a subsubsequence, which we still index with j , converging to S for some S ∈ Q∗0,D (· · · ) as j → ∞. For each s ∈ [0, T ], on one hand, 1 [s, μkj ](·, ·) −→ S(s; ·, ·). On the other, since, by j →∞

Corollary 4.7 and Lemma 4.9, the mapping (s, Z, μ) → ms,μ (Z) is a continuous mapping of [0, T ] × M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) × P(T d ) into M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) (because the Lipschitz constant of ms,μ is independent of s and μ), we have

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.25 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

25

kj

1 [s, μkj ](·, ·) = (m1 )s,μ (1 [s, μkj ](·, ·)) −→ (m1 )s,μ (S(s; ·, ·)). j →∞

Therefore, for each s ∈ [0, T ], S(s; ·, ·) = (m1 )s,μ (S(s; ·, ·)), that is, S(s; ·, ·) is a fixed point of (m1 )s,μ , so, by uniqueness, S(s; ·, ·) = 1 [s, μ](·, ·). Thus, every subsequence of ∗ 1 {1 [·, μk ](·, ·)}∞ 1 has a subsequence that converges to  [·, μ](·, ·) ∈ Q0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). Hence, 1 [·, μk ](·, ·) −→ 1 [·, μ](·, ·)

uniformly,

which implies the claimed continuity. The second assertion of the lemma will be an immediate consequence of Lemma 4.15.

2

Lemma 4.15. Let θ > 2κ, 0 ≤ s ≤ T , μ ∈ P(T d ). The mapping q → 1 [s, μ](t, q) is a C 1 diffeomorphism, for 0 ≤ t ≤ T , with 1 < det ∇q 1 [s, μ](t, q), 2

|(∇q 1 [s, μ](t, q))−1 | < 4(1 +



d)d−1 ,

(4.15)

provided T is sufficiently small. Proof. We know the mapping is already C 1 because W 2;∞ mappings are continuously differentiable. To prove invertibility, put (t, q) := 1 (t, q) − q. Computing ∇q (t, q), we have |∇q (t, q)| ≤ |s − t|(A1 + θ A2 )h(θ B)

(4.16)

because  ∈ M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). By Remark 4.8, this means the function q → (t, q) has Lipschitz constant strictly less than 1. Therefore, the function q → q + (t, q) = 1 (t, q) is injective, for 0 ≤ t ≤ T . To prove that q → 1 (t, q) is onto, note that supq∈Rd |1 (t, q) − q| ≤ T l(θB) < 2T l(θ B), for 0 ≤ t ≤ T . Let y be a point in the ball of radius R − 2T l(θ B) in Rd centered at the origin, where R > 1 > 2T l(θ B). Then for all q on the boundary of BR (0) — the ball of radius R in Rd centered at the origin — we have: 1 (t, q) = y, for 0 ≤ t ≤ T . Therefore f (t) := deg(1t , BR (0), y),

0 ≤ t ≤ T,

the topological degree of 1t is well defined at y ∈ BR−2T l(θB) (0). This counts the number of “signed” solutions (see, e.g., [21]) x in BR (0) of the equation 1t (x) = y. Since f is a continuous function taking on integer values only, we conclude that f (t) = f (s) = 1. This means that the range of 1t includes BR−2T l(θB) (0). Since R > 1 is arbitrary, we conclude that the range of 1t is Rd . We will denote the inverse of 1 by X, so X[s, μ](t, q) = [1 [s, μ](t)]−1 (q), for 0 < s ≤ T , 0 ≤ t ≤ T , q ∈ T d , μ ∈ P(T d ). Next, note that for 0 ≤ t ≤ T , q ∈ T d , |∇q 1t (q) − Id | ≤ T (A1 + θ A2 )h(θ B) < 1, since T is small, where Id is the d × d matrix with

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.26 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

26

1’s in the diagonal and 0’s everywhere else. This implies that ∇q 1t (q) is an invertible matrix, for 0 ≤ t ≤ T , q ∈ T d . By the inverse function theorem, Xt is differentiable. Moreover, since ∇q Xt (q) = [∇q 1t (Xt (q))]−1 ,

(4.17)

and q → ∇q 1t (q) is continuous, continuity of matrix inversion gives that the mapping q → ∇q Xt (q) is continuous; this means that X is C 1 in q. 2 To show (4.15), we may use the fact that6 the determinant function det : Rd → R has derivative ∇ det satisfying |∇ det(ξ )| ≤ 2|ξ |d−1 ,

ξ ∈ Rd

2

and the inverse matrix formula ξ −1 =

1 (∇ det ξ )t , det ξ

where the superscript t denotes transposition. By the mean-value theorem, there is τ ∈ [0, 1] such that det(Id + Ts ∇q ) − det Id = ∇ det(Id + τ Ts ∇q ) · Ts ∇q , where ∇ abbreviates ∇(t, q) at an arbitrary (t, q) ∈ [0, T ] × T d . Hence, by the aforementioned fact, | det(Id +

s s s s s ∇q ) − det Id | ≤ |∇ det(Id + τ ∇q )||∇q | ≤ 2|Id + τ ∇q |d−1 ||∇q | T T T T T √ √ d−1 d−1 ≤ 2( d + |∇q |) |∇q | ≤ 2(1 + d) T (A1 + A2 )h(θ B) 1 < , 2

by (4.16) and because T is small enough that T<

4(1 +



1 d)d−1 (A1 + θ A2 )h(θ B)

.

Since Id + ∇q (t, q) = ∇q 1 (t, q) and det Id = 1, we obtain the first inequality in (4.15). Using the inverse matrix formula and the inequality |∇ det(ξ )| ≤ 2|ξ |d−1 once more, we have |∇q X(t, q)| = |(Id + ∇q )−1 | = (det(Id + ∇q ))−1 [∇ det(Id + ∇q )]t √ 2 2 |Id + ∇q |d−1 ≤ (1 + d)d−1 ≤ det(Id + ∇q ) det(Id + ∇q ) √ d−1 < 4(1 + d) , and since this holds for any t ∈ [0, T ] and q ∈ T d , we have obtained the second inequality in (4.15). 6 See, for instance, [24, Remark 3.11].

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.27 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

27

Bound on ∇q X. Due to the formula ∇q Xt (q) = (∇q 1t )−1 ◦ Xt (q), i.e., formula (4.17), the second inequality in (4.15) implies

∇q X[s, μ](t, ·) C(T d ;T d ×T d ) < 4(1 +



d)d−1 .

2

(4.18)

Definition 4.16. Let θ > 2κ. Given μ ∈ P(T d ), s ∈ [0, T ], and  the unique fixed point of ms,μ in M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), set σt = σ (t) := 1 [s, μ](t, ·)# μ,

v[s, μ](t, q) := ∂t 1t [s, μ] ◦ Xt [s, μ](q),

(4.19)

for 0 ≤ t ≤ T . It should be kept in mind that the path σt depends on s and μ. Also, the arguments s, μ may often be omitted in the notation for v, as has been done for . Proposition 4.17. The path σ belongs to AC 2 (0, T ; P(T d )) and v is a velocity associated to σ , that is, ∂t σ + div(σ v) = 0 in the distribution sense, with vt = v(t, ·) ∈ L2 (T d , σt ), 0 < t < T . Proof. The proof is simple and follows by direct estimation and calculation, so we omit it. 2 Note that the definition of the field vt means that the mappings t → 1 (t, q) are the flow lines of vt . 4.3. Value function U and characteristics In what follows, the hypotheses of Corollary 4.7 will be in force, with  denoting the solution to (4.3). The following statement stems from the fact that 1 [s, μ](·, ·) ∈ W 2,2;∞ ((0, T ) × T d ; T d ); we omit its proof. Proposition 4.18. For every s∈[0, T ], μ∈P(T d ), the function X[s, μ](·,·) is in W 2,2;∞ ((0, T ) × T d ; T d ). Define now V[s, μ](t, q) := 2t [s, μ] ◦ Xt [s, μ](q) = 2t [s, μ] ◦ (1t [s, μ])−1 (q),

(4.20)

for s, t ∈ [0, T ], μ ∈ P(T d ), q ∈ T d . Alternatively, we may write Vt [s, μ](q). We can now proceed to solve the MFG system. Lemma 4.19. Let T be small according to Remark 4.8, s ∈ [0, T ], μ ∈ P(T d ), and, as in Definition 4.16: σt = 1 [s, μ](t, ·)# μ. For each q ∈ T d , let U (t, q) = z(t, X[s, μ](t, q)), where z(·, q) satisfies

t ∈ [0, T ],

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.28 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

28

∂t z(t, q) = 2 [s, μ](t, q) · ∇p H (1 [s, μ](t, q), 2 [s, μ](t, q)) − H (1 [s, μ](t, q), 2 [s, μ](t, q)) − F (1 [s, μ](t, q), σt )

(4.21) in (0, T ),

z(0, q) = g(1 [s, μ](0, q), σ0 ).

(4.22)

Then U ∈ C 1 ((0, T ) × T d ) and solves the Hamilton-Jacobi equation of the mean-field game system: ∂t U (t, x) + H (x, ∇q U (t, x)) + F (x, σt ) = 0

in (0, T ) × T d ,

U (0, ·) = g(·, σ0 ).

(1.2a) (1.2c)

Proof. Since s and μ are fixed, we abbreviate [s, μ](t, q) = t (q). Observe that the right-hand side of (4.21) is C 1 in q, C 0 in t , so z is C 1 in q, C 1 in t . Therefore, U is C 1 in both variables t and q, because of Proposition 4.18. Moreover, since ∂t z(t, q) is C 1 in q, then (e.g. see [38, 2 z exists and is equal to ∇ 2 z. Thus, the calculations below are legitimate. We Thm. 9.41]) ∇tq qt have U (t, 1t (q)) = z(t, q), so ∂t z(t, q) = ∂t (U (t, 1t (q)) and ∂t z(t, q) = ∂t (U (t, 1t (q))) = ∂t U (t, 1t (q)) + ∇q U (t, 1t (q)) · ∂t 1t (q) = ∂t U (t, 1t (q)) + ∇q U (t, 1t (q)) · ∇p H (1t (q), 2t (q))

(4.23)

Now, if ∇q U (t, 1t (q)) = 2t (q),

t ∈ (0, T ), q ∈ T d ,

(4.24)

then, comparing (4.23) and (4.21), we get ∂t U (t, 1t (q)) = −H (1t (q), 2t (q)) − F (1t (q), σt ), and the change of variable x = 1t (q) then yields (1.2a), (1.2c). We set out to prove (4.24) now. Let ∂z(t, q)  2 (j ) ∂x (j ) − (t ) (q) (i) , ∂q (i) ∂q d

ri (t) :=

j =1

i = 1, . . . , d, where x = 1t (q). We know that ri (0) = 0, from the initial conditions, and ∂ 2 z(t, q)  ∂ ∂ − ∂t (2t )(j ) (q) (i) x (j ) =: a − b. (i) ∂t∂q ∂q ∂t d

r˙i (q) =

j =1

Using the first line, ∂t 1t (q) = ∇p H (1t (q), 2t (q)), in (4.3), we have

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.29 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

29

a = ∂q (i) ∂t z(t, q) =

d d  

 (k) 2 1 (k) ∂q (i) (2t )(k) (q)∂t (1 )t (q) + (2t )(k) (q)∂t,q (q) (i) (t )

k=1 k=1



d  

 ∂q (l) H (1t (q), 2t (q))∂q (i) (1t )(l) (q) + ∂p(l) H (1t (q), 2t (q))∂q (i) (2t )(l) (q)

l=1



d 

∂q (l) F (1t (q), σt )∂q (i) (1t )(l) (q),

l=1

and, by the second line in (4.3), namely, ∂t 2t (q) = −∇q H (1t (q), 2t (q)) − ∇q F (1t (q), σt ), this simplifies to a=

d  

 2 1 (k) (2t )(k) (q)∂tq (q) + ∂t (2t )(k) (q)∂q (i) (1t )(k) (q) . (i) (t )

k=1

As for b, b = ∂t

d 

(2t )(j ) (q)∂q (i) (1t )(j ) (q) j =1

=

d   j =1

 2 1 (j ) ∂t (2t )(j ) (q)∂q (i) (1t )(j ) (q) + (2t )(j ) (q)∂tq (q) , (i) (t )

so a = b. Therefore r˙i (t) ≡ 0, and ri (t) ≡ 0 on (0, T ], by the uniqueness of (4.21) [0, T ]. Now we differentiate U , keeping in mind that U (t, x) = z(t, q); using the fact that ri (t) = 0, 0 ≤ t ≤ T , we have ∂x (i) U (t, x) =

d  ∂z(t, q) ∂q (j ) j =1

∂q (j ) ∂x (i)

d d  d  ∂x (k) ∂q (j )  2 (k) ∂x (k) 2 (k) = (t ) (q) (i) = (t ) (q) (i) ∂q ∂x (i) ∂x j =1 k=1

k=1

= (2t )(i) (q), for i = 1, . . . , d and 0 ≤ t ≤ T . This proves (4.24), completing the proof of the lemma. 2 Corollary 4.20. By (4.24) in the proof of the preceding lemma, we have ∇q U (t, q) = V[s, μ](t, q),

t ∈ (0, T ), q ∈ T d .

(4.25)

The dependence of U on the parameters s and μ, made clear by its definition, should not be forgotten. This corollary means that V[s, μ](t, ·) is the C 1 gradient of a function. Thus, ∇q V[s, μ](t, q) is symmetric, for every t ∈ (0, T ), q ∈ T d .

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.30 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

30

To conclude our statement about the MFG system, observe that vt (q) = ∂t 1t ((1t )−1 (q)) = ∇p H (q, V(t, q)) = ∇p H (q, ∇q U (t, q)).

(4.26)

Hence, combining with Proposition 4.17, we have the following. Theorem 4.21. (Existence of solution to the MFG system) Let μ ∈ P(T d ), T be in accordance with Remark 4.8 and Proposition 4.5, 0 < s < T , and let σt , vt be as in (4.19), where (1 [s, μ], 2 [s, μ]) is the unique solution to (4.3) with parameters s, μ. Then the pair (U, σ ), where U is as in Lemma 4.19, is a classical solution to the mean-field game system (1.2a)-(1.2d) in the sense explained in Section 2.3. Note that, by Proposition 4.18, the function U in the pair (U, σ ) constructed above is in W 2,2;∞ ((0, T ) × T d ) × AC 2 (0, T ; P(T d )). The following is, in a sense, a consistency, or restricted uniqueness, complement to the latter theorem. Theorem 4.22. (The case of W 2,3;∞ ((0, T ) × T d ) × AC 2 (0, T ; P(T d )) solutions to the MFG system) Let (U˜ , σ˜ ) be a classical solution to the MFG system (1.2a)-(1.2d), in the sense explained in Section 2.3, such that U ∈ W 2,3;∞ ((0, T ) × T d ). Then (U˜ , σ˜ ) = (U, σ ), where (U, σ ) is the pair constructed for Theorem 4.21. Proof. Let (U˜ , σ˜ ) be a solution to the system (1.2a)-(1.2d) with parameters s ∈ (0, T ) and μ ∈ P(T d ), according to the definition of Section 2.3, and suppose, moreover, that U˜ is W 3;∞ in q. ˜ σ˜ ) must solve the 1. We will prove that the characteristics of the MFG system satisfied by (U, Hamiltonian system (4.3). Set v˜t (q) = v(t, ˜ q) := ∇p H (q, ∇q U˜ (t, q)), 0 ≤ t ≤ T , q ∈ T d . Since U˜ ∈ W 2,3;∞ ((0, T ) × T d ) and H ∈ C 3 , we have that Lip(v˜t , K) + supq∈K |v˜t (q)| is bounded on [0, T ], where K is any compact subset of Rd and Lip(v˜t , K) is the Lipschitz constant of v˜t |K . By elementary ODE theory (see, e.g., [3, Lemma 8.1.4]), if q ∈ Rd , the ODE ˜ 1 (s, q) = q, 

∂ 1 ˜ 1 (t, q)) ˜ (t, q) = v˜t (  ∂t

(4.27)

˜ 1 (t, ·) is has a unique maximal solution in a neighborhood I (q, s) ⊂ (0, T ) of s, but, since  periodic, and therefore bounded for every t ∈ (0, T ), then I (q, s) = (0, T ). Clearly, the path ˜ 1 (t, ·)# μ solves the continuity equation with velocity v˜t . We can apply Proposition 8.1.7 t →  of [3] to conclude that ˜ 1 (t, ·)# μ, σ˜ t = 

(4.28)

0 ≤ t ≤ T . Let ˜ q) := ∇q U˜ (t, q) , V(t,

˜  ˜ 1 (t, q)), ˜ 2 (t, q) := V(t, 

(4.29)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.31 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

31

0 ≤ t ≤ T , q ∈ T d . Then ˜ 1 (t, q)) = ∇p H ( ˜ 1 (t, q),  ˜ 2 (t, q)), ˜ 1 (t, q) = v˜t ( ∂t 

(4.30)

which is the first equation in (4.3). To obtain the second one, observe that ˜  ˜  ˜ 1 (q, t)) + ∇q V(t, ˜ 1 (t, q))∂t  ˜ 2 (t, q) = ∂t V(t, ˜ 1 (t, q) ∂t  2 ˜ 2 ˜ ˜ 1 (t, q)) + ∇qq ˜ 1 (t, q))∂t  ˜ 1 (t, q), = ∇tq U (t,  U (t, 

while, differentiating the Hamilton-Jacobi equation with respect to q gives 2 ˜ 2 ˜ ∇qt U (t, q) + ∇q H (q, ∇q U˜ (t, q)) + ∇p H (q, ∇q U˜ (t, q))∇qq U (t, q) + ∇q F (q, σ˜ t ) = 0,

˜ 1 (t, q) in place of q, and using (4.29), (4.30), serves to simplify the former which, evaluating at  equality to ˜ 1 (t, q),  ˜ 2 (t, q)) − ∇q F ( ˜ 1 (t, q), σ˜ t ), ˜ 2 (t, q) = −∇q H ( ∂t  ˜ 2 (0, q) = ∇q g( ˜ 1 (0, q),  ˜ 1 (0, ·)# μ) folwhich is the second equation in (4.3). The condition  lows readily from (4.29) and (1.2c). ˜ 1,  ˜ 2 ) to (4.3) from the previous 2. We prove that, for a possibly smaller T , the solutions ( paragraph belongs to M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ), i.e., they satisfy the bounds (4.13) and ˜ 2 (0, q)| = |∇q g( ˜ 1 (q), σ˜ 0 )| ≤ κ, continuity (iii) of Definition 4.2. If T is small enough, since | 0 ˜ 2 (t, q)| ≤ κ + ε, 0 ≤ t ≤ T , q ∈ T d for an ε > 0 such that implies | ˜ 2 (t, q)| ≤ θ |

κ 1 + ε ≤ θ max{d, κ} + ε = θ c/θ + ε ≤ θ B, θ θ

because B in the proof of Lemma 4.4 was chosen as B > c/θ (in these lines we are referring back to the proof of Lemma 4.4, in particular (4.5), (4.6) and the paragraph preceding ˜ 2 M2 ≤ θ B, we have |∂t  ˜ 1 (t, q)| = those inequalities). This is the third line in (4.13). Since  1 2 ¯ ˜ (t, q),  ˜ (t, q))| ≤ l(B) |∇p H ( (see Definition 4.1, “Coefficient bounds I”) for small enough ¯ T and all q, and since A1 in Lemma 4.4 was chosen to be larger than l(B), we obtain the bound ˜ 1 in (4.13). The one for ∂t  ˜ 2 M2 goes in a similar way, because A2 in Lemma 4.4 for ∂t  ¯ was chosen larger than l(B)/θ . The bounds for the second-order time derivatives are dealt with in a similar way, keeping in mindthe way E1 and E2 were chosen in the proof of Lemma 4.4. ˜ 1 (t, q) = q + t vτ ( ˜ 1 (τ, q))dτ , which makes it clear that, upon taking the From (4.27),  s ˜ 1 (t, q) will be only slightly larger than gradient in q, if T is small enough, the norm of ∇q  √ 2  2 (q) ≡ 0, can ˜ 1 , due to ∇qq d, making it less than A1 , because of (4.5). The bound for ∇qq 2  ˜ 2 and ∇qq ˜ 2, actually be made arbitrarily small by choosing T small enough. To address ∇q  since U˜ (t, q) = g(q, σ˜ 0 ) +

t 0

[H (q, ∇q U˜ (τ, q)) + F (q, σ˜ τ )]dτ,

JID:YJDEQ AID:10108 /FLA

32

[m1+; v1.304; Prn:25/11/2019; 13:20] P.32 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

2 U ˜ (t,  ˜ 1 (t, q))∇q  ˜ 2 (t, q) = ∇qq ˜ 1 (t, q), the norm of ∇q  ˜ 2 is the product of a number and ∇q  √ slightly larger than κ and one slightly larger than d, for small times T . But the constant E in the proof of Lemma 4.4 is larger than c = max{d, κ}, and A2 > θ1 cE(E + 1). This ensures that ˜ 2 ≤ θ A2 . Next, given that

∇q  2 ˜2 3 2 ˜ 2 ˜1 ˜ 1 ∇q  ˜ 1 + ∇qq U˜ ∇q  U ∇qq  ∇qq  = ∇qqq 2  ˜ 1 |, as already mentioned, can be made as small as needed (because U˜ is W 3;∞ in q), and |∇qq 2  ˜ 2 is also no greater than θ A2 , since the by reducing T , the same argument shows that ∇qq 3 U ˜ ∇q  ˜ 1 ∇q  ˜ 1 is the product of a number slightly larger than κ and one slightly norm of ∇qqq 1 2  ˜ , ∇qq ˜ 1 ≤ E follow from E having been picked larger than d (and, than d. Finally, ∇q  0 0 √ therefore, than d), and taking T smaller if necessary. ˜ 1,  ˜ 2 ) constructed in (4.27) and (4.29) from (U˜ , σ˜ ) coincides with Thus, the mapping ( 1 the unique solution ( [s, μ], 2 [s, μ]) of (4.3) in M0 (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) during a possibly shorter interval [0, T ]. Consequently, by (4.28), we further have that σ˜ = σ . Also, now ˜ q) = V(t, q), 0 ≤ t ≤ T , we get that V(t,

U˜ (t, q) = g(x, σ˜ 0 ) −

t

˜ q) − F (x, σ˜ τ )]dτ [H (q, V(τ,

0

t = g(x, σ0 ) −

[H (x, V(τ, q)) − F (q, στ )]dτ = U (t, q), 0

for any t ∈ [0, T ]. Thus, (U˜ , σ˜ ) = (U, σ ) on the possibly smaller interval [0, T ].

2

4.4. The full value function u(s, q, μ) In this section we begin our study of the dependence of our solution U to (1.2a) on the parameter μ. First, we present a list of facts that will be used in this and following sections. Proposition 4.23. Let 0 ≤ s, t0 ≤ T , μ ∈ P(T d ). Set σt0 = 1t0 [s, μ]# μ. Then, (i) For every 0 ≤ t ≤ T : t [t0 , σt0 ] ◦ 1t0 [s, μ] = t [s, μ].

(4.31)

(ii) For every 0 ≤ t ≤ T : 1t [t0 , 1t0 [t, μ]# μ]

and 1t0 [t, μ]

are inverses of each other ,

(4.32)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.33 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

vt [s, μ] = vt [t0 , σt0 ] , 2t [t0 , σt0 ] ◦ 1t0 [s, μ] = 2t [s, μ] , ∂s 1t [s, μ]

33

(4.33) (4.34)

= −∇q 1t [s, μ]vs [t, σt ].

(4.35)

2τ [t, σt ] ◦ (1τ [t, σt ])−1 = 2τ [s, μ] ◦ 1τ [s, μ]−1 .

(4.36)

(iii) If 0 ≤ τ, t ≤ T , then

Proof. (i) Let Q(t, q) = (1t [s, μ] ◦ 1t0 [s, μ]−1 )(q), P (t, q) = (2t [s, μ] ◦ 1t0 [s, μ]−1 )(q), for 0 ≤ t ≤ T , q ∈ T d . By differentiating, and noting that Q(0, ·)# σt0 = 10 [s, μ]# μ, one verifies that Q and P defined this way satisfy the Hamiltonian ODEs (4.3) with s = t0 ,

μ = σ t0 .

Since solutions to (4.3) are unique, we conclude that (Qt , Pt ) = (1t [t0 , σt0 ], 2t [t0 , σt0 ]), yielding (4.31). (ii) Fact (4.32) follows readily from (i), by setting t = s. For (4.33), see [24, p. 6593]. Formula (4.34) is just the second component of (4.31). By (4.32), with t = s, t0 = t , we have id = 1t [s, μ] ◦ 1s [t, σt ] = 1 [s, μ](t, 1s [t, σt ]). By Lemma 4.12, we can differentiate both sides with respect to s: 0 = ∂s 1 [s, μ](t, 1s [t, σt ](q)) + ∇q 1 [s, μ](t, 1s [t, σt ](q))∂s 1s [t, σt ](q),

q ∈ Td.

Substituting 1s [t, σt ](q) for q, we get 0 = ∂s 1 [s, μ](t, q) + ∇q 1 [s, μ](t, q)∂s 1s [t, σt ](1s [t, σt ]−1 ) = ∂s 1 [s, μ](t, q) + ∇q 1 [s, μ](t, q)vs [t, σt ], which gives (4.35). (iii) For (4.36), simply use (4.34) with τ in place of t and t in place of t0 : 2τ [t, σt ] ◦ (1τ [t, σt ])−1 = 2τ [s, μ] ◦ t [s, μ]−1 ◦ 1τ [t, σt ]−1 = 2τ ◦ (1τ [t, σt ] ◦ t [s, μ])−1 = 2τ [s, μ] ◦ 1τ [s, μ]−1 . 2

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.34 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

34

Given s ∈ [0, T ], q ∈ T d , μ ∈ P(T d ), define s u(s, q, μ) = g(q,  [s, μ](0, ·)# μ) − 1



 H (q, V[s, μ](τ, q)) + F (q, 1 [s, μ](τ, ·)# μ dτ,

0

(4.37) and, as before, σt = 1t [s, μ]# μ, 0 ≤ t ≤ T . Note that (4.36) reads now as Vτ [t, σt ] = Vτ [s, μ]. This, coupled with the fact that, by (4.31) of Proposition 4.23, 1τ [t, σt ] ◦ 1t [s, μ] = 1τ [s, μ], 0 ≤ τ ≤ T , gives us t u(t, q, σt )

= g(q, 10 [t, σt ]# σt ) −



 H (q, V[t, σt ](τ, q)) + F (q, 1 [t, σt ](τ, ·)# μ) dτ

0

t = g(q, σ0 ) −



 H (q, V[s, μ](τ, q)) + F (q, 1τ [s, μ]# μ) dτ.

(4.38)

0

Since V[s, μ](t, 1 [s, μ](t, q)) = 2 [s, μ](t, q), and 2 satisfies the second of the Hamiltonian ODEs (4.3), it follows, by taking the total time derivative of V[s, μ](t, 1 [s, μ](t, q)), and then changing variable from 1 (t, q) to q, that V(t, q) = V[s, μ](t, q) satisfies the equation ∂t V(t, q) + ∇q V(t, q)∇p H (q, V(t, q)) = − ∇q H (q, V(t, q)) − ∇q F (q, 1t # μ), (4.39) V(0, q) = ∇q g(q, σ0 ).

(4.40)

If we differentiate u(t, q, σt ) with respect to q in (4.38), and use (4.39) and (4.40), we get t ∇q u(t, q, σt ) = ∇q g(q, σ0 ) + 0



∂t V[s, μ](τ, q) + ∇q V[s, μ](τ, q)∇p H (q, V[s, μ](τ, q))

 − ∇p H (q, V[s, μ](τ, q)∇q V[s, μ](τ, q) dτ.

Since ∇q Vτ is a symmetric matrix for τ ∈ [0, T ], only the term ∂t V survives in the integral. Hence ∇q u(t, q, σt ) = V[s, μ](t, q),

0 ≤ t ≤ T, q ∈ Td.

(4.41)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.35 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

35

Differentiating now with respect to t in (4.38), and substituting (4.41) into it, we conclude that ∂t (u(t, q, σt )) + H (q, ∇q u(t, q, σt )) + F (q, σt ) = 0.

(4.42)

Thus, we have shown: Lemma 4.24. For s ∈ [0, T ], μ ∈ P(T d ), we have: (i) For any (t, q) ∈ [0, T ] × T d , u(0, ·, μ) = g(·, μ) ,

∇q u(t, q, σt ) = V[s, μ](t, q).

(ii) The function t → u(t, q, σt ) is continuously differentiable and ∂t (u(t, q, σt )) + H (q, ∇q u(t, q, σt )) + F (q, σt ) = 0 ,

(t, q) ∈ [0, T ] × T d .

5. Regularity of [s, ·](t, q) Besides the conditions of Section 2.1, we now activate the conditions of Section 2.2 for the remainder of the paper. 5.1. The discretized map M For the remainder of the paper, let √ θ > max{1, 5 2κ},

(5.1)

and A1 , A2 , B, E, T , D be as in Proposition 4.5 and Corollary 4.7, with T being subject to Remark 4.8. The functions [s, μ], s ∈ [0, T ], μ ∈ P(T d ), as before, denote the fixed points of ¯ μ] = (1 [s, μ], 1 2 [s, μ]) =: ( ¯ 1 [s, μ],  ¯ 2 [s, μ]) are the fixed the operators ms,μ , while [s, θ s,μ ¯ ; recall (4.12) from the proof of Corollary 4.7. points of the operators m Let M = (M1 , M2 ) be the map  = (1 , 2 ) restricted to average of Dirac masses. Namely: Definition 5.1. For any s, t ∈ [0, T ], q ∈ T d , x ∈ (T d )n , let M = (M1 , M2 ) : [0, T ] × [0, T ] × T d × (T d )n −→ T d × Rd (t, s, q, x) −→ [s, μx ](t, q) = (1 [s, μx ](t, q), 2 [s, μx ](t, q)). Note. The domain of the mapping M depends on n ∈ Z + . Definition 5.2. (i) For n ∈ N, let M¯ k , k = 0, 1, . . . be the sequence of T d × Rd -valued functions on [0, T ] × [0, T ] × T d × (T d )n defined by M¯ 0 ≡ (q, 0),

x ¯ s,μ (M¯ k (·, s, ·, x)), M¯ k+1 = m

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.36 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

36

where μx =

1 n

n

j =1 δxj ,

x = (x1 , . . . , xn ) ∈ (T d )n . That is,

M¯ 1k+1 (t, s, q, x) = q

t +

∇p H (M¯ 1k (τ, s, q, x), θ M¯ 2k (τ, s, q, x))dτ

s

and 1 M¯ 2k+1 (t, s, q, x) = ∇q gn (M¯ 1k (0, s, q, x), M¯ 1k (0, s, x1 , x), · · · , M¯ 1k (0, s, xn , x)) θ t 1 − [∇p H (M¯ 1k (τ, s, q, x), θ M¯ 2k (τ, s, q, x)) θ 0

+ ∇q Fn (M¯ 1k (t, s, q, x), M¯ 1k (t, s, q, x1 ), · · · , M¯ 1k (t, s, q, xn ))]dτ, where Fn (q, x) := F (q, μx ),

gn (q, x) := g(q, μx ).

(ii) Let M k , k = 0, 1, . . . be the sequence of T d × Rd -valued functions on [0, T ] × [0, T ] × T d × (T d )n defined by M 0 ≡ (q, 0),

x

M k+1 = ms,μ (M k (·, s, ·, x)).

Remark 5.3. It follows that, for every k = 0, 1, . . . M1k (t, s, q, x) = M¯ 1k (t, s, q, x);

M2k (t, s, q, x) = θ M¯ 2k (t, s, q, x),

t, s ∈ [0, T ], q ∈ T d , x ∈ (T d )n . The objective now is to obtain estimates on the derivatives of M k with respect to x. Using the definition of Wasserstein gradient directly, one verifies the formulas 1 ∇x2i q Fn (q, x) = ∇μ ∇q F (q, μx )(xi ), n

1 ∇x2i q gn (q, x) = ∇μ ∇q g(q, μx )(xi ). n

(5.2)

The previous is a matrix equality: the entries on the left hand side are ∂x (j ) ∂q (l) Fn (q, x) and those i

on the right are n1 ∇μj ∂q (l) F (q, μx )(xi ), for j, l = 1, . . . , d, where ∇μj F (q, μ) denotes the j -th component of the Wasserstein gradient of ∂q (l) F (q, μx ) at xi . Furthermore, since ∇q F and ∇q g are twice differentiable in the measure variable, we also know that 1 2 ∇ ∇q F (q, μx )(xi , xj ), n2 μμ 1 2 ∇x2j xi ∇q gn (q, x) = 2 ∇μμ ∇q g(q, μx )(xi , xj ). n ∇x2j xi ∇q Fn (q, x) =

(5.3)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.37 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

37

We begin with the xj -derivative of the (k + 1)-th iteration (j = 1, . . . , n), and in this and subsequent calculations, we will include in the arguments of the functions only the variables that are relevant to them. ∇xj M¯ 1k+1 (t, s, q, x) =

t

2 2 [∇qp H (M¯ 1k , θ M¯ 2k )∇xj M¯ 1k + θ ∇pp H (M¯ 1k , θ M¯ 2k )∇xj M¯ 2k ]dτ, (5.4)

s

∇xj M¯ 2k+1 (t, s, q, x) 1 2 = ∇qq gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇xj M¯ 1k (q, x) θ 1 + ∇x2j q gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))[∇q M¯ 1k (xj , x) + ∇xj M¯ 1k (xj , x)] θ 1 2 ∇xi q gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇xj M¯ 1k (xi , x) + θ i =j

1 − θ

t

2 2 [∇qq H (M¯ 1k , θ M¯ 2k )∇xj M¯ 1k + θ ∇pq H (M¯ 1k , θ M¯ 2k )∇xj M¯ 2k ]dτ

0

1 − θ

t

2 [∇qq Fn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇xj M¯ 1k (q, x)

0

+ ∇x2j q Fn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))[∇q M¯ 1k (xj , x) + ∇xj M¯ 1k (xj , x)]  ∇x2i q Fn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇xj M¯ 1k (xi , x)]dτ. (5.5) + i =k

For the next lemma, we remind the reader of Remark 4.13. Lemma 5.4. Fix n ∈ Z+ . Using the terminology of Definition 5.2, the following hold: (i) For each k = 0, 1, . . . and each x ∈ (T d )n , M k (·, ·, ·, x) ∈ M∗0,D (A1 , θA2 , θB, E, E1 , θE2 , T ). (ii) There is a constant C > 0, independent of k such that for any j = 1, . . . , n:

∇xj M k ∞ ≤

C . n

(5.6)

Proof. (i) Clearly, M¯ 0 (·, ·, ·, x) ∈ M∗0,D¯ (A1 , A2 , B, E, E1 , E2 , T ), and by Lemma 4.12, each M¯ k (·, ·, ·, x) ∈M∗0,D¯ (A1 , A2 , B, E, E1 , E2 , T ). By Remark 4.13, each M k (·, ·, ·, x) ∈M∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ). (ii) Recall formulas (5.2), (5.3)—since ∇q F , ∇q g and their first and second order Wasserstein gradients are uniformly bounded (conditions imposed in Section 2.2), we get

JID:YJDEQ AID:10108 /FLA

38

[m1+; v1.304; Prn:25/11/2019; 13:20] P.38 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

√ √

1 2κ 2κ k+1 ¯

∇xj M ∞ ≤ E+ T A1 θ θ n √ √ √ 2κ 2κ 1 2κ n − 1 ¯ + + + + ∇xj M ∞ θ θ n θ n √ √ √  √ 2κ 2κ 1 2κ n − 1 

¯ + + + T 2 2(1 + θ )h(B) θ θ n θ n √ √ √ √ 2κ 2κ 2κ

1 ¯ +3 (E + T A1 ) + ∇xj M¯ k ∞ 3 + T (2 2(1 + θ )h(B) ) . ≤ θ n θ θ By (5.1), we can invoke Remark 4.8 to obtain that the latter is an expression of the form a

∇xj M¯ k+1 ∞ ≤ + b ∇xj M¯ k ∞ n with positive constants a, b in which b < 1. This holds for every k = 0, 1, . . . Applying this inequality recursively (see [24, Remark 8.1]), we find a constant C > 0 such that ∇xj M¯ k ∞ ≤ C/n, for every k ∈ Z+ . Thus, by Remark 5.3, 1 C ( ∇xj M1k 2∞ + ∇xj M2k 2∞ )1/2 = ( ∇xj M¯ 1k 2∞ + ∇xj M¯ 2k 2∞ )1/2 ≤ . θ n

(5.7)

Multiplying by θ on both sides we get, since θ > 1, that ( ∇xj M1k 2 + ∇xj M2k 2 )1/2 ≤ which is (5.6) for a larger constant C. 2

θC n ,

5.1.1. Regularity of M in x, q and s Corollary 5.5. The sequence {M k }∞ 1 of Definition 5.2(ii) converges uniformly to the function M of Definition 5.1, with M(·, ·, ·, x) ∈ M∗0,D (A1 , θA2 , θB, E, E1 , θE2 , T ) for every x ∈ (T d )n and M(t, s, q, x) = M(t, s, q, x) ¯ whenever x¯ is a permutation of x. Moreover, there is a constant C > 0 such that C , j = 1, . . . , n , n C ≤ 2, i = j, i, j ∈ {1, . . . , n} , n C ≤ , j = 1, . . . , n , n C ≤ , j = 1, . . . , n. n

2

∇qx M ∞ ≤ j

(5.8)

∇x2i xj M ∞

(5.9)

∇x2j xj M ∞ 2 M ∞

∇sx j

(5.10) (5.11)

Note. Since on W 2;∞ (T d × T d ) the mixed partial derivatives ∇x2j xi and ∇x2i xj are equal, estimate (5.8) holds for ∇x2j q M too.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.39 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

39

Proof. The proof of Lemma 4.12 shows that for every x ∈ (T d )n , M k (·, ·, ·, x) converges uniformly to [·, μx ](·, ·). Formula (5.6) means that the sequence M k is equicontinuous, and uniformly bounded, on [0, T ] × [0, T ] × T d × (T d )n . It follows, by Ascoli’s theorem, that the convergence of M k to M is uniform, and M also satisfies (5.6) of Lemma 5.4; to be more precise:

∇xj M ∞ ≤

C . n

(5.12)

If x¯ is a permutation of x then μx = μx¯ and so M(t, s, q, x) = M(t, s, q, x). ¯ The three estimates above will be true of the limit function M if they hold for every M k . Like before, we are unable ¯ k first. to obtain them for M k directly due to the size of the constant κ, so we again do it for M Differentiating (5.4) and (5.5) with respect to q, we get 2 ∇qx M¯ 1k+1 (t, s, q, x) = j

t

3 3 [(∇qqp H (M¯ 1k , θ M¯ 2k )∇q M¯ 1k + θ ∇pqp H (M¯ 1k , θ M¯ 2k )∇q M¯ 2k )∇xj M¯ 1k

s 2 2 + ∇qp H (M¯ 1k , θ M¯ 2k )∇qx M¯ 1k j 3 3 + θ (∇qpp H (M¯ 1k , θ M¯ 2k )∇q M¯ 1k + θ ∇ppp H (M¯ 1k , θ M¯ 2k )∇q M¯ 2k )∇xj M¯ 2k 2 2 + θ ∇pp H (M¯ 1k , M¯ 2k )∇qx M¯ 2k ]dτ, j

and7 2 M¯ 2k+1 (t, s, q, x) = ∇qx j

1 3 = ∇qqq gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇q M¯ 1k (q, x)∇xj M¯ 1k (q, x) θ 1 2 gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇x2j q M¯ 1k (q, x) + ∇qq θ 1  3 ∇qxl q gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇q M¯ 1k (q, x)∇xj M¯ 1k (xl , x) + θ l =k

 2 + ∇x2l q gn (M1 (q, x), M1 (x1 , x), . . . , M1 (xn , x))∇qx M1 (xl , x) j

1 3 + ∇qx g (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))∇q M¯ 1k (q, x)(∇q M¯ 1k (xj , x) + ∇xj M¯ 1k (xj , x)) jq n θ 1 2 ¯k 2 M1 (xj , x) + ∇qx M¯ 1k (xj , x)] + ∇x2j q gn (M¯ 1k (q, x), M¯ 1k (x1 , x), . . . , M¯ 1k (xn , x))[∇qq j θ t 1 · · · dτ. − θ 0 7 For the second order gradients of K¯ , we will only write the part corresponding to g , knowing that the one corren 2 sponding to Fn has exactly the same structure, and the expression coming from H is the same as the one just displayed, 3 except that the last subindex in ∇222 is q.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.40 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

40

Using (5.6) and the bounds on the coefficients, we get: √ C 2 ¯

∇qx M¯ k+1 ∞ ≤ 2 2T (1 + θ )h(B)(A 1 + θ A2 ) j n √ κ1 n−1 C C + 2 (E(1 + C) + E + E(E + )) θn n n n √ κ 1 n−1 C C A1 + A1 (A1 + )) + 2 T (A1 (1 + C) + θ n n n n √ √ √ 2κ 2κ

1 + θ 1 n − 1 2 k ¯ h(B) + ( +1+ )+T , + ∇qxj M¯ ∞ 2T θ θ n n θn which is an inequality of the form a 2 2

∇qx M¯ k+1 ∞ ≤ + b ∇qx M¯ k ∞ j j n for constants a, b with b < 1, because θ > 4 and T is small. By induction again, increasing C and switching back to M k in similar fashion to (5.7), we obtain (5.8) in the limit as k → ∞. We do not present here the full calculations (the reader may refer to [37]), similar to the preceding one, that lead to (5.9) and (5.10), but we provide a short description: Case i = j : The portion involving g has the same bound as that involving F , except the latter will be multiplied by T in the estimation, so we can focus our attention on the terms coming from g. We see that 1/n2 factors out from the terms that do not involve ∇x2j xi M¯ k ∞ , while the term multiplying ∇x2j xi M¯ k ∞ is bounded by κ (n − 2)κ κ κ + + + + T (· · · ). θ θn θn θn Thus, because of the lower bound imposed on θ , we get that

∇x2j xi M¯ k+1 ∞ ≤

a + b ∇x2j xi M¯ k+1 ∞ , n2

a, b, with b < 1. The conclusion (5.9) follows. Case i = j : This time only 1/n, not 1/n2 , factors out from the terms not involving

∇x2j xj M¯ k , and the term multiplying ∇x2j xj M¯ k is κ κ κ + (n − 1) + + T (· · · ). θ θn θn It then follows that a

∇x2j xj M¯ k+1 ∞ ≤ + b ∇x2j xj M¯ k ∞ n with b < 1, for all k ∈ Z+ . Therefore (5.10) holds. Going through the same process once more, 2 , ends up in this time with the operator ∇sx j

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.41 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

41

2 ¯ k+1 2 ¯k

∇sx M ∞ ≤ a/n + b ∇sx M ∞ , j j

with b < 1, from which the estimate (5.11) follows.

2

5.1.2. First-order Taylor estimate Our next step is to get an appropriate bound on the remainder of a first-order Taylor approximation of M(·, s, ·, ·) around (t, s, q, x). Since ∂t M1 (t, s, q, x) = ∇p H (M1 (t, s, q, x), M2 (t, s, q, x)), ∂t M2 (t, s, q, x) = − ∇q H (M1 (t, s, q, x), M2 (t, s, q, x)) − ∇q Fn (M1 (t, s, q, x), M1 (t, s, q, x1 ), · · · , M1 (t, s, q, xn )), differentiating once more with respect to t and knowing that M(·, ·, ·, x) ∈ M∗0,D (A1 , θ A2 , θ B, E, E1 , θ E2 , T ) for every x ∈ (T d )n , we obtain √ 2

∂tt2 M ∞ , ∇qt M ∞ ≤ 2 2(h(θ B)(A1 + θ A2 ) + κA1 ),

∇x2j t M ∞ ≤

√ C κ 2(4h(θ B) + (A1 + 2C)), n n

(5.13)

for j = 1, . . . , n. Equipped now with estimates on all the second order derivatives of M(·, s, ·, ·), we proceed to obtain the Taylor estimate in the following corollary. Corollary 5.6. Let M = M(t, s, q, x), M  = M(t  , s, q  , x  ), with the notation of Corollary 5.2. There is a constant C > 0 such that |M  − M − ∂t M(t  − t) − ∇q M · (q  − q) − ∇x M · (x  − x)| ≤ C(|t  − t|2 + |q − q  |2 + |x − x  |2 ). The constant C does not depend on n. Note. The norm in the left-hand side of the latter inequality is the Euclidean norm on R2d . Proof. Let i ∈ {1, 2}, j ∈ {1, . . . , d}. Denoting t  − t = t , q  − q = q, xl − xl = xl , and  |x| = ( nl=1 |xl |2 )1/2 , the mean-value theorem implies that  (j )

|Mi

(j )

− Mi

− ∂t Mi (t  − t) − ∇q Mi (j )

(j ) ≤ ∂tt2 Mi ∞ |t|2

(j )

· (q  − q) − ∇x Mi (x  − x)| (j )

(j ) 2 + 2 ∇tq Mi ∞ |t||q| + 2

n  l=1

(j )

2 Mi ∞ |q|2 + 2 + ∇qq

n  l=1

(j )

(j )

2

∇tx Mi ∞ |t||xl | l

2

∇qx Mi ∞ |q||xl | l

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.42 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

42

+

n 

(l)

∇x2l xm Mi ∞ |xl ||xm | +

l,m=1 l =m

n 

(j )

∇x2l xl Mi ∞ |xl |2 .

l=1

Therefore, bringing in the estimates obtained in the foregoing paragraphs, we get |M  − M − ∂t M(t  − t) − ∇q M · (q  − q) − ∇x M · (x  − x)| ≤

∂tt2 M ∞ |t|2

2 + 2 ∇tq M ∞ |t||q| + 2

n 

2 2

∇tx M ∞ |t||xl | + ∇qq M ∞ |q|2 l

l=1

+2

n 

2

∇qx M ∞ |q||xl | + l

l=1

n 

(l)

∇x2l xm Mi ∞ |xl ||xm | +

l,m=1 l =m

n 

∇x2l xl M ∞ |xl |2

l=1

√ √ √ C ≤ 2 2(A1 + A2 )(κ + h(θ B))(|t|2 + 2|t||q|) + 4 2 2n (2h(θ B) + κ)|x||t| n √ √ √ √ C C C + 2(A1 + A2 )|q|2 + 2n 2|x||q| + n(n − 1) 2 2 2|x|2 + n |x|2 . n n n Thus, there is a larger constant, still denoted by C, and not depending on n, such that inequality in the corollary’s statement holds. 2 Given x, x  ∈ (T d )n , we can reorder and shift the coordinates of x  = (x1 , . . . , xn ) so that  |x − x  |2 = W 2 (μx , μx ). Thus, the inequality of Corollary 5.6 reads |M  − M − ∂t M(t  − t) − ∇q M · (q  − q) − ∇x M · (x  − x)| 

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μx , μx )).

(5.14)

5.2. Regularity of the inverse of M Let us define N : [0, T ] × [0, T ] × T d × (T d )n −→ T d (t, s, q, x) −→ X[s, μx ](t, q);

(5.15)

recall that X[s, μ](t, ·) is the inverse of 1 [s, μ](t, ·). The function N takes values in T d , so it has only one component, unlike M = (M1 , M2 ). Thus, M1 and N are related by M 1 (t, s, N(t, s, q, x), x) = q,

t, s ∈ [0, T ], q ∈ T d , x ∈ (T d )n .

We are going to derive now the Lipschitz property of X[s, ·](·, ·) before addressing the full regularity of [s, ·](·, ·). Recall estimate (4.18):

∇q X[s, μ](t, ·) ∞ < 4(1 +



d)d−1 ,

s, t ∈ [0, T ], μ ∈ P(T d ).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.43 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

43

Differentiating the identity q ≡ X[s, μ](t, 1 [s, μ](t, q)) with respect to t , we have 0 = ∂t X[s, μ](t, 1 [s, μ](t, q)) + ∇q X[s, μ](t, 1 [s, μ](t, q))∂t 1 [s, μ](t, q), from which ∂t X[s, μ](t, q) = −∇q X[s, μ](t, q)v[s, μ](t, q),

(5.16)

at any s, t ∈ [0, T ], q ∈ T d , μ ∈ P(T d ). Therefore

∂t Xt [s, μ] ∞ ≤ ∇q Xt [s, μ] ∞ vt [s, μ] ∞ ≤ 4A1 (1 +



d)d−1 .

(5.17)

For the regularity with respect to x = (x1 , . . . , xn ) ∈ (T d )n , we use the identity M1 (t, s, N(t, s, q, x), x) ≡ q, which holds by definition. Taking the derivative with respect to xj , j = 1, . . . , n, gives −∇xj N (t, s, q, x) = [∇q M1 (t, s, N(t, s, q, x), x)]−1 ∇xj M1 (t, s, q, x) = ∇q N (t, s, q, x)∇xj M1 (t, s, q, x). Thus, ∇xj N ∞ ≤ 4(1 +



d)d−1 Cn , which, increasing the value of C, gives

∇xj N ∞ ≤

C . n

(5.18)

Corollary 5.7. Let t, t  , s ∈ [0, T ], q  , q ∈ T d , μ, ν ∈ P(T d ). Then there is a constant C > 0 such that |X[s, ν](t  , q  ) − X[s, μ](t, q)| ≤ C(|t − t  | + |q  − q| + W (μ, ν)). Proof. Let x, x  ∈ (T d )n , and N = N (t, s, q, x), N  = N (t  , s, q  , x  ), where N is defined in (5.15). By the bounds (5.17), (4.18),  (5.18), and relabeling the sequence x1 , . . . , xn and shifting  the points so that W 2 (μx , μx ) = nj=1 |xj − xj |2 , |N (t  , s, q  , x  ) − N (t, s, q, x)| ≤ ∂t N ∞ |t  − t| + ∇q N ∞ |q  − q| + ≤ 4A1 (1 + +

n  C j =1

therefore, since

C

 n |xj



n

d)d−1 |t  − t| + 4(1 +



n 

∇xj N ∞ |xj − xj |

j =1

d)d−1 )|q  − q|

|xj − xj |;

  − xj | ≤ C( 1/n)1/2 ( |xj − xj |2 /n)1/2 , we get, by increasing C,

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.44 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

44



|N  − N | ≤ C(|t  − t| + |q − q  | + W (μx , μx )). The constant C does not depend on n. Applying the last fact in the list of Section 2, we now ∞  extend this to the arbitrary measure case: let μ, ν ∈ P(T d ), and {x(n)}∞ n=1 , {x (n)}n=1 , with  d n x(n), x (n) ∈ (T ) , sequences such that lim W (μx(n) , μ) = 0,

n→∞



lim W (μx (n) , ν) = 0.

n→∞

Since, by definition, N (t, s, q, x) = X[s, μx ](t, q), the latter estimate means 



|X[s, μx (n) ](t  , q  ) − X[s, μx(n) ](t, q)| ≤ C(|t  − t| + |q − q  | + W (μx(n) , μx (n) )), for every n ∈ Zn . Letting n → ∞, the continuity of X in all its variables finalizes the proof. 2 The regularity of  in all its variables simultaneously is obtained in the next paragraphs from the foregoing properties of its discretized version. 5.3. Regularity properties of  and composite functions We will follow the extension method of [24] to get the regularity of  in the measure variable. The idea, roughly speaking, begins with introducing a Lipschitz extension of the derivative ∇xj M1 , j = 1, . . . , n, that is defined at every measure μ, by way of a Moreau-Yosida type of extension, which becomes closer to n∇xj M1 the larger n — the number of particles — is. When the n n-particle ordered sets x n = (x1n , . . . , xnn ) are chosen in such a way that δ x → μ, the extension just mentioned will reveal itself as the Wasserstein gradient in the first-order Taylor approximation derived in the preceding paragraphs; recall (2.5). For fixed n ∈ Z+ , let B := [0, T ] × [0, T ] × T d × {(yj , μy ) | y = (y1 , . . . , yn ) ∈ (T d )n , j ∈ {1, . . . , n}} ⊂ [0, T ] × [0, T ] × T d × [(T d ) × P(T d )]. A typical element of B is thus (t, s, q, (yj , μy )) where y is any n-particle ordered set (y1 , . . . , yn ) ∈ (T d )n and yj is any of its component particles. If m ∈ Z+ and f : B → Rm is a continuous function, let

f B := sup{|f (t, s, q, (xj , μx ))| t, s ∈ [0, T ], q ∈ T d , x ∈ (T d )n , j ∈ {1, . . . , n}}. For any continuous function f = (f (1) , . . . , f (m) ) : B → Rm such that 1 |f (t, s, q, (yj , μy )) − f (t, s, q, (xi , μx ))| ≤ C(|xi − yj |T d + W (μx , μy ) + ), n where t, s ∈ [0, T ], q ∈ T d , x, y ∈ (T d )n , i, j ∈ {1, . . . , n}, define

(5.19)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.45 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

45

g (l) (t, s, q, z, μ)  := inf f (l) (t, s, q, (yj , μy )) + C(|z − yj |T d + W (μ, μy )) y ∈ (T d )n , j ∈ {1, . . . , n} , l = 1, . . . , m, g := (g (1) , . . . , g (m) ), at any fixed z ∈ T d , μ ∈ P(T d ). The function g is thus an extension of f from B to the full space [0, T ]2 × T d × [T d × P(T d )]. The following is [24, Lemma 8.10]. Proposition 5.8. Suppose that (5.19) holds, and for any x ∈ (T d )n , j ∈ {1, . . . , n}, f (·, ·, ·, (xj , μx )) is C-Lipschitz. Then √ (i) g is 3C-Lipschitz, (ii) g|B − f B ≤ C/n. As in [24], we set, for s, t ∈ [0, T ], q ∈ T d , x = (x1 , . . . , xn ) ∈ (T d )n , j = 1, . . . , n, ζ n (t, s, q, (xj , μx )) = n∇xj M(t, s, q, x).

(5.20)

The periodicity of M in q and x ensures that ζ n is well defined on B. Corollary 5.9. (Extension of ζ n ) For each n ∈ Z+ , there is a function 2

χ n : [0, T ] × [0, T ] × T d × T d × P(T d ) → Rd × Rd

2

such that χ n |B = ζ n and, with a larger value of C than before, (i) χ n is C-Lipschitz, (ii) χ n |B − ζ n B ≤ Cn . Proof. We check that f = ζ n satisfies the conditions of Proposition 5.8. The Lipschitz property in t and q follows from (5.13), while, in s, from (5.11). Hence, to obtain the Corollary, it is enough to prove that the condition (5.19) is satisfied by f = ζ n . Fix then x, y ∈ (T d )n , i, j ∈ {1, . . . , n}, s, t ∈ [0, T ], q ∈ T d . Since the order in which we take the n particles x1 , . . . , xn , which make up x ∈ (T d )n , does not change M(· · · , x), and ∇xj M(· · · , x) is periodic in x, it can be assumed that: 

|xk − yk |2 ≤ W 2 (μx , μy ),

|xj − yi | = |xj − yi |T d ,

|xi − yj | = |xi − yj |T d ,

k =i,j

¯ ∇xj M(t, s, q, y) = ∇x1 M(t, s, q, y),

∇xi M(t, s, q, x) = ∇x1 M(t, s, q, x), ¯

where y¯ denotes the result of shifting yj and yi to the first and second positions, respectively, in the n-uple y, and x¯ denotes the result of shifting xi and xj to the first and second positions, respectively, in the n-uple x. Suppose, too, without loss of generality, that i < j . In view of these simplifications,

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.46 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

46

|∇xj M(t, s, q, y) − ∇xi M(t, s, q, x)| ≤ ∇x21 ,x1 M ∞ |yj − xi | + ∇x21 ,x2 M ∞ |yi − xj | +

i−1 

∇x21 ,xk+2 M ∞ |yk − xk |

k=1

+

j −1 

∇x21 ,xk+1 M ∞ |yk − xk | +

k=i+1

n  k=j +1

∇x21 ,xk M ∞ |yk − xk |.

Therefore, by the bounds of Corollary 5.5, |∇xj M(t, s, q, y) − ∇xi M(t, s, q, x)| ≤

C C C  |yk − xk | |yj − xi | + 2 |yi − xj | + 2 n n n k =i,j



C C |yj − xi |T d + 2 |yi − xj |T d n n √  C n + 2 |yk − xk |2 n k =i,j

√ d C ≤ (|yj − xi |T d + + W (μx , μy )), n 2n where



d/2 is the diameter of T d . Thus,

|n∇xj M(t, s, q, y) − n∇xi M(t, s, q, x)| ≤



1 dC(|yj − xi |T d + W (μx , μy ) + ), n

which proves property (5.19) for f = ζ n , since i and j were arbitrary.

2

Lemma 5.10. For every s ∈ [0, T ], the T d × Rd -valued map [s, ·](·, ·) is differentiable on P(T d ) × [0, T ] × T d , that is: there exists a mapping 2 2 ∇¯ μ  : [0, T ] × [0, T ] × T d × T d × P(T d ) −→ Rd × Rd

(t, s, q, x, μ) −→ ∇¯ μ [s, μ](t, q, x) such that, for every s, t, t  ∈ [0, T ], q, q  ∈ T d , μ, ν ∈ P(T d ), γ ∈ 0 (μ, ν), [s, ν](t  , q  ) − [s, μ](t, q) − ∂t [s, μ](t, q)(t  − t) − ∇q [s, μ](t, q) · (q  − q)  ∇¯ μ [s, μ](t, q, x) · (y − x)γ (dx, dy) − T d ×T d

≤ C(|t  − t|2 + |q  − q|2 + W 2 (μ, ν)). Moreover, the mapping ∇¯ μ  is Lipschitz.

(5.21)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.47 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

47

Proof. Let μ, ν ∈ P(T d ), and let γ ∈ 0 (μ, ν). Appealing to the last fact in the first list of the d d Preliminaries, there is a sequence {γ (n)}∞ n=1 , converging narrowly to γ in P(T × T ), such that 1 δ(xj (n),yj (n)) , n n

γ (n) =

j =1

and for each j ∈ {1, . . . , n}, (xj (n), yj (n)) belongs to the support of γ . Due to this latter fact (see, for instance, [3, Theorem 6.1.4]), for each n ∈ Z+ , the sequence {(xj (n), yj (n))}nj=1 is | · |T d -monotone, and, therefore, γ (n) ∈ 0 (μx(n) , μy(n) ),

n ∈ Z+ .

It is also true that lim W (μ, μx(n) ) = 0,

lim W (ν, μy(n) ) = 0.

n→∞

n→∞

Let ζ n be defined as in (5.20), so that, as a consequence, for each x(n) of our sequence, ζ n (t, s, q, (xj (n), μx(n) )) = n∇xj (n) M(t, s, q, x(n)) = n∇xj (n) [s, μx(n) ](t, q), j ∈ {1, . . . , n}. Recall the second-order estimate (5.14), now with x = x(n) and x  = y(n): [s, μy(n) ](t, q) − [s, μx(n) ](t  , q  ) − ∂t [s, μx(n) ](t, q)(t  − t) − ∇q [s, μx(n) ](t, q) · (q  − q) −

n 

∇xj (n) [s, μx(n) ](t, q) · (yj (n) − xj (n))

j =1 

 2

≤ C(|t − t| + |q − q | + W (μ 2

2

x(n)

, μy(n) )).

Since 1 n ζ (t, s, q, (xj (n), μx(n) )) · (yj (n) − xj (n)) n j =1  ζ n (t, s, q, (x, μx(n) )) · (y − x)γ (n)(dx, dy), = n

T d ×T d

the latter inequality is the same as [s, μy(n) ](t  , q  ) − [s, μx(n) ](t, q) − ∂t [s, μx(n) ](t, q)(t  − t)  − ∇q [s, μx(n) ](t, q) · (q  − q) − ζ n (t, s, q, (x, μx(n) )) · (y − x)γ (n)(dx, dy) T d ×T d

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μx(n) , μy(n) )).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.48 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

48

Denote by χ n the extension of ζ n furnished by Corollary 5.9. The same inequality holds if we substitute χ n for ζ n in the previous inequality, because these functions coincide on the set B, which includes the support of γ (n). But, since we will pass to the limit, we rather write [s, μy(n) ](t, q) − [s, μx(n) ](t, q) − ∂t [s, μx(n) ](t, q)(t  − t)  χ n (t, s, q, (x, μx(n) )) · (y − x)γ (n)(dx, dy) − ∇q [s, μx(n) ](t, q) · (q  − q) − T d ×T d

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μx(n) , μy(n) ))  + [ζ n (t, s, q, (x, μx(n) )) − χ n (t, s, q, (x, μx(n) ))] · (y − x)γ (n)(dx, dy) T d ×T d

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μx(n) , μy(n) )) +

C W (μx(n) , μy(n) ), n

(5.22)

by Corollary 5.9(ii) and the fact γ (n) ∈ 0 (μx(n) , ν y(n) ). Now, by Corollary 5.9(i), each χ n , n = 1, . . . is C-Lipschitz on the bounded domain [0, T ]2 ×(T d )2 ×P(T d ). The functions χ n are also pointwise uniformly bounded, because of (5.12). Thus, the sequence {χ n }∞ n=1 is equicontinuous and pointwise uniformly bounded, so a subsequence of it converges to a C-Lipschitz mapping, which we define as the mapping ∇¯ μ  introduced in the statement of this lemma. Passing to the limit as n → ∞ in (5.22), we prove (5.21), in particular, that  is differentiable in the μ variable. 2 We will denote by ∇¯ μ 1 the first component (the T d -valued part) of ∇¯ μ , and ∇¯ μ 2 the second component (Rd -valued) of ∇¯ μ . Next we prove the analogue of Lemma 5.10 for X = (1 )−1 . Definition 5.11. For t, s ∈ [0, T ], μ ∈ P(T d ), q, x ∈ T d , put ∇¯ μ X[s, μ](t, q, x) := −∇q X[s, μ](t, q)∇¯ μ 1 [s, μ](t, X[s, μ](t, q), x). Before stating and proving the lemma, we recall formula (5.16): ∂t X[s, μ](t, q) = −∇q X[s, μ](t, q)∂t 1 [s, μ](t, ·) ◦ X[s, μ](t, q). Lemma 5.12. For every s ∈ [0, T ], the Rd -valued map X[s, ·](·, ·) is differentiable on P(T d ) × [0, T ] × T d , i.e., there is a constant C > 0 such that for every s, t, t  ∈ [0, T ], q, q  ∈ T d , μ, ν ∈ P(T d ), γ ∈ 0 (μ, ν), X[s, ν](t  , q  ) − X[s, μ](t, q) − ∂t X[s, μ](t, q)(t  − t) − ∇q X[s, μ](t, q) · (q  − q)  ∇¯ μ X[s, μ](t, q, x) · (y − x)γ (dx, dy) − T d ×T d

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.49 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μ, ν)),

49

(5.23)

where ∂t X, ∇q X, and the mapping ∇¯ μ X of Definition 5.11, are continuous. Proof. The continuity of ∇¯ μ X is immediate from its definition, and the continuity of ∇q X and ∂t X has been known since Lemma 4.15 and formula (5.16) respectively. Let us put q˜ = X[s, μ](t, q),

q˜  = X[s, ν](t  , q  ).

(5.24)

We write out the expression on the left hand side of (5.23) and factor out ∇q X[s, μ](t, q), while ˜ are inverses of one another: also using the fact that ∇q X[s, μ](t, q) and ∇q 1 [s, μ](t, q)  ∇q X[s, μ](t, q) ∇q 1 [s, μ](t, q)( ˜ q˜  − q) ˜ + ∂t 1 [s, μ](t, q)(t ˜  − t)   1   1 1 ¯ ˜ + ˜ x) · (y − x)γ (dx, dy) ∇μ  [s, μ](t, q, − ( [s, ν](t , q˜ ) −  [s, μ](t, q)) T d ×T d

 ˜ − ∂t 1 [s, μ](t, q)(t ˜  − t) = − ∇q X[s, μ](t, q) 1 [s, ν](t  , q˜  ) − 1 [s, μ](t, q)   1  1 ¯ ˜ q˜ − q) ˜ − ˜ x) · (y − x)γ (dx, dy) ∇μ  [s, μ](t, q, − ∇q  [s, μ](t, q)( ≤ 4(1 + = 4(1 +

√ √

T d ×T d

d)d−1 C(|t  − t|2 + |q˜  − q| ˜ 2 + W 2 (μ, ν)) d)d−1 C(|t  − t|2 + |X[s, ν](t  , q˜  ) − X[s, μ](t, q)|2 + W 2 (μ, ν)).

But, by Corollary 5.7, the term |X[s, ν](t  , q  ) − X[s, μ](t, q)| is bounded by C(|t  − t| + |q  − q| + W (μ, ν)) for some C > 0. Inserting this bound into the last expression, after expanding and raising the value of C, one obtains (5.23). 2 Regularity of V. Let us look back at the definition of V, given by (4.20). Set now ∇¯ μ V[s, μ](t, q, x) := ∇¯ μ 2 [s, μ](t, X[s, μ](t, q), x) + ∇q 2 [s, μ](t, X[s, μ](t, q))∇¯ μ X[s, μ](t, q, x),

(5.25)

for s, t ∈ [0, T ], q, x ∈ T d , μ ∈ P(T d ). Lemma 5.13. For every s ∈ [0, T ], the Rd -valued map V[s, ·](·, ·) is differentiable on P(T d ) × [0, T ] × T d , i.e., there is a constant C > 0 such that for every s, t, t  ∈ [0, T ], q, q  ∈ T d , μ, ν ∈ P(T d ), γ ∈ 0 (μ, ν), V[s, ν](t  , q  ) − V[s, μ](t, q) − ∂t V[s, μ](t, q)(t  − t) − ∇q V[s, μ](t, q) · (q  − q)  ∇¯ μ V[s, μ](t, q, x) · (y − x)γ (dx, dy) − T d ×T d

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.50 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

50

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μ, ν)),

(5.26)

where the mapping ∇¯ μ V, defined by (5.25), and ∂t V, ∇q V, are continuous. Proof. We know that ∇q Vt [s, μ] = ∇q 2t [s, μ]∇q Xt [s, μ],

∂t Vt [s, μ] = ∂t 2t [s, μ] + ∇q 2t [s, μ]∂t Xt [s, μ].

Therefore, the continuity of the functions stated in the lemma follows from that of ∇q , ∇q X, ∂t , ∂t X, and the continuity, proved above, of ∇¯ μ  and ∇¯ μ X. Keeping the notation (5.24), we first write down the expression to estimate, i.e. the left hand side of (5.26), and factor out ∇q 2 [s, μ](t, q): ˜ 2  [s, ν](t  , q˜  ) − 2 [s, μ](t, q) ˜ − ∇q 2 [s, μ](t, q)∇ ˜ q X[s, μ](t, q)(q  − q) ˜ + ∇q 2 [s, μ](t, q)∂ ˜ t X[s, μ](t, q))(t  − t) − (∂t 2 [s, μ](t, q)    − ∇¯ μ 2 [s, μ](t, q, ˜ x) + ∇q 2 [s, μ](t, q) ˜ ∇¯ μ X[s, μ](t, q, x) · (y − x)γ (dx, dy) T d ×T d

= 2 [s, ν](t  , q˜  ) − 2 [s, μ](t, q) ˜ 2 ˜ ∇q X[s, μ](t, q)(q  − q) + ∂t X[s, μ](t, q)(t  − t) − ∇q  [s, μ](t, q) 

+ ∇¯ μ X[s, μ](t, q) · (y − x)γ (dx, dy) T d ×T d 



− ∂t  [s, μ](t, q)(t ˜ − t) − 2

˜ x) · (y − x)γ (dx, dy) . ∇¯ μ 2 [s, μ](t, q,

T d ×T d

Inside the large round brackets we add and substract q˜  − q˜ = Xt  [s, ν](q  ) − Xt [s, μ](q), and apply (5.23), to obtain that the latter expression is no greater than 2   [s, ν](q˜  ) − 2 [s, μ](q) ˜ − ∇q 2t [s, μ](q)( ˜ q˜ − q) − ∂t 2t [s, μ](q)(t ˜  − t) t t  ˜ x) · (y − x)γ (dx, dy) − ∇¯ μ 2t [s, μ](q, T d ×T d  + |∇q 2t [s, μ](q)|C(|q ˜ − q|2 + |t  − t|2 + W 2 (μ, ν)),

which, in turn, by (5.23), is bounded above by C(|X[s, ν](t  , q  ) − X[s, μ](t, q)|2 + |t  − t|2 + |q  − q|2 + W 2 (μ, ν)) + θ A2 C(|q  − q|2 + |t  − t|2 + W 2 (μ, ν)), and, after using Corollary 5.7 again, simplifying and increasing the value of C, inequality (5.26) is obtained. 2

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.51 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

51

Regularity of H (q, V). We finish this section by following the previous results with the regularity of what turned out to be the second function that appears in the MFG equation (1.2a), due to (4.25). Set ∇¯ μ H (q, V[s, μ](t, q))(x) := ∇p H (q, V[s, μ](t, q))∇¯ μ V[s, μ](t, q, x),

(5.27)

for s, t ∈ [0, T ], q, x ∈ T d , μ ∈ P(T d ). Lemma 5.14. For every s ∈ [0, T ], the R-valued map H (·, V[s, ·](·, ·)) is differentiable on P(T d ) × [0, T ] × T d , and there is a constant C > 0 such that for every s, t, t  ∈ [0, T ], q, q  ∈ T d , μ, ν ∈ P(T d ), γ ∈ 0 (μ, ν), H (q  , V[s, ν](t  , q  )) − H (q, V[s, μ](t, q)) − (∂t )(H (q, V[s, μ](t, q))(t  − t) − (∇q )(H (q, V[s, μ](t, q))) · (q  − q)  ∇¯ μ H (q, V[s, μ](t, q))(x) · (y − x)γ (dx, dy) − T d ×T d

≤ C(|t  − t|2 + |q − q  |2 + W 2 (μ, ν)),

(5.28)

where the mapping ∇¯ μ H , defined by (5.27), is continuous. Proof. Let us abbreviate V = Vt [s, μ](q) and V  = Vt  [s, ν](q  ). Since (∂t )(H (q, V)) = ∇p H (q, V)∂t V,

(∇q )(H (q, V)) = ∇q H (q, V) + ∇p H (q, V)∇q V,

the left hand side of (5.28) is, after factoring out ∇p H (q, V),  H (q  , V  ) − H (q, V) − ∇p H (q, V) − (V  − V) + ∇q V(q  − q) + ∂t V(t  − t)   ∇¯ μ Vt [s, μ](q) · (y − x)γ (dx, dy) + T d ×T d

− ∇q H (q, V)(q  − q) − ∇p H (q, V  )(V − V  )

≤ |H (q  , V  ) − H (q, V) − ∇q H (q, V)(q  − q) − ∇p H (q, V)(V  − V)| + |∇p H (q, V)|C(|t  − t|2 + |q  − q|2 + W 2 (μ, ν)). Remember now that |2 | ≤ θ B (see Corollary 4.7(ii)) at any t, q, s, μ; recall Definition 4.10. Therefore, the right-hand side of this inequality is bounded by h(θ B)(|q  − q|2 + |V  − V|2 ) + l(θB)C(|t  − t|2 + |q  − q|2 + W 2 (μ, ν)). To deal with the term |V  − V|2 , note that Corollary 5.7 is also valid for  in place of X, following a similar argument. With the notation (5.24), |V  − V| = |2 [s, ν](t  , q˜  ) − 2 [s, μ](t, q)| ˜ ≤ C(|t  − t| + |q˜  − q| ˜ + W (μ, ν)).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.52 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

52

Applying Corollary 5.7, and raising the value of C, we get |V  − V| ≤ C(|t  − t| + |q  − q| + W (μ, ν)). Substituting this into the bounding expression, and simplifying, one arrives at (5.28), for some larger value of C. The continuity of ∇¯ μ H in all its variables is clear from the definition. 2 6. Solution to the master equation Let us recall the definition of the function u from Section 4.3: s u(s, q, μ) = g(q,  [s, μ](0, ·)# μ) − 1



 H (q, V[s, μ](τ, q)) + F (q, 1 [s, μ](τ, ·)# μ dτ.

0

(4.37) 6.1. Pathwise gradients of the couplings For any s, t ∈ [0, T ), q ∈ T d , μ ∈ P(T d ), define NtF [s, μ](q)(x) := − ∇μ F (q, σt )(1t [s, μ](x))∇q 1t [s, μ](x)  + ∇μ F (q, σt )(1t [s, μ](r))∇¯ μ 1t [s, μ](r)(x)μ(dr), Td

and g

Nt [s, μ](q)(x) := − ∇μ g(q, σt )(1t [s, μ](x))∇q 1 [s, μ](t, x)  + ∇μ g(q, σt )(1t [s, μ](r))∇¯ μ 1t [s, μ](r)(x)μ(dr). Td

Likewise, (∂t )(F (q, 1t [s, μ]# μ)) := ∇μ F (q, σt )(1t [s, μ](x)) · ∂t 1 [s, μ](t, x)μ(dx), with an analogous definition for (∂t )(g(q, 1t [s, μ]# μ)). In preparation for Lemma 6.3 below, we are going to adopt this notation: for t, t  ∈ (0, T ), q, q  ∈ T d , μ, ν ∈ P(T d ), and 0 ≤ τ ≤ 1, and γ ∈ (μ, ν), ⎧ τ t = (1 − τ )t + τ t  , q τ =(1 − τ )q + τ q  , μτ = ((1 − τ )π 1 + τ π 2 )# γ , ⎪ ⎪ ⎨ σt = 1 [s, ν](t  , q  ), σt = 1 [s, μ](t, q), ⎪ ⎪ ⎩ σ τ =1t τ [s, μτ ]# μτ . Recall that, by definition, X = (1 )−1 . In this context, we are going to need the following:

(6.1)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.53 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

53

Proposition 6.1. Let μ, ν ∈ P(T d ), q, q  ∈ T d , t, t  ∈ (0, T ), and let the notation (6.1) be in force. Let w τ be the velocity vector field of the geodesic μτ . (i) The vector field v τ (y) = (t  − t)vt τ [s, μτ ](y) +



∇¯ μ 1t τ [s, μτ ](Xt τ [s, μτ ](y))(x)w τ (x)μτ (dx)

Td

− ∇q 1t τ [s, μτ ](Xt τ [s, μτ ](y))w τ (Xt τ [s, μτ ](y)),

y ∈ Td,

(6.2)

is a velocity vector field for the path σ τ . (ii)

v τ L2 (σ τ ) ≤ C(|t  − t| + W (μ, ν)),

0≤τ ≤1

for some constant C. Proof. (i) Fix any γ ∈ 0 (μ, ν) and for 0 ≤ τ ≤ 1, put μτ as in (6.1). By definition, v τ must satisfy   d τ ϕ(y)σ (dy) = ∇ϕ(y) · v τ (y)σ τ (dy) (6.3) dτ Td

Td

for every ϕ ∈ C ∞ (T d ) and for L 1 -a.e. t ∈ (0, 1). Computing the derivative on the left hand side,  d ϕ(1 [s, μτ ](t τ , y))μτ (dy) = dτ  =

Td

d ϕ(1 [s, μτ ](t τ , y))μτ (dy) − dτ

Td

 (∇y )[ϕ(1 [s, μτ ](t τ , y))]w τ (y)μτ (dy),

Td

where w τ is the velocity vector field of the geodesic μτ , and we have again used the definition of velocity, since ϕ ◦ 1 is C ∞ in y. Doing the differentiation with respect to τ and y, we get  d ϕ(y)σ τ (dy) = dτ Td





∇ϕ(1 [s, μτ ](t τ , y)

= Td

   · ∂t 1 [s, μτ ](t τ , y)(t  − t) + ∇¯ μ 1 [s, μτ ](t τ , y)(x)w τ (x)μτ (dx) μτ (dy) 

− Td

Td

∇ϕ(1 [s, μτ ](t τ , y)) · [∇q 1 [s, μτ ](t τ , y)w τ (y)]μτ (dy).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.54 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

54

Since 1t τ [s, μτ ]# μτ = σ τ by definition, we obtain (6.2) after writing the latter expression as an integral with respect to σ τ and comparing against the right hand side of (6.3). (ii) This follows by Remark 2.1(i) and the boundedness of ∂t 1 , ∇q 1 , ∇¯ μ 1 . 2 Remark 6.2. It follows that, evaluated at 1t τ [s, μτ ](y), v τ has the simpler expression: v τ (1t τ [s, μτ ](y)) = (t  − t)∂t 1t τ [s, μτ ](y) +



∇¯ μ 1t τ [s, μτ ](y)(x)w τ (x)μτ (dx)

Td

− ∇q 1t τ [s, μτ ](y)w τ (y),

y ∈ Td.



Lemma 6.3. Let t, t  ∈ (0, T ), μ, ν ∈ P(T d ), q, q  ∈ T d be arbitrary, and put σt = 1t [s, μ]# μ, σt = 1t  [s, ν]# ν. Then there exists a constant C such that F (q  , σ  ) − F (q, σt ) − ∇q F (q, σt ) · (q  − q) t  − ∇μ F (q, σt )(1t [s, μ](x)) · ∂t 1 [s, μ](t, x)μ(dx)(t  − t) Td





NtF [s, μ](q)(x) · (y − x)γ (dx, dy)

T d ×T d

≤ C(|t  − t|2 + |q  − q|2 + W 2 (μ, ν)), for any γ ∈ 0 (μ, ν). The functions NtF [s, μ](q)(x) and (∂t )(F (q, 1t [s, μ]# μ) are continuous in all its variables. Naturally, the same result holds for g in place of F . Proof. The assertion about continuity follows immediately from the continuity of the functions that enter in the definition of NtF [s, μ](q)(x) and (∂t )(F (q, 1t [s, μ]# μ). Let γ ∈ 0 (μ, ν), and define μτ , 0 ≤ τ ≤ 1 as in Remark 2.1. Let the notation (6.1) be in effect, so that τ → σ τ is a continuous path joining σt with σt and Proposition 6.1 holds, with w τ as defined therein. Denote by E the expression inside the bars on the left-hand side of the inequality of the lemma. Step 1. Claim 1.   0 ∇μ F (q, σt )(x) · v (x)σt (dx) = ∇μ F (q, σt )(1t [s, μ](x)) · ∂t 1t [s, μ](x)μ(dx)(t  − t) Td

Td



+ T d ×T d

NtF [s, μ](q)(x)(y − x)γ (dx, dy).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.55 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

55

Proof of Claim 1. Using Remark 6.2 with τ = 0,  ∇μ F (q, σt )(x) · v 0 (x)σt (dx)  = Td

Td

∇μ F (q, σt )(1t [s, μ](x)) · ∂t 1t [s, μ](y)μ(dx)(t  − t) 



 ∇μ F (q, σt )(1t [s, μ](x)) · ∇q 1t [s, μ](x)w 0 (x)

Td





 ∇¯ μ 1t [s, μ](x)(b)w 0 (b)μ(db) μ(dx).

(6.4)

Td

Note that  ∇μ F (q, σt )(1t [s, μ](x)) · Td

 

=





 ∇¯ μ 1t [s, μ](x)(b)w 0 (b)μ(db) μ(dx)

Td

∇μ F (q, σt )(1t [s, μ](x)) · ∇¯ μ 1t [s, μ](x)(b)μ(dx)w 0 (b)μ(db)

Td Td

 

=

∇μ F (q, σt )(1t [s, μ](r)) · ∇¯ μ 1t [s, μ](r)(x)μ(dr)w 0 (x)μ(dx).

Td Td

Substituting this identity into (6.4), we see that  ∇μ F (q, σt )(x) · v 0 (x)σt (dx) Td



= Td

∇μ F (q, σt )(1t [s, μ](x)) · ∂t 1t [s, μ](y)μ(dx)(t  − t) 

+

NtF [s, μ](q)(x) · w 0 (x)μ(dx)

Td

Finally, we use Remark 2.1(ii) for the last integral in the latter expression, and the claim is proved. 2 Claim 2. E = F (q



, σt ) − F (q, σt ) − ∇q F (q, σt ) · (q 

 − q) − Td

∇μ F (q, σt )(x) · v 0 (x)σt (dx).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.56 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

56

Proof of Claim 2. This follows immediately from Claim 1 and the definition of E. Thus, the lemma will be proved if we show that |F (q  , σt ) − F (q, σt ) − ∇q F (q, σt ) · (q  − q) −

 ∇μ F (q, σt )(x) · v 0 (x)σt (dx)|

Td

≤ C(|t  − t|2 + |q  − q|2 + W 2 (μ, ν))

(6.5)

for some constant C. Step 2. Before we set out to prove (6.5), we transform the expression for E some more. By the chain rule, we have that

F (q



, σt ) − F (q, σt ) =

1



 ∇q F (q τ , σ τ ) · (q  − q) + ∇μ F (q τ , σ τ )(x) · v τ (x)σ τ (dx) dτ,

0

so 1 E=

[∇q F (q τ , σ τ ) − ∇q F (q, σt )] · (q  − q)

0





+

∇μ F (q , σ ) · v (x)σ (dx) − τ

τ

τ

τ

Td

 ∇μ F (q, σt ) · v (x)σt (dx) dτ. 0

Td

With our knowledge of v τ , that is, from Remark 6.2), we may rewrite E as 1  E=

(∇q F (q τ , σ τ ) − ∇q F (q, σt )) · (q  − q)

0

 +

∇μ F (q , σ τ

Td

 +

 −

τ

)(1t τ [s, μτ ](x)) ·

∂t 1t τ [s, μτ ](x)(t  − t)

 ∇¯ μ 1t τ [s, μτ ](x)(r)w τ (r)μτ (dr) − ∇q 1t τ [s, μτ ](x)w τ (x) μτ (dx)

Td

∇μ F (q, σt )(1t [s, μ](x)) · ∂t 1t [s, μ](x)(t  − t)

Td

 + Td

  1 0 1 0 ¯ ∇μ t [s, μ](x)(r)w (r)μ(dr) − ∇q t [s, μ](x)w (x) μ(dx) dτ.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.57 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

57

Now, let γ τ ∈ (μ, μτ ),

0 < τ < 1.

Then E takes the form 1 E=

(∇q F (q τ , σ τ ) − ∇q F (q, σt )) · (q  − q)dτ

0

1





+

∇μ F (q τ , σ τ )(1t τ [s, μτ ](y)) · ∂t 1t τ [s, μτ ](y)(t  − t)

0 T d ×T d

 +

 ∇¯ μ 1t τ [s, μτ ](y)(r)w τ (r)μτ (dr) − ∇q 1t τ [s, μτ ](y)w τ (y)

Td

− ∇μ F (q, σt )(1t [s, μ](x)) ·  +

∂t 1t [s, μ](x)(t  − t)

 ∇¯ μ 1t [s, μ](x)(r)w 0 (r)μ(dr) − ∇q 1t [s, μ](x)w 0 (x) γ τ (dx, dy)dτ

Td

=: E1 + E2 , where 1 E1 =

(∇q F (q τ , σ τ ) − ∇q F (q, σt )) · (q  − q)dτ,

E 2 = E − E1 .

0

Step 3 (estimates). In the following, we will be making use of the boundedness of the quantities displayed in Section 2.2. With an abuse of notation, for each fixed τ ∈ (0, 1], let τ (h) = hτ , 0 ≤ h ≤ 1, so that t τ (·) has endpoints t , t τ , q τ (·) has endpoints q, q τ and σ τ (·) has endpoints σt , σ τ . The velocity vector field of the path h → v τ (h) is τ v τ (h) (see, e.g., [3]). An argument similar 2 F and ∇ 2 F , shows to the one in the proof of Proposition 2.6, based on the continuity of ∇qq μq that ∇q F (q τ , σ τ ) − ∇q F (q, σt ) = 1 = 0



2 τ ∇qq F (q τ (h) , σ τ (h) )(q  − q) +



 2 τ ∇μq F (q τ (h) , σ τ (h) )(x)v τ (h) (x)σ τ (h) (dx) dh.

Td

Therefore, by Proposition (6.1)(ii), we get

|E1 | ≤ C|q  − q| |q  − q| + |t  − t| + W (μ, ν)

(6.6)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.58 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

58

for some positive constant C. We break E2 down as follows: 1



E2 =

(B1 + B2 + B3 )γ τ (dx, dy)dτ, 0 T d ×T d

where   B1 := ∇μ F (q τ , σ τ )(1t τ [s, μτ ](y)) − ∇μ F (q, σt )(1t [s, μ](y)) ·  · ∂t 1t τ [s, μτ ](y)(t  − t) − ∇q 1t τ [s, μτ ](y)w τ (y)   + ∇¯ μ 1t τ [s, μτ ](y)(r)w τ (r)μτ (dr) , Td

 B2 := ∇μ F (q, σt )(1t [s, μ](y)) · ∂t 1t τ [s, μτ ](y)(t  − t) − ∇q 1t τ [s, μτ ](y)w τ (y)  + ∇¯ μ 1t τ [s, μτ ](y)(r)w τ (r)μτ (dr) Td

− ∂t 1t [s, μ](x)(t  − t) + ∇q 1t [s, μ](x)w 0 (x)   − ∇¯ μ 1t [s, μ](x)(r)w 0 (r)μ(dr) , Td

and   B3 := ∇μ F (q, σt )(1t [s, μ](y)) − ∇μ F (q, σt )(1t [s, μ](x) ·    · ∂t 1t [s, μ](x)(t  − t) − ∇q 1t [s, μ](x)w 0 (x) + ∇¯ μ 1t [s, μ](x)(r)w 0 (r)μ(dr) . Td

1 To estimate 0 (T d )2 B1 γ τ dτ , we address the first square bracket in the definition of B1 . With τ w τ (h) being the velocity vector field of the path h → w τ (h) , we have ∇μ F (q τ , σ τ )(1t τ [s, μτ ](y)) − ∇μ F (q, σt )(1t [s, μ](y)) = 1  2 F (q τ (h) , σ τ (h) )(1t τ (h) [s, μτ (h) ](y))(q  − q) τ ∇qμ = 0

 +

2 τ ∇μμ F (q τ (h) , σ τ (h) )(1t τ (h) [s, μτ (h) ](y))(b)v τ (h) (b)σ τ (h) (db)

Td

 2 + ∇xμ F (q τ (h) , σ τ (h) )(1t τ (h) [s, μτ (h) ](y)) τ (t  − t)∂t 1t τ (h) [s, μτ (h) ](y)    1 τ (h) τ (h) τ (h) ¯ + τ ∇μ t τ (h) [s, μ ](y)(b)w (b)μ (db) dh, Td

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.59 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

59

so |∇μ F (q τ , σ τ )(1t τ [s, μτ ](y)) − ∇μ F (q, σt )(1t [s, μ](y))| ≤ C|q  − q| + C(|t  − t| + W (μ, ν)) + C(|t  − t| + W (μ, ν)) and thus, ∇μ F (q τ , σ τ )(1t τ [s, μτ ](y)) − ∇μ F (q, σt )(1t [s, μ](y))| ≤ C(|t  − t| + |q  − q| + W (μ, ν)) for some constant C. Invoking the boundedness of ∂t 1 , ∇q 1 , ∇¯ μ 1 and Remark 2.1(i), we get

1



B1 γ τ (dx, dy)dτ ≤ C(|t  − t| + |q  − q| + W (μ, ν))(|t  − t| + W (μ, ν)).

(6.7)

0 T d ×T d 2 F (q, μ)(x) the gradient at x of the mapping x → ∇ F (q, μ)(x), Recall that we denote by ∇xμ μ 2 F (q, μ)(x) is uniformly bounded, by assumption. Thus, and ∇xμ

|∇μ F (q, σt )(1t [s, μ](y)) − ∇μ F (q, σt )(1t [s, μ](x)| 2 ≤ ∇xμ F (q, σt ) ∞ |1t [s, μ](y) − 1t [s, μ](x)| 2 ≤ 2 ∇xμ F (q, σt ) ∞ A1 |y − x|,

for any x, y ∈ T d . Therefore, for some constant C,

1



B3 γ (dx, dy)dτ ≤ C

1



τ

0 T d ×T d

|x − y|(|t  − t| + W (μ, ν))γ τ (dx, dy)dτ

0 T d ×T d

≤ C|x − y|(|t  − t| + W (μ, ν)). Next, we are going to estimate

1 0

φ2 = 1t τ [s, μτ ], Then





(T d )2

(T d )2

B2 γ τ dτ . To ease notation, let us make the abbreviations

φ1 = 1t [s, μ],

ψ1 = ∇μ F (q, σt ) ◦ φ1 .

B2 γ τ (dx, dy) reads: 

 ψ1 (y) · ∂t φ2 (y)(t  − t) − ∇q φ2 (y)w τ (y) +

T d ×T d

∇μ φ2 (y)(r2 )w τ (r2 )μτ (dr2 )

Td 



− ∂t φ1 (x)(t − t) + ∇q φ1 (x)w (x) − 0

Td

Therefore,

(6.8)

 ∇μ φ1 (x)(r1 )w 0 (r1 )μ(dr1 ) γ τ (dx, dy).

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.60 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

60





B2 γ τ (dx, dy) ≤ ψ1 L∞ (γ τ ) D L1 (γ τ ) ,

T d ×T d

where, by applying Remark 2.1(iii), D = (∂t φ2 (y) − ∂t φ1 (x))(t  − t) − (∇q φ2 (y) − ∇q φ1 (x))  (∇μ φ2 (y)(b) − ∇μ φ1 (x)(a))

+

y −x τ

b−a τ γ (da, db), τ

T d ×T d

and we are left with estimating the L1 (γ τ )-norm of D. We write τ D = τ D + (φ2 (y) − φ1 (x)) + (φ1 (x) − φ2 (y)), to apply (5.21), once with μ = μ, ν = μτ , t = t, t  = t τ , q = x, q  = y, then with μ = μτ , ν = μ, t = t τ , t  = t, q = y, q  = x, and obtain: |τ D| ≤ 2C(τ 2 |t  − t|2 + τ 2 W 2 (μ, ν) + |x − y|2 ). Therefore 

1 |D|γ (dx, dy) ≤ 2C τ |t  − t|2 + τ W 2 (μ, ν) + τ



τ

T d ×T d

|x − y|2 γ τ (dx, dy)

T d ×T d



1 ≤ 2C τ |t  − t|2 + τ W 2 (μ, ν) + τ 2 W 2 (μ, ν) . τ Consequently, for some constant C, 1



|B2 |γ τ (dx, dy)dτ ≤ C(|t  − t|2 + W 2 (μ, ν)).

(6.9)

0 T d ×T d

Step 4. Note that all the estimates derived in the previous step, namely, (6.6), (6.7), (6.8), (6.9), are quadratic in the increments. Hence, |E| ≤ C(|t  − t|2 + |q  − q|2 + W 2 (μ, ν)), which is (6.5), for some constant C. This concludes the proof. 2

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.61 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

61

6.2. Gradient of u(s, q, ·) and chain rule We collect now the results on differentiability in μ of the functions g, F , H (q, V) that constitute the full value function u, with the following definition and corollary. Define the Rd -valued function g ϒ[s, μ](q, y) := N0 [s, μ](q)(y) +

s

[∇¯ μ H (q, Vt [s, μ](q, y)) + NtF [s, μ](q)(y)]dt,(6.10)

0

where s ∈ [0, T ], q ∈ T d , y ∈ Rd , μ ∈ P(T d ). Corollary 6.4. The function ϒ, just defined, is continuous on [0, T ] × T d × T d × P(T d ), and u(s, q, ·), defined by (4.37), is differentiable on P(T d ), in the sense that there exists a constant C such that  |u(s, q, ν) − u(s, q, μ) − ϒ[s, μ](q, y) · (x − y)γ (dy, dx)| ≤ CW 2 (μ, ν) T d ×T d

for every μ, ν ∈ P(T d ), γ ∈ 0 (μ, ν), s ∈ [0, T ], q ∈ T d . Proof. The continuity of ϒ is a consequence of the continuity of its parts, and combining Lemma 5.14 with Lemma 6.3 produces the stated estimate. 2 We refer back to Section 2.4 for the definition of Tμ P(T d ). Since we do not know whether ϒ[s, μ](q, ·) belongs to the L2 (μ) closure of {∇ϕ | ϕ ∈ Cc∞ (T d )}, we make the following definition. Definition 6.5. Let u = u(s, q, μ) be as in (4.37), for s ∈ [0, T ], q ∈ T d , μ ∈ P(T d ), and ϒ be as in (6.10). At every s, q, μ, by ∇μ u(s, q, μ) we will mean the projection of ϒ[s, μ](q, ·) onto Tμ P(T d ). We need to note that the velocity vector fields v[s, μ](t, ·) are not necessarily elements of Tσt P(T d ), even though this is true in the case H (q, p) = 12 |p|2 (see [24, Theorem 5.1]). This leads to the following definition. Definition 6.6. At every s, t ∈ [0, T ], μ ∈ P(T d ), by v[s, ¯ μ](t, ·) we will mean the projection of v[s, μ](t, ·) onto Tσt P(T d ), where σt = 1t [s, μ](·)# μ.

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.62 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

62

Note that, if w ∈ L2 (T d , μ) is arbitrary and w¯ is its projection onto Tμ P(T d ), then 

 ϒ[s, μ](q, x) · w(x)μ(dx) ¯ =

Td

∇μ u(s, q, μ)(x) · w(x)μ(dx),

(6.11)

Td

which follows from (2.3). We are now ready to prove the third main statement of the paper: Theorem 6.7. Let H, F, g be as in Sections 2.1, 2.2, and  be the unique solution to the system (4.3) obtained in Corollary 4.7(ii). Let u = u(s, q, μ) be defined as in (4.37). Then: (i) For any s ∈ [0, T ], μ ∈ P(T d ), there exists σ ∈ AC 2 (0, T ; P(T d )) such that σs = μ and the continuity equation ∂t σt + div(∇p H (q, ∇q u(t, q, σt )σt ) = 0

in D ((0, T ) × P(T d ))

holds; (ii) The function u is a classical solution to the master equation  ⎧ ⎪ ⎪ ∂s u(s, q, μ) + ∇μ u(s, q, μ)(x) · ∇p H (x, ∇q u(s, x, μ))μ(dx) ⎪ ⎪ ⎪ ⎨ d T

⎪ ⎪ ⎪ ⎪ ⎪ ⎩

+ H (q, ∇q u(s, q, μ)) + F (q, μ) = 0 u(0, q, μ) = g(q, μ)

in (0, T ) × T d × P(T d ),

(1.1)

on T d × P(T d ),

in the sense explained in Section 2.3. The full value function u = u(s, q, μ) is the value of the solution U of the MFG system’s Hamilton-Jacobi equation (1.2a) at the time (t = s) at which the terminal condition σt=s = μ is prescribed for the continuity equation (1.2b). Proof. (i) Let s ∈ [0, T ], μ ∈ P(T d ). Set σt := 1t [s, μ]# μ. Then the statement follows from Proposition 4.17, Corollary 4.20, formula (4.26) and Lemma 4.24. (ii) The regularity of u in q is the same as the regularity of U in q, which was discussed in Lemma 4.19. Fix 0 < s < T , q ∈ T d , μ ∈ P(T d ). As usual, σt = 1t [s, μ]# μ, and vt = ∂t 1t [s, μ] ◦ Xt [s, μ], 0 ≤ t ≤ T . Set σˆ t := (id + (t − s)vs )# μ,

σ¯ t := (id + (t − s)v¯s )# μ,

where v¯s is the projection of vs to Tμ P(T d ). Through σs+h × σˆ s+h , we estimate W (σs+h , σˆ s+h ):  W 2 (σs+h , σˆ s+h ) ≤

|x − y|2 (σs+h × σˆ s+h )(dx, dy) T d ×T d



|1s+h [s, μ](y) − 1s [s, μ](y) − hvs [s, μ](y)|2 μ(dy).

= Td

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.63 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

63

Note that vs [s, μ](q) = ∂t 1t [s, μ](q)|t=s , since Xs [s, μ] = id. Therefore, W (σs+h , σˆ s+h ) ≤ |h|2 ∂tt2 1 2∞ .

(6.12)

Let γh := (id × (id + hvs ))# μ ∈ (μ, σˆ s+h ). Since, by definition, ∇μ u(s + h, q, μ) ∈ Tμ P(T d ), we apply Lemma 2.5 to write  |u(s + h, q, σˆ s+h ) − u(s + h, q, μ) −

∇μ u(s + h, q, μ)(x) · (y − x)γh (dx, dy)|

T d ×T d

= o( π 2 − π 1 γh ), which is the same as  |u(s + h, q, σˆ s+h ) − u(s + h, q, μ) − h

∇μ u(s + h, q, μ)(x) · vs (x)μ(dx)| = o(|h|),

T d ×T d

(6.13) because o( π 2 − π 1 γh ) = o(|h|) as can be easily checked. Recall now (6.11). Formula (6.10) shows that ϒ[·, μ](q, y) is continuous, so there is a modulus of continuity ω such that 

 ∇μ u(s + h, q, μ)(x) · vs (x)μ(dx) =

Td

Td



=

ϒ[s + h, μ](q, x) · v¯s (x)μ(dx)

ϒ[s, μ](q, x) · v¯ s (x)μ(dx) + ω(|h|) Td



=

∇μ u(s, q, μ)(x) · vs (x)μ(dx) + ω(|h|). Td

Therefore (6.13) is improved to  |u(s + h, q, σˆ s+h ) − u(s + h, q, μ) − h

∇μ u(s, q, μ)(x) · vs (x)μ(dx)|

Td

= o(|h|) + |h|ω(|h|).

(6.14)

Corollary 6.4 shows that u(s, q, ·) is κ1 -Lipschitz for some constant κ1 , because [0, T ] × T d is compact. Using the bound (6.12), we then have |u(s + h, q, σˆ s+h ) − u(s + h, q, σs+h )| ≤ κ1 h2 ∂tt2 1 2∞ .

(6.15)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.64 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

64

Invoking (4.42), we write   |u(s + h, q, σs+h ) − u(s, q, μ) + h H (q, ∇q u(s, q, σs )) + F (q, σs ) | = o(|h|).

(6.16)

Finally, (6.14), (6.15) and (6.16) are needed to obtain:  |u(s + h, q, μ) − u(s, q, μ) + h

∇μ u(s, q, μ)(x) · vs (x)μ(dx)

Td

  + h H (q, ∇q u(s, q, σs )) + F (q, σs ) |



= |u(s + h, q, μ) − u(s + h, q, σˆ s+h ) + h

∇μ u(s, q, μ)(x) · vs (x)μ(dx)

Td

+ u(s + h, q, σˆ s+h ) − u(s + h, q, σs+h )   + u(s + h, q, σs+h ) − u(s, q, μ) + h H (q, ∇q u(s, q, σs )) + F (q, σs ) | = o(|h|) + |h|ω(|h|) + κ1 h2 ∂tt2 1 2∞ + o(|h|) = o(|h|).

(6.17)

We divide by h, remember that vs (x) = ∇p H (x, ∇q u(s, x, μ)), μ = σs and let h → 0 to obtain  −∂s u(s, q, μ) =

∇μ u(s, q, μ)(x) · ∇p H (x, ∇q u(s, x, μ))μ(dx)

Td

+ H (q, ∇q u(s, q, μ)) + F (q, μ). Let us check the continuity of s → ∂s u(s, q, μ). Due to (4.41), ∇q u(s, q, μ) = V[s, μ](s, q) = 2 [s, μ](s, q), which is continuous in s, and the continuity of H and F takes care of the non-integral term in the formula for ∂s u. For the integral term, we use once again (6.11). Let s  ∈ (0, T ). Then  |

 ϒ[s, μ](q, x) · v¯ s (x)μ(dx) −

Td

ϒ[s  , μ](q, x) · v¯s  (x)μ(dx)|

Td





ϒ[s, μ](q, x) · v¯s (x)μ(dx) −

≤| Td



+| Td

ϒ[s, μ](q, x) · v¯ s  (x)μ(dx)|

Td



ϒ[s, μ](q, x) · v¯s  (x)μ(dx) −

ϒ[s  , μ](q, x) · v¯s  (x)μ(dx)|

Td

≤ ϒ[s, μ](q, ·) L2 (μ) v¯s − v¯s  L2 (μ) + ϒ[s, μ](q, ·) − ϒ[s  , μ](q, ·) L2 (μ) v¯s  L2 (μ) . By the fact that v¯ s , v¯s  are the projections of vs , vs  on a subspace of L2 (μ), we know that

v¯s − v¯s  L2 (μ) ≤ vs − vs  L2 (μ) . Letting s  → s we conclude the continuity. The continuity of ∂s u(s, ·, μ) is treated in the same fashion, since v[s, μ](s, ·) is continuous. This completes the proof. 2

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.65 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

65

Remark 6.8. We do not claim that the function ∇μ u(s, q, ·) is continuous on P(T d ), which is true in the case [24] of H (q, p) = 12 |p|2 . The reason is that we have had to define ∇μ u as the projection of a vector field (Definition 6.5) that, in general, is not in the tangent space Tμ P(T d ), whereas for the quadratic Hamiltonian, ϒ[s, μ](q, ·) and ∇μ u(s, q, μ) are the same [37]. Acknowledgments This research was partially supported by AFOSR MURI FA9550-18-1-0502. The work presented here would not have been possible at all without the careful, patient and ´ ech, who taught me diligent advise of my advisors, Professors Wilfrid Gangbo and Andrzej Swi˛ the main ideas and methods involved in it, and provided me with frequent and valuable feedback. This project was planned and executed during my time as a doctoral student at the Georgia Institute of Technology, to which I am deeply grateful. Appendix A Proof of Lemma 2.5. We begin by noting the following. Let μ, ν ∈ P(T d ), and γ , γ¯ ∈ (μ, ν). Then, for any ϕ ∈ C 2 (T d ),



π 1 − π 2 2γ + π 1 − π 2 2γ¯ ∇ϕ(x) · (y − x)(γ − γ¯ )(dx, dy) ≤

∇ 2 ϕ ∞ . 2

(A.1)

Rd ×Rd

Indeed, Taylor expansion gives a Borel function r : T d × T d → [−1, 1] such that ϕ(y) − ϕ(x) − ∇ϕ(x) · (y − x) = r(x, y) ∇ 2 ϕ ∞

|x − y|2 . 2

Integrating both sides of this equality over Rd × Rd once with respect to γ and then with respect to γ  , remembering that γ and γ  have the same marginals μ and ν, and substracting one of the resulting expressions from the other, yields (A.1). Fix now μ ∈ P(T d ). Let ν ∈ P(T d ) and γ ∈ 0 (μ, ν), γ¯ ∈ (μ, ν). Let ϕ ∈ C ∞ (T d ). Write  e(ν, ∇μ W(μ), γ ) = e(ν, ∇ϕ, γ ) −

(∇μ W(μ)(x) − ∇ϕ(x)) · (y − x)γ (dx, dy),

Rd ×Rd

and the same expression which holds with γ¯ in place of γ . Substracting one from the other and taking absolute value gives, using Hölder’s inequality, |e(ν, ∇μ W(μ), γ ) − e(ν, ∇μ W(μ), γ¯ )| ≤ |e(ν, ∇ϕ, γ ) − e(ν, ∇ϕ, γ¯ )| + ∇μ W(μ) − ∇ϕ L2 (μ) ( π 2 − π 1 γ + π 2 − π 1 γ¯ ). Now, π 2 − π 1 γ ≤ π 2 − π 1 γ¯ , and, using (A.1),

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.66 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

66

|e(ν, ∇μ W(μ), γ ) − e(ν, ∇μ W(μ), γ¯ )|

≤ π 1 − π 2 γ¯ π 1 − π 2 γ¯ ∇ 2 ϕ ∞ + 2 ∇μ W(μ) − ∇ϕ L2 (μ) . Dividing by π 2 − π 1 γ¯ and once again because π 2 − π 1 γ ≤ π 2 − π 1 γ¯ , we obtain e(ν, ∇μ W(μ), γ¯ ) e(ν, ∇μ W(μ), γ ) ≤

π 1 − π 2 γ¯

π 1 − π 2 γ

+ π 1 − π 2 γ¯ ∇ 2 ϕ ∞ + 2 ∇μ W(μ) − ∇ϕ L2 (μ) . This holds for any ν ∈ P(T d ), γ ∈ 0 (μ, ν), γ¯ ∈ (μ, ν), ϕ ∈ C ∞ (T d ). Fix r > 0, and, on the right-hand side, fix γ¯ ∈ (μ, ν) such that π 2 − π 1 γ¯ < r. Take then the supremum on the left-hand side over ν ∈ P(T d ), γ¯ ∈ (μ, ν) such that π 2 − π 1 γ¯ < r, to obtain e(ν, ∇μ W(μ), γ )

+ r ∇ 2 ϕ ∞ + 2 ∇μ W(μ) − ∇ϕ 2 e[∇μ W(μ), r] ≤ L (μ) 1 2

π − π γ holding for any ν ∈ P(T d ), γ ∈ 0 (μ, ν), ϕ ∈ C ∞ (T d ). Taking now the supremum on the right-hand side over ν ∈ P(T d ), γ ∈ 0 (μ, ν) such that π 2 − π 1 γ < r, and then letting r → 0+ on both sides yields lim e[∇μ W(μ), r] ≤ lim e0 [∇μ W(μ), r] + 2 ∇μ W(μ) − ∇ϕ L2 (μ)

r→0+

r→0+

= 2 ∇μ W(μ) − ∇ϕ L2 (μ) , by the hypothesis, for any ϕ ∈ C ∞ (T d ). By the fact that ∇μ W(μ) is an L2 (μ) limit of gradients of smooth periodic functions ϕ, the conclusion follows. 2 Proof of Proposition 2.6. (i) Let us invoke Proposition 8.4.6 of [3], to say that there exists a subset J ∈ I whose measure equals that of I , such that, for every V ∈ C(T d × T d ), h0 ∈ J , we have   y −x

lim V x, (A.2) γh (dx, dy) = V (x, v¯h0 (x))μh0 (dx), h→0 h T d ×T d

Td

where v¯h0 is the velocity vector field of minimal norm for μh at h0 , and {γh }|h|>0 are optimal plans between μh0 and μh0 +h . Let then h0 ∈ J . For h such that h0 + h ∈ I , let γh ∈ 0 (μh0 , μh0 +h ). By the twice differentiability of V , we have ∇μ V (q h0 +h , μh

0 +h

)(x h0 +h ) − ∇μ V (q h0 , μh0 )(x h0 )

2 − ∇qμ V (q h0 , μh0 )(x h0 )(q h − q h0 ) − Pγ [μh0 ](q h0 , x h0 , x h0 +h )

≤ o(|q h0 +h − q h0 |)



+ W (μh0 , μh0 +h ) + |x h0 +h − x h0 | ρ(W (μh0 , μh0 +h )) + (|x h0 +h − x h0 |) . (A.3)

JID:YJDEQ AID:10108 /FLA

[m1+; v1.304; Prn:25/11/2019; 13:20] P.67 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

67

Let v¯h0 be the projection of vh0 onto Tμh0 P(T d ). Since 2 ∇μμ V (q h0 , μh0 )(x h0 , ·) ∈ Tμh0 P(T d ),

(2.3) and (A.2) give us  lim

h→0 T d ×T d

2 ∇μμ V (q h0 , μh0 )(x h0 , r)

b−r γh (dr, dy) h



2 ∇μμ V (q h0 , μh0 )(x h0 , r)vh0 (r)μh0 (dr).

= Td

Therefore, dividing both sides of inequality (A.3) by h, and passing to the limit as h → 0, we obtain the desired formula. d (ii) Under those conditions, the formula for dh ∇μ V (q h , μh )(x h ) is continuous in h, and the claim follows. 2 References [1] D.M. Ambrose, Small strong solutions for time-dependent mean field games with local coupling, C. R. Acad. Sci. Paris, Ser. I 354 (2016) 589–594, https://doi.org/10.1016/j.crma.2016.02.006. [2] D.M. Ambrose, Strong solutions for time-dependent mean field games with non-separable Hamiltonians, J. Math. Pures Appl. 113 (2018) 141–154, https://doi.org/10.1016/j.matpur.2018.03.003. [3] L. Ambrosio, N. Gigli, G. Savaré, Gradient Flows in Metric Spaces and in the Space of Probability Measures, 2nd edition, Birkhäuser, 2008. [4] Y. Averboukh, Minimax approach to first-order mean field games, arXiv:1312.6627v2, 2013. [5] M. Bardi, M. Cirant, Uniqueness of solutions in mean field games with several populations and Neumann conditions, arXiv:1709.02158v1, 2017. [6] J-D. Benamou, G. Carlier, F. Santambrogio, Variational Mean Field Games, Modeling and Simulation in Science, Engineering and Technology, vol. 1, Springer, Basel, 2017, pp. 141–171. [7] A. Bensoussan, J. Frehse, P. Yam, Mean Field Games and Mean Field Type Control Theory, Springer, 2013. [8] A. Bensoussan, J. Frehse, P. Yam, The master equation in mean field theory, J. Math. Pures Appl. 103 (6) (2015) 1441–1474, https://doi.org/10.1016/j.matpur.2014.11.005. [9] A. Bensoussan, J. Frehse, P. Yam, On the interpretation of the master equation, Stoch. Process. Appl. 127 (7) (2017) 2093–2137, https://doi.org/10.1016/j.spa.2016.10.004. [10] U. Bessi, Existence of solutions of the master equation in the smooth case, SIAM J. Math. Anal. 48 (1) (2016) 204–228, https://doi.org/10.1137/15M1018782. [11] V. Bogachev, Measure Theory, vol. 2, Springer, 2007. [12] P. Cardaliaguet, Notes on mean field games, https://www.ceremade.dauphine.fr/~cardaliaguet/MFG20130420.pdf, 2012. [13] P. Cardaliaguet, Long time average of first order mean field games and weak KAM theory, Dyn. Games Appl. 3 (4) (2013) 473–488, https://doi.org/10.1007/s13235-013-0091-x. [14] P. Cardaliaguet, P.J. Graber, Mean field games systems of first order, ESAIM Control Optim. Calc. Var. 21 (3) (2015) 690–722, https://doi.org/10.1051/cocv/2014044. [15] P. Cardaliaguet, A. Porretta, D. Tonon, Sobolev regularity for the first order Hamilton-Jacobi equation, Calc. Var. 54 (3) (2015) 3037–3065, https://doi.org/10.1007/s00526-015-0893-3. [16] P. Cardaliaguet, F. Delarue, J-M. Lasry, P.L. Lions, The Master Equation and the Convergence Problem in Mean Field Games, Princeton University Press, 2019. [17] R. Carmona, F. Delarue, The master equation for large population equlibriums, in: Stochastic Analysis and Applications, in: Springer Proceedings in Mathematics & Statistics, vol. 100, Springer, Cham, 2014. [18] R. Carmona, F. Delarue, Probabilistic Theory of Mean Field Games with Applications, Volume I, II, Springer, 2018.

JID:YJDEQ AID:10108 /FLA

68

[m1+; v1.304; Prn:25/11/2019; 13:20] P.68 (1-68)

S. Mayorga / J. Differential Equations ••• (••••) •••–•••

[19] J-F. Chassagneaux, D. Crisan, F. Delarue, A probabilistic approach to classical solutions of the master equation for large population equilibria, arXiv:1411.3009v2, 2014. [20] M. Cirant, D. Tonon, Time-dependent focusing mean-field games: the sub-critical case, arXiv:1704.04014v1, 2017. [21] I. Fonseca, W. Gangbo, Degree Theory in Analysis and Its Applications, Oxford University Press, 1995. [22] W. Gangbo, On some analytical aspects of mean field games, https://math.berkeley.edu/~wgangbo/math_278/MFG1-18.pdf, 2018. [23] W. Gangbo, Y.T. Chow, A partial Laplacian as an infinitesimal generator on the Wasserstein space, arXiv:1710. 10536v1, 2017. ´ ech, Existence of a solution to an equation arising from the theory of mean field games, J. Differ. [24] W. Gangbo, A. Swi˛ Equ. 259 (11) (2015) 6573–6643, https://doi.org/10.1016/j.jde.2015.08.001. [25] W. Gangbo, A. Tudorascu, Weak KAM theory on the Wasserstein torus with multidimensional underlying space, J. Differ. Equ. 67 (3) (2014) 408–463, https://doi.org/10.1002/cpa.21492. [26] W. Gangbo, A. Tudorascu, On differentiability in the Wasserstein space and well-posedness for Hamilton-Jacobi equations, J. Math. Pures Appl. 457 (2018) 119–174, https://doi.org/10.1016/j.matpur.2018.09.003. [27] N. Ghoussoub, Optimal ballistic transport and Hopf-Lax formulae on Wasserstein space, arXiv:1705.05951, 2017. [28] D. Gomes, J. Saúde, Mean field games models—a brief survey, Dyn. Games Appl. 4 (2) (2013) 110–154, https:// doi.org/10.1007/s13235-013-0099-2. [29] D. Gomes, E. Pimentel, V. Voskanyan, Regularity Theory for Mean-Field Game Systems, Springer, 2016. [30] P.J. Graber, A. Mészáros, Sobolev regularity for first order mean field games, Ann. Inst. Henri Poincaré 14 (1) (2018) 1557–1576, https://doi.org/10.1016/j.anihpc.2018.01.002. [31] O. Guéant, J-M. Lasry, P.L. Lions, Mean field games and applications, in: Paris-Princeton Lectures on Mathematical Finance 2010, Springer, Berlin, 2010. [32] M. Huang, P.E. Caines, R.P. Malhamé, Large-population cost-coupled LQG problems with nonuniform agents: individual-mass behavior and decentralized -Nash equilibria, IEEE Trans. Autom. Control 52 (9) (2007) 1560–1571, https://doi.org/10.1109/TAC.2007.904450. [33] D. Lacker, A general characterization of the mean field limit for stochastic differential games, Probab. Theory Relat. Fields 165 (3–4) (2015) 581–648, https://doi.org/10.1007/s004. [34] J-M. Lasry, P.L. Lions, Mean field games, Jpn. J. Math. 2 (1) (2007) 229–260, https://doi.org/10.1007/s11537-0070657-8. [35] H. Lavenant, F. Santambrogio, Optimal density evolution with congestion: L∞ bounds via flow interchange techniques and applications to variational mean field game, arXiv:1705.05658v1, 2017. [36] P.L. Lions, Cours au collège de france, http://www.college-de-france.fr. [37] S. Mayorga, On a Classical Solution to the Master Equation of a First Order Mean Field Game, Doctoral dissertation, Georgia Institute of Technology, 2019, http://hdl.handle.net/1853/61760. [38] W. Rudin, Principles of Mathematical Analysis, 3rd edition, Mc-Graw Hill, 1976. [39] F. Santambrogio, Regularity via duality in calculus of variations and degenerate elliptic PDEs, J. Math. Anal. Appl. 457 (2) (2018) 1649–1674, https://doi.org/10.1016/j.jmaa.2017.01.030. [40] H. Tran, A note on nonconvex mean field games, arXiv:1612.04725, 2016.