On the convergence of augmented Lagrangian method for optimal transport between nonnegative densities

J. Math. Anal. Appl. 485 (2020) 123811 Contents lists available at ScienceDirect Journal of Mathematical Analysis and Applications www.elsevier.com/...

Download PDF

719KB Sizes 0 Downloads 23 Views

Report

PDF Reader
Full Text

J. Math. Anal. Appl. 485 (2020) 123811

Contents lists available at ScienceDirect

Journal of Mathematical Analysis and Applications www.elsevier.com/locate/jmaa

On the convergence of augmented Lagrangian method for optimal transport between nonnegative densities Romain Hug a,∗ , Emmanuel Maitre b , Nicolas Papadakis c a b c

Univ. Artois, LML, 62307 Lens, France Univ. Grenoble Alpes, CNRS, Grenoble INP 1 , LJK, 38000 Grenoble, France Univ. Bordeaux, IMB, CNRS, UMR 5251, F-33400 Talence, France

a r t i c l e

i n f o

Article history: Received 27 November 2017 Available online 3 January 2020 Submitted by H. Frankowska Keywords: Optimal transport Augmented Lagrangian method Existence and uniqueness problem

a b s t r a c t The dynamical formulation of the optimal transport problem, introduced by J.D. Benamou and Y. Brenier [4], amounts to ﬁnd a time dependent space density and velocity ﬁeld minimizing a transport energy between two densities. In order to solve this problem, an algorithm has been proposed to estimate the saddle point of a Lagrangian. We study the convergence of this algorithm in the most general case where initial and ﬁnal densities may vanish on regions of the transportation domain. Under these assumptions, the main diﬃculty of our study is the proof of existence of a saddle point and of uniqueness of the density-momentum component, as it leads to deal with non-regular optimal transportation maps. For these reasons, a detailed study of the regularity properties of the velocity ﬁeld associated to an optimal transportation map is required. © 2020 Elsevier Inc. All rights reserved.

1. Introduction The optimal transport problem is generally formulated as follows: considering two sets of particles or probability measures, ﬁnd the allocation between those discrete or continuous objects while minimizing a given cost. This is referred to as optimal transport or optimal allocation. Even if these two denominations describe the same problem, they reﬂect two diﬀerent approaches. Indeed, while it was initially a problem of optimal displacement, the pioneer Gaspard Monge, acknowledging the fact that the optimal trajectory from one point to another is a straight line, reduced this problem to a simple allocation problem. The same holds for the formulation later given by Leonid Kantorovitch: the problem has also been reduced to a single allocation problem of the elements of a given resource to be transported. The trajectories are thus not involved in the transport cost, which only reﬂects the price to pay to move a mass from one point to another. * Corresponding author. 1

E-mail address: [email protected] (R. Hug). Institute of Engineering, Univ. Grenoble Alpes.

https://doi.org/10.1016/j.jmaa.2019.123811 0022-247X/© 2020 Elsevier Inc. All rights reserved.

2

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

The reduction of the optimal transport problem to an allocation problem ﬁrst makes it easier to tackle theoretically (see [20]). However, when it comes to describe more accurately the optimal allocation map, this formulation is less eﬃcient. Some approaches then choose to reintroduce the notion of displacement: this is the case with the method we deal with in this paper. The ﬁrst attempt to link the optimal allocation and optimal displacement problems has been proposed by R.J. McCann [23]. The continuous displacement between two measures ρ0 and ρ1 has been considered to determine a continuum of intermediate measures, so that the integral sum of local optimal costs (i.e. the distance of Wasserstein step by step) is equal to the global optimal allocation cost between ρ0 and ρ1 . For inﬁnitely close measures, there is indeed no diﬀerence between optimal allocation and optimal displacement. Relying on this continuous interpolation between densities ρ0 and ρ1 by densities ρt , t ∈ [0; 1], Y. Brenier and J.-D. Benamou introduced a continuous formulation of the optimal transport problem [4], originating from ﬂuid mechanics. This amounts to determine directly the evolution of the density ρ = t → ρt as well as the velocity ﬁeld v which advects the density, associated with the optimal map ∇φ (see (3)). Such a pair (ρ, v) therefore satisﬁes a continuity equation. The optimal map ∇φ between ρ0 and ρ1 can be estimated without relying on the intermediate densities ρt . Eﬃcient second-order algorithms have for instance been proposed [5], using resolutions of the timecontinuous Monge-Ampere equation induced by [23]. These direct approaches are nevertheless limited to non-vanishing densities deﬁned on convex supports. We refer to the book of Cuturi and Peyré [26] for an up-to-date contribution to the current state of research on computational optimal transport. As far as we know, the only numerical method handling the general case is based on the dynamic ﬂuid mechanic method introduced in [4]. The main interest of this dynamic approach is to exploit the physical formulation of the optimal transport problem. Being expressed in terms of ﬂuid mechanics quantities, it makes the model very ﬂexible, and allows generalizations to non balanced problems [22,9] which are relevant for practical applications. The introduction of new physical constraints (anisotropy of the domain, free divergence or rigidity of the velocity ﬁeld) in the dynamic problem has been a subject of study in [19]. The problem is reformulated as the minimization of a convex, proper, lower semi-continuous (l.s.c.) energy and can be solved using convex optimization tools [3,25]. Before going any further, let us ﬁrst introduce the dynamic formulation of the optimal transport problem proposed by Benamou and Brenier. 2. Formulation of the problem and presentation of the algorithm 2.1. Monge-Kantorovich problem in Rd for a quadratic cost We denote by | ·| the Euclidean norm on Rd , for all d ∈ N, and consider two nonnegative densities (ρ0 , ρ1 ) on Rd (d ∈ N ∗ ), with bounded supports and of same mass. The Monge-Kantorovich problem consists in ﬁnding an optimal transport plan T between ρ0 Ld and ρ1 Ld that minimizes d(x, T (x))2 ρ0 (x)dx,

(1)

Rd

where d(x, y) is a distance on Ω. We write T #(ρ0 Ld ) the push-forward of ρ0 Ld by a transport plan T . Having T #(ρ0 Ld ) = ρ1 Ld means that for any bounded subset A of Rd , A ρ0 = T −1 (A) ρ1 . The quadratic Wasserstein distance W2 (ρ0 , ρ1 ) is deﬁned by: W2 (ρ0 , ρ1 )2 =

d(x, T (x))2 ρ0 (x)dx.

inf

T #(ρ0 Ld )=ρ1 Ld Ω

(2)

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

3

In the Euclidean case, where d(x, y) = |x − y|, there exists a unique transport map T between ρ0 and ρ1 that can be written as the gradient of a lower semi-continuous convex potential φ (Brenier’s Theorem [29], p. 66) i.e. W2 (ρ0 , ρ1 )2 =

|∇φ(x) − x|2 ρ0 (x) dx =

Rd

inf

T #(ρ0 Ld )=ρ1 Ld Rd

|T (x) − x|2 ρ0 (x) dx.

(3)

In the dynamic formulation introduced by J.-D. Benamou and Y. Brenier in [4], this Monge problem is reformulated as a minimization of a kinetic energy depending on a mass ρ and a velocity ﬁeld v, such that ρ is transported by v from ρ0 to ρ1 . We now detail this new problem. 2.2. Convex and augmented Lagrangian formulation Let ρ0 , ρ1 ∈ L2 (Rd ) be two probability densities with bounded supports. The dynamic formulation consists in increasing the dimension of the problem by adding a temporal variable t ∈ [0, 1]. Formally, we look for a couple (ρ, v), where ρ denotes a nonnegative density, and v a vector ﬁeld with values in Rd , both deﬁned on (0, 1) × Ω, where Ω is a bounded open convex set of Rd containing supp(ρ0 ) and supp(ρ1 ). This couple is required to satisfy the continuity equation, ∂t ρ + div(ρv) = 0

(4)

with no ﬂux boundary conditions for ρv on ∂Ω × (0, 1), and with initial and ﬁnal conditions for ρ on Ω: ρ(0, x) = ρ0 (x), ρ(1, x) = ρ1 (x).

(5)

Among all couples (ρ, v) checking these conditions, we look for a minimizer of the kinetic energy E(ρ, v) = 1 (1/2) 0 Ω |v|2 ρ dx dt. As E is not convex and the constraint (4) is nonlinear, the authors of [4] proposed as a new variable m = ρv instead of v, and consider the transport cost: 1 K(ρ, m) =

|m(t, x)|2 dx dt, 2ρ(t, x)

(6)

0 Ω

with the corresponding continuity constraint: ∂t ρ + divx m = 0,

(7)

that are subject to no ﬂux boundary conditions on m and initial/ﬁnal conditions (5). The nonnegativity constraint on ρ turns to {ρ > 0 or (ρ, m) = (0, 0)} through the change of variable m = ρv. By introducing a Lagrange multiplier ψ to handle the linear constraints (7) together with boundary and initial/ﬁnal conditions (5), we can write a saddle-point formulation of the problem: ⎡ ⎢ inf sup ⎣

(ρ,m) ψ

(0,1)×Ω

|m| − ρ 2

(0,1)×Ω

(∂t ψρ − ∇x ψ · m) +

⎤ ⎥ (ψ(0, ·)ρ0 − ψ(1, ·)ρ1 )⎦ .

(8)

Ω

Another crucial idea in [4] is to encode the non-negativity constraint on ρ by introducing the Legendre transform of (ρ, m) → |m|2 /(2ρ):

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

4

F (a, b) = sup (ρ,m)

|m|2 ρa + m, b − 2ρ

⇔

F (a, b) =

0 if a ≤ −|b|2 /2 +∞ otherwise

with (a, b) ∈ R × Rd . Since the transport cost K is convex and l.s.c., it is equal to its biconjugate by the Legendre transform. We therefore have |m|2 /(2ρ) = sup(a,b) (ρa + m · b − F (a, b)). The problem is thus partially linearized with respect to the variables (ρ, m), while the non-linear part F reduces to the indicator function of a convex set. Considering sup-integral and inf-sup inversions, and setting q = (a, b), μ = (ρ, m) and Q = (0, 1) × Ω, the saddle point problem (8) is reformulated as inf (ψ,q) supμ L(ψ, p, μ) where L(ψ, p, μ) = χP (q) + G(ψ) + μ, ∇t,x ψ − qL2

(9)

with G(ψ) = Ω ψ(0, ·)ρ0 dx − Ω ψ(1, ·)ρ1 dx and χP = 0 if q ∈ P, χP (q) = +∞ otherwise, where the paraboloid P is deﬁned as P = {(a, b) ∈ L2 (Q) × L2 (Q)d , a ≤ −|b|2 /2}.

(10)

The augmented Lagrangian is ﬁnally given, for some parameter r > 0, by: r Lr (ψ, q, μ) = χP (q) + G(ψ) + μ, ∇t,x ψ − qL2 + ∇t,x ψ − q2L2 . 2

(11)

2.3. Benamou-Brenier algorithm To solve the saddle point problem associated to (11), the authors of [4] have proposed an algorithm based on a Uzawa method: the ADMM algorithm introduced by M. Fortin and R. Glowinski in [14]. This consists in performing the following steps, starting from (ψ n−1 , q n−1 , μn ): Step A: Find the unique ψ n such that Lr (ψ n , q n−1 , μn ) ≤ Lr (ψ, q n−1 , μn ), ∀ψ,

(12)

Step B: Find the unique q n = (an , bn ) such that Lr (ψ n , q n , μn ) ≤ Lr (ψ n , q, μn ), ∀q,

(13)

n+1

Step C: Update μ

n+1

= (ρ

,m

n+1

) as μ

n+1

= μ + r(∇t,x ψ − q ). n

n

n

(14)

More precisely, the algorithm breaks down as follows: Step A can be interpreted as a projection on gradient ﬁelds. We look for the unique ψ n ∈ H 1 (Q)/R such that: ∀h ∈ H 1 (Q), G(h) + μn , ∇t,x h + r∇t,x ψ n − q n−1 , ∇t,x h = 0.

This corresponds to ﬁnd ψ n solution of −rΔt,x ψ n = divt,x μn − rq n−1 , with boundary conditions r∂t ψ n (0, ·) = ρ0 − ρn (0, ·) + ran−1 (0, ·) and r∂t ψ n (1, ·) = ρ1 − ρn (1, ·) + ran−1 (1, ·), and no ﬂux boundary conditions on (0, 1) × ∂Ω. This is a kind of Helmholtz decomposition. Step B is an L2 orthogonal projection on P: q n = PP ((1/r)μn + ∇t,x ψ n ). Step C uses the computed gradient of step A to project onto the aﬃne space of constraints (4) and (5). 2.4. Objectives and related existing works The main object of this article is to propose a theoretical framework allowing to answer the three following points: existence of saddle points, uniqueness of saddle points and convergence of the considered algorithm. This study also leads to characterize rigorously some properties of the velocity ﬁeld associated to an optimal transport plan. A ﬁrst study of the Benamou-Brenier algorithm was carried out in [17] for periodic in space boundary condition: Ω = T d , where T d denotes the torus in dimension d. The strongest assumption in this work

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

5

concerned the density ρ∗ , solving the problem (6), that had to be larger than a positive constant. This assumption implies in particular a regularity of the associated transport plan. Under such conditions, the potential φ in (3) is indeed necessarily of class C 1 with Lipschitz gradient. Caﬀarelli studied in [7] and [8] the regularity of an optimal transport plan on a convex domain with respect to the regularity of the initial and ﬁnal densities ρ0 and ρ1 , additionally assumed to be positive. A special case is ρ0 and ρ1 strictly positive on T d and belonging to C α,l (T d ), for some l ∈ (0, 1), and α ∈ N d . Cordero and Erausquin have shown in [11] that in this case, the optimal transport potential φ is of class C α,l+2 and, for any t ∈ [0, 1], the density ρt also has a C α,l regularity on T d and is bounded from below by a strictly positive constant independent of t. Under the same assumptions, Guittet showed in [17] the existence of a solution (ρ∗ , m∗ ) for the dynamic formulation of optimal transportation; solution from which was proven the existence of a saddle point (ψ ∗ , q ∗ , μ∗ ) for the Lagrangian L. However, there was no uniqueness result for the density-momentum couple μ∗ = (ρ∗ , m∗ ). A convergence result of the Benamou-Brenier algorithm was also presented in [17]. Nevertheless, this did not explicitly give the strong or weak convergence of μn , the main component of the triplet (ψn , qn , μn ). Only the strong convergences in H 1 /R and L2 of the components ψn and qn were addressed and the proof suﬀers from an unclear argument related to Cauchy sequences. More recently, in parallel works to ours, Goldman and Otto [16] also addressed this regularity question for initial and ﬁnal densities with bounded support that could vanish. The authors nevertheless assume they are smooth and bounded from below by a positive constant on their support, making them generalized indicator functions. In this article, we consider a more general framework. The open set Ω will here be assumed to be convex and bounded, with no ﬂux boundary conditions on the momentum m, but more importantly, the density ρ will not be assumed to be bounded from below by a strictly positive constant. We will merely assume that the densities ρ0 and ρ1 are non-negative elements of L2 (Ω) and potentially without any required smoothness. We will show the existence of a saddle point for the Lagrangian L, solution of the problem (6), as well as the uniqueness among the set of saddle points (ψ ∗ , q ∗ , μ∗ ) of the Lagrangian L of μ∗ = (ρ∗ , m∗ ), which shows that the density corresponds to the McCann interpolation. The uniqueness result established in this article only concerns the component μ, since, as we will see at the beginning of the section 6, there is no uniqueness of the saddle points of L: in fact, the components ψ and q can vary outside the support of ρ. This study also leads us to a new characterization of a velocity ﬁeld inherent to an optimal transport in 2 d L . We give suﬃcient assumptions on a velocity ﬁeld v ∈ L∞ loc ([0, 1] × R ) so that a density transported by v is the McCann interpolation of an optimal (unique) transport (Theorem 7.1). These hypotheses are then reduced to the usual characteristics of optimal isotropic transport, in particular straight-line trajectories, constant speed and curl-free. 2.5. Organization of the paper In section 3 we present the diﬀerent challenges concerning the existence of the saddle points for L, as well as the uniqueness properties. In section 4, we carry out a preliminary study of the properties of the velocity ﬁeld. It gives crucial materials for the following three sections, in which we establish the existence of a saddle point (section 5), the uniqueness of the component μ = (ρ, m) (section 6), and the characterization a minima of a velocity ﬁeld which represents an optimal transport (section 7). In section 8, we ﬁnally show the weak and strong convergence of Benamou-Brenier algorithm towards a saddle point of the Lagrangian L. 3. Characterization of saddle points and main results The main objective of this ﬁrst part is to directly build a saddle point of L deﬁned in (11). Let ρ0 and ρ1 be two probability densities (i.e. non-negative and of integral equal to 1) of L2 (Rd ) with bounded supports,

6

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

and Ω be a bounded convex open set of Rd . We assume that Ω is piecewise of class C 1 and such that supp(ρ0 ) ∪ supp(ρ1 ) ⊂ Ω. In the remaining of this paper, we denote Q = (0, 1) × Ω. For all r > 0, we write 1 2 d+1 Lsp × r (ρ0 , ρ1 , Ω) the set of saddle points of Lagrangian Lr which are elements of S = H (Q)/R × L (Q) 2 d+1 L (Q) . In order to characterize saddle points, we now deﬁne the set of Properties (I) as follows. Properties (I). A triplet (ψ, q, μ) ∈ S satisﬁes Properties (I) if and only if: (I)1 ∀q ∈ P, μ, q − q ≤ 0, (I)2 ∀h ∈ H 1 (Q), G(h) + μ, ∇t,x h = 0, (I)3 ∇t,x ψ = q. Proposition 3.1. A saddle point (ψ ∗ , q ∗ , μ∗ ) ∈ S of Lr satisﬁes Properties (I), for all r ≥ 0. Sketch of the proof. We ﬁrst check that for a triplet (ψ ∗ , q ∗ , μ∗ ) ∈ S satisfying the Properties (I), we have the relation Lr (ψ, q, μ∗ ) ≥ Lr (ψ ∗ , q ∗ , μ∗ ) ≥ Lr (ψ ∗ , q ∗ , μ), for all (ψ, q, μ) ∈ S, which characterizes a saddle point of Lr . Conversely, for a saddle point (ψ ∗ , q ∗ , μ∗ ) ∈ S of Lr , we can check one by one Properties (I), ﬁrst by ﬁxing ψ = ψ ∗ and q = q ∗ for (I)3 , then q = q ∗ and μ = μ∗ for (I)2 , and ﬁnally μ = μ∗ and ψ = ψ ∗ for (I)1 . Notice that the saddle points of Lr are independent of r ≥ 0, and therefore the same as L = L0 . Setting μ = (ρ, m) and q = (a, b) ∈ P, see (10), we can reinterpret (I)1 and (I)2 . Property (I)1 means that if μ(t, x) is nonzero then it is orthogonal to the paraboloid at point q(t, x), i.e. co-linear to the vector (1, b(t, x)). Property (I)1 can thus be translated as follows: ρ ≥ 0, m = ρb, ρ(a + |b|2 /2) = 0. Next, (I)2 corresponds to the mass conservation equation satisﬁed by ρ and b (i.e. ∂t ρ + divx (ρv) = 0, for v = b) for the initial and ﬁnal densities ρ0 and ρ1 . We now recall that according to Brenier’s Theorem ([29], p. 66), there exists a convex potential φ verifying ρ1 Ld = ∇φ#(ρ0 Ld ). For the purpose of our study, we need to specify the properties of the potential φ: we deﬁne the property (Γ1 ) on the potential φ as follows: Property (Γ1 ). φ and φ∗ are convex, continuous and achieve a minimum on Rd . Here φ∗ is the Legendre transform of φ. We recall that a convex and continuous function on Rd is locally Lipschitz. We complete Brenier’s Theorem as follows. Proposition 3.2. Let ρ0 be a probability density Lebesgue-measurable on Rd and ν1 a probability measure on

Rd . There exists a potential φ : Rd → R, satisfying the property (Γ1 ), such that ∇φ# ρ0 Ld = ν1 . Sketch of the proof. One can ﬁrst show that the optimal transport potential φ given by the Brenier’s Theorem (convex, lower semicontinuous and gradient bounded almost everywhere on supp(ρ0 Ld )) is ﬁnite and with a bounded gradient on an open neighborhood of the support of ρ0 Ld . It is then possible to extend the restriction of φ to this neighborhood by φ, a ﬁnite convex function on Rd continuous, supralinear and sub-quadratic. The supralinearity of φ implies existence of a global minimum for the latter, and ensures ∗ that its Legendre transform φ is also ﬁnite (and thus continuous) on Rd . As φ is sub-quadratic, it ensures ∗ the supralinearity of φ , and therefore the existence of a global minimum. Deﬁnition 3.1. Let ρ0 be a probability density Lebesgue-measurable on Rd and ν1 a probability measure deﬁned on the Lebesgue σ-algebra in Rd . We denote by Φ(ρ0 Ld , ν1 ) the set of functions φ : Rd → R

satisfying the property (Γ1 ) such that ∇φ# ρ0 Ld = ν1 .

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

7

We can now deﬁne from φ the following quantities. Deﬁnition 3.2. For all t ∈ (0, 1), introducing the potential φt = (1 − t) |·|2 + tφ, we deﬁne: 2

1. The characteristic displacement at time t, Xφ (t, ·) = (1 − t) id +t∇x φ = ∇x φt

(15)

2. The associated velocity ﬁeld v, vφ (t, ·) =

id −∇x (φt )∗ , t

(16)

where (φt )∗ denotes the Legendre transform of the potential φt . 3. The density ρφ , deﬁned as the union for t ∈ [0, 1] of the McCann interpolation densities ρtφ between ρ0 Ld and ρ1 Ld [23]: ρφ (t, ·)Ld = ρtφ Ld = Xφ (t, ·)#(ρ0 Ld ).

(17)

In the following, when there is no ambiguity, we will write ρ, v and X instead of ρφ , vφ and Xφ in order to lighten the notation. 3.1. Reformulation of Properties (I) In the following, we set μ = (ρ, ρv) and q = (− 12 |v|2 , v). To construct a saddle point, μ and q should satisfy Properties (I). It is then suﬃcient to check that ρ and v deﬁned in (17) and (16) satisfy the following properties: Deﬁnition 3.3. For n ∈ N, 1 ≤ p ≤ +∞, a measurable function f : Q −→ Rn , and a functional space X, 1,∞ we write f ∈ X when each component of f is in X. We denote by Wloc ([0, 1] × Ω) the space of functions 1,∞ ψ∈W ((0, 1) × ω), for all bounded open set ω ⊂ Ω. Properties (II). (II)1 The density ρ is an element of L2 (Q). d (II)2 The velocity ﬁeld v is an element of L∞ loc (Q) . (II)3 The potential (ρ, v) satisﬁes the mass conservation equation in the distributions sense for the initial and ﬁnal conditions ρ0 and ρ1 and no ﬂux conditions: ∞ ∀h ∈ C (Q), (∂t h + v.∇h) ρ + h(0, x)ρ0 (x)dx − h(1, x)ρ1 (x)dx = 0 (18) Q

Ω

Ω

1,∞ (II)4 There exists ψ ∈ Wloc ([0, 1] × Ω) such that v = ∇x ψ. (II)5 The velocity ﬁeld v satisﬁes the Burgers equation in the sense of distributions:

1 ∂t v + ∇x |v|2 = 0. 2

(19)

Remark 3.1. These properties are stated in a general case: Ω is a convex open set of Rd (and Q = (0, 1) ×Ω), which is not necessarily bounded (see Section 7). However, until Section 6, Ω is assumed to be bounded, in context of Benamou-Brenier’s Algorithm.

8

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

Properties (II)4 and (II)5 describe the features of an isotropic optimal transport for a quadratic cost: (II)4 corresponds to a curl-free velocity ﬁeld (without crossing trajectories) and (II)5 is in line with the property of straight-line displacement. To build a triplet (ψ, q, μ) satisfying Properties (I), it is suﬃcient that ρ and v satisfy Properties (II). However, having a triplet (ψ, q, μ) satisfying the Properties (I) is not suﬃcient to build a density-velocity ﬁeld pair (ρ, v) checking Properties (II), and such that μ = (ρ, ρv) and q = (− 12 |v|2 , v). Indeed, the component q may not belong to the boundary of the paraboloid (a, b) → a + (1/2)|b|2 ≤ 0 outside the support of μ. Properties (II)1 and (II)2 ensure that the saddle point is in the correct space S. Indeed, we have q ∈ L∞ (Q)d+1 ⇔ v ∈ L∞ (Q)d and μ ∈ L2 (Q)d+1 ⇔ ρ ∈ L2 (Q) and ρv ∈ L2 (Q)d , and, for the potential ψ, we have W 1,∞ (Q) ⊂ H 1 (Q). The properties (II)2 and (II)5 imply property (I)3 . Indeed, having q deriving from a space-time potential amounts to verifying, for a dimension d ≤ 2, that curlt,x (q) = 0 (recalling that q = (− 12 |v|2 , v)) in the sense of distributions (see [15], Theorem 2.9, p. 31), so that

∂t v + 12 ∇|v|2 = 0, curlx (v) = 0 ⇔ ∃ψ ∈ D (Q), v = ∇x ψ

From Deﬁnition of v in (16), we see that the velocity derives from a potential in space in the sense of the distributions, namely:

1 1 2 ∗ | · | − (φt ) v(t, ·) = ∇x t 2

(20)

where the potential φ is an element of L1loc (Q). This proves property (I)3 , provided that the ﬁeld v is an element of L∞ (Q)d and veriﬁes the Burgers equation in the sense of distributions. Even if the notion of rotational is less easy to cope with in dimension d > 2, it is always possible to deduce property (I)3 from (II)2 and (II)5 (see [30]), whatever the dimension d is, provided that we have v ∈ L∞ (Q)d . Property (II)3 corresponds to (I)2 . Indeed, we can easily extend by a density argument the relation to h ∈ H 1 (Q) once it is established for h ∈ C ∞ (Q). Finally remark that with the above results, Property (I)1 is veriﬁed by setting m = ρv. Remark 3.2. We also notice that, according to (20), (II)2 implies (II)4 . Indeed, we can deduce from (II)2

1,∞ that the potential ψ(t, ·) = 1t 12 | · |2 − (φt )∗ is in L∞ loc (Q), and then in Wloc ([0, 1] × Ω). 3.2. Main results of existence, uniqueness and regularity We now give our main results of existence and uniqueness on saddle points. Theorem 3.1 (Existence of a saddle point). Let ρ0 and ρ1 be two probability densities of L2 (Rd ) with bounded supports, and let Ω be a suﬃciently smooth bounded convex open set of Rd such that supp(ρ0 ) ∪supp(ρ1 ) ⊂ Ω. Then, for all φ ∈ Φ(ρ0 Ld , ρ1 Ld ) (Deﬁnition 3.1), for ρ = ρφ and v = vφ (see (17) and (16) in Deﬁnition 3.2), 1,∞ and setting μ = (ρ, ρv) and q = (−(1/2)|v|2 , v), there exists ψ ∈ Wloc ([0, 1] × Ω) such that q = ∇t,x ψ, and (ψ, q, μ) is a saddle point of Lagrangian L (or at least its restriction on (0, 1) × Ω).

The density ρ is in C 0 [0, 1], L2 (Ω) , and the velocity ﬁeld v is locally Lipschitz on the space (0, 1) × Rd , and satisﬁes properties (II)4 and (II)5 . Moreover, v ∈ W 1,p ((0, 1) × Ω) for all 1 ≤ p < 2, and ∇t,x v ∈ L∞ (0, 1; L1 (Ω)).

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

9

Having ∇t,x v ∈ L∞ (0, 1; L1 (Ω)) is a result in itself, and it is not directly used to characterize an optimal transport velocity ﬁeld in L2 . Very close properties are nevertheless considered to show the statements on the uniqueness of the component (ρ, m) of the saddle points of L. Theorem 3.2 (Uniqueness of density and momentum). If the triplet (ψ ∗ , q ∗ , μ∗ ) is a saddle point of L (the assumptions on ρ0 , ρ1 and Ω being the same as in Theorem 3.1), then, for any potential φ ∈ Φ(ρ0 Ld , ρ1 Ld ), with ρ = ρφ and v = vφ , we have μ∗ = (ρ∗ , m∗ ) = (ρ, ρv), and ρ(t, ·)Ld = X(t, ·)#(ρ0 Ld ) ∈ C([0, 1], L2 (Ω)), with X(t, ·) = ∇x φt = (1 − t) id +t∇x φ. In general, the set of saddle points (ψ, q, μ) of L is not reduced to a single element: only the component μ = (ρ, m) is unique. In other words, the set of points (ψ, q, μ) of L shares the same component μ and there is uniqueness of the density ρ and the velocity ﬁeld v on the support of ρ. The components q and ψ can vary outside the support of ρ. To prove these two results, we now study the properties of a velocity ﬁeld v deﬁned as in (16), for φ satisfying property (Γ1 ). 4. First velocity ﬁeld properties In this section, we present properties on velocity ﬁelds associated to optimal transport maps and namely show (II)2 and (II)5 . These results constitute the basis of existence and uniqueness results concerning the saddle points of L, as well as the generalized results of section 7. We recall that for a function f : Rn → R proper, l.s.c. and convex, the proximal operator Proxf (x) of f at x ∈ Rn is the unique minimizer of f + 12 |x −·|2 , so that y = Proxf (x) ⇔ x −y ∈ ∂f (y). The operators Proxf and id − Proxf are non-expansive (1-Lipschitz), and the proximal operator of f ∗ , the Legendre transform of f , satisﬁes Moreau’s identity Proxγf ∗ = id −γ Proxf /γ (·/γ). The γ-Moreau envelope γf is deﬁned as the inf-convolution of f by 1/(2γ)| · |2 . 4.1. Deﬁnition and ﬁrst properties of the velocity ﬁeld From (16), the velocity ﬁeld of an optimal transport can be written as a proximal operator p. This makes easier to deal with the problems of “breaks” of the velocity ﬁeld (which are not necessarily discontinuities). Let φ : Rd → R satisfy property (Γ1 ), i.e. φ is convex and continuous at every point of Rd , and admits in each of these points a non-empty and compact sub-diﬀerential. According to the properties linking Legendre conjugation and inf-convolution, if t ∈ (0, 1), by setting φt = (1 −t)| ·|2 /2 +tφ, then the Legendre conjugation (φt )∗ is of class C 1 on Rd , and for all (t, x) ∈ [0, 1) × Rd we deﬁne the operator p by

t p(t, x) = Prox 1−t φ

x 1−t

=

id − Prox(1−t)(tφ)∗ = ∇x (φt )∗ . 1−t

(21)

The velocity v = vφ introduced in (16) can be deﬁned from p by v(t, x) = [x − p(t, x)]/t, for all t ∈ (0, 1) and x ∈ Rd . When there is no ambiguity on φ, we use v to denote the velocity ﬁeld v. For all t ∈ [0, 1), p(t, ·) is 1/(1 − t)-Lipschitz (non-expansiveness of operator Prox), and ∀x, y ∈ Rd , y = p(t, x) = ∇x (φt )∗ (x) ⇔ x ∈ ∂x (φt )(y) = (1 − t)y + t∂φ(y).

(22)

From these deﬁnitions and using X as introduced in (15), we get: y = p (t, (1 − t)y + t∇φ(y)) = p(t, X(t, y)) i.e. ∇φ(y) − y = ∂t X(t, y) = v(t, X(t, y)).

(23)

10

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

Given that X(t, ·) = ∇φt and p(t, ·) = ∇x (φt )∗ , then for any t ∈ (0, 1), and for all x ∈ Rd such that φ Frechet-diﬀerentiable in x, we have (∇x (φt )∗ ◦ ∇φt ) (x) = x. Consequently, we observe that p(t, ·) formally represents the inverse of the characteristic trace X(t, ·) = ∇x (φt ). It would really be the case if φ was of class C 1 with a Lipschitz gradient. In the general case (i.e. φ is not C 1 and only checks property (Γ1 )), p(t, ·) is nevertheless not injective on Rd . The operator p(t, ·) thus repairs the “breaks” that can be generated by a transport plan. Indeed, p(t, ·) re-concentrates the areas generated by diﬀusion (by the characteristic trajectories X(t, ·)) of the break points on these same points. Next, we deduce that for any ﬁxed t ∈ (0, 1), the velocity ﬁeld v(t, ·) is Lipschitz on Rd . It is also possible to deﬁne a Lipschitz constant that is only time dependent and does not depend on φ. Lemma 4.1. For all ∈ (0, 1), Lt = 2/(t(1 − t)) is a Lipschitz constant for v(t, ·) on Rd (for the Euclidean norm | · |), whatever φ is. With Rademacher’s Theorem, the velocity ﬁeld v is therefore continuous and Fréchet-diﬀerentiable almost everywhere in space, and thus ∇x v(t, x) is additionally uniformly bounded by Lt = 2/(t(1 − t)) for almost all x ∈ Rd , where · denotes the subordinate norm to | · |. Remark 4.1. The “break” points of the transport plan correspond to the points where the potential φ is not diﬀerentiable. Although Rademacher’s Theorem ensures that the set of such points is negligible, the diﬀusion of these breaks, and in particular the torsions/high variations of the velocity ﬁeld at these points in t = 0 (or the re-connections at t = 1 for the points of irregularity of φ∗ ) is not. Indeed, the torsions of the velocity ﬁeld in the neighborhood of break points may induce a loss of H 1 regularity of the velocity ﬁeld. Having a H 1 regularity on the potential φ would greatly simplify the study on the uniqueness of the saddle points of the Lagrangian L presented in section 6. Unfortunately, such regularity can not be obtained in general. 4.2. Velocity ﬁeld control Lemma 4.2 (Property (II)2 ). For φ satisfying property (Γ1 ), the velocity ﬁeld v = vφ (deﬁned from φ as in d d (16)) satisﬁes v ∈ L∞ loc ([0, 1] × R ) . Sketch of the proof. We can show that v(t, ·) is uniformly bounded on any bounded open set ω ⊂ Rd in the neighborhood of t = 0. Take for instance t ∈ (0, 1/2] and y ∈ ω, and let x ∈ (1 − t)y + t∂φ(y). According to (22), we have p(t, x) = y, so v(t, x) = (x − y)/t ∈ ∂φ(y) − y. Because of Lemma 4.1, for all t ∈ (0, 1), we have: x − y 2 = 4|v(t, x)|. |v(t, y) − v(t, x)| ≤ |v(t, x)| + |x − y| ≤ 4 (24) t(1 − t) t We have |v(t, y)| ≤ 5|v(t, x)| ≤ 5 (supx∈ω |∂φ(x)| + sup(ω)) < +∞. Indeed, with the property (Γ1 ), φ and φ∗ are assumed to be ﬁnite and convex on Rd and therefore locally Lipschitz. The same argument can be used in the neighborhood of t = 1 (using φ∗ in place of φ). Moreover, since v(t, ·) = (id −∇x (φt )∗ )/t, we can prove (see for instance [30]) that there exists ψ ∈ × Ω) (which is not necessarily equal to | · |2 /2 − φ∗t ), such that v = ∇x ψ. As already stated, for every t ∈ (0, 1), v(t, ·) is continuous and Lipschitz on Rd with constant 2/t(1 − t). The ﬁeld v(t, ·) is therefore Lipschitz on Rd , for a Lipschitz constant independent of t on any interval [α, β] ⊂ (0, 1). One can for instance consider the constant Mα,β = sup[α,β] 2/t(1 − t). It is therefore possible to apply the Cauchy-Lipschitz Theorem on [α, β]. 1,∞ Wloc ([0, 1]

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

11

Then, for every x ∈ Rd and t ∈ (0, 1), the Cauchy problem: yt,x = v(·, yt,x ), with yt,x (t) = x, admits a unique maximum solution over any interval (α, β), 0 < α < t < β < 1. We can then prove that there exists a unique solution deﬁned on (0, 1) and that it can be written yt,x (s) = (s − t)v(t, x) + x for all s ∈ (0, 1). Indeed, such a solution satisﬁes yt,x (t) = x, and yt,x (s) = v(t, x) = v(s, (s − t)v(t, x) + x) = v(s, yt,x (s)). By using the properties (22) of the operator p, we can show that, for all t, s ∈ (0, 1), and for all x ∈ Rd , we have

v(t, x) = v(s, (s − t)v(t, x) + x)

(25)

This relation implies that v is a time-space locally Lipschitz function: we can bring back any diﬀerence between two space-times values of v to a same time, and apply the space Lipschitz property of v for ﬁxed time. This property is also shown for example in [29]. Proposition 4.1 (Property (II)5 ). With the property (Γ1 ), v satisﬁes the Burgers’ equation (19), that is to say, in the distribution sense: 1 ∂t v + ∇x |v|2 = 0, 2 which is a generalized form of ∂t v + v · ∇x v = 0. The proof is given in Appendix A. In the case where φ is suﬃciently regular (case where there is no “breaks” in the transport), then ∇x φt is invertible for t ∈ (0, 1), and we directly ﬁnd that 0 = ∂tt ∇x φt = (∂t v + v · ∇x v)(∇x φt ), since ∇x φ − id = ∂t ∇x φt = v(t, ∇x φt ). 5. Existence of a saddle point (proof of Theorem 3.1) In order to prove the existence of a saddle point for the Lagrangian L, we have built a couple densityvelocity ﬁeld (ρ, v) satisfying conditions (II). The velocity ﬁeld v = vφ , immediately checks properties (II)2 (Lemma 4.2) and (II)5 (subsection 4.2). From (20) and (II)2 , v veriﬁes (II)4 (see Remark 3.2). We now have to build a density ρ = ρφ , verifying property (II)1 , such that the pair (ρ, v) satisﬁes the mass conservation of (II)3 . The candidate density is naturally the one formed by the set of intermediate measurements between ρ0 Ld and ρ1 Ld = (∇φ#ρ0 Ld ), which can be assimilated to a series of “optimal micro-transports” along the time scale [0, 1] and correspond to the interpolation of McCann (17), deﬁned at each time t by the density ρt = ρφt of the measure ρt Ld = ρφt Ld = [(1 − t) id +t∇φ]#(ρ0 Ld ) = ∇φt #(ρ0 Ld ).

(26)

For all 1 < p < +∞, McCann also deﬁnes in [23] the “internal energy” of the space (P2 (Rd ), Wp ) as

Fp (μ) =

⎧ ⎪ ⎨ |f (x)|p dLd (x) if μ = f.Ld ∈ P2 (Rd ), d ⎪ ⎩R +∞ otherwise,

(27)

where P2 (Rd ) is deﬁned as the space of probability measures μ on Rd satisfying the condition |x|2 dμ(x) < +∞ (any ﬁnite measure on Rd with compact support is an element of P2 (Rd )). These Rd energies are convex along the geodesics of the space (P2 (Rd ), Wp ), which are the interpolations of McCann. Thus, the function Λp : [0, 1] → [0, +∞], deﬁned for all t ∈ [0, 1] as Λp (t) = Fp (ρt Ld ), for t → ρt deﬁned in (26), is convex on [0, 1], and is moreover ﬁnite in t = 0 and t = 1, since ρ0 , ρ1 ∈ Lp (Rd ), and

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

12

ρ0 Ld , ρ1 Ld ∈ P2 (Rd ). Hence it is ﬁnite and bounded on the whole interval [0, 1]. Therefore, by deﬁnition of Λp , we have ρt ∈ Lp (Rd ) for every t ∈ [0, 1] [1]. Moreover, t → ρt Lp (Rd ) is bounded on [0, 1] by a constant M . We observe that t → ρt is weakly continuous in Lp (Rd ) on [0, 1]. Indeed, t → ρt dLd is continuous from [0, 1] to D (Rd ): it is then suﬃcient to use the density of Cc0 (Rd ) in Lq (Rd ), for q ∈ (1, +∞) such that 1/p + 1/q = 1. Since the function Λp is convex and ﬁnite on [0, 1], t → Λ(t) = ρt pLp (Rd ) is continuous on (0, 1) and admits a right limit at t = 0 and a left limit at t = 1. Thus, the application t → ρt is strongly continuous from [0, 1] to Lp (Rd ) [30]. Conversely, one can characterize the McCann interpolation by the relation 1

∀h ∈

Cc0 ([0, 1]

× R ),

h ρ dx ⊗ dt =

d

h (t, t∇φ(x) + (1 − t)x) ρ0 (x) dx dt. 0

(0,1)×Rd

(28)

Rd

Indeed, using Fubini’s Theorem, it can be shown that for any density ρ satisfying (28), there exists a family of densities (ρt )t∈[0,1) of McCann’s interpolations measures between ρ0 Ld and ρ1 Ld , such that t →

ρt ∈ C 0 [0, 1), Lp (Rd ) and (t, x) → ρt (x) be measurable, and such that ρ(t, x) = ρt (x) for almost all (t, x) ∈ [0, 1) × Rd [18]. From Brenier’s Theorem ([29], p. 66), we have supp(ρ1 ) = ∇φ(supp(ρ0 )). This property also holds for potentials satisfying the property (Γ1 ) considered in Proposition 3.2. Thus, for Ω a convex open set of Rd containing supp(ρ0 ) and supp(ρ1 ), we have the inclusion (t∇φ + (1 − t) id)(supp(ρ0 )) ⊂ Ω, for all t ∈ [0, 1]. Therefore, the candidate density ρ : (t, x) → ρt (x), deﬁned in (26), satisﬁes the condition (II)1 . The component μ = (ρ, ρv) is zero in the neighborhood of the space boundary, and thus veriﬁes the Neumann conditions implicitly included in the weak form of mass conservation (II)3 . Finally, the deﬁnition immediately implies that for all h ∈ C ∞ (Q), 1

1 (∂t h + v · ∇x h) ρ dx dt =

0 Rd

1 = 0

d dt

(∂t h(t, X(t, ·)) + ∂t X(t, ·) · ∇x h(t, X(t, ·))) ρ0 dx dt 0 Rd

h(t, X(t, ·))ρ0 dx dt =

Rd

h(1, ∇x φ(x))ρ0 (x) dx − Rd

(29)

h(0, x)ρ0 (x) dx.

Rd

Since ρ1 Ld = ∇φ#(ρ0 Ld ), we have Rd h(1, ∇x φ)ρ0 dx = Rd h(1, ·)ρ1 dx. The integrals are well deﬁned as v ∈ L∞ (Q). Then, the pair (ρ, v) satisﬁes the condition (II)3 . Moreover, by taking q ≤ 2 such that 1/p + 1/q = 1, the weak relation of mass conservation extends to the test functions h ∈ W 1,q (Q). We can now prove the Theorem. Proof of Theorem 3.1. Let us recall that an element (ψ, q, μ) of Lsp (ρ0 , ρ1 , Ω) must satisfy (ψ, q, μ) ∈ S, as well as the Properties (I) and (II).

(I)1 . From (II)1 and (II)2 , we know that v ∈ L∞ (Q)d and t → ρ(t, ·) = X(t, ·)#ρ0 ∈ C 0 [0, 1], L2 (Ω) . Then μ = (ρ, ρv), q = (−(1/2)|v|2 , v) ∈ L2 (Q), and supp(μ) ⊂ supp(ρ) ⊂ [0, 1] × Ω. Spatial homogeneous Neumann conditions of μ are thus veriﬁed. Moreover, by setting μ = (ρ, ρv) and q = (−(1/2)|v|2 , v), the condition (I)1 is naturally veriﬁed as for all q = (a, b) ∈ P = {(a, b), a ≤ −|b|2 /2}, the paraboloid deﬁned in (10), we have

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

1

μ, q − q =

1 (aρ + b · vρ) dx dt −

0 Ω

1

≤

13

1 2 |v| ρ dx dt 2

0 Ω

1 1 2 1 2 1 2 a + |b| + |v| ρ dx dt − |v| ρ dx dt ≤ 0. 2 2 2

0 Ω

0 Ω

(I)2 . The condition (I)2 results from the conservation of mass (II)3 : satisﬁed by the pair (ρ, v). (I)3 . According to Proposition 4.1, v satisﬁes the Burgers equation (II)5 in the sense of distributions. Moreover, from (II)4 , we know that v derives from a spatial potential. It gives us the existence of one ψ ∈ W 1,∞ (Q) such that q = ∇t,x ψ. Property (I)3 is thus checked. We ﬁnally obtain that the triplet (ψ, q, μ) is an element of W 1,∞ (Q)/R × L∞ (Q)d+1 × L2 (Q)d+1 ⊂ S. From the property ψ ∈ W 1,∞ (Q), the ﬁeld v = ∇x ψ then satisﬁes the properties (II)4 and (II)5 . The ﬁnal regularity properties of vφ given in Theorem 3.1 come from three other results proven in Appendix A): the velocity ﬁeld v is locally Lipschitz on the space (0, 1) × Rd , the property ∇t,x v ∈ L∞ (0, 1; L1 (Ω)), and its corollary v ∈ W 1,p ((0, 1) × Ω) for all 1 ≤ p < 2. 6. Uniqueness properties of saddle points The uniqueness of the density ρ has been proven with the energetic formulation, via the uniqueness of the geodesics in Wasserstein’s space W2 (see [28], Proposition 5.32). We nevertheless here propose a proof based on the structure of the velocity ﬁeld and its regularity properties. More precisely, we use a DiPernaLions method [12], in which the insuﬃcient regularity of the velocity ﬁeld (W 1,p , p < 2, rather than H 1 ) is compensated by the speciﬁc structure of the velocity ﬁeld of an optimal transport. The regularity of the velocity ﬁeld provides (a minima) the regularity of the solution of the dual transport equation of the mass conservation equation. 6.1. Uniqueness of the velocity ﬁeld on the density support We ﬁrst study the uniqueness of the velocity ﬁeld on the support of the diﬀerent candidate densities. We show that for any saddle points of L, denoted by (ψ ∗ , q ∗ , μ∗ ) = (ψ ∗ , q ∗ , (ρ∗ , m∗ )), the densities ρ∗ are transported with the same velocity ﬁeld v. Lemma 6.1. We consider Ω a bounded convex open set of Rd , and ρ0 , ρ1 ∈ L2 (Rd ) two densities which supports are included in Ω. If the triplet (ψ ∗ , q ∗ , μ∗ ) is an element of Lsp (ρ0 , ρ1 , Ω) with μ∗ = (ρ∗ , m∗ ), then, for all φ ∈ Φ(ρ0 Ld , ρ1 Ld ), we have m∗ = ρ∗ v, with v = vφ deﬁned in (16). Sketch of the proof. Following Fig. 1, we give a “schematic” proof of the uniqueness of the velocity ﬁeld on the union of supports of the candidate densities, which is based on the convexity of the set of saddle points and the strict convexity of the paraboloid P deﬁned in (10). For a more rigorous proof we refer to [18] (chapter 4). We recall that Q = (0, 1) × Ω. We assume (ψ1 , q1 , μ1 ) and (ψ2 , q2 , μ2 ) to be two saddle points of L. The ﬁelds μ1 and μ2 are both orthogonal in the sense of the canonical scalar product of L2 to the paraboloid P, respectively at points q1 and q2 , which imply a pointwise orthogonality almost everywhere. We will see in section 8, that the set of saddle points of L is convex so that the λ[(ψ1 , q1 , μ1 ) + (1 − λ)(ψ2 , q2 , μ2 )] is

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

14

Fig. 1. Illustration of the characterization of the saddle points of L.

also a saddle point of L, for all λ ∈ [0, 1]. Then, for almost all (t, x) ∈ Q and for all λ ∈ [0, 1], the vector [λμ1 + (1 − λ)μ2 ](t, x) is also orthogonal to P at point [λq1 + (1 − λ)q2 ](t, x) which is strictly inside P if q1 (t, x) = q2 (t, x). Then we have [λμ1 + (1 − λ)μ2 ](t, x) = 0 for all λ ∈ [0, 1], i.e. μ1 (t, x) = μ2 (t, x) = 0. Therefore, almost everywhere, when (μ1 (t, x), μ2 (t, x)) = (0, 0), we have q1 (t, x) = q2 (t, x), and then μ1 (t, x) and μ2 (t, x) are colinear. We can conclude if one of these two saddle points is the one we have constructed above. To sum up: for any ﬁxed φ ∈ Φ(ρ0 Ld , ρ1 Ld ) and for every saddle point (ψ ∗ , q ∗ , μ∗ ) of L, ρ∗ is associated with the same velocity ﬁeld v with ∂t ρ∗ + divx (ρ∗ v) = 0 with the initial and ﬁnal conditions ρ∗ (0, ·) = ρ0 and ρ∗ (1, ·) = ρ1 . In other words, for all h ∈ H 1 (Q), we have 1

(∂t h(t, x) + v(t, x).∇x h(t, x)) ρ∗ (t, x) dx dt +

0 Ω

h(0, ·)ρ0 −

Ω

h(1, ·)ρ1 = 0. Ω

We now prove that there exists a unique ρ∗ which satisﬁes these conditions, that is to say ρ∗ (t, ·) = X(t, ·)#ρ0 , with X = Xφ as deﬁned in (15), for all φ ∈ Φ(ρ0 Ld , ρ1 Ld ). 6.2. Uniqueness of density in L2 As the velocity ﬁeld v is unique on the union of the supports of the candidate densities, the uniqueness of the density ρ implies the uniqueness of the momentum m = ρv. The next proposition is the main ingredient to show the uniqueness of ρ. Proposition 6.1. Let φ : Rd → R a convex potential satisfying property (Γ1 ), ρ0 ∈ L2 (Rd ) with a bounded support, and consider a velocity ﬁeld v = vφ deﬁned from φ as (16). If ρ ∈ L2 ((0, 1) × Rd ) is a density with bounded support in [0, 1] × Rd , such that ∂t ρ + divx (ρv) = 0, ρ(0, ·) = ρ0 in the sense of distributions, i.e.

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

∀h ∈

Cc∞ ([0, 1)

1 × R ),

(∂t h + v · ∇x h) ρ dx dt +

d

15

0 Rd

h(0, ·)ρ0 dx = 0,

(30)

Rd

then ρ(t, ·) = ρφ (t, ·) = (t∇φ + (1 − t) id)#ρ0 for almost all t ∈ [0, 1] (as deﬁned in (17)). In other words: 1 ∀ϕ ∈

Cc0 ((0, 1)

× R ), d

Moreover t → ρ(t, ·) ∈ C

ϕ (t, t∇φ(x) + (1 − t)x) ρ0 (x) dx dt.

ϕρ dx dt = 0

0

1 0

Rd

(31)

Rd

[0, 1), L2 (Rd ) .

Proof. Let Ω be a bounded open set of Rd such that supp(ρ0 ) is included in Ω and supp(ρ) ⊂ [0, 1] × Ω, and let Q = (0, 1) × Ω. Let (ψ, q, μ) be a saddle point of L as deﬁned in (11), and let φ ∈ Φ(ρ0 Ld , ρ1 Ld ) (thus checking Property (Γ1 ) on Rd ). The triplet (ψ, q, μ) satisﬁes the Properties (I), which implies in particular the weak mass conservation G(h) + μ, ∇t,x h = 0 for all h ∈ H 1 (Q), as well as the linear relation between momentum and density: μ = (ρ, m) = (ρ, ρv) ∈ L2 (Q), with v = vφ deﬁned in (16) and satisfying the properties (II)4 and (II)5 (see subsection 3.2). From these properties, we deduce that for every h ∈ H 1 (Q): 1

(∂t h + v · ∇x h) ρ dx dt =

0 Ω

h(1, ·)ρ1 dx −

Ω

h(0, ·)ρ0 dx.

(32)

Ω

Let ϕ ∈ Cc∞ (0, 1) × Rd such that supp(ϕ) ⊂ (0, 1) × Ω ⊂ Q. By solving the transport problem in v and ϕ with a characteristics method, we consider the function h deﬁned for any (t, x) ∈ (0, 1) × Rd by: 1 h(t, x) = −

ϕ(s, (s − t)v(t, x) + x) ds,

(33)

t

which satisﬁes ∂t h + v · ∇x h = ϕ

and ∇t,x h = t (∇t,x v) α + β,

(34)

1 d+1 with α ∈ L∞ (Q)d , and β ∈ L∞ . We also have h(1, ·) = 0 and h(0, ·) = − 0 ϕ(t, X(t, ·)) dt with loc (Q) X(t, ·) = t∇φ + (1 − t) id. To solve the problem, it would be suﬃcient to take h in (32) as a test function. Unfortunately, the velocity ﬁeld only satisﬁes ∇t,x v ∈ L∞ (0, 1; L1 (Ω)) and v ∈ W 1,p ((0, 1) × Ω) for all 1 ≤ p < 2 (see Appendix A), thus v = vφ ∈ H := ∩1≤p<2 W 1,p (Q). The function h is thus an element of H. Since we do not have a better integrability than L2 on ρ, we cannot extend the space of test functions of (32) to a larger space than H 1 (Q) as H. The counter-example of Caﬀarelli on the strict division of the mass show that, in general, the ﬁeld v is not an element of H 1 (Q). We then choose to approximate h by regularizing the velocity ﬁeld v associated to the transport plan ∇φ with vγ = vγφ associated to the regularized transport plans ∇γφ, where γφ denotes the γ-regularization of the potential φ by a Moreau’s envelope (see beginning of section 4). This regularization has the property of erasing the breaks of the transport plan, which are responsible for the fact that v does not have regularity H 1 in the neighborhood of t = 0. With a characteristics method, it is thus possible to construct some test functions hγ ∈ H 1 (Q), uniformly zero in the neighborhood of t = 1 (independently of γ), such that: ∂t hγ + vγ · ∇x hγ = ϕ = ∂t hγ + v · ∇x hγ + (vγ − v) · ∇x hγ ,

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

16

1 and such that hγ (0, ·) converges to − 0 ϕ[t, (1 − t) id +t∇x φ] dt in L2 (Rd ) when γ tends to 0. By injecting such a function hγ in (32), one obtains 1

1 ϕ[t, (1 − t)x + t∇x φ(x)]ρ0 (x) dx dt + Rγ (ϕ), with

ϕρ dx dt = 0

1

0

Rd

(vγ − v) · ∇x hγ ρ dx dt −

Rγ (ϕ) =

(35)

Rd

0 Rd

1 hγ (0, ·)ρ0 dx −

ϕ[t, (1 − t) id +t∇x φ]ρ0 dx dt.

(36)

0 Rd

Rd

In order to prove Proposition 6.1, it is therefore necessary to show that Rγ (ϕ) converges to 0 when γ tends to 0. This makes use of the results of Appendix A. The hγ are deﬁned with respect to vγ by (33). We can then prove from (34) that we have, |∇x hγ (t, x)| ≤ (|∇x vγ (t, x)| + 1)∇x ϕL∞ ([0,1]×Rd ) , for almost all (t, x) ∈ (0, 1) × Rd .

(37)

For more details, we refer to [18] (subsection 4.2.5), in which it is proven that hγ ∈ H 1 (Q). The velocity ﬁeld vγ can be extended by continuity in t = 0. It is the same for hγ , which is continuous on [0, 1) × Ω, and for all x ∈ Rd , one has: 1 hγ (0, x) = −

1 ϕ(s, s vγ (0, x) + x) ds = −

0

ϕ(s, s∇γφ(x) + (1 − s)x) ds,

(38)

0

which coincides with the trace L2 of hγ in t = 0. We can show that ∇γφ(x) converges for almost all x ∈ Ω to ∇φ(x) (see Lemma A.4 in Appendix A), for all x where φ is diﬀerentiable. In addition, for all (s, x) ∈ (0, 1) × Ω, the term ϕ(s, s∇γφ(x) + (1 − s)x) is uniformly bounded by ϕL∞ . Thus, by dominated convergence, we have

1 hγ (0, ·)ρ0 dx +

rγ (ϕ) = Ω

ϕ(s, s∇φ(x) + (1 − s)x)ρ0 (x) dx ds −→ 0. γ→0

(39)

0 Ω

Let tm ∈ (0, 1) such that supp(ϕ) ⊂ (0, tm ) × Ω. From (37), we thus have ⎛t ⎞ m tm |Rγ (ϕ)| ≤ |rγ (ϕ)| + ∇x ϕL∞ ⎝ |v − vγ | · |∇x vγ | · |ρ| dx dt + |v − vγ | · |ρ| dx dt⎠ . 0 Ω

0 Ω

We can prove that |v − vγ | is uniformly bounded and simply converges to 0 on (0, 1) × Ω when γ tends to 0. t Thus, since ρ ∈ L2 ((0, 1) ×Ω), we conclude via dominated convergence that the term 0 m Ω |v − vγ |·|ρ| dx dt converges to 0. t Finally, to complete the proof of Proposition 6.1, we have to show that 0 m Ω |v − vγ | · |∇x vγ | · |ρ| dx dt converges to 0. This is the subject of the following Lemma 6.2, which proof will end the one of the current Proposition 6.1. Lemma 6.2. Let φ : Rd → R be a convex potential verifying the property (Γ1 ). We consider a velocity ﬁeld v = vφ deﬁned as in (16) and 0 < tm < 1. Let ρ ∈ L2 ((0, 1) × Rd ), with bounded support into [0, 1] × Rd . Let Ω a bounded open set of Rd such that supp(ρ0 ) ⊂ Ω and supp(ρ) ⊂ [0, 1] × Ω. For any γ > 0, we deﬁne

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

17

vγ = vγφ , where γφ is the Moreau envelope of φ by the parameter γ (see beginning of section 4). Then we have the result of convergence: tm |v − vγ | · |∇x vγ | · |ρ| dx dt −→ 0.

(40)

γ→0

0 Ω

Before giving a sketch of proof of the previous Lemma, let us state two results concerning the control of the regularized velocity ﬁelds vγ . The ﬁrst one is an important uniform regularity result for the velocity ﬁeld and its regularization. Proposition 6.2. We assume that φ satisﬁes the property (Γ1 ). Let R > R > 0 and a ∈ Rd such that φ(a) = inf φ and let M = sup |∂φ(x)|. Rd

x∈B(a,2(R+|a|))

Then there exists a constant C > 0, independent of φ, γ, a, R and R , such that for all t0 ∈ (0, 1) satisfying the condition t0 < min {1/2, (R − R)/(M + 2|a|)}, and by setting v0 = v, we have the property: ∀γ ≥ 0, ∀t ∈ (0, t0 ],

|∇x vγ (t, x)| dx ≤

C Ld (B(a, R )). t0 (1 − t0 )

(41)

B(a,R)

Thus ∇x vγ ∈ L∞ (0, t0 ; L1 (B(a, R))), for all γ ≥ 0. The proof of this proposition is the main object of Appendix A. Remark 6.1. Proposition 6.2 is an important result on the control of the velocity ﬁeld of a dynamical optimal transport: in particular, we have ∇t,x v ∈ Lp (0, 1; Lqloc (Rd )) for all 1 < p ≤ +∞ and 1 ≤ q < +∞ such that 1,p 1/p + 1/q > 1, especially ∇t,x v ∈ L∞ (0, 1; L1 (Ω)) and v ∈ Wloc ([0, 1] × Rd ) for all 1 ≤ p < 2. Corollary 6.1. Let φ satisfy the property (Γ1 ) and Ω ⊂ Rd be a bounded open set and 0 < tm < 1. Then there exists a constant K > 0 such that for every γ > 0 and any t ∈ [0, tm ], we have |∇x vγ (t, x)| dx ≤ K.

(42)

Ω

Proof. For t < t0 , we apply Proposition 6.2; and for tm ≥ t > t0 we use the fact that the term |∇x v(t0 , ·)|1 is bounded by c/t(1 − t), for c a constant depending only on the chosen norm. Sketch of proof of Lemma 6.2. For any convex, l.s.c. proper function f , the operator id − Proxγf is nonexpansive and ∇(γf )(x) ∈ ∂f (Proxγf (x)). Moreover, there exists a constant C > 0 such that for all t ∈ (0, 1), ∇x v(t, ·)L∞ (Rd ) ≤

C D2 φ ∞ d + d . L (R ) 1−t

Hence there exists a constant c, independent of γ, such that for every 1 ≥ γ > 0 and t ∈ (0, 1) ∇x vγ (t, ·)L∞ (Rd ) ≤

c min 1−t

1 1 , t γ

.

(43)

By partially bounding |∇x vγ |1 from above (i.e. we bound only aβ when a = aα aβ , 1 = α+β), and applying Corollary 6.1, we can show that the convergence result (40) is true for all p > 2 such that ρ ∈ Lp (Q), and that there exists a constant M such that:

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

18

tm ∀γ ∈ (0, 1],

|v − vγ | · |∇x vγ | · |ρ| dx dt ≤ M ρL2 (Q) .

(44)

0 Ω

Therefore, by density of Lp (Q) in L2 (Q), (40) is also true for p = 2. 6.3. Proof of Theorem 3.2 To ﬁnish the proof of Theorem 3.2, we only have to show that a density ρ∗ associated to one of the saddle points (ψ ∗ , q ∗ , μ∗ ) of L veriﬁes the conditions of Proposition 6.1. Proof. Let (ψ ∗ , q ∗ , μ∗ ) be a saddle point of L. According to Proposition 3.1, we have G(h) + μ∗ , ∇t,x h = 0, for all h ∈ H 1 (Q). Let φ ∈ Φ(ρ0 Ld , ρ1 Ld ), thus verifying property (Γ1 ). Following Lemma 6.1, by deﬁning v = vφ on (0, 1) × Rd as in (16), we have m∗ = ρ∗ v. Hence for all h ∈ H 1 (Q):

(∂t h + v · ∇x h) ρ∗ dx dt +

Q

h(0, ·)ρ0 dx −

Ω

h(1, ·)ρ1 dx = 0.

(45)

Ω

Let ρ∗ ∈ L2 ((0, 1) × Rd ) be the extension in 0 of ρ∗ on (0, 1) × Rd . Notice that for any test function 1 h ∈ Hloc ((0, 1)×Rd ) we have h|Q ∈ H 1 (Q) and ∇t,x h|Q = (∇t,x h)|Q . Hence the relation (45) can be extended from Q to the entire space (0, 1) × Rd . According to Proposition 6.1, we thus have the equivalence ρ∗ (t, ·) =

ρφ (t, ·) = (t∇φ +(1 −t) id)#ρ0 for almost all t ∈ [0, 1], with in addition t → ρφ (t, ·) ∈ C 0 [0, 1), L2 (Rd ) . 7. Characterization of an optimal transport velocity ﬁeld We now present a generalization of our study on the uniqueness of the component density-momentum μ that characterizes less formally an optimal transport velocity ﬁeld. The result means that the velocity ﬁeld has to satisfy properties (II)4 and (II)5 and can be understood as follows. Any density of L2 , with bounded support, and advected by a locally bounded divergence free velocity ﬁeld, whose trajectories are all straight lines (and in particular never intersect), corresponds to an optimal transport, i.e. an interpolation of McCann, and is the only solution for such a displacement. For a convex open set Ω of Rd , we deﬁne the space bL2+ ((0, 1) × Ω) of densities ρ ∈ L2 ((0, 1) × Ω) which are non-negative and with compact supports into [0, 1] × Ω. Theorem 7.1. Let Q = (0, 1) × Ω, Ω being a convex open set of Rd not necessarily bounded. Let v ∗ be a velocity ﬁeld on Ω satisfying properties (II)4 and (II)5 and ρ0 ∈ L2 (Ω), with ρ0 ≥ 0 and supp(ρ0 ) bounded in Ω. Let ρ∗ ∈ bL2+ (Q) be a solution in the sense of distributions of

∂t ρ + divx (ρv ∗ ) = 0, ρ(0, ·) = ρ0 .

(46)

Then the density ρ∗ is the unique solution of problem (46) in the space bL2+ (Q) and we have ρ∗ ∈

C 0 [0, 1), L2 (Ω) . Moreover, there exists a unique non-negative Radon measure ν1 on Ω, which support is bounded in t∈[0,1] supp(ρ∗ (t, ·)) and a convex function φ on Rd verifying property (Γ1 ), such that

ν1 = ∇φ# ρ0 Ld and ∀t ∈ [0, 1), (ρ∗ (t, ·)Ld ) = (ρ(t, ·)Ld ) = (t∇φ + (1 − t) id) #(ρ0 Ld ),

(47)

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

19

(with ρ = ρφ deﬁned in (17)) which makes the link with McCann’s interpolation. The couple (ρ∗ , v ∗ ) is then solution of

∂t ρ∗ + divx (ρ∗ v ∗ ) = 0, ρ∗ (0, ·) = ρ0 , ρ∗ (1, ·) = ν1 ,

(48)

which gives in the weak sense ∀h ∈

Cc∞ ([0, 1]

1 × Ω), 0 Ω

(∂t h + v ∗ · ∇x h) ρ∗ dx dt +

h(0, ·)ρ0 dx −

Ω

h(1, ·) dν1 = 0.

(49)

Ω

Finally, for all (t, x) ∈ supp(ρ∗ ), we have v ∗ (t, x) = vφ (t, x), deﬁned in (16). The ﬁeld v ∗ thus satisﬁes the properties of the velocity ﬁeld v on supp(ρ∗ ). In particular, v is locally Lipschitz on (0, 1) × Rd and it 1,p satisﬁes ∇t,x v ∈ L∞ (0, 1; L1loc (Rd )) and v ∈ Wloc ([0, 1] × Rd ) for all 1 ≤ p < 2. Sketch of the proof. The proof gathers elements from sections 5 and 6. It is technical, as the ﬁnal measure (46) is no longer a density measure, of type ρ1 Ld , but simply a ﬁnite Radon measure ν1 . We only give the main steps of the proof and refer to [18] (section 4.3) for details. The ﬁrst step consists in proving the existence, in the sense of distributions, of the ﬁnal measure ν1 , as deﬁned in the statement of the Theorem. For this purpose we use classical functional analysis tools such as the Riesz Representation Theorem [27]. We also show that the weak formulation (49) is always valid for test functions taken from Wc1,∞ ([0, 1] × Ω). Next we consider φ, an optimal transport potential between ρ0 Lp , and ν1 satisfying property (Γ1 ). We take a saddle point (μ, q, ψ), as done in section 5 and built in Theorem 3.1. As Brenier’s Theorem only assumes density for the initial density ρ0 Ld , Proposition 3.2 is still valid. This is also the case for all inductions done while building the velocity ﬁeld v = vφ . However, we can not obtain ρ = ρφ ∈ L2 (Q), since it requires ν1 = ρ1 Ld with ρ1 ∈ L2 (Ω). Hence, we can not extend the test functions of the weak formulation 1 of the mass conservation for the pair (ρ, v) to the space Hloc (Q), which only considers absolutely continuous 1,∞ initial measures. Extending test functions to the space Wloc (Q) is required as the potential ψ necessarily belongs to this space. We then build a second saddle point from the pair (ρ∗ , v ∗ ), i.e. a triplet (μ∗ , q ∗ , ψ ∗ ), with μ∗ = (ρ∗ , ρ∗ v ∗ ), ∗ q = (−(1/2)|v ∗ |2 , v ∗ ) and ∇t,x ψ ∗ = q ∗ . We can then, as well as for Lemma 6.1, prove the uniqueness of the velocity ﬁeld on the supports of ρ and ρ∗ , i.e. ρ∗ v ∗ = ρ∗ v. Although our triplets (μ, q, ψ) and (μ∗ , q ∗ , ψ ∗ ) are not necessarily in L2 , and we can no longer speak of “projections” and “orthogonality” as in the schematic proof of Lemma 6.1, the reasoning remains globally the same and we reach the same conclusion (Lemma 4.3-14 of [18]). Thus, according to Proposition 6.1, the density ρ∗ veriﬁes the relation

(47), with t → ρ∗ (t, ·) ∈ C 0 [0, 1), L2 (Ω) . Now, let us show that ρ∗ is the only solution with bounded support of the problem (46) in the space b 2 L+ (Q). Assume there exists two solutions ρ1 , ρ2 of (46) in bL2+ (Q). Then the density ρ = (ρ1 + ρ2 )/2 is still a solution in bL2+ (Q). Therefore, there would exist a convex function φ of Rd satisfying property (Γ1 ), such that, by deﬁning vφ as in (16), we have ρ v ∗ = ρ vφ , i.e. (ρ1 + ρ2 ) v ∗ = (ρ1 + ρ2 ) vφ . The ﬁeld v ∗ is then almost everywhere equal to the ﬁeld vφ on supp(ρ1 ) ∪ supp(ρ2 ), thus ρ1 v ∗ = ρ1 vφ and ρ2 v ∗ = ρ2 vφ . Therefore, ρ1 and ρ2 both solves problem (46), by replacing v ∗ with vφ . According to Proposition 6.1, we

thus have ρ1 = ρ2 : t → ρφ = t∇φ + (1 − t) id #ρ0 in L2 ((0, 1) × Rd )), and then in L2 (Q). 8. Convergence of the algorithm The convergence of Benamou-Brenier algorithm to a saddle point of the Lagrangian L is shown.

20

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

Proposition 8.1. (ψ, q, μ) is a saddle point of the Lagrangian L deﬁned in (9) if and only if it is a ﬁxed point of the Benamou-Brenier algorithm (12)-(14). Proof. Let (ψ, q, μ) be a saddle point of L deﬁned in (9). We denote by (ψ , q , μ ) the new triplet obtained after one iteration of the algorithm. Let us show that (ψ , q , μ ) = (ψ, q, μ) in S. From property (I)2 of (I), and taking h = ψ − ψ, we obtain ∇t,x (ψ − ψ)2 = 0 in step A (12). According to the Poincaré inequality, we get ψ = ψ in H 1 ((0, 1) ×Ω)/R. In step B (13), we look for the unique q verifying μ +∇ψ −q , p −q ≤ 0, for all p ∈ P. From Properties (I)1 and (I)3 which characterize a saddle point, we get ψ = ψ . Having q = q is a good candidate for q and therefore the only one. Finally, ∇t,x ψ = ∇t,x ψ = q = q and we get μ = μ in step C (14). Finally, let (ψ, q, μ) be a ﬁxed point of the algorithm. Let us show that it is a saddle point of L. Step C gives immediately ∇t,x ψ = q, and consequently step B gives μ, p − q ≤ 0 for all p ∈ P. From step A, we have G(h) + μ, ∇t,x h = 0 for all h ∈ H 1 (Q). Since the three Properties (I) are veriﬁed, (ψ, q, μ) is a saddle point of L from Proposition 3.1. It is now possible, according to Proposition 8.1, to reformulate the problem of convergence of the algorithm (12)-(14) to a saddle point of the Lagrangian L, as a problem of convergence to a ﬁxed point of an operator that concatenates the 3 steps of the Benamou-Brenier algorithm. For this purpose, we rely on ﬁxed points of non-expansive operators. This strategy can indeed be used to show the convergence, weak or strong, of proximal splitting algorithms [21,10,3,6]. We consider the space H = L2 (Q)d+1 × L2 (Q)d+1 , provided with the scalar product (μ1 , q1 ), (μ2 , q2 )H = μ1 , μ2 L2 + r2 q1 , q2 L2 , so that (H, ., .) is an Hilbert space. Let B : H → H be the operator which associate to (μ, q) the product (μ , q ) of steps B (13) and C (14) of Benamou-Brenier’s algorithm. Here ψ just acts as an auxiliary variable. Indeed, if (μ∗ , q ∗ , ψ ∗ ) is a saddle point of the Lagrangian L deﬁned in (9), then (μ∗ , q ∗ ) = B(μ∗ , q ∗ ). Conversely if (μ∗ , q ∗ ) is a ﬁxed point of B then (μ∗ , q ∗ , ψ ∗ ) is a saddle point of the Lagrangian, where ψ ∗ is the unique element of S which satisﬁes q ∗ = ∇ψ ∗ . We then have the following result, demonstrated in Appendix B. Proposition 8.2. Operator B is non-expansive on H and quasi-ﬁrmly non-expansive on B(H). As B is non-expansive, the Benamou-Brenier algorithm weakly converges in L2 to a ﬁxed point of B (i.e. a saddle point of L) whose existence is shown by Theorem 3.1. From the quasi-ﬁrmly non-expansiveness, we can also deﬁne a relaxed version of the algorithm with strong-L2 convergence. We refer to Appendix B and the results of H. Bauschke in [2] for more details. Notice that Proposition 8.2 also justiﬁes the convexity (and closure) of the set of saddle points of L. The set of ﬁxed points of a non-expansive operator is indeed a closed convex set. The operator B immediately gives us this convexity for the components μ and q. The characteristic (I)3 of the Properties (I) as well as the linearity of the gradient operator ∇t,x transfer this convexity to the component ψ, and therefore to the set of saddle points of L. 9. Conclusion and perspectives The starting point of our work is the study of the Lagrangian augmented algorithm of Benamou-Brenier [4]. We show in the section 8 the convergence of this algorithm to a saddle point of the Lagrangian L, which models the dynamic formulation of the optimal transport problem.

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

21

The convergence of the algorithm is conditioned by the existence of a saddle point for the Lagrangian L. In sections 4 and 5, we thus tackle the problem of existence of such saddle points. In section 6, we show the uniqueness of the couple density/momentum. Our proof is based on the properties of regularity and the dynamic structure of the velocity ﬁeld, whereas previous methods such as [1] or [28] use the energetic formulation and geodesic displacements in Wasserstein’s spaces. Our study has been carried out in the most general conditions, especially in cases where the initial and ﬁnal densities ρ0 and ρ1 vanish on some subset of the transport domain, which generally involve non-regular optimal transport plans. As far as we know, this is the ﬁrst convergence proof of the Benamou-Brenier algorithm for vanishing densities in L2 . We also present in Proposition 6.2 new results on the regularity and the control of a velocity ﬁeld associated to an optimal transport. In Theorem 7.1, we ﬁnally exhibit the minimal properties that a velocity ﬁeld has to satisfy in order to be associated to an optimal transport in L2 . In forthcoming works, we would like to analyze the convergence properties of dynamic optimal transport algorithms: stopping or distance criteria with respect to the solution, theoretical information on the speed of convergence, etc. Another perspective concerns the extension of our existence and uniqueness results in L2 to generalized dynamic optimal transport settings, especially non-isotropic domains (see [19]) or Riemannian manifolds. Acknowledgments This study has been carried out with ﬁnancial support from the French State, managed by the French National Research Agency ANR TOMMI (ANR 2011 BS01 014 01) and GOTMI (ANR-16-CE33-0010-01). The project has also received funding from Grenoble INP, through SEI grant TOSCANA. Appendix A. Burgers equation and regularity results on the velocity ﬁeld of dynamic optimal transport The aim of this ﬁrst section is to give a sketch of proof for Proposition 4.1 (Burgers equation is satisﬁed by the velocity ﬁeld v = vφ ) and Proposition 6.2 (uniform control of the gradients of ﬁelds v = vφ and vγ = vγφ ) of the main paper. With Proposition 6.2, we can state important regularity results presented in subsection A.3 below in Corollary A.1 and Corollary A.2 (Remark A.2 in the main paper): ∇t,x v ∈ Lp (0, 1; Lqloc (Rd )) for all 1 < 1,p p ≤ +∞ and 1 ≤ q < +∞ such that 1/p + 1/q > 1, and especially v ∈ Wloc ([0, 1] × Rd ) (i.e. ∀Ω ⊂ Rd , v ∈ 1,p W ((0, 1) × Ω)) for all 1 ≤ p < 2. We ﬁrst recall Hypothesis Γ1 . Property (Γ1 ). φ and φ∗ are convex, continuous and achieve a minimum on Rd . The statements of the two results we prove in this appendix are the following. Proposition A.1 (Proposition 4.1 of the main paper). With the property (Γ1 ), v satisﬁes the Burger’s equation, that is to say, in the distribution sense: 1 ∂t v + ∇x |v|2 = 0, 2

(A.1)

which is a generalized form of ∂t v + v · ∇x v = 0. Proposition A.2 (Proposition 6.2 of the main paper). We assume that φ satisﬁes the property (Γ1 ). Let R > R > 0 and a ∈ Rd such that φ(a) = inf φ and let M = sup |∂φ(x)|. Rd

x∈B(a,2(R+|a|))

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

22

Then there exists a constant C > 0 – independent of φ, γ, a, R and R , such that for all t0 ∈ (0, 1) satisfying the condition t0 < min {1/2, (R − R)/(M + 2|a|)}, and by setting v0 = v, we have the property: ∀γ ≥ 0, ∀t ∈ (0, t0 ],

|∇x vγ (t, x)| dx ≤

C Ld (B(a, R )). t0 (1 − t0 )

(A.2)

B(a,R)

Thus ∇x vγ ∈ L∞ (0, t0 ; L1 (B(a, R))), for all γ ≥ 0. In subsection A.1, we ﬁrst consider an “ideal” framework, i.e. without breaks. For this purpose, we assume that the potential φ is regular and satisﬁes the following property (Γ2 ). Property (Γ2 ). φ satisﬁes (Γ1 ), is of class C 1 , and ∇x φ is Lipschitz (i.e. φ ∈ C 1,1 (Rd )). The propositions are ﬁrst established under the assumption (Γ2 ) and then extended to the general framework for φ satisfying the property (Γ1 ). To that end, a regularization of the potential φ with Moreau envelope is considered in the subsection A.2. The ﬁnal proofs (by compilation of previous results) of Propositions A.1 and A.2 are stated in subsection A.3. In subsection A.4, we give additional results concerning the regularity of the velocity ﬁeld. These complementary results are not directly necessary for the proofs of uniqueness of the main paper. A.1. Regular case (Γ2 ) We assume that φ satisﬁes the hypothesis (Γ2 ). In this “ideal” case (without breaks), we want to show that the velocity ﬁeld v satisﬁes the Burgers equation in the sense of distributions (A.1), as well as the

1 d control of the gradient of the velocity ﬁeld: ∇t,x v ∈ L∞ loc [0, 1), L (R ) . The operator p can be interpreted as a spatial “inverse” of the operator X(t, ·) = ∇x (φt ) = (1−t) id +t∇φ. It is indeed invertible when φ satisﬁes the hypothesis (Γ2 ). In other words, we have y = p(t, x) ⇔ x = (1 − t)y + t∇φ(y).

(A.3)

As a consequence, the velocity v can be deﬁned from p by: v(t, x) = ∇φ(p(t, x)) − p(t, x) =

x − p(t, x) , t

(A.4)

and can be continuously extended on [0, 1) × Rd (i.e. in t = 0): v(0, ·) = ∇φ − id. Remark A.1. We note that φt = (1/2)(1 − t)| · |2 + tφ is of class C 1 , strictly convex and superlinear, since φ is convex and | · |2 is strictly convex and superlinear. Thus, for all t ∈ (0, 1), X(t, ·) = (1 − t) id +t∇φ = ∇φt is bijective of inverse p(t, ·) = ∇x (φt )∗ . Proposition A.3. Under the property (Γ2 ), v satisﬁes the Burgers equation (A.1) in the sense of distributions. Sketch of the proof. If the potential φ is of class C 1 , then by diﬀerentiating the advection relation ∇x φ−id = ∂t ∇x φt = v(t, ∇x φt ), we obtain (see Remark A.2): 0 = ∂tt ∇x φt = (∂t v + v · ∇x v)(∇x φt ).

1 d In order to prove the second regularity result ∇t,x v ∈ L∞ loc [0, 1), L (R ) , we now present intermediate results on the potential φ.

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

23

Proposition A.4. We assume that φ satisﬁes the property (Γ2 ). Let R > R > 0 and a ∈ Rd such that φ(a) = inf φ. Then, for all t ∈ [0, t0 ], there exists t0 ∈ (0, 1) such that for all t ∈ [0, t0 ], with Rd

t0 < min

1 R − R , 2 M + 2|a|

for

M=

sup

|∂φ(x)|,

(A.5)

x∈B(a,2(R+|a|))

we have p(t, B(a, R)) ⊂ p(t0 , B(a, R )). Sketch of the proof. Let x ∈ B(a, R) and t, t0 ∈ [0, 1) such that t0 > 0 and t ∈ [0, t0 ]. We have p(t, x) ∈ p(t0 , B(a, R )) if and only if there exists y ∈ B(a, R ) such that p(t0 , y) = p(t, x). According to the deﬁnition of p (by the equivalence (A.3)), for y ∈ Rd we have: p(t0 , y) = p(t, x) ⇔ y = (1 − t0 ) p(t, x) + t0 ∇φ(p(t, x)).

(A.6)

Let us take y = (1 − t0 ) p(t, x) + t0 ∇φ(p(t, x)) and look for a suﬃcient condition on t0 for y ∈ B(a, R ). We have p(t, (1 − t)a) = a and then | p(t, x) − a| ≤ (R + |a|)/(1 − t0 ). By taking t0 ≤ 1/2, we thus have p(t, x) ∈ B(a, 2(R + |a|)). For all t ∈ [0, t0 ], for x ∈ B(a, R) and y = (1 − t0 ) p(t, x) + t0 ∇φ(p(t, x)), we get |y − a| ≤ (1 − t0 )| p(t, x) − a| + t0 |∇φ(p(t, x)) − a| ≤ R + t0 (M + 2|a|) . If we assume t0 < min{1/2, (R − R)/(M + 2|a|)}, then for all t ∈ [0, t0 ] and x ∈ B(a, R), y = (1 − t0 ) p(t, x) + t0 ∇φ(p(t, x)) ∈ B(a, R ) i.e. p(t0 , y) = p(t, x), and therefore we have the inclusion p(t, B(a, R)) ⊂ p(t0 , B(a, R )). Lemma A.1. We assume that φ satisﬁes the property (Γ2 ). For every t ∈ [0, 1), p(t, ·) is diﬀerentiable almost everywhere on Rd . Moreover, for almost every x ∈ Rd , ∇φ is diﬀerentiable in p(t, x) and D2 φ is such that: ∇x p(t, x) = (t D2 φ(p(t, x)) + (1 − t)I)−1

(A.7)

where I ∈ Md (R) is the identity matrix. Proof. Let t ∈ [0, 1). The operator p(t, ·) is Lipschitz and bijective, and from (A.3) its inverse is p(t, ·)−1 = t∇φ + (1 − t) id = X(t, ·). Recall that by hypothesis ∇φ is assumed to be Lipschitz. Rademacher’s Theorem (see for instance [13], p. 81) states that if f : Rd −→ Rm is a locally Lipschitz function, then f is Ln -almost everywhere Fréchet-diﬀerentiable (and its diﬀerential in the sense of Fréchet coincides with its diﬀerential in the sense of distributions). According to this last Theorem, ∇φ and p(t, ·) are diﬀerentiable almost everywhere on Rd and their gradients coincide with their derivatives in the sense of distributions. Thus the set F of points in Rd where ∇φ is not diﬀerentiable is of zero Lebesgue measure. Since ∇φ is assumed to be Lipschitz, then so does X(t, ·), which gives Ld (X(t, F )) = 0 ([13], p. 75). As X(t, F ) is the set of points x ∈ Rd for which ∇φ is not diﬀerentiable in p(t, x), this means that ∇φ is diﬀerentiable in p(t, x) for almost all x ∈ Rd . Hence p(t, ·) is diﬀerentiable at almost every x ∈ Rd , ∇φ is diﬀerentiable at p(t, x) and I = (t D2 φ(p(t, x)) + (1 − t)I)∇x p(t, x). The potential φ being convex, D2 φ(p(t, x)) is symmetric positive, and hence by coercivity, t D2 φ(p(t, x)) +(1 −t)I is symmetric positive deﬁnite and therefore invertible in M2 (R), which concludes the proof. Remark A.2 (A useful example of the application of Rademacher’s Theorem). We know that v is locally Lipschitz on (0, 1) × Rd . Thus, according to Rademacher’s Theorem, v is diﬀerentiable almost everywhere

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

24

on (0, 1) × Rd , and its diﬀerential corresponds to its derivative in the sense of distributions. In particular, we have v · ∇x v =

1 ∇x |v|2 , 2

(A.8)

in the sense of Frechet (Ld+1 -almost everywhere) and in the sense of distributions. Remark A.3 (On the contribution of property (Γ2 ) in the previous proof). Note that for every t ∈ [0, 1), the operator p(t, ·) is diﬀerentiable almost everywhere on Rd , and the operator ∇x p(t, ·) is thus well deﬁned. This property is satisﬁed regardless of the regularity of φ, since the proximal operator is always Lipschitz. The additional property brought here by the regularity C 1 of φ is in fact the bijectivity of the operator p(t, ·). The fact that ∇φ is locally Lipschitz on Rd is crucial to ensure that D2 φ is well deﬁned almost everywhere. However, the fact that ∇φ is globally Lipschitz on Rd ensures that p(t, ·) does not send sets of Rd of positive measure to negligible sets (see [13], p. 75), such as sets where φ is not twice diﬀerentiable. Such global regularity ensures that the operator D2 φ(p(t, ·)) is well deﬁned almost everywhere. Then, the assumption of property (Γ2 ) allows us to consider ∇x p(t, ·) as a function of p(t, ·) almost everywhere. We now have all the elements to state the following proposition, which is one of the main results of this subsection, concerning the control of the gradient of the velocity ﬁeld v. This result is namely required to control the solutions of the transport problem generated by the ﬁeld v in the uniqueness results of Section 6. For convenience, we will use the norm | ·|1 on Md (R), deﬁned by |A|1 = i,j |aij |, instead of the operator norm associated to the Euclidean norm on Rd . Proposition A.5. We assume that φ satisﬁes the property (Γ2 ). Let R > R > 0 and a ∈ Rd such that φ(a) = inf φ. Then there exist constants C and C (independent of φ, a, R and R ) such that for all Rd

t0 ∈ (0, 1) satisfying the condition (A.5), we have the property: ∀t ∈ (0, t0 ],

|∇x v(t, x)| dx ≤ C

B(a,R)

|∇x v(t0 , x)| dx ≤ B(a,R )

C L2 (B(a, R )) t0 (1 − t0 )

(A.9)

so that ∇x v ∈ L∞ ([0, t0 ], L1 (B(a, R))). Sketch of proof. Let t ∈ (0, 1). Remember that for all x ∈ Rd , v(t, x) = ∇φ(p(t, x)) − p(t, x). Then, for almost all x ∈ Rd , v(t, ·) is diﬀerentiable on x and ∇x v(t, x) = ∇x p(t, x)(D2 φ(p(t, x)) − I) = (t D2 φ(p(t, x)) + (1 − t)I)−1 (D2 φ(p(t, x)) − I).

(A.10)

Since, v(t, ·) is Lipschitz, |∇x v(t, ·)|1 ∈ L∞ (Rd ) ⊂ L1loc (Rd ). The function X(t, ·) = t∇φ + (1 − t) id is Lipschitz and bijective (and p(t, ·) is its inverse), we can therefore apply the generalized Change of Variable Theorem ([13], p. 117) and obtain: |∇x v(t, x)|1 dx = | det(t D2 φ(y) + (1 − t)I)| × |(t D2 φ(y) + (1 − t)I)−1 (D2 φ(y) − I)|1 dy. B(a,R)

p(t,B(a,R))

(A.11) For every y ∈ Rd where ∇φ is diﬀerentiable, the matrix D2 φ is symmetric positive and can therefore be diagonalized in an orthonormal basis. We will consider λ1 (y), · · · , λd (y) ≥ 0 the eigenvalues associated with D2 φ. From the equivalence between the | · |1 norm and the Frobenius norm, we obtain the following relation:

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

|∇x v(t, ·)|1 dx ≤

C1 B(a,R)

d ! j=1

|λj (y) − 1|

d "

(tλi (y) + (1 − t))dy ≤ C2

i=1 i=j

p(t,B(a,R))

25

|∇x v(t, ·)|1 dx,

B(a,R)

(A.12) where the constants C1 and C2 only depend on the constants of equivalence between the | · |1 norm and the Frobenius norm, and are independent of t and R. Let us take t0 ∈ (0, 1) verifying the condition (A.5) of Proposition A.4. Hence, we have p(t, B(a, R)) ⊂ p(t0 , B(a, R )). The condition (A.5) also gives t0 ≤ 1/2, and so for any t ∈ (0, t0 ], we have 1 − t ≤ 1 ≤ 2(1 − t0 ). Since λi (y) ≥ 0 (i = 1, · · · , d) for almost all y ∈ Rd , we have 0 ≤ tλi (y) + (1 − t) ≤ 2(t0 λi (y) + (1 − t0 )). Thus, thanks to the inequality (A.12), we can conclude:

|∇x v(t, x)|1 dx ≤

d #

(tλi (y) + (1 − t)) ≤ 2d−1

i=1 i=j

d #

(A.13)

B(a,R )

B(a,R)

since

|∇x v(t0 , x)|1 dx,

(t0 λi (y) + (1 − t0 )). Finally, as already mentioned at the beginning of

i=1 i=j

this proof, there exists a constant c > 0 such that |∇x v(t0 , ·)|1 is bounded by c/t0 (1 − t0 ), which completes the proof of the Proposition. This proof is useful for another lemma concerning the control of v. Gathering the diﬀerent results of this subsection concerning the control of the gradient of the ﬁeld v, we can control the solutions of the transport problem and thus obtain uniqueness results. Lemma A.2. We assume that φ satisﬁes the property (Γ2 ). Then there exists a constant C > 0 such that for all t ∈ (0, 1), ∇x v(t, ·)L∞ (Rd ) ≤

C D2 φ ∞ d + d L (R ) 1−t

Proof. As in the previous proof, using a diagonalization of D2 φ and the equivalence between | · |1 and Frobenius norms, as λi (y) ≥ 0, we obtain, for every y ∈ Rd where ∇φ is diﬀerentiable: % $ d d ! ! 1 λ (y) − 1 i C1 |(t D2 φ(y) + (1 − t)I)−1 (D2 φ(y) − I)|1 ≤ λi (y) tλi (y) + (1 − t) ≤ 1 − t d + 1 i=1 i=1 C2 2 ≤ | D φ(y)|1 + d . 1−t

(A.14)

We can then conclude by injecting the equation (A.10) into this last inequality. A.2. General case (Γ1 ) In the context of our optimal transport problem, we have proved at Proposition 3.2 that we can assume that the potential φ satisﬁes the property (Γ1 ). We are now able to extend the results of the previous subsection to potentials φ which satisfy (Γ1 ), so that φ may have non-diﬀerentiability points involving breaks in the transport plan. More precisely we can extend Propositions A.3 and A.5 to the case where φ only satisﬁes (Γ1 ) in order to show Propositions A.1 and A.2. In particular we show that ∇x v is uniformly integrable in the neighborhood of t = 0, and in Corollary A.1 we argue symmetrically to show that it is the same in the neighborhood of t = 1, and then

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

26

on all [0, 1]. To that end we consider the regularization γφ of φ with the Moreau envelope [3] deﬁned for all x ∈ Rd and for all γ > 0 as γ

φ(x) = inf

1

y∈Rd 2γ

|x − y|2 + φ(y) =

1 |x − Proxγφ (x)|2 + φ(Proxγφ (x)). 2γ

(A.15)

For all γ > 0, we also deﬁne the velocity ﬁeld vγ for all t ∈ (0, 1) and x ∈ Rd by vγ (t, x) = vγφ (t, x) = ∇φ(pγφ (t, x)) − pγφ (t, x) =

t γ where pγφ (t, x) = Prox 1−t φ

x − pγφ (t, x) , t

(A.16)

x . We also recall that γφ is of class C 1 and that for all x ∈ Rd , 1−t ∇(γφ)(x) =

x − Proxγφ (x) ∈ ∂φ (Proxγφ (x)) . γ

(A.17)

The potential γφ is then γ −1 -Lipschitz. We now show that if φ satisﬁes property (Γ1 ), then γφ satisﬁes property (Γ2 ). The results of subsection A.1 are then applied on φγ , that corresponds to transport plans without breaks. Lemma A.3. If φ satisﬁes property (Γ1 ), then γφ satisﬁes property (Γ2 ) for all γ > 0. ∗

Sketch of the proof. As (γf ) = f ∗ + γ2 | · |2 , then if f ∗ is in a Hölder space C 1,1 and admits a minimum on ∗ Rd , the same holds for (γf ) . Notice on the other hand that the functions φ and γφ have the same minima on Rd . Lemma A.4. We assume that φ satisﬁes property (Γ1 ), then (∇γφ)γ>0 is locally bounded, uniformly with respect to γ > 0. Proof. We can show that for every minimum a of φ on Rd and r > 0, we have the inclusion ∇(γφ)(B(a, r)) ⊂ ∂φ (B(a, r)). To that end, we use the relation ∇(γf ) = γ −1 (id − Proxγf ) = Proxf ∗ /γ (·/γ) [3], and then, with deﬁnition of proximal operator (y = Proxf (x) ⇔ x − y ∈ ∂f (y)), we have ∇(γf )(x) ∈ ∂f (Proxγf (x)) for all x ∈ Rd . Moreover, we have the inclusion Proxγφ (B(a, r)) ⊂ B(a, r), due to the non-expansiveness of the operator Proxγφ and the fact that a is a ﬁxed point for these operators. The union of the subdiﬀerentials of φ on B(a, r), denoted by ∂φ (B(a, r)), is bounded in Rd , since the potential φ is convex and locally Lipschitz. Then, by weak compactness of Sobolev spaces, we can conclude that for all x ∈ Rd , the set of adherence values of the family (∇γφ(x))γ , when γ > 0 tends to 0 is included in ∂φ(x), and that if φ is diﬀerentiable at x, then ∇γφ(x) converges to ∇φ(x) when γ > 0 tends to 0. The functions ∇γφ converge simply almost everywhere to ∇φ when γ tends to 0.

We recall that vγ L∞ ((0,1)×ω) ≤ 5 max{Mγ , Mγ∗ } + sup(ω) , with Mγ = ∇γφL∞ (ω) and Mγ∗ = ∂(γφ)∗ L∞ (ω) . According to Lemma A.4, the constants Mγ are uniformly bounded with respect to γ > 0. ∗ The same holds for the constants Mγ∗ with respect to γ ∈ (0, γ0 ] by noticing that ∂ (γφ) = ∂φ∗ + γ id ∗ ∗ ∗ (property of conjugation of an inf-convolution: (f g) = f + g [3]). Therefore, (vγ )γ0 ≥γ>0 is uniformly bounded on (0, 1) × ω by a constant independent of γ ∈ (0, γ0 ]. We have the relation ∇x (φt )∗ = (id − Prox(1−t)(tφ)∗ )/(1 − t) for all t ∈ (0, 1). In addition, γ Prox(1−t)(tγφ)∗ = Prox(1−t)(tφ)∗ +(1−t) 2t | · |2 = Prox(1−t)(tφ)∗ (id −γ(1 − t)[id +(1 − t)vγ (t, ·)]) .

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

27

As we have p(t, ·) = ∇x (φt )∗ and vφ (t, x) = [x − p(t, x)]/t and using the non-expansiveness of the proximal operator, we get: |vγ (t, ·) − v(t, ·)| =

1 Prox(1−t)(tγφ)∗ − Prox(1−t)(tφ)∗ ≤ γ |id +(1 − t)vγ (t, ·)| . t(1 − t) t

Since the ﬁelds vγ are uniformly bounded on (0, 1) × ω independently of γ ∈ (0, γ0 ], there exists a constant M independent of γ ∈ (0, γ0 ], such that |v(t, ·) − vγ (t, ·)L∞ (ω) ≤ M (γ/t) for all γ ∈ (0, γ0 ] and all t ∈ (0, 1). As (vγ )γ0 ≥γ>0 is uniformly bounded on (0, 1) × ω, we have: ∀α ∈ [0, 1], ∀γ ∈ (0, γ0 ], ∀t ∈ (0, 1), v(t, ·) − vγ (t, ·)L∞ (ω) ≤ C

γ α t

(A.18)

for all α ∈ [0, 1], and with C a constant independent of γ. A.3. Finalization of the proof of Propositions A.1 and A.2 We now have enough elements to prove Propositions A.1 and A.2. Sketch of proof of Proposition A.1. When φ satisﬁes property (Γ1 ), a transport plan may admit “breaks”. With Proposition A.3 the Burgers equation is nevertheless satisﬁed by the vγ ’s. According to inequality (A.18), we can conclude that vγ converges to the ﬁeld v in Lp ((0, 1) × ω) when γ goes to 0, for every 1 < p < +∞ and every bounded open set ω in Rd . From the weak compactness of Sobolev spaces, we get that vγ (t, ·) converges weakly in W 1,p (ω) to v(t, ·), for all t ∈ (0, 1), since ∇x vγ (t, ·)L∞ (ω) ≤ c/t(1 − t). As the ﬁelds vγ are uniformly bounded on (0, 1) × ω independently of γ ∈ (0, γ0 ], we can apply these properties of convergence in the weak formulation of Burgers equation and conclude. Sketch of proof of Proposition A.2. As already mentioned in the proof of Lemma A.4, we have the inclusion ∇(γφ)(B(a, r)) ⊂ ∂φ (B(a, r)), and thus, by setting r = 2(R + |a|), we have Mγ = supB(a,r) |∇γφ| ≤ M . Hence, a time t0 verifying the hypothesis of the statement (independent of γ), also satisﬁes the hypothesis of Proposition A.5 for all γ > 0. We can therefore apply this last proposition for such a t0 for all vγ to obtain (A.2). For the case γ = 0 (i.e. v0 = v), the weak convergence of vγ (t, ·) to v(t, ·) in W 1,p (ω) can be used to conclude, by weak lower semi-continuity of the L1 norm on B(a, r). A.4. An independent but notable regularity result We now show that Proposition A.2 can be generalized to ∇t,x v ∈ L∞ (0, 1; L1 (Ω)) for every bounded open set Ω of Rd . This property is not required in the proof of the results presented in the main paper. It nevertheless gives a new insight of the regularity and the control of the velocity ﬁeld of an isotropic optimal transport for the L2 distance. Corollary A.1 (Corollary 6.1 of the main paper). Assume that φ satisﬁes the property (Γ1 ). Let Ω be a bounded open set Then ∇t,x v ∈ L∞ (0, 1; L1 (Ω)) and there exists K > 0 such that for all t ∈ (0, 1), |∇t,x v(t, x)| dx ≤ K.

(A.19)

Ω

Sketch of the proof. We symmetrize the result of Proposition A.2 (case γ = 0) respectively in the neighborhood of t = 0 and t = 1, and apply the upper bound c/t(1 − t) in the middle.

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

28

Corollary A.2 (Remark A.2 of the main paper). We assume that φ satisﬁes the property (Γ1 ). Let Ω be a bounded open set. For all p, q ≥ 1 such that 1/p + 1/q > 1, we have ∇t,x v ∈ Lp (0, 1; Lq (Ω)). In particular v ∈ W 1,p ((0, 1) × Ω) for all 1 ≤ p < 2. Sketch of the proof. We jointly use the result of Corollary A.1 with the estimate c/t(1 −t). We thus partially bound from above |∇x v(t0 , ·)|1 in order to be able to apply Corollary A.1. More precisely, for all ω ⊂⊂ Ω: ⎛ ⎞ pq ⎝ |∇x v(t, x)|p dx⎠ ≤ ω

with a = q(p − 1)/p, and t → take p = q < 2.

C ta (1−t)a

⎛ a

c ⎝ ta (1 − t)a

⎞ pq |∇x v(t, x)| dx⎠ ≤

ω

C , ta (1 − t)a

is integrable on (0, 1). For the particular case W 1,p , it is suﬃcient to

Corollary A.1 and Corollary A.2 give the most consistent regularity that one can have for general velocity ﬁeld v. Indeed, according to Corollary A.2, a velocity ﬁeld v deﬁned with respect to a potential φ (verifying the property (Γ1 )) for all t ∈ (0, 1) by v(t, ·) = vφ (t, ·) =

id − p(t, ·) 1 = (id −(tφ + (1 − t) id)∗ ) , t t

(A.20)

1,p is an element of Wloc ([0, 1] × Rd ). Hence the restriction of v to any bounded open set Ω of Rd is an element 1,p of W ((0, 1) × Ω), for all p < 2. The question that arises naturally is whether v = vφ could not be an 1 element of Hloc ([0, 1] × Rd ). In general, this is not the case (see for instance Caﬀarelli’s counter-example on mass splitting).

Appendix B. Convergence of Benamou-Brenier’s algorithm This section is dedicated to the proof of Proposition 8.2 in the main paper. We ﬁrst recall general results on non-expansive operators that will imply the weak and strong convergence of the algorithm. In what follows, (H, . , .) is an Hilbert space and M is an operator from H to H. Deﬁnition B.1 (Non-expansive operator). The operator M is called non-expansive if and only if it is 1Lipschitz, and ﬁrmly non-expansive if and only if we have Mx −My2 ≤ x −y, Mx −My, for all x, y ∈ H. The operator M is called quasi-ﬁrmly non-expansive on a subset A of H, containing the set of ﬁxed points of M, if and only if, for any ﬁxed point x∗ of M, we have for all x ∈ A: x −Mx2 +x −x∗ 2 ≥ Mx −x∗ 2 . Theorem B.1. Let M be a non-expansive operator on H, and quasi-ﬁrmly non-expansive on M(H) (the image of H by M). Assume that the set Fix(M) of the ﬁxed points of M is non-empty. Let (xn )n be a sequence of elements of H satisfying for every n ∈ N the estimate: M(xn ) − xn+1 ≤ n , where (n )n is a non-negative real sequence satisfying n n < +∞. Then (xn )n weakly converges in H to a ﬁxed point of M. Theorem B.2 (H. Bauschke [2]). Let M be a non-expansive operator over H and assume that the set Fix(M) of the ﬁxed points of M is non-empty. Let (λn )n≥0 be a sequence of parameters of [0, 1), converging to 0, and satisfying: n λn = +∞ and n |λn+1 − λn | < +∞. Given a and x0 in H, we deﬁne the sequence (xn ) by the recurrence xn+1 = λn a + (1 − λn )Mn xn (∀n ≥ 0), where we have for all n ∈ N the estimate Mn xn − Mxn ≤ n , with n n < +∞. Then the sequence (xn )n converges strongly to the L2 projection of a on the closed convex set Fix(M).

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

29

The sequence (n )n represents the numerical errors inherent to the implementation of the algorithm. They are here assumed to be highly controlled, which is not realistic in practice. Theorem B.1 can be shown using classical functional analysis tools (see [21]) such as the Opial’s Lemma [24]. Notice that in Theorem B.2, it is direct to prove that the set of ﬁxed points of a non-expansive operator is a closed convex set (see Lemma 2.3-7 of [18]). The detailed proofs of these two theorems can be found in [18] (section 2.3 and its appendix B). We are now able to apply the previous convergence results to the Benamou-Brenier algorithm. To that end, it is suﬃcient to show that an iteration of the algorithm corresponds to the iteration of a certain nonexpansive operator. We consider the space H = L2 (Q)d+1 × L2 (Q)d+1 , provided with the scalar product (μ1 , q1 ), (μ2 , q2 )H = μ1 , μ2 L2 + r2 q1 , q2 L2 , so that (H, ., .) is an Hilbert space. Let B : H → H be the operator which associate to (μ, q) the product (μ , q ) of the last two steps (B and C) of the algorithm Benamou-Brenier. Here ψ just acts as an auxiliary variable. Indeed, if (μ∗ , q ∗ , ψ ∗ ) is a saddle point of the Lagrangian L(ψ, p, μ) = χP (q) + G(ψ) + μ, ∇t,x ψ − qL2 , then (μ∗ , q ∗ ) = B(μ∗ , q ∗ ). Conversely if (μ∗ , q ∗ ) is a ﬁxed point of B then (μ∗ , q ∗ , ψ ∗ ) is a saddle point of the Lagrangian, where ψ ∗ is the unique element of S which satisﬁes q ∗ = ∇ψ ∗ . The potential ψ is therefore only required for computational purposes. Proposition B.1 (Proposition 8.2 of the main paper). Operator B is non-expansive on H and quasi-ﬁrmly non-expansive on B(H). Proof. (μ1 , q1 ) and (μ2 , q2 ) being given, we obtain (μ1 , q1 ) = B(μ1 , q1 ) and (μ2 , q2 ) = B(μ2 , q2 ) with the Benamou-Brenier iterations. For i = 1, 2, we look for: • Step A: Find the unique ψi ∈ (H 1 /R)(Q) such that G(h) + μi , ∇hL2 + r∇ψi − qi , ∇hL2 = 0, ∀h ∈ (H 1 /R)(Q). • Step B: Find the unique qi such that μi + r(∇ψi − qi ), p − qi L2 ≤ 0, for all p ∈ P. • Step C: Deﬁne μi by μi = μi + r(∇t,x ψi − qi ). Let us start by studying the non-expansiveness of B. Note that by injecting the equation of step C into step A and step B (for i = 1 or i = 2), we obtain the two new equations: G(h) + μi , ∇hL2 + rqi − qi , ∇hL2 = 0, μi , p

( )

( )

( )

−

qi L2

( )

( )

≤ 0, ( )

∀h ∈ H 1 (Q)/R,

(B.1)

∀p ∈ P.

(B.2)

( )

We then set μ( ) = μ2 − μ1 , q ( ) = q2 − q1 and ψ = ψ2 − ψ1 . By taking (B.1)i=2 − (B.1)i=1 with h = ψ , and by summing (B.2)i=2 with p = q1 and (B.2)i=1 with p = q2 , we respectively obtain the two following relations:

μ , ∇ψ + rq − q, ∇ψ = 0,

(B.3)

μ , q ≥ 0.

(B.4)

Summing these two relations, we have μ , ∇ψ − q + rq − q, ∇ψ ≤ 0. Observing from Step C that μ = μ + r(∇ψ − q ), we then obtain

|μ|2 − |μ |2 = μ − μ , μ + μ = −r∇ψ − q , 2(μ + r(∇ψ − q )) − r(∇ψ − q )

= −2rμ , ∇ψ − q + r2 |∇ψ − q |2 ≥ 2r2 ∇ψ , q − q + r2 |∇ψ − q |2 .

(B.5)

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

30

Moreover, we have

∇ψ , q − q = ∇ψ − q + q, q − q = ∇ψ − q, q − q − |q − q|2 +

= ∇ψ − q , q − q +

1 2 |q | − |q|2 + |q − q|2 . 2

1 2 |q | − |q|2 + |q − q|2 2

(B.6)

By re-injecting (B.6) in (B.5), we get: |μ|2 − |μ |2 ≥ r2 (|q |2 − |q|2 ) + |q − q|2 + 2∇ψ − q , q − q + |∇ψ − q |2

≥ r2 (|q |2 − |q|2 ) + r2 |∇ψ − q|2 ,

(B.7)

which leads to

r2 |∇ψ − q|2 + (|μ |2 + r2 |q |2 ) ≤ (|μ|2 + r2 |q|2 )

r2 |∇ψ − q|2 + B(μ1 , q1 ) − B(μ2 , q2 )H ≤ (μ1 , q1 ) − (μ2 , q2 )H ,

(B.8) (B.9)

so that the operator B is non-expansive. We now demonstrate the quasi-ﬁrmly non-expansiveness of B on B(H). Let (μ∗ , q ∗ ) be a ﬁxed point of B, thus included in B(H). Using equation (B.8) with (ψ, q, μ) = (ψ − ψ ∗ , q − q ∗ , μ − μ∗ ) and (ψ , q , μ ) = (ψ − ψ ∗ , q − q ∗ , μ − μ∗ ), we obtain: r2 |∇(ψ − ψ ∗ ) − (q − q ∗ )|2 + (|μ − μ∗ |2 + r2 |q − q ∗ |2 ) ≤ (|μ − μ∗ |2 + r2 |q − q ∗ |2 ).

(B.10)

For the ﬁrst term, we get r2 |∇(ψ − ψ ∗ ) − (q − q ∗ )|2 = r2 |∇ψ − q|2 = |(μ − μ) + r(q − q)|2 = |μ − μ|2 + 2rμ − μ, q − q + r2 |q − q|2 , with μ − μ, q − q ≥ 0, according to the relation (B.4). As in [14], we thus obtain: |μ − μ|2 + r2 |q − q|2 + (|μ − μ∗ |2 + r2 |q − q ∗ |2 ) ≤ (|μ − μ∗ |2 + r2 |q − q ∗ |2 ), B(μ, q) − (μ, q)2H + (μ , q ) − (μ∗ , q ∗ )2H ≤ (μ, q) − (μ∗ , q ∗ )2H so that B is quasi-ﬁrmly non-expansive. References [1] L. Ambrosio, N. Gigli, G. Savaré, Gradient Flows: in Metric Spaces and in the Space of Probability Measures, Springer Science & Business Media, 2008. [2] H. Bauschke, The approximation of ﬁxed points of compositions of nonexpansive mappings in Hilbert space, J. Math. Anal. Appl. 202 (1996) 150–159. [3] H. Bauschke, P.L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, vol. 408, Springer, 2011. [4] J.D. Benamou, Y. Brenier, A computational ﬂuid mechanics solution to the Monge-Kantorovich mass transfer problem, Numer. Math. 84 (2000) 375–393. [5] J.D. Benamou, B.D. Froese, A.M. Oberman, Numerical solution of the optimal transportation problem using the MongeAmpère equation, J. Comput. Phys. 260 (2014) 107–126. [6] R.I. Boţ, E.R. Csetnek, D. Meier, Inducing strong convergence into the asymptotic behaviour of proximal splitting algorithms in Hilbert spaces, Optim. Methods Softw. 34 (2019) 489–514. [7] L. Caﬀarelli, Boundary regularity of maps with convex potentials, Comm. Pure Appl. Math. 45 (1992) 1141–1151. [8] L. Caﬀarelli, Boundary regularity of maps with convex potentials–II, Ann. of Math. 144 (1996) 453–496.

R. Hug et al. / J. Math. Anal. Appl. 485 (2020) 123811

31

[9] L. Chizat, G. Peyré, B. Schmitzer, F.X. Vialard, Unbalanced optimal transport: dynamic and Kantorovich formulations, J. Funct. Anal. 274 (2018) 3090–3123. [10] P.L. Combettes, J.C. Pesquet, A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery, IEEE J. Sel. Top. Signal Process. 1 (2007) 564–574. [11] D. Cordero-Erausquin, Sur le transport de mesures périodiques, C. R. Acad. Sci., Sér. 1 Math. 329 (1999) 199–202. [12] R.J. DiPerna, P.L. Lions, Ordinary diﬀerential equations, transport theory and Sobolev spaces, Invent. Math. 98 (1989) 511–547. [13] L. Evans, R. Gariepy, Measure Theory and Fine Properties of Functions, CRC Press, 1992. [14] M. Fortin, R. Glowinski, Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, Studies in Mathematics and Its Applications, Elsevier Science, 1983. [15] V. Girault, P.A. Raviart, Finite Element Methods for Navier-Stokes Equations: Theory and Algorithms, Springer-Verlag, 2011. [16] M. Goldman, F. Otto, A variational proof of partial regularity for optimal transportation maps, arXiv preprint, arXiv: 1704.05339, 2017. [17] K. Guittet, On the time-continuous mass transport problem and its approximation by augmented Lagrangian techniques, SIAM J. Numer. Anal. 41 (2004) 382–399. [18] R. Hug, Mathematical analysis and convergence of an algorithm for dynamic optimal transport: case of non-smooth transport plans, Ph.D. thesis, Université de Grenoble, 2016. [19] R. Hug, E. Maitre, N. Papadakis, Multi-physics optimal transportation and image interpolation, ESAIM Math. Model. Numer. Anal. 49 (2015) 1671–1692. [20] L. Kantorovitch, On the translocation of masses, Manage. Sci. 5 (1958) 1–4. [21] P.L. Lions, B. Mercier, Splitting algorithms for the sum of two nonlinear operators, SIAM J. Numer. Anal. 16 (1979) 964–979. [22] D. Lombardi, E. Maitre, Eulerian models and algorithms for unbalanced optimal transport, ESAIM Math. Model. Numer. Anal. 49 (2015) 1717–1744. [23] R. McCann, A convexity principle for interacting gases, Adv. Math. 128 (1997) 153–179. [24] Z. Opial, Weak convergence of the sequence of successive approximations for non-expansive mappings, Bull. Amer. Math. Soc. 73 (1967) 591–597. [25] N. Papadakis, G. Peyré, E. Oudet, Optimal transport with proximal splitting, SIAM J. Imaging Sci. 7 (2014) 212–238. [26] G. Peyré, M. Cuturi, Computational optimal transport, Found. Trends Mach. Learn. 11 (2019) 355–607. [27] W. Rudin, N. Dhombres, F. Hoﬀman, Analyse réelle et complexe, Masson, Paris, 1975. [28] F. Santambrogio, Optimal Transport for Applied Mathematicians, Birkhäuser, NY, 2015. [29] C. Villani, Topics in Optimal Transportation, American Mathematical Society, 2003. [30] C. Zuily, Éléments de distributions et d’équations aux dérivées partielles: cours et problèmes résolus, Dunod, 2002.

On the convergence of augmented Lagrangian method for optimal transport between nonnegative densities

On the convergence of augmented Lagrangian method for optimal transport between nonnegative densities

Recommend Documents