Nonlinear Analysis 74 (2011) 7455–7473
Contents lists available at SciVerse ScienceDirect
Nonlinear Analysis journal homepage: www.elsevier.com/locate/na
Alternating proximal algorithms for linearly constrained variational inequalities: Application to domain decomposition for PDE’s✩ H. Attouch a , A. Cabot a,∗ , P. Frankel a , J. Peypouquet b a
Département de Mathématiques, Université Montpellier II, CC 051 Place Eugène Bataillon, 34095 Montpellier Cedex 5, France
b
Departamento de Matemática, Universidad Técnica Federico Santa María, Avenida España 1680, Valparaíso, Chile
article
info
abstract
Article history: Received 14 June 2010 Accepted 30 July 2011 Communicated by Ravi Agarwal
Let X, Y, Z be real Hilbert spaces, let f : X → R ∪ {+∞}, g : Y → R ∪ {+∞} be closed convex functions and let A : X → Z, B : Y → Z be linear continuous operators. Let us consider the constrained minimization problem
MSC: 65K05 65K10 49J40 90C25
Given a sequence (γn ) which tends toward 0 as n → +∞, we study the following alternating proximal algorithm
Keywords: Convex minimization Alternating minimization Proximal algorithm Variational inequalities Monotone inclusions Domain decomposition for PDE’s
(P )
(A)
min{f (x) + g (y) : Ax = By}.
α 1 2 2 ‖ A ζ − By ‖ + ‖ζ − x ‖ ; ζ ∈ X x = argmin γ f (ζ ) + n Z n X n+1 n+1 2 2 1 ν yn+1 = argmin γn+1 g (η) + ‖Axn+1 − Bη‖2Z + ‖η − yn ‖2Y ; η ∈ Y , 2
2
where α and ν are positive parameters. It is shown that if the sequence (γn ) tends moderately slowly toward 0, then the iterates of (A) weakly converge toward a solution of (P ). The study is extended to the setting of maximal monotone operators, for which a general ergodic convergence result is obtained. Applications are given in the area of domain decomposition for PDE’s. © 2011 Elsevier Ltd. All rights reserved.
1. Introduction Let X, Y, Z be real Hilbert spaces respectively endowed with the scalar products ⟨., .⟩X , ⟨., .⟩Y and ⟨., .⟩Z and the corresponding norms. Let f : X → R ∪ {+∞}, g : Y → R ∪ {+∞} be closed convex proper functions and let A : X → Z, B : Y → Z be linear continuous operators. In this study, our aim is to solve convex structured minimization problems of the form
(P )
min{f (x) + g (y) : Ax = By}.
✩ With the support of the French ANR grant ANR-08-BLAN-0294-03. J. Peypouquet was partly supported by FONDECYT grant 11090023 and Basal Proyect, CMM, Universidad de Chile. ∗ Corresponding author. E-mail addresses:
[email protected] (H. Attouch),
[email protected] (A. Cabot),
[email protected] (P. Frankel),
[email protected] (J. Peypouquet).
0362-546X/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.na.2011.07.066
7456
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
In order to find a point that minimizes the map (x, y) → Φ (x, y) = f (x) + g (y) on the subspace {(x, y) ∈ X × Y, Ax = By}, we propose the following alternating algorithm:
(A)
α 1 2 2 xn+1 = argmin γn+1 f (ζ ) + ‖Aζ − Byn ‖Z + ‖ζ − xn ‖X ; ζ ∈ X 2 2 ν 1 2 yn+1 = argmin γn+1 g (η) + ‖Axn+1 − Bη‖Z + ‖η − yn ‖2Y ; η ∈ Y , 2
2
where α, ν are positive real numbers and (γn ) is a positive sequence that tends1 toward 0 as n → +∞. Due to the structured character of the objective function Φ (x, y) = f (x) + g (y), alternating algorithms imply a reduction on the size of the subproblems to be solved at each iteration. Our particular choice of (A) is based on the following ideas. (a) Alternating algorithms with costs-to-move. Consider the convex function Φγ : X × Y → R ∪ {+∞} defined by
Φγ (x, y) = f (x) + g (y) +
1 2γ
‖Ax − By‖2Z ,
where γ is a positive real parameter. The minimization of the function Φγ is studied in [2], where the authors introduce the alternating algorithm with costs-to-move
α 1 2 2 ‖Aζ − Byn ‖Z + ‖ζ − xn ‖X ; ζ ∈ X xn+1 = argmin f (ζ ) + 2γ 2 1 ν 2 2 yn+1 = argmin g (η) + ‖Axn+1 − Bη‖Z + ‖η − yn ‖Y ; η ∈ Y , 2γ 2 α and ν being positive coefficients. If argminΦγ ̸= ∅, it is shown in [3] that the sequence (xn , yn ) converges weakly toward a minimum of Φγ . The framework of [3,2] extends the one of [4,5] from the strong coupled problem to the weak coupled problem with costs-to-change. More precisely, Q (x, y) = ‖x − y‖2Z is a strong coupling function with X = Y = Z and A = B = I while Q (x, y) = ‖Ax − By‖2Z is now a weak coupling function which allows for asymmetric and partial relations between the variables x and y. The interest of the weak coupling term is to cover many situations, ranging from decomposition methods for PDE’s to applications in game theory. In decision sciences, the term Q (x, y) = ‖Ax − By‖2Z allows to consider agents who interplay, only via some components of their decision variables. For further details, the interested reader is referred to [3]. Observing that problem (P ) corresponds formally to the minimization of the function Φγ with γ → 0, it is natural to consider a vanishing sequence (γn ) in algorithm (A). (b) Prox-penalization methods. Setting Ψ (x, y) = as
1 2
‖Ax − By‖2Z and x = (x, y) ∈ X × Y = X , we can rewrite problem (P )
min{ Φ (x) : x ∈ argminΨ }. This situation is studied in [6,7], where the authors use a diagonal proximal point algorithm combined with a penalization scheme. This kind of technique can be traced back to the pioneering work [8]. The reader is also referred to [9] where multiscale in time aspects are considered in the framework of continuous dynamics. The algorithm of [6,7] applied to our setting reads as
1 (A′ ) xn+1 ∈ argmin γn Φ (x) + Ψ (x) + ‖x − xn ‖2X . 2
Under suitable conditions on the sequence (γn ), it is shown in [6,7] that the iterates of algorithm (A′ ) converge weakly to a solution of (P ). These ideas lead us to the formulation of algorithm (A), which has the following distinctive marks. First, it uses the structured character of the objective function to reduce the size of the subproblem solved at each iteration. Second, it combines proximal iterations with a penalization scheme in a simple way, meaning that no new nonlinearities are introduced by the latter, unlike most penalization procedures available in the literature. Consider, for instance, the functions θ described in [10] (see also the references therein). The main result of the paper asserts that if the solution set is nonempty and if (γn ) tends moderately slowly toward 0, then the iterates of (A) weakly converge toward a solution of (P ). When the space R(A) + R(B) is closed in Z, the above condition on (γn ) is satisfied if the sequence
1
γn+1
−
1
γn
is bounded from above and if (γn ) ∈ l2 .
We apply our abstract results to the framework of splitting methods for PDE’s. For that purpose, we consider a domain
Ω ⊂ RN that can be decomposed into two non overlapping subdomains Ω1 , Ω2 with a common interface Γ . The functional spaces are X = H 1 (Ω1 ), Y = H 1 (Ω2 ) and Z = L2 (Γ ), the operators A : X → Z and B : Y → Z being, respectively, 1 In another direction, algorithm (A) has been recently studied in [1] in the case of a sequence (γ ) increasing toward +∞ as n → +∞. n
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
the trace operators on Γ . The term Au − Bv corresponds to the jump of the map w =
7457
u
v
on Ω1 through on Ω2
the interface
Γ . It is shown that algorithm (A) allows to solve some given boundary value problem on Ω by solving separately mixed Dirichlet–Neumann problems on Ω1 and Ω2 . Finally, observe that by writing down the optimality conditions satisfied by the iterates of algorithm (A), we obtain 0 ∈ γn+1 ∂ f (xn+1 ) + A∗ (Axn+1 − Byn ) + α(xn+1 − xn ) 0 ∈ γn+1 ∂ g (yn+1 ) − B∗ (Axn+1 − Byn+1 ) + ν(yn+1 − yn ). This suggests to extend the previous study to the framework of maximal monotone operators, by replacing respectively the subdifferential operators ∂ f and ∂ g with two maximal monotone operators M and N. Indeed, in this more general setting we are able to prove the convergence of the sequence of weighted averages. The paper is organized as follows. Section 2 is devoted to fix the general setting and notations that are used throughout the paper. In Section 3, we prove a general result of weak ergodic convergence for the iterates of (A) in a maximal monotone setting. The key conditions are the closedness of the space R(A) + R(B) and the assumption (γn ) ∈ l2 \ l1 . The subdifferential case is analyzed in Section 4, where we establish a result of weak convergence toward a solution of (P ). We also discuss on the robustness of the algorithm with respect to computational errors. Section 5 presents further convergence results for the strongly coupled problem without costs-to-move. Finally, the applications to domain decomposition for PDE’s are illustrated in Section 6. 2. General setting and notations We recall that X, Y, Z are real Hilbert spaces respectively endowed with the scalar products ⟨., .⟩X , ⟨., .⟩Y , ⟨., .⟩Z and the corresponding norms. Let M : X ⇒ X, N : Y ⇒ Y be maximal monotone operators such that dom M ̸= ∅, dom N ̸= ∅. Let A : X → Z, B : Y → Z be linear continuous operators with adjoints A∗ : Z → X and B∗ : Z → Y. Let (γn ) be a positive sequence such that limn→+∞ γn = 0. Given positive coefficients α, ν > 0 and initial data (x0 , y0 ) ∈ X × Y, let us consider the alternating proximal algorithm defined implicitly by
(A)
0 ∈ γn+1 Mxn+1 + A∗ (Axn+1 − Byn ) + α(xn+1 − xn ) 0 ∈ γn+1 Nyn+1 − B∗ (Axn+1 − Byn+1 ) + ν(yn+1 − yn ).
Observe that the linear continuous operator A∗ A is maximal monotone, hence the operator γn+1 M + A∗ A, is also maximal monotone; see for example [11]. Therefore, the iterate xn+1 is uniquely defined by Minty’s theorem. The same holds true for the iterate yn+1 . Remark 2.1 (Strong Coupling Without Costs-to-Move). Assume that X = Y = Z and that A = B = I, which corresponds to a situation of strong coupling. In this case, algorithm (A) is well-defined even if α = ν = 0. We denote by (A0 ) the corresponding algorithm
(A0 )
0 ∈ γn+1 Mxn+1 + xn+1 − yn 0 ∈ γn+1 Nyn+1 + yn+1 − xn+1 ,
that can be equivalently rewritten as
xn+1 = (I + γn+1 M )−1 yn yn+1 = (I + γn+1 N )−1 xn+1 .
It ensues that the sequences (xn ) and (yn ) satisfy the following recurrence formulas xn+1 = (I + γn+1 M )−1 (I + γn N )−1 xn ,
yn+1 = (I + γn+1 N )−1 (I + γn+1 M )−1 yn .
This scheme consisting of a double backward step has been previously studied by Passty [12]. Algorithm (A) can be viewed as an extension of iteration (A0 ), so that our present paper appears as a continuation of the seminal work [12]. Let X = X × Y and denote by V the closed subspace {(x, y) ∈ X , Ax = By}. The normal cone operator NV takes the constant value NV ≡ V ⊥ on its domain V . Setting x = (x, y), define the monotone operators M : X ⇒ X and T : X ⇒ X , respectively, by Mx = (Mx, Ny) and Tx = Mx + NV (x) =
Mx + V ⊥
∅
if x ∈ V if x ̸∈ V .
We denote by S = T−1 0 the null set of T. It is also convenient to define the bounded linear operator A
:
X
(x, y)
→ →
Z
Ax − By,
7458
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
and the map
Ψ
:
X
→
(x, y)
→
R 1 2
‖Ax − By‖2Z .
Recall that the Fenchel conjugate Ψ ∗ : X → R ∪ {+∞} of the map Ψ is defined by Ψ ∗ (p) = supx∈X {⟨p, x⟩X − Ψ (x)} for every p ∈ X . The next proposition shows that dom Ψ ∗ = R(A∗ ) and gives the expression of the function Ψ ∗ on its domain. Proposition 2.2. With the same notations as above, we have dom Ψ ∗ = R(A∗ ) and Ψ ∗ (A∗ z ) = z ∈ Z.
1 2 d 2 Z
(z , Ker(A∗ )) for every
Proof. Let us fix p ∈ X . From the definition of Ψ and Ψ ∗ , we have Ψ ∗ (p) = supx∈X ⟨p, x⟩X − 21 ‖Ax‖2Z . This maximization problem can be reformulated as
− inf {F (x) + G(Ax)},
(1)
x∈X
where F : X → R and G : Z → R are, respectively, defined by F (x) = −⟨p, x⟩X and G(z ) = Let us introduce the following minimization problem inf {F ∗ (−A∗ z ∗ ) + G∗ (z ∗ )} = inf
z ∗ ∈Z
z ∗ ∈Z
=
inf
1 2
‖z ‖2Z for every x ∈ X , z ∈ Z.
1 δ{−p} (−A∗ z ∗ ) + ‖z ∗ ‖2Z
z ∗ ∈Z A∗ z ∗ =p
(2)
2
1 2
‖z ∗ ‖2Z .
(3)
Since the functions F and G are convex and continuous, problems (1)–(2) are dual to each other; see for example [13, Chap. III]. Observing that the Moreau–Rockafellar qualification condition is satisfied, we derive from [13, Theorem 4.1, p. 59] that the infimum values of problems (1)–(2) are simultaneously finite and in this case they coincide. Expression (3) shows that the infimum in (2) is finite if and only if p ∈ R(A∗ ). Coming back to problem (1), we deduce that p ∈ dom Ψ ∗ if and only if p ∈ R(A∗ ). Now assume that p = A∗ z for some z ∈ Z. Then, we have inf
1
z ∗ ∈Z A∗ z ∗ =p
2
‖z ∗ ‖2Z =
which ends the proof.
inf
1
z ∗ ∈Z z ∗ −z ∈Ker(A∗ )
2
‖z ∗ ‖2Z =
1 2
d2Z z , Ker(A∗ ) ,
3. Maximal monotone framework: ergodic convergence results The notations and hypotheses are the same as in the previous section. Given any initial point (x0 , y0 ) ∈ X , the iterates generated by algorithm (A) are denoted by (xn , yn ), n ∈ N. 3.1. Preliminary results Let us start with an estimation that is at the core of the convergence analysis. For (x, y) ∈ X set hn (x, y) = α‖xn − x‖2X + ν‖yn − y‖2Y + ‖Byn − By‖2Z .
(4)
Then, we have the following. Lemma 3.1. For every (x, y) ∈ V and (ζ , η) ∈ T(x, y), there exists p ∈ V ⊥ such that hn+1 (x, y) − hn (x, y) + 2γn+1 ⟨ζ , xn+1 − x⟩X + ⟨η, yn+1 − y⟩Y
+ α‖xn+1 − xn ‖2X + ν‖yn+1 − yn ‖2Y + ‖Axn+1 − Byn ‖2Z ≤ 2γn2+1 Ψ ∗ (p). Proof. To simplify the notation, we set hn = hn (x, y). The definition of (xn+1 ) gives
α γn+1
(xn+1 − xn ) +
1
γn+1
A∗ (Axn+1 − Byn ) ∈ −Mxn+1 .
On the other hand, since (ζ , η) ∈ T(x, y), there exists p = (p, q) ∈ V ⊥ such that
ζ ∈ Mx + p and η ∈ Ny + q.
(5)
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7459
In particular, we have p − ζ ∈ −Mx, which by the monotonicity of M implies
α 1 ⟨xn+1 − xn , xn+1 − x⟩X + ⟨A∗ (Axn+1 − Byn ), xn+1 − x⟩X ≤ ⟨p − ζ , xn+1 − x⟩X . γn+1 γn+1 This is equivalent to
α‖xn+1 − x‖2X + α‖xn+1 − xn ‖2X ≤ α‖xn − x‖2X − 2⟨Axn+1 − Byn , Axn+1 − Ax⟩Z + 2γn+1 ⟨p, xn+1 − x⟩X − 2γn+1 ⟨ζ , xn+1 − x⟩X . In a similar way, we obtain
ν‖yn+1 − y‖2Y + ν‖yn+1 − yn ‖2Y ≤ ν‖yn − y‖2Y − 2⟨Byn+1 − Axn+1 , Byn+1 − By⟩Z + 2γn+1 ⟨q, yn+1 − y⟩Y − 2γn+1 ⟨η, yn+1 − y⟩Y . Using the properties of the inner product and the fact that Ax = By, we let the reader check that
−2⟨Axn+1 − Byn , Axn+1 − Ax⟩Z − 2⟨Byn+1 − Axn+1 , Byn+1 − By⟩Z = ‖Byn − By‖2Z − ‖Byn+1 − By‖2Z − ‖Axn+1 − Byn ‖2Z − ‖Axn+1 − Byn+1 ‖2Z . Since (x, y) ∈ V and p = (p, q) ∈ V ⊥ , we have
⟨p, xn+1 − x⟩X + ⟨q, yn+1 − y⟩Y = ⟨p, xn+1 ⟩X + ⟨q, yn+1 ⟩Y = ⟨p, (xn+1 , yn+1 )⟩X . Gathering all this information and writing cn = hn+1 − hn + 2γn+1 ⟨ζ , xn+1 − x⟩X + ⟨η, yn+1 − y⟩Y + α‖xn+1 − xn ‖2X + ν‖yn+1 − yn ‖2Y + ‖Axn+1 − Byn ‖2Z
we deduce that cn ≤ 2γn+1 ⟨p, (xn+1 , yn+1 )⟩X − ‖Axn+1 − Byn+1 ‖2Z = 2 [⟨γn+1 p, (xn+1 , yn+1 )⟩X − Ψ (xn+1 , yn+1 )] . By the definition of Ψ ∗ , the term between brackets is majorized by Ψ ∗ (γn+1 p). Since Ψ ∗ (γn+1 p) = γn2+1 Ψ ∗ (p), inequality (5) immediately follows. In order to exploit inequality (5), we may assume that Ψ ∗ (p) < +∞ for every p ∈ V ⊥ . In view of Proposition 2.2, this amounts to saying that V ⊥ ⊂ dom Ψ ∗ = R(A∗ ). Since V ⊥ = Ker(A)⊥ = R(A∗ ), this condition is equivalent to the closedness of the space R(A∗ ), which in turn is equivalent to the closedness of R(A) in Z. From now on, we assume in this section that R(A) = R(A) + R(B) is closed in Z. By using Lemma 3.1, we now prove the boundedness of the sequence (xn , yn ) along with the summability of the sequences (‖xn+1 − xn ‖2X ), (‖yn+1 − yn ‖2Y ) and (‖Axn − Byn ‖2Z ). Proposition 3.2. Assume that the space R(A) is closed in Z and that (γn ) ∈ l2 . Suppose that the set S is nonempty and let (x, y) ∈ S . We have the following (i) limn→+∞ hn (x, y) exists, hence the sequence (xn , yn ) is bounded; (ii) the sequences (‖xn+1 − xn ‖2X ), (‖yn+1 − yn ‖2Y ) and (‖Axn − Byn ‖2Z ) are summable. In particular, lim ‖xn+1 − xn ‖X = lim ‖yn+1 − yn ‖Y = lim ‖Axn − Byn ‖Z = 0
n→+∞
n→+∞
n→+∞
(6)
and every weak cluster point of the sequence (xn , yn ) lies in V . Proof. (i) Taking (ζ , η) = (0, 0) in inequality (5) and setting hn = hn (x, y), we obtain hn+1 − hn + α‖xn+1 − xn ‖2X + ν‖yn+1 − yn ‖2Y + ‖Axn+1 − Byn ‖2Z ≤ 2γn2+1 Ψ ∗ (p).
(7)
In particular, hn+1 − hn ≤ 2γn2+1 Ψ ∗ (p). Since (γn ) ∈ l2 and Ψ ∗ (p) < +∞, the following lemma shows that (hn ) converges, which in turn implies that the sequence (xn , yn ) is bounded. Lemma 3.3. Let (an ) and (εn ) be two real sequences. Assume that (an ) is minorized, that (εn ) ∈ l1 and that an+1 ≤ an + εn for every n ∈ N. Then (an ) converges. (ii) Let us sum up inequality (7) from n = 0 to +∞. Recalling that (γn ) ∈ l2 , that Ψ ∗ (p) < +∞ and that hn ≥ 0, we immediately deduce the summability of the sequences (‖xn+1 − xn ‖2X ), (‖yn+1 − yn ‖2Y ) and (‖Axn+1 − Byn ‖2Z ). Since
‖Axn − Byn ‖2Z ≤ 2‖Axn+1 − Byn ‖2Z + 2‖Axn+1 − Axn ‖2Z , the sequence (‖Axn − Byn ‖2Z ) is also summable. For the last part, use the fact that limn→+∞ ‖Axn − Byn ‖2Z = 0 and the weak lower-semicontinuity of the function (x, y) → ‖Ax − By‖2Z .
7460
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
Remark 3.4. Proposition 3.2 still holds if one assumes only that R(A∗ ) ∩ R(M) ⊂ R(A∗ ), a condition that is weaker than the closedness of R(A). The reason is that one uses Lemma 3.1 for (x, y) = (x, y) ∈ S . 3.2. Ergodic convergence From now on, we assume that (γn ) ∈ l2 \ l1 . Condition (γn ) ̸∈ l1 is standard and common to most proximal-type algorithms.2 For practical purposes it states that the sequence (γn ) does not tend to 0 too fast as n → +∞. The condition (γn ) ∈ l2 \ l1 expresses that the sequence (γn ) tends moderately slowly toward 0. This kind of assumption appears in several works related to proximal algorithms involving maximal monotone operators with alternating features. See for example the seminal work [12] (or also [6,7]). In some particular cases, it is possible to obtain convergence of our algorithm under the sole assumption that (γn ) ̸∈ l1 (see Remark 5.8) but this relies strongly on the geometry of the problem. Finding the general conditions for convergence when (γn ) ̸∈ l2 is an interesting and challenging open question. ∑n Setting σn = k=1 γk , let us define the averages
xn =
n 1 −
σn
γk xk and yn =
k=1
n 1 −
σn
γk yk ,
(8)
k=1
and prove that the sequence ( xn , yn ) converges weakly to a point in S . Theorem 3.5. Assume that the space R(A) is closed in Z and that (γn ) ∈ l2 \ l1 . Assume moreover that the operator T is maximal monotone with S = T−1 0 ̸= ∅. Then the sequence ( xn , yn ) of averages converges weakly as n → +∞ to a point in S . Proof. Let us first prove that every weak cluster point of the sequence ( xn , yn ) is in S . Fix (ζ , η) ∈ T(x, y). By summing up inequality (5) of Lemma 3.1 for k = 0, . . . , n − 1, we obtain
⟨ζ , xn − x⟩X + ⟨η, yn − y⟩Y ≤
1 2σn
h0 (x, y) + 2Ψ (p) ∗
n −
γ
2 k
.
k=1
Let ( x∞ , y∞ ) be a weak cluster point of ( xn , yn ) as n → +∞ and let n tend to +∞ in the above inequality. By using the fact that Ψ ∗ (p) < +∞ and that (γn ) ∈ l2 \ l1 , we deduce that
⟨ζ , x∞ − x⟩X + ⟨η, y∞ − y⟩Y ≤ 0.
(9)
Since this holds whenever (ζ , η) ∈ T(x, y) we conclude ( x∞ , y∞ ) ∈ S by maximality of the monotone operator T. Now observe that the sequence ( xn , yn ) is bounded by Proposition 3.2. In order to establish the weak convergence of the sequence ( xn , yn ), it suffices to prove that it has at most one weak cluster point.3 Indeed, let (x, y) and (x′ , y′ ) be two such points, which must belong to S . Define the quantity Q (u, v) = α‖u‖2X + ν‖v‖2Y + ‖Bv‖2Z for every (u, v) ∈ X . From Proposition 3.2(i), the limits
ℓ(x, y) = lim Q (xn − x, yn − y) and ℓ(x′ , y′ ) = lim Q (xn − x′ , yn − y′ ) n→+∞
n→+∞
exist. Observe that Q (xn − x, yn − y) = Q (xn − x′ , yn − y′ ) + Q (x − x′ , y − y′ ) + 2α⟨xn − x′ , x′ − x⟩X + 2ν⟨yn − y′ , y′ − y⟩Y
+ 2⟨B(yn − y′ ), B(y′ − y)⟩Z .
(10)
Taking the average and letting ( xnk , ynk ) ⇀ (x , y ) as k → +∞, we obtain ′
′
ℓ(x, y) = ℓ(x′ , y′ ) + Q (x − x′ , y − y′ ). In a similar fashion, we deduce that
ℓ(x′ , y′ ) = ℓ(x, y) + Q (x − x′ , y − y′ ) and hence Q (x − x′ , y − y′ ) = 0 which implies (x, y) = (x′ , y′ ).
3.3. Links with Passty theorem Assume that X = Y = Z and that A = B = I, along with α = ν = 0. This induces a situation of strong coupling without costs-to-move. The corresponding algorithm is denoted by (A0 ); see Remark 2.1. Since R(A) = X, the closedness of R(A) is automatically satisfied. It is immediate that 0 ∈ T(x, y) if and only if x = y and Mx + Nx ∋ 0. Therefore, we have
S = T−1 0 = (x, x) ∈ X2 , x ∈ (M + N )−1 0 .
2 When the proximal point algorithm is seen as a discretization of the differential inclusion −˙x(t ) ∈ Ax(t ), the partial sums σ = n interpretation as discrete times. In this setting, the condition (γn ) ̸∈ l1 is an analogue for t → +∞. 3 This idea, inspired by the Opial lemma [14] (see Lemma 4.9), can also be found in [12].
∑n
k=1
γk have a natural
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7461
Assume that the operator M + N is maximal monotone with (M + N )−1 0 ̸= ∅. Let (γn ) be a positive sequence such that (γn ) ∈ l2 \ l1 . By arguing as in the proof of Proposition 3.2, we obtain that (a) limn→+∞ ‖yn − y‖2X exists for every y ∈ (M + N )−1 0; (b) the sequence ‖xn+1 − yn ‖2X is summable, hence limn→+∞ ‖xn+1 − yn ‖X = 0. Take ζ = 0 and x = y in the proof of Theorem 3.5. Observe that (0, η) ∈ T (y, y) holds if and only if η ∈ (M + N )y. Hence formula (9) implies that ⟨η, y∞ − y⟩Y ≤ 0 for every η ∈ (M + N )y. We deduce that y∞ ∈ (M + N )−1 0 by maximality of the monotone operator M + N. This proves that every weak cluster point of the sequence ( yn ) lies in (M + N )−1 0. Then we prove that the sequence ( yn ) has at most one weak cluster point. It suffices to adapt the proof of Theorem 3.5 by invoking point (a) above and by using the quantity Q (v) = ‖v‖2X (instead of Q (u, v)). We obtain that the sequence ( yn ) weakly converges toward some y∞ ∈ (M + N )−1 0. By using point (b) above, we infer that the sequence ( xn ) weakly converges toward x∞ = y∞ . As a conclusion, we recover the following result Theorem (Passty [12]). Assume that the operator M + N is maximal monotone with (M + N )−1 0 ̸= ∅. Let (γn ) be a positive sequence such that (γn ) ∈ l2 \ l1 and let (xn , yn ) be any sequence generated by algorithm (A0 ). Then there exists x ∞ ∈ ( M + N ) −1 0 such that both sequences of averages ( xn ) and ( yn ) converge weakly toward x∞ . 3.4. Strong monotonicity Under strong monotonicity assumptions, we are able to prove the strong convergence of the sequence (xn , yn ) itself (not only in average). Let us recall that the operator M is said to be strongly monotone with parameter a if, for every x1 , x2 ∈ dom M and every ξ1 ∈ Mx1 , ξ2 ∈ Mx2 , we have
⟨ξ2 − ξ1 , x2 − x1 ⟩X ≥ a‖x2 − x1 ‖2X . Assuming in the same way that the operator N is strongly monotone, we obtain that the operators M and T = M + NV are strongly monotone. Hence if the set S = T−1 0 is nonempty it must be reduced to a single point, say S = {(x, y)}. Proposition 3.6. Assume that the space R(A) is closed in Z and that (γn ) ∈ l2 \ l1 . If the operators M and N are strongly monotone and if S ̸= ∅ then the sequence (xn , yn ) converges strongly to the unique (x, y) ∈ S . Proof. Let us suppose that the operators M and N are strongly monotone, respectively, with parameters a, b > 0. We let the reader check that this assumption leads to a stronger form of inequality (5), which in turn implies hn+1 (x, y) − hn (x, y) + 2aγn+1 ‖xn+1 − x‖2X + 2bγn+1 ‖yn+1 − y‖2Y ≤ 2γn2+1 Ψ ∗ (p). Since (γn ) ∈ l2 and Ψ ∗ (p) < +∞, and recalling that hn (x, y) ≥ 0, the summation of the above inequality implies +∞ −
γn ‖xn − x‖2X + ‖yn − y‖2Y < +∞
n =1
and hence +∞ −
γn hn (x, y) ≤ α
n =1
+∞ −
γn ‖xn − x‖2X + (ν + ‖B‖2 )
n =1
+∞ −
γn ‖yn − y‖2Y < +∞.
n =1
Since (γn ) ̸∈ l1 and since limn→+∞ hn (x, y) exists, this limit must be equal to 0 and we deduce that limn→+∞ (xn , yn ) = (x, y). Remark 3.7. Observe that the maximality of the operator T does not come into play in the previous proof. Notice also that if the operator T is both maximal and strongly monotone, then condition S ̸= ∅ is automatically satisfied; see for example [11, Cor. 2.4] or [15, Prop. 12. 54]. 4. The subdifferential case: weak convergence results 4.1. Preliminaries Let f : X → R ∪{+∞}, g : Y → R ∪{+∞} be closed convex proper functions. Define the maximal monotone operators M and N, respectively, by M = ∂ f and N = ∂ g. The operator M coincides with the subdifferential of the function Φ defined by Φ (x, y) = f (x) + g (y) for every (x, y) ∈ X . Observe that the monotone operator T = ∂ Φ + NV = ∂ Φ + ∂δV is maximal if, and only if,
∂ Φ + ∂δV = ∂ (Φ + δV ) .
7462
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
Maximality is guaranteed if one assumes some qualification condition such as the Moreau–Rockafellar one [16,17] or the Attouch–Brézis one [18]. In order to cover various applications to PDE’s (see paragraph 6.2), we assume the following Attouch–Brézis qualification condition
(QC )
λ(dom f × dom g − V ) is a closed subspace of X × Y.
λ>0
Under (QC ), the following claim shows that the set S = T−1 0 can be interpreted as the set of minima of a suitable function. Claim 4.1. We have
S ⊂ argminV Φ = argmin{f (x) + g (y) : Ax = By}. If condition (QC ) is satisfied, the above inclusion holds true as an equality. Proof. First recall that the inclusion ∂ Φ + ∂δV ⊂ ∂ (Φ + δV ) is always satisfied. It ensues immediately that
S = T−1 0 = [∂ Φ + ∂δV ]−1 0 ⊂ [∂ (Φ + δV )]−1 0 = argminV Φ . If condition (QC ) is satisfied, the set λ>0 λ(dom Φ − dom δV ) is a closed subspace of X . This classically implies that ∂ Φ + ∂δV = ∂(Φ + δV ) and the conclusion follows.
Recall that the iterate (xn+1 , yn+1 ) of algorithm (A) is implicitly defined by 0 ∈ γn+1 ∂ f (xn+1 ) + A∗ (Axn+1 − Byn ) + α(xn+1 − xn ) 0 ∈ γn+1 ∂ g (yn+1 ) − B∗ (Axn+1 − Byn+1 ) + ν(yn+1 − yn ).
(11)
These are the optimality conditions associated to the following minimization problems
(A)
1 α 2 2 xn+1 = argmin γn+1 f (ζ ) + ‖Aζ − Byn ‖Z + ‖ζ − xn ‖X ; ζ ∈ X 2 2 ν 1 yn+1 = argmin γn+1 g (η) + ‖Axn+1 − Bη‖2Z + ‖η − yn ‖2Y ; η ∈ Y . 2
2
For each (x, y) ∈ X , define hn (x, y) = α‖xn − x‖2X + ν‖yn − y‖2Y + ‖Byn − By‖2Z
(12)
as in Section 3. Define also the sequence (ϕn ) by
ϕn = f (xn ) + g (yn ) +
1 2γn
‖Axn − Byn ‖2Z .
(13)
Lemma 4.2. With the above notations and hypotheses, we have the following.4 (i) For every (x, y) ∈ argminV Φ and for every n ≥ 0,
hn+1 (x, y) − hn (x, y) + 2γn+1 f (xn+1 ) + g (yn+1 ) − min Φ + ‖Axn+1 − Byn+1 ‖2Z V
+ ‖Axn+1 − Byn ‖Z + α‖xn+1 − xn ‖X + ν‖yn+1 − yn ‖2Y ≤ 0. 2
2
(14)
‖Axn − Byn ‖2Z .
(15)
(ii) For every n ≥ 0,
ϕn+1 − ϕn ≤
1 2
1
γn+1
−
1
γn
Proof. In view of the optimality conditions (11), for all (x, y) ∈ X × Y we can write the subdifferential inequalities
γn+1 (f (x) − f (xn+1 )) ≥ −⟨Axn+1 − Byn , Ax − Axn+1 ⟩Z − α⟨xn+1 − xn , x − xn+1 ⟩X
(16)
γn+1 (g (y) − g (yn+1 )) ≥ ⟨Axn+1 − Byn+1 , By − Byn+1 ⟩Z − ν⟨yn+1 − yn , y − yn+1 ⟩Y .
(17)
and
4 Inequalities (5) and (14) are closely related, even if they rely on different techniques (monotonicity in the first case and subdifferential inequalities in the second one).
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7463
Using the properties of the inner product the reader can check that
‖By − Byn ‖2Z − ‖By − Byn+1 ‖2Z = ‖Byn+1 − Axn+1 ‖2Z − ‖By − Ax‖2Z + ‖By − Ax − (Byn − Axn+1 )‖2Z + 2⟨By − Byn+1 , Byn+1 − Axn+1 ⟩Z + 2⟨Byn − Axn+1 , Axn+1 − Ax⟩Z . Combining (16) and (17) we deduce that
‖By − Byn ‖2Z − ‖By − Byn+1 ‖2Z ≥ ‖Byn+1 − Axn+1 ‖2Z − ‖By − Ax‖2Z + ‖By − Ax − (Byn − Axn+1 )‖2Z + 2γn+1 [f (xn+1 ) − f (x) + g (yn+1 ) − g (y)] + 2α⟨xn+1 − xn , xn+1 − x⟩X + 2ν⟨yn+1 − yn , yn+1 − y⟩Y = ‖Axn+1 − Byn+1 ‖2Z − ‖Ax − By‖2Z + ‖By − Ax − (Byn − Axn+1 )‖2Z +2γn+1 (f (xn+1 ) + g (yn+1 ) − f (x) − g (y)) + α(‖xn+1 − xn ‖2X + ‖xn+1 − x‖2X − ‖xn − x‖2X ) + ν(‖yn+1 − yn ‖2Y + ‖yn+1 − y‖2Y − ‖yn − y‖2Y ). We infer that for all (x, y) ∈ X × Y, hn (x, y) − hn+1 (x, y) ≥ 2γn+1 (f (xn+1 ) + g (yn+1 ) − f (x) − g (y)) − ‖Ax − By‖2Z
+ ‖Axn+1 − Byn+1 ‖2Z + ‖By − Ax − (Byn − Axn+1 )‖2Z + α‖xn+1 − xn ‖2X + ν‖yn+1 − yn ‖2Y .
(18)
Now let (x, y) ∈ argminV Φ . Then Ax = By and f (x) + g (y) = minV Φ so that inequality (18) becomes (14). On the other hand, by using inequality (18) with x = xn and y = yn , we infer that 2γn+1 (f (xn+1 ) + g (yn+1 ) − f (xn ) − g (yn )) + ‖Axn+1 − Byn+1 ‖2Z ≤ ‖Axn − Byn ‖2Z . We finally divide by 2γn+1 and rearrange the terms to obtain (15).
(19)
4.2. Weak convergence Assuming that argminV Φ ̸= ∅, let us set
ωn =
inf
(x,y)∈X×Y
1 2
‖Ax − By‖2Z + γn f (x) + g (y) − min Φ V
= inf Ψ (x) + γn Φ (x) − min Φ x∈X
V
.
(20)
Denote by (ωn− ) the negative part of (ωn ). In the sequel, we will assume the key condition
(ωn− ) ∈ l1 . This kind of hypothesis was introduced by the second author in [7]. Proposition 4.3. Assuming that argminV Φ ̸= ∅, consider the following assertions. (i) (γn ) ∈ l2 , the space R(A) is closed in Z and condition (QC ) is satisfied. (ii) (γn ) ∈ l2 and there exists x ∈ V and p ∈ R(A∗ ) such that −p ∈ ∂ Φ (x).5 (iii) (ωn− ) ∈ l1 . Then, we have the implications (i) H⇒ (ii) H⇒ (iii). Proof. (i) H⇒ (ii) Let x ∈ argminV Φ . Since condition (QC ) is satisfied, we deduce from Claim 4.1 that x ∈ S = −1 ∂ Φ + V ⊥ 0. Hence there exists p ∈ V ⊥ such that −p ∈ ∂ Φ (x). The closedness of R(A) implies the closedness of R(A∗ ), hence we have V ⊥ = Ker(A)⊥ = R(A∗ ) and finally p ∈ R(A∗ ). (ii) H⇒ (iii) The subdifferential inequality gives for every x ∈ X ,
Φ (x) − Φ (x) ≥ ⟨−p, x − x⟩X = ⟨−p, x⟩X , where the last equality is a consequence of p ∈ R(A∗ ) ⊂ V ⊥ and x ∈ V . Since Φ (x) = minV Φ , we deduce that
Ψ (x) + γn Φ (x) − min Φ ≥ Ψ (x) − γn ⟨p, x⟩X . V
5 The corresponding z ∈ Z such that p = A∗ z plays the role of a Lagrange multiplier.
7464
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
Taking the infimum over x ∈ X , we find
ωn ≥ − sup {γn ⟨p, x⟩X − Ψ (x)} = −Ψ ∗ (γn p) = −γn2 Ψ ∗ (p). x∈X
It ensues that ωn− ≤ γn2 Ψ ∗ (p). Since p ∈ R(A∗ ) = dom Ψ ∗ (see Proposition 2.2), the conclusion follows from the summability of (γn2 ). Notice that in infinite dimensional spaces, conditions (ii) or (iii) can be satisfied even if the space R(A) is not closed. An example will be provided in the last section. Let us now state the main result of the paper. Theorem 4.4. Let f : X → R ∪ {+∞} and g : Y → R ∪ {+∞} be closed convex proper functions. Let A : X → Z and B : Y → Z be linear continuous operators. Assume that condition (QC ) holds and that argmin{f (x) + g (y) : the qualification Ax = By} ̸= ∅. Let (γn ) be a positive sequence such that
1
γn+1
−
is majorized by some M > 0. Finally suppose that condition
1
γn
(ωn ) ∈ l holds, where the sequence (ωn ) is defined by (20). Then, we have −
1
(i) (xn , yn ) converges weakly to a point (x∞ , y∞ ) ∈ argmin{f (x) + g (y) : Ax = By}. (ii) limn→+∞ f (xn ) = f (x∞ ) and limn→+∞ g (yn ) = g (y∞ ). Proof. Let us start with several preliminary claims. Claim 4.5. For every (x, y) ∈ argminV Φ , lim α‖xn − x‖2X + ν‖yn − y‖2Y + ‖Byn − By‖2Z
n→+∞
exists in R.
Proof of Claim 4.5. Fix (x, y) ∈ argminV Φ and set hn = α‖xn − x‖2X + ν‖yn − y‖2Y + ‖Byn − By‖2Z as in (12). From inequality (14) we deduce that hn+1 − hn + 2ωn+1 + ‖Axn+1 − Byn ‖2Z + α‖xn+1 − xn ‖2X + ν‖yn+1 − yn ‖2Y ≤ 0. This implies hn+1 − hn + ‖Axn+1 − Byn ‖2Z + α‖xn+1 − xn ‖2X + ν‖yn+1 − yn ‖2Y ≤ 2ωn−+1 . It ensues that hn+1 − hn ≤ 2ωn−+1 . Since (ωn− ) ∈ l1 , owing to Lemma 3.3, we conclude that limn→+∞ hn exists.
(21)
Claim 4.6. The sequence (‖Axn − Byn ‖2Z ) is summable, and therefore limn→+∞ ‖Axn − Byn ‖Z = 0. Proof of Claim 4.6. Let us sum up inequalities (21) which are obtained for n = 0 to +∞. Recalling that (ωn− ) ∈ l1 and that hn ≥ 0, we immediately deduce the summability of the sequences (‖xn+1 − xn ‖2X ), (‖yn+1 − yn ‖2Y ) and (‖Axn+1 − Byn ‖2Z ). Since ‖Axn − Byn ‖2Z ≤ 2‖Axn+1 − Byn ‖2Z + 2‖Axn+1 − Axn ‖2Z , the sequence (‖Axn − Byn ‖2Z ) is also summable.
Claim 4.7. Setting ϕn = f (xn ) + g (yn ) + 2γ1 ‖Axn − Byn ‖2Z as in (13), we have limn→+∞ ϕn = minV Φ . n Proof of Claim 4.7. Since
ϕn+1 − ϕn ≤
M 2
1 γn+1
−
1
γn
≤ M, we derive from inequality (15) that
‖Axn − Byn ‖2Z .
(22)
From the previous claim, the sequence (‖Axn − Byn ‖2Z ) is summable. By applying Lemma 3.3, we deduce that the sequence (ϕn ) converges. Let us now set aN = 2
N −
1
γn f (xn ) + g (yn ) − min Φ + ‖Axn − Byn ‖Z . V
n =0
2
2
From inequality (14), the sequence (hN + aN ) is nonincreasing. Moreover the assumption (ωn− ) ∈ l1 allows us to assert that, for all n ∈ N, aN ≥ −2
+∞ − n=0
ωn− > −∞.
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7465
Thus the sequence (hN + aN ) is bounded from below, hence convergent. As a consequence, lim aN =
N →+∞
lim 2
N →+∞
N −
γn ϕn − min Φ V
n=0
exists in R.
Since γ 1 − γ1 ≤ M for every n ≥ 0, we deduce that γn ≥ n n+1 we infer from (23) that limn→+∞ ϕn = minV Φ .
(23)
1 Mn+ γ1
, hence (γn ) ̸∈ l1 . Recalling that limn→+∞ ϕn exists in R,
0
Claim 4.8. limn→+∞ Φ (xn , yn ) = minV Φ . Proof of Claim 4.8. Let (x, y) ∈ argminV Φ . Since condition (QC ) holds, we deduce from Claim 4.1 that (x, y) ∈ S = T−1 0. Hence there exists (p, q) ∈ V ⊥ such that −(p, q) ∈ ∂ Φ (x, y). The convex subdifferential inequality then gives
Φ (xn , yn ) ≥ Φ (x, y) + ⟨−(p, q), (xn , yn ) − (x, y)⟩X×Y
= min Φ − ⟨(p, q), (xn , yn )⟩X×Y .
(24)
V
Let us prove that limn→+∞ ⟨(p, q), (xn , yn )⟩X×Y = 0. From Claim 4.5, the sequence (xn , yn ) is bounded, hence it suffices to prove that 0 is the unique limit point of ⟨(p, q), (xn , yn )⟩X×Y . Let ⟨(p, q), (xnk , ynk )⟩X×Y be a convergent subsequence. We can extract a subsequence of (xnk , ynk ), still denoted by (xnk , ynk ), which weakly converges toward (x, y). The weak lower semicontinuity of the function (x, y) → ‖Ax − By‖2Z combined with Claim 4.6 implies that
‖Ax − By‖2Z ≤ lim inf ‖Axnk − Bynk ‖2Z = lim ‖Axn − Byn ‖2Z = 0, n→+∞
k→+∞
hence (x, y) ∈ V . Recalling that (p, q) ∈ V , we infer that ⊥
lim ⟨(p, q), (xnk , ynk )⟩X×Y = ⟨(p, q), (x, y)⟩X×Y = 0.
k→+∞
We immediately deduce that the whole sequence ⟨(p, q), (xn , yn )⟩X×Y converges toward 0. Hence from (24), we obtain that lim infn→+∞ Φ (xn , yn ) ≥ minV Φ . On the other hand, since Φ (xn , yn ) ≤ ϕn , we have in view of Claim 4.7
lim sup Φ (xn , yn ) ≤ lim ϕn = min Φ . n→+∞
n→+∞
V
We conclude that limn→+∞ Φ (xn , yn ) = minV Φ .
The proof of (i) relies on Opial’s lemma [14], that we recall for the sake of completeness. Lemma 4.9 (Opial). Let H be a Hilbert space endowed with the norm N. Let (ξn ) be a sequence of H such that there exists a nonempty set Ξ ⊂ H which verifies (a) for all ξ ∈ Ξ , limn→+∞ N (ξn − ξ ) exists, (b) if (ξnk ) ⇀ ξ weakly in H as k → +∞, we have ξ ∈ Ξ . Then the sequence (ξn ) weakly converges in H as n → +∞ toward a point of Ξ . Let us define the norm N (u, v) =
α‖u‖2X + ν‖v‖2Y + ‖Bv‖2Z
1/2
on the space X × Y. Since the linear operator B is
continuous, the norm N is equivalent to the canonical norm on X × Y. In view of Claim 4.5, the quantity N (xn − x, yn − y) does have a limit as n → +∞ for every (x, y) ∈ argminV Φ , which shows point (a). Let (xnk , ynk ) be a subsequence of (xn , yn ) which weakly converges toward (x, y). The weak lower semicontinuity of the function (x, y) → ‖Ax − By‖2Z combined with Claim 4.6 implies that
‖Ax − By‖2Z ≤ lim inf ‖Axnk − Bynk ‖2Z = lim ‖Axn − Byn ‖2Z = 0, k→+∞
n→+∞
hence (x, y) ∈ V . In the same way, using Claim 4.8 and the weak lower semicontinuity of Φ , we infer that (x, y) ∈ argminV Φ . This shows point (b) of Opial’s lemma and ends the proof of (i). Let us now prove that limn→+∞ f (xn ) = f (x∞ ). Using the weak lower semicontinuity of f , we have f (x∞ ) ≤ lim infn→+∞ f (xn ). On the other hand, we deduce from Claim 4.8 that lim sup f (xn ) = lim sup(f (xn ) + g (yn ) − g (yn )) n→+∞
n→+∞
= f (x∞ ) + g (y∞ ) − lim inf g (yn ). n→+∞
By the weak lower semicontinuity of g, we have g (y∞ ) ≤ lim infn→+∞ g (yn ). We infer that lim supn→+∞ f (xn ) ≤ f (x∞ ), and finally limn→+∞ f (xn ) = f (x∞ ). In the same way, we have limn→+∞ g (yn ) = g (y∞ ), which ends the proof of (ii).
7466
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
4.3. Inexact computation of the iterates Due to the implicit character of the iterations it is important to account for possible computation errors in their implementation. The optimality conditions (11) defining xn+1 and yn+1 can be relaxed without losing the convergence properties of the algorithm. Suppose the sequences ( xn ) and ( yn ) satisfy xn+1 ) + A∗ (A xn+1 − B yn ) + α( xn+1 − xn ) 0 ∈ γn+1 ∂εn+1 f ( yn+1 ) − B∗ (A xn+1 − B yn+1 ) + ν( yn+1 − yn ), 0 ∈ γn+1 ∂εn+1 g (
(25)
where ∂ε denotes the approximate ε -subdifferential.6 The arguments in the proof of Lemma 4.2 give an additional term 4γn+1 εn+1 on the right-hand side of∑ inequality (14) and ∞ an additional 2εn+1 on the right-hand side of (15). As a consequence, Claims 4.5 and 4.6 remain true if n=1 γn εn < +∞. ∑∞ Claims 4.7 and 4.8 also hold if n=1 εn < +∞. Corollary 4.10. Theorem 4.4 holds under the same hypotheses if the iterates satisfy (25) instead of (11) provided (εn ) ∈ l1 . There is a rich literature regarding the treatment of errors in proximal-type algorithms. The interested reader may consult [19,20] or [21]. See also [22] for an alternative approach to computational errors. 5. Further convergence results for strongly coupled problems In this section, we assume that X = Y = Z and that A = B = I, along with α = ν = 0. Given closed convex functions f , g : X → R ∪ {+∞}, consider the following particular case7 of algorithm (A)
(A0 )
1 2 ‖ζ − y ‖ ; ζ ∈ X x = argmin γ f (ζ ) + n X n +1 n+1 2 1 yn+1 = argmin γn+1 g (η) + ‖xn+1 − η‖2X ; η ∈ X . 2
Using the same notations as in the previous sections, we have
V = {(x, x); x ∈ X} and argminV Φ = {(x, x); x ∈ argmin(f + g )}. Let us first start with an example. Example 5.1. Take X = R and define the functions f , g : R → R, respectively, by f (x) = 12 (x − 1)2 and g (y) = 12 (y + 1)2 . We then have argmin(f + g ) = {0}. By writing down the optimality conditions for algorithm (A0 ), we immediately obtain the following recurrence formulas (see also Remark 2.1)
γn+1 (xn+1 − 1) + xn+1 − yn = 0 γn+1 (yn+1 + 1) + yn+1 − xn+1 = 0.
We infer that yn+1 =
1
y − 2 n
(1 + γn+1 )
γn2+1 . (1 + γn+1 )2 γ2
Let us set an = (1+γ1 )2 and bn = (1+γn+1 )2 . We deduce from the above equality that |yn+1 | ≤ an |yn | + bn . To prove the n+1 n+1 convergence of the sequence (yn ), we use the following lemma borrowed from [23, Lemma 3, p. 45]. Lemma 5.2. Let (an ) and (bn ) be real sequences such that 0 ≤ an < 1 and bn ≥ 0 for every n ∈ N. Assume moreover that (1 − an ) ̸∈ l1 and that limn→+∞ 1−bnan = 0. Let (un ) be a real sequence such that un+1 ≤ an un + bn for every n ∈ N. Then, we have lim supn→+∞ un ≤ 0. It is easy to check that, if the sequence (γn ) is not summable, then the sequence (1 − an ) is not summable. Moreover, γ bn we have limn→+∞ 1− = limn→+∞ 2+γn+1 = 0. Thus the previous lemma implies that lim supn→+∞ |yn | ≤ 0, hence a n
n+1
limn→+∞ yn = 0. Finally, we have proved that if (γn ) ̸∈ l1 then limn→+∞ (xn , yn ) = (0, 0). It is worthwhile noticing that the assumption (γn ) ∈ l2 does not come into play in the above example. This is in fact a consequence of a general result that will be brought to light by Theorem 5.5(i); see also Remark 5.8. Before stating Theorem 5.5, we need the following preliminary result. 6 In Hilbert space H , given ε > 0 and F : H → R ∪ {+∞}, the approximate ε -subdifferential of F at ξ is defined by ∂ F (ξ ) = {ξ ∗ ∈ H : F (ζ ) ≥ ε F (ξ ) + ⟨ξ ∗ , ζ − ξ ⟩ − ε ∀ζ ∈ H }. 7 See Remark 2.1, where algorithm (A ) has been introduced in the framework of maximal monotone operators. 0
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7467
Proposition 5.3. Let f , g : X → R ∪ {+∞} be closed convex functions which are bounded from below and such that dom f ∩ dom g ̸= ∅. Let (γn ) be a positive nonincreasing sequence such that limn→∞ γn = 0. Then any sequence (xn , yn ) generated by (A0 ) satisfies limn→+∞ ‖xn − yn ‖X = 0. Proof. Let us define the sequence (ψn ) by
1 ψn = γn f (xn ) + g (yn ) + ‖xn − yn ‖2X .
(26)
2
We have 1
ψn ≥ γn inf Φ + ‖xn − yn ‖2X ,
(27)
2
hence the sequence (ψn − γn inf Φ ) is nonnegative. By using inequality (19) with A = B = I, we deduce that, for every n ∈ N,
ψn+1 − ψn ≤ (γn+1 − γn ) inf Φ . This shows that the sequence (ψn − γn inf Φ ) is nonincreasing, hence convergent. Since limn→+∞ γn = 0, the sequence (ψn ) also converges. Let us apply inequality (18) with A = B = I, α = ν = 0 and x = y; we find for all x ∈ dom f ∩ dom g ̸= ∅, 2ψn+1 − 2γn+1 (f (x) + g (x)) ≤ ‖yn − x‖2X − ‖yn+1 − x‖2X . By summing the above inequalities for n = 0, . . . , N, we obtain 2
N −
[ψn+1 − γn+1 (f (x) + g (x))] ≤ ‖y0 − x‖2X .
n=0
Since this is true for every N ∈ N, we derive that lim inf [ψn+1 − γn+1 (f (x) + g (x))] ≤ 0. n→+∞
But both terms are convergent, so we have limn→+∞ ψn ≤ 0. From (27) we immediately deduce that limn→+∞ ‖xn − yn ‖2X = 0. The approach that we now develop relies on topological ingredients that can already be found in [24–26,7]. The result below shows that if (γn ) ̸∈ l1 and limn→+∞ γn = 0, the iterates xn , yn of algorithm (A0 ) approach the set argmin(f + g ) as n → +∞. Weak convergence is obtained under the extra assumption (ωn− ) ∈ l1 , where
ωn =
inf
(x,y)∈X2
1 2
‖x − y‖X + γn f (x) + g (y) − min Φ 2
V
.
Remark 5.4. By Proposition 4.3, (ωn− ) ∈ l1 for instance if (γn ) ∈ l2 , R(A) + R(B) is closed and condition (QC ) is satisfied. Here R(A) + R(B) = R(I) = X, so the closedness of the space R(A) + R(B) is trivially fulfilled. On the other hand, in the present setting it is easy to check that condition (QC ) is satisfied if and only if
λ(dom f − dom g ) is a closed subspace of X.
λ>0
This is precisely the Attouch–Brézis condition, which ensures that ∂ f + ∂ g = ∂(f + g ) and hence (∂ f + ∂ g )−1 0 = argmin(f + g ). In the next statement, we denote by dX ·, argmin(f + g ) the distance function to the set argmin(f + g ).
Theorem 5.5. Let f , g : X → R ∪ {+∞} be closed convex functions which are bounded from below. Assume that either f or g is inf-compact.8 Let (γn ) be a positive nonincreasing sequence such that (γn ) ̸∈ l1 and limn→+∞ γn = 0. Finally, let (xn , yn ) be a sequence generated by (A0 ). Then (i) limn→+∞ dX xn , argmin(f + g ) = limn→+∞ dX yn , argmin(f + g ) = 0; (ii) If (ωn− ) ∈ l1 , then the sequence (xn , yn ) converges weakly to a point (x, x) with x ∈ argmin(f + g );
(iii) If moreover the sequence
1
γn+1
−
1
γn
is majorized by some M > 0, then the sequence (xn , yn ) converges strongly in X.
8 Recall that a function is said to be inf-compact if its sublevel sets are relatively compact.
7468
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
Proof. First, if dom f ∩ dom g = ∅ then argmin(f + g ) = X, condition (QC ) cannot hold and the result is trivial. Hence we assume dom f ∩ dom g ̸= ∅. We can also assume that the function f is inf-compact. Since the function g is bounded from below, we derive that the function f + g is inf-compact; hence argmin (f + g ) ̸= ∅. (i) In view of Proposition 5.3, it suffices to prove that limn→+∞ dX yn , argmin(f + g ) = 0. Set A = B = I and α = ν = 0 in inequality (14) to deduce that for every y ∈ argmin(f + g ) and every n ∈ N, we have
‖yn+1 − y‖2X − ‖yn − y‖2X + 2γn+1 Φ (xn+1 , yn+1 ) − min Φ + ‖xn+1 − yn+1 ‖2X ≤ 0. V
(28)
Let P denote the projection operator onto the closed convex set argmin(f + g ) and take y = P (yn ). Setting un = d2X yn , argmin(f + g ) , we derive from (28) that
un+1 − un + 2γn+1 Φ (xn+1 , yn+1 ) − min Φ
V
≤ 0.
(29)
We now follow the same arguments as those used by the second author in [7, Theorem 3.1]. We distinguish two cases. (a) There exists n0 ∈ N such that for all n ≥ n0 , Φ (xn , yn ) > minV Φ . (b) For all n0 ∈ N, there exists n ≥ n0 such that Φ (xn , yn ) ≤ minV Φ . Case (a). Assume there exists n0 ∈ N such that for all n ≥ n0 , Φ (xn , yn ) > minV Φ . From inequality (29), we deduce that the sequence (un )n≥n0 is nonincreasing and convergent. We must prove that limn→+∞ un = 0. Using again inequality (29), we can assert that the sequence (γn (Φ (xn , yn ) − minV Φ )) is summable. Moreover, since (γn ) ̸∈ l1 , we have lim infn→+∞ Φ (xn , yn ) = minV Φ . Consider a subsequence of (xn , yn ), still denoted by (xn , yn ), such that limn→+∞ Φ (xn , yn ) = minV Φ . Since the function g is bounded from below, the sequence (f (xn )) is majorized. Using the inf-compactness of the map f , we obtain that the sequence (xn ) is relatively compact in X. Thus there exist a subsequence (xnk ) along with x ∈ X such that limk→+∞ xnk = x strongly in X. In view of Proposition 5.3, we also have limk→+∞ ynk = x strongly in X. The closedness of the function Φ allows to assert that Φ (x, x) ≤ lim infk→+∞ Φ (xnk , ynk ) = minV Φ . Hence (x, x) ∈ argminV Φ , i.e. x ∈ argmin(f + g ). Thus lim unk = lim d2X (ynk , argmin(f + g )) = d2X (x, argmin(f + g )) = 0.
k→+∞
n→+∞
Recalling that the sequence (un ) is convergent, we conclude that limn→+∞ un = 0. Case (b). We assume that for all n0 ∈ N there exists n ≥ n0 such that Φ (xn , yn ) ≤ minV Φ . Let us define
τN = max{n ∈ N, n ≤ N and Φ (xn , yn ) ≤ min Φ }. V
The integer τN is well-defined for N large enough and limN →+∞ τN = ∞. If τN < N inequality (29) implies un+1 ≤ un whenever τN ≤ n ≤ N − 1. In particular, uN ≤ uτ N .
(30)
Notice that if τN = N this inequality is still true. Therefore, it suffices to prove that limn→+∞ uτn = 0. First observe that Φ (xτn , yτn ) ≤ minV Φ for all sufficiently large n by definition. We deduce, as before, that the sequence (xτn ) is relatively compact, hence bounded in X. In view of Proposition 5.3, the sequence (yτn ) is also bounded in X, whence the boundedness of the real sequence (uτn ). The proof will be complete if we verify that every convergent subsequence of (uτn ) must vanish. Indeed, assume that limk→+∞ uτnk exists. We may assume, upon passing to a subsequence if necessary, that limk→+∞ xτnk = limk→+∞ yτnk = x for some x ∈ X. The closedness of Φ then gives
Φ (x, x) ≤ lim inf Φ (xτnk , yτnk ) ≤ min Φ , V
k→∞
which implies (x, x) ∈ argminV Φ . As before, this implies limk→+∞ uτnk = 0 and we deduce that the whole sequence (uτn ) converges toward 0. Then inequality (30) shows that limn→+∞ un = 0. (ii) Let y ∈ argmin(f + g ). From inequality (28), we obtain
‖yn+1 − y‖2X − ‖yn − y‖2X ≤ 2ωn−+1 . Since (ωn− ) ∈ l1 this implies in view of Lemma 3.3 that
∀y ∈ argmin(f + g ),
lim ‖yn − y‖2X exists.
n→+∞
(31)
On the other hand, recalling that limn→+∞ dX (yn , argmin(f + g )) = 0, every weak cluster point of the sequence (yn ) lies in argmin(f + g ). We infer from Lemma 4.9 that the sequence (yn ) weakly converges toward some point in argmin(f + g ). Finally, Proposition 5.3 shows that (xn ) and (yn ) have the same weak limit.
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7469
(iii) Let us first prove that the sequence (ϕn ) defined by formula (13) is bounded. By applying inequality (18) with A = B = I, α = ν = 0 and (x, y) = (xn , yn ), we easily find
ϕn+1 − ϕn +
1 2γn+1
1 ‖xn+1 − xn ‖2X + ‖yn+1 − yn ‖2X ≤ 2
1
γn+1
−
1
γn
‖xn − yn ‖2X .
(32)
Observe that this inequality is slightly more precise than (15), where two terms were omitted. Since γ 1 − γ1 ≤ M by n n+1 assumption and since ‖xn − yn ‖2X ≤ 2‖xn+1 − yn ‖2X + 2‖xn+1 − xn ‖2X , inequality (32) implies
ϕn+1 − ϕn +
1 2γn+1
‖xn+1 − xn ‖2X ≤ M ‖xn+1 − yn ‖2X + ‖xn+1 − xn ‖2X .
From the fact that limn→+∞ γn = 0, we immediately derive that for n large enough
ϕn+1 − ϕn ≤ M ‖xn+1 − yn ‖2X .
(33)
Recall that the sequence (ωn− ) is summable. The summability of (‖xn+1 − yn ‖2X ) is then an immediate consequence of inequality (21), with A = B = I and α = ν = 0. In view of (33), we infer from Lemma 3.3 that the sequence (ϕn ) is convergent, hence bounded. Since the function g is bounded from below, the sequence (f (xn )) is majorized. The inf-compactness of f allows to deduce that the sequence (xn ) is relatively compact in X. Hence there exists x ∈ X along with a subsequence (xnk ) such that limk→+∞ xnk = x strongly in X. From Proposition 5.3, we also have limk→+∞ ynk = x strongly in X. In view of (i), it is clear that x ∈ argmin(f + g ). Taking y = x in assertion (31), we deduce that limn→+∞ ‖yn − x‖X = 0. Owing to Proposition 5.3, we conclude that limn→+∞ xn = limn→+∞ yn = x strongly in X. Remark 5.6. No qualification condition is required in the proof of Theorem 5.5, which is a distinctive mark with respect to the proof of Theorem 4.4 (see specially Claim 4.8). The preceding analysis can be used to establish the classical weak convergence result by Brézis and Lions [27, Théorème 9] for the proximal point algorithm. In the context of Theorem 5.5, assume g ≡ 0 and argminf ̸= ∅. Then Φ (x, y) − minV Φ = f (x) − minX f ≥ 0 for all (x, y) and yn = xn for all n ∈ N. Whence ϕn = Φ (xn , yn ) = f (xn ) and (xn ) coincides with the sequence generated by the proximal point algorithm:
xn+1 = argmin γn+1 f (ζ ) +
1 2
‖ζ − xn ‖2X ; ζ ∈ X
for all n ∈ N.
(34)
Moreover ωn ≡ 0 so that (ωn− ) ∈ l1 . If (γn ) ̸∈ l1 then inequality (28) implies both that (‖xn − y‖X ) converges for each y ∈ argminf and that lim infn→+∞ f (xn ) = minX f . Since yn = xn the right-hand side of inequality (32) is zero and so the sequence (f (xn )) is nonincreasing. We deduce that limn→+∞ f (xn ) = minX f . This makes the inf-compactness assumption unnecessary in this setting. Lemma 4.9 then gives the following proposition. Proposition 5.7. Let f : X → R ∪ {+∞} be a closed convex function with argminf ̸= ∅. If (γn ) ̸∈ l1 then any sequence (xn ) satisfying (34) converges weakly to a point in argminf . Remark 5.8. Recall from Remark 2.1 that the iterates of algorithm (A0 ) satisfy the following equalities xn+1 = (I + γn+1 ∂ f )−1 (I + γn ∂ g )−1 xn , yn+1 = (I + γn+1 ∂ g )−1 (I + γn+1 ∂ f )−1 yn . This corresponds to a double resolvent scheme studied by Passty in [12]. In this reference, weak ergodic convergence of such sequences is established for general maximal monotone operators such that the sum is itself maximal, provided that (γn ) ∈ l2 \ l1 . Under some inf-compactness assumption, Theorem 5.5(ii) (resp. (iii)) shows that weak ergodic convergence is replaced by weak (resp. strong) convergence in the subdifferential framework. Hence our result is an improvement of the Passty Theorem when applied to subdifferential operators. Weak convergence of the alternating algorithm (A0 ) assuming only that (γn ) ̸∈ l1 is an open question. Observe, however, that if argmin(f + g ) = {ξ }, Theorem 5.5(i) shows that any sequence generated by (A0 ) converges strongly to (ξ , ξ ), even if (γn ) ̸∈ l2 (see Remark 5.4). 6. Application to domain decomposition for PDE’s Let us consider a bounded domain Ω ⊂ RN with C 2 boundary. Assume that the set Ω is decomposed in two nonoverlapping Lipschitz subdomains Ω1 and Ω2 with a common interface Γ . This situation is illustrated in the next figure.
7470
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
6.1. Neumann problem Given a function h ∈ L2 (Ω ), let us consider the following Neumann boundary value problem on Ω
−1w = h ∂w =0 ∂n
on Ω on ∂ Ω ,
where ∂w = ∇w.⃗n and n⃗ is the unit outward normal to ∂ Ω . We assume that Ω h = 0, which is a necessary and ∂n sufficient condition for the existence of a solution. The weak solutions of the above Neumann problem satisfy the following minimization problem
∫ 1
min
2
∫
|∇w| −
2
Ω
hw; w ∈ H (Ω ) , 1
Ω
(35)
see for example [28–31]. Moreover, denoting by w a particular solution, the solution set of (35) is of the form { w + C , C ∈ R}. Assuming that Ω is of class C 2 , we know from the regularity theory of weak solutions that w ∈ H 2 (Ω ); see for instance [32–34]. Notice that, if w ∈ H 1 (Ω ) then the restrictions u = w|Ω1 and v = w|Ω2 belong respectively to H 1 (Ω1 ) and H 1 (Ω2 ) and moreover u|Γ = v|Γ . Conversely, if u ∈ H 1 (Ω1 ), v ∈ H 1 (Ω2 ) and if u|Γ = v|Γ , then the function w defined by u on Ω1 v on Ω2 belongs
w=
(P )
to H 1 (Ω ). As a consequence, problem (35) can be reformulated as
min f (u) + g (v); (u, v) ∈ H 1 (Ω1 ) × H 1 (Ω2 ) and u|Γ = v|Γ ,
where f (u) =
1 2
∫
|∇ u|2 −
∫
Ω1
hu
and
Ω1
g (v) =
1 2
∫
|∇v|2 −
Ω2
∫ Ω2
hv.
(36)
1 Let us show how the algorithm with the (A) can be applied so as to solve problem (P ). The set X = H (Ω1 ) is equipped scalar product ⟨u1 , u2 ⟩X = Ω (∇ u1 .∇ u2 + u1 u2 ) and the corresponding norm. The same holds for Y = H 1 (Ω2 ) by replacing
1 Ω1 with Ω2 . The set Z = L2 (Γ ) is equipped with the scalar product ⟨z1 , z2 ⟩Z = Γ z1 z2 and the corresponding norm. The operators A : X → Z and B : Y → Z are respectively the trace operators on Γ , which are well-defined by the Lipschitz character of the boundaries of Ω1 and Ω2 (see [35, Theorem II.46] or [36, Theorem 2]). Algorithm (A) runs as follows 1 α un+1 = argmin γn+1 f (u) + ‖Au − Bvn ‖2Z + ‖u − un ‖2X ; u ∈ X 2 2 ν 1 2 vn+1 = argmin γn+1 g (v) + ‖Aun+1 − Bv‖Z + ‖v − vn ‖2Y ; v ∈ Y ,
2
2
where α and ν are fixed positive parameters. An elementary directional derivative computation shows that the weak variational formulation of algorithm (A) is given by
γn+1
∫
γn+1
∫
Ω1
∇ un+1 .∇ u + α
∫
∇vn+1 .∇v + ν
∫
Ω1
(∇ un+1 − ∇ un ).∇ u + α
∫
(∇vn+1 − ∇vn ).∇v + ν
∫
(un+1 − un )u +
∫
(vn+1 − vn )v +
∫
Ω1
Γ
(Aun+1 − Bvn )Au = γn+1
∫ hu Ω1
and Ω2
Ω2
Ω2
Γ
(Bvn+1 − Aun+1 )Bv = γn+1
∫ Ω2
hv
for all u ∈ X and v ∈ Y. These are the variational weak formulations of the following mixed Dirichlet–Neumann boundary value problems respectively on Ω1
−(γn+1 + α)1un+1 + α un+1 = γn+1 h − α 1un + α un ∂ un ∂ un + 1 (γn+1 + α) =α ∂n ∂n (γn+1 + α) ∂ un+1 + un+1 = α ∂ un + vn ∂n ∂n
on Ω1 on ∂ Ω1 ∩ ∂ Ω on Γ ,
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
7471
and Ω2
−(γn+1 + ν)1vn+1 + νvn+1 = γn+1 h − ν 1vn + νvn ∂vn ∂vn+1 =ν (γn+1 + ν) ∂n ∂n (γn+1 + ν) ∂vn+1 + vn+1 = ν ∂vn + un+1 ∂n ∂n
on Ω2 on ∂ Ω2 ∩ ∂ Ω on Γ .
Let us now check the validity of the assumptions of Theorem 4.4. The qualification condition (QC ) is automatically satisfied since dom f = X and dom g = Y. In view of Proposition 4.3, assumption (ωn− ) ∈ l1 is verified9 if (γn ) ∈ l2 and if there exist ( u, v) ∈ X × Y such that u| Γ = v|Γ along with z ∈ Z satisfying
− A∗ z ∈ ∂ f ( u) and B∗ z ∈ ∂ g ( v).
(37)
respectively to Ω1 and Ω2 . Let us multiply the equality −1 u = h by v = w |Ω2 the restrictions of w Take u = w |Ω1 and u u ∈ H 1 (Ω1 ) and integrate on Ω1 . Using Green’s formula and the fact that ∂ = 0 on ∂ Ω ∩ ∂ Ω , we obtain 1 ∂n ∀u ∈ H 1 (Ω1 ),
∫ Ω1
∇ u.∇ u −
∫ Γ
∂ u u= ∂n
∫ Ω1
hu.
Hence we deduce that for every u ∈ H 1 (Ω1 ), f ( u) =
1
∫
2
|∇ u|2 −
∫
Ω1
Ω1
∇ u.∇ u +
∫ Γ
∂ u u ∂n
and therefore
∂ u (u − u) 2 Ω1 Γ ∂n ∫ ∂ u ∂ u (u − u) = A ∗ , u − u . ≥ ∂n Γ ∂n X
f (u) − f ( u) =
1
∫
|∇ u − ∇ u|2 +
∫
v u v , condition (37) is proved This shows that A∗ ∂ ∈ ∂ f ( u) and we find in the same way B∗ ∂ ∈ ∂ g ( v). Since ∂∂nu |Γ = − ∂ ∂n ∂n ∂ n |Γ ∂ v 2 2 with z = ∂ n , which belongs to L (Γ ) since v ∈ H (Ω2 ).
We conclude from Theorem 4.4(i) and the preceding argument that if γ 1 − γ1 is bounded from above and if (γn ) ∈ l2 , n n+1 then any sequence (un , vn ) generated by (A) weakly converges in H 1 (Ω1 ) × H 1 (Ω2 ) to a minimum point ( u + C, v + C ), (C ∈ R) of problem (P ). Without loss of generality, we can assume that C = 0. Since Ω1 and Ω2 are Lipschitz domains, the injections H 1 (Ω1 ) ↩→ L2 (Ω1 ) and H 1 (Ω2 ) ↩→ L2 (Ω2 ) are compact by the Rellich–Kondrachov Theorem (see [37, Theorem 6.2] or [35, Theorem II.55]). It ensues that the sequence (un , vn ) converges to ( u, v) strongly in L2 (Ω1 ) × L2 (Ω2 ). Moreover, from Theorem 4.4 (ii), we have limn→+∞ f (un ) = f ( u) and limn→+∞ g (vn ) = g ( v), hence limn→+∞ Ω |∇ un |2 = Ω |∇ u| 2 and limn→+∞
Ω2
|∇vn |2 =
Ω2
1
|∇ v|2 . As a consequence, we have
1
lim ‖(un , vn )‖H 1 (Ω1 )×H 1 (Ω2 ) = ‖( u, v)‖H 1 (Ω1 )×H 1 (Ω2 ) .
n→+∞
Since (un , vn ) weakly converges in H 1 (Ω1 ) × H 1 (Ω2 ) toward ( u, v), the convergence is strong in H 1 (Ω1 ) × H 1 (Ω2 ). We have proved the following. Theorem 6.1. Let Ω ⊂ RN be a bounded domain which can be decomposed in two nonoverlapping Lipschitz subdomains Ω1 and Ω2 with a common interface Γ . We assume that the set Ω is of class C 2 . Let h ∈ L2 (Ω ) be such that Ω h = 0 and define the 1 2 functions f : H 1 (Ω 1 ) → R and g : H (Ω2 ) → R by formulas (36). Assume that (γn ) is a positive sequence such that (γn ) ∈ l and the sequence γ 1 − γ1 n n+1
is bounded from above. Then any sequence (un , vn ) generated by algorithm (A) strongly converges
u
in H (Ω1 ) × H (Ω2 ) and the limit ( u, v) is such that the map w = v 1
1
on Ω1 on Ω2
is a solution of the Neumann problem (35).
Algorithm (A) allows to solve the initial Neumann problem on Ω by solving separately mixed Dirichlet–Neumann problems on Ω1 and Ω2 . A similar method is developed in [38], where the authors consider alternating minimization algorithms based on augmented Lagrangian approach.
9 Observe that we have R(A) = R(B) = H 1/2 (Γ ). Hence the set R(A) + R(B) = H 1/2 (Γ ) is dense in Z = L2 (Γ ) and the condition (ω− ) ∈ l1 cannot be n verified by using assertion (i) of Proposition 4.3. This remark may suggest to take Z = H 1/2 (Γ ) endowed with the corresponding norm. In this case, the closedness of the set R(A) + R(B) is automatically ensured. However, the practical implementation of algorithm (A) will be more complicated due to the use of the H 1/2 (Γ ) norm. The details are out of the scope of the paper.
7472
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473
6.2. Problem with an obstacle As a model situation, let us consider the variational problem with an obstacle constraint
∫ min
1
2
|∇w|2 −
∫
Ω
Ω
hw; w ∈ H 1 (Ω ), w ≥ 0 on Ω .
(38)
It can be cast into our framework by taking f (u) =
1 2
∫
|∇ u|2 −
∫
Ω1
Ω1
hu + δC1 (u) and g (v) =
1 2
∫
|∇v|2 −
∫ Ω2
Ω2
hv + δC2 (v),
where δC1 is the indicator function of the convex set C1 = {u ≥ 0; u ∈ H 1 (Ω1 )} and δC2 is the indicator function of the convex set C2 = {v ≥ 0; v ∈ H 1 (Ω2 )}. Problem (38) can be reformulated as
(P )
min f (u) + g (v); (u, v) ∈ H 1 (Ω1 ) × H 1 (Ω2 ) and u|Γ = v|Γ .
Let us show that the Attouch–Brézis qualification condition (QC ) is satisfied in this situation (by contrast with the Moreau–Rockafellar condition which fails to be satisfied for N ≥ 2). Indeed, we are going to verify that dom f × dom g − V = H 1 (Ω1 ) × H 1 (Ω2 ). To that end, we introduce two trace lifting operators r1 : H 1/2 (Γ ) → H 1 (Ω1 ) r2 : H 1/2 (Γ ) → H 1 (Ω2 ) such that for every z ∈ H 1/2 (Γ ), z ≥ 0 ⇒ ri (z ) ≥ 0, i = 1, 2. Such operators can be easily obtained by taking any lifting operator and then taking its positive part. Precisely, we use that for any u ∈ H 1 (Ω1 ), u+ = max{u, 0} ∈ H 1 (Ω1 ), u− = max{0, −u} ∈ H 1 (Ω1 ) and u = u+ − u− . Similarly, for any v ∈ H 1 (Ω2 ), v + ∈ H 1 (Ω2 ), v − ∈ H 1 (Ω2 ) and v = v + − v − . For any u ∈ H 1 (Ω1 ) and v ∈ H 1 (Ω2 ), we denote respectively by u|Γ and v|Γ their Sobolev traces on Γ . Let us now perform the following decomposition: for any (u, v) ∈ H 1 (Ω1 ) × H 1 (Ω2 )
(u, v) = (u+ − u− , v) = u+ , v + r2 (u− )|Γ − u− , r2 (u− )|Γ . (39) − − Let us notice that u , r2 (u )|Γ belongs to V because u− and r2 (u− )|Γ have the same trace on Γ . Let us perform once more this operation: set v = v + r2 (u− )|Γ which belongs to H 1 (Ω2 ). (u+ , v) = (u+ , v+ − v− ) = u+ + r1 (v− )|Γ , v+ − r1 (v− )|Γ , v− .
(40)
Combining (39) and (40) we finally obtain
(u, v) = u+ + r1 (v− )|Γ ), v+ − r1 (v− )|Γ , v− + u− , r2 (u− )|Γ . − − By construction of the trace lifting operator, and by v ≥ 0 we have r1 (v )|Γ ≥ 0. Thus, we have obtained a decomposition of (u, v) as a difference of an element of H 1 (Ω1 )+ × H 1 (Ω2 )+ and an element of H 1 (Ω ). The decomposition algorithm can now be developed in a very similar way as in the unconstrained case. References [1] A. Cabot, P. Frankel, Alternating proximal algorithms with asymptotically vanishing coupling. Application to domain decomposition for PDE’s, Optimization (in press). [2] H. Attouch, P. Redont, A. Soubeyran, A new class of alternating proximal minimization algorithms with costs-to-move, SIAM J. Optim. 18 (2007) 1061–1081. [3] H. Attouch, J. Bolte, P. Redont, A. Soubeyran, Alternating proximal algorithms for weakly coupled convex minimization problems. Applications to dynamical games and PDE’s, J. Convex Anal. 15 (3) (2008) 485–506. [4] F. Acker, M.A. Prestel, Convergence d’un schéma de minimisation alternée, Ann. Fac. Sci. Toulouse Math. (5) 2 (1980) 1–9. [5] H.H. Bauschke, P.L. Combettes, S. Reich, The asymptotic behavior of the composition of two resolvents, Nonlinear Anal. 60 (2005) 283–301. [6] H. Attouch, M.-O. Czarnecki, J. Peypouquet, Prox-penalization and splitting methods for constrained variational problems, SIAM J. Optim. 21 (2011) 149–173. [7] A. Cabot, Proximal point algorithm controlled by a slowly vanishing term: applications to hierarchical minimization, SIAM J. Optim. 15 (2) (2005) 555–572. [8] A. Auslender, J.-P. Crouzeix, P. Fedit, Penalty-proximal methods in convex programming, J. Optim. Theory Appl. 55 (1987) 1–21. [9] H. Attouch, M.-O. Czarnecki, Asymptotic behavior of coupled dynamical systems with multiscale aspects, J. Differential Equations 248 (2010) 1315–1344. [10] R. Cominetti, M. Courdourier, Coupling general penalty schemes for convex programming with the steepest descent method and the proximal point algorithm, SIAM J. Optim. 13 (2002) 745–765. [11] H. Brézis, Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution, in: Lecture Notes, vol. 5, North Holland, 1972.
H. Attouch et al. / Nonlinear Analysis 74 (2011) 7455–7473 [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38]
7473
G. Passty, Ergodic convergence to a zero of the sum of monotone operators in Hilbert space, J. Math. Anal. Appl. 72 (2) (1979) 383–390. I. Ekeland, R. Temam, Convex analysis and variational problems, SIAM Classics Appl. Math. 28 (1999). Z. Opial, Weak convergence of the sequence of successive approximations for nonexpansive mappings, Bull. Amer. Math. Soc. 73 (1967) 591–597. R.T. Rockafellar, R. Wets, Variational Analysis, Springer, Berlin, 1998. J.J. Moreau, Fonctionnelles convexes, Cours Collège de France 1967, new edition CNR, Facoltà di Ingegneria di Roma, 2003. R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970. H. Attouch, H. Brézis, Duality for the sum of convex functions in general Banach spaces, in: J. Barroso (Ed.), Aspects of Mathematics and its Applications, North-Holland, Amsterdam, 1986, pp. 125–133. R.T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control Optim. 14 (1976) 877–898. A. Auslender, Numerical methods for nondifferentiable convex optimization, Math. Program. Stud. 30 (1987) 102–126. M.-V. Solodov, B.-F. Svaiter, An inexact hybrid extragradient-proximal point algorithm using the enlargement of a maximal monotone operator, Set-Valued Anal. 7 (1999) 323–345. F. Álvarez, J. Peypouquet, Asymptotic almost-equivalence of Lipschitz evolution systems in Banach spaces, Nonlinear Anal. 73 (9) (2010) 3018–3033. B.T. Polyak, Introduction to Optimization, Opimization Software, Inc. Publications Division, New York, 1987. F. Álvarez, R. Cominetti, Primal and dual convergence of a proximal point exponential penalty method for linear programming, Math. Program. 93 (2002) 87–96. H. Attouch, M.-O. Czarnecki, Asymptotic control and stabilization of nonlinear oscillators with non isolated equilibria, J. Differential Equations 179 (2002) 278–310. J.-B. Baillon, R. Cominetti, A convergence result for non-autonomous subgradient evolution equations and its application to the steepest descent exponential penalty trajectory in linear programming, J. Funct. Anal. 187 (2001) 263–273. H. Brézis, P.L. Lions, Produits infinis de résolvantes, Israel J. Math. 29 (1978) 329–345. H. Attouch, G. Buttazzo, G. Michaille, Variational analysis in Sobolev and BV spaces. Applications to PDE’s and optimization, in: MPS/SIAM Series on Optimization, vol. 6, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2006. H. Brézis, Analyse Fonctionnelle: Théorie et Applications, Dunod, Paris, 1999. J.-L. Lions, E. Magenes, Non-homogeneous boundary value problems and applications, in: Die Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, New York, Heidelberg, 1972. P.-A. Raviart, J.-M. Thomas, Introduction à l’Analyse Numérique des Équations aux Dérivées Partielles, Masson, Paris, 1993. S. Agmon, A. Douglis, L. Nirenberg, Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions I, Comm. Pure Appl. Math. 12 (1959) 623–727. S. Agmon, A. Douglis, L. Nirenberg, Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions II, Comm. Pure Appl. Math. 17 (1964) 35–92. D. Gilbarg, N. Trudinger, Elliptic Partial Differential Equations of Second Order, Springer-Verlag, Berlin, 1977. D. Azé, Eléments d’Analyse Convexe et Variationnelle, Ellipses, Paris, 1997. J. Marschall, The trace of Sobolev–Slobodeckij spaces on Lipschitz domains, Manuscripta Math. 58 (1987) 47–65. R. Adams, Sobolev Spaces, Academic Press, New York, 1975. H. Attouch, M. Soueycatt, Augmented Lagrangian and proximal alternating direction methods of multipliers in Hilbert spaces. Applications to games, PDE’s and control, Pac. J. Optim. 5 (1) (2009) 17–37.