Available online at www.sciencedirect.com
Applied Mathematics and Computation 202 (2008) 446–452 www.elsevier.com/locate/amc
On the difference of two maximal monotone operators: Regularization and algorithmic approaches Abdellatif Moudafi Universite´ des Antilles et de la Guyane, DSI-GRIMAAG, 97275 Schoelcher, Martinique
Abstract Relying on the Yosida approximate, we investigate the problem of finding zeroes for a difference of two maximal monotone operators in Hilbert spaces. These zeroes are compared to those of the corresponding regularized and dual problems. The behavior of the regularized operator is also studied and a splitting algorithm involving the resolvents of both operators is suggested via a fixed-point formulation of the regularized problem. A particular attention is given to the DC programming case. Ó 2008 Elsevier Inc. All rights reserved. Keywords: Maximal monotone operators; Splitting proximal algorithms; Regularization; Duality; DC programming
1. Introduction and preliminaries Given a nonmonotone operator defined as the difference of two maximal monotone operators B and C, we study the problem of finding its zeroes relying on the corresponding regularized problem as well as the associated dual problem. To begin with, let us recall the following concepts which are of common use in the context of convex and nonlinear analysis, see, for example, Rockafellar–Wets [18]. Throughout, H is a real Hilbert space, h; i denotes the associated scalar product and kk stands for the corresponding norm. An operator with domain DðAÞ and range RðAÞ is said to be monotone if hu v; x yi P 0 whenever
u 2 AðxÞ; v 2 AðyÞ:
It is said to be the maximal monotone if, in addition, its graph, gphA :¼ fðx; yÞ 2 H H : y 2 AðxÞg, is not properly contained in the graph of any other monotone operator. It is well known that for each x 2 H and k > 0, there is a unique z 2 H such that x 2 ðI þ kAÞz. The single-valued operator J Ak :¼ ðI þ kAÞ1 is called the resolvent of A of parameter k. It is a nonexpansive mapping which is everywhere defined and is related xJ A ðxÞ
to its Yosida approximate, namely Ak ðxÞ :¼ kk , by the relation Ak ðxÞ 2 AðJ Ak ðxÞÞ. Finally, recall that the inverse A1 of A is the operator defined by x 2 A1 ðyÞ () y 2 AðxÞ.
E-mail address: abdellatif.moudafi@martinique.univ-ag.fr 0096-3003/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2008.01.024
A. Moudafi / Applied Mathematics and Computation 202 (2008) 446–452
447
In what follows, we will focus our attention on the problem of finding a zero of the difference of maximal monotone operators B and C on a real Hilbert space H, namely ðPÞ
find
x2H
such that
BðxÞ CðxÞ 3 0;
such a problem has not been fully explored up to now, but a substantial progress has been recently done on DC programming context [10,11], prox-regularity setting [15] and co-hypomonotone variational inequalities case [16]. We would like to emphasize that the examples abound form practitioners needing algorithms for nonconvex problems, for instance in crystallography, astronomy and more recently in inverse scattering. Until now, they have been forced to rely on convex heuristics to justify certain strategies. It is worth mentioning that when B ¼ @f and C ¼ @g, where f and g are two convex lower semicontinuous functions, ðPÞ amounts to finding critical points of the function f :¼ g h. This problem has many applications such as multicommodity network, image restoration processing, discrete tomography, clustering and seems particularly well suited to model several nonconvex industrial problems (portfolio optimization, fuel mixture, molecular biology, phylogenetic analysis . . .), see, for example, An and Pham [1]. One of the fundamental approaches to solving ðPÞ, see Hiriart-Urruty [12] in the context of DC programming, is to consider a regularized version obtained by replacing the operators by their Yosida approximates, namely ðRÞ
find
x2H
such that
Bk ðxÞ C k ðxÞ ¼ 0;
where k is a regularization parameter and Dk stands for the Yosida approximate of D. Another approach, see, for example, Attouch and The´ra [2], consists in considering the dual problem, namely ðDÞ
find
y2H
such that
C 1 ðyÞ B1 ðyÞ 3 0:
The major interest of this transformation is that the operators B1 and C 1 may enjoy some compactness properties which are not satisfied by the initial operators B and C. In contrast with ðPÞ, problem ðDÞ may be well posed. It is worth mentioning that when B ¼ @f and C ¼ @g, where f and g are the two convex lower semicontinuous proper functions, then B1 ¼ @f ; C 1 ¼ @g and ðPÞ is nothing but a DC optimization problem and ðDÞ is its dual in Toland’s sense. Here, h ðyÞ ¼ supx2H fhx; yi hðxÞg stands for the conjugate function of h. 2. Regularization aspect Let us first compare the zeroes of ðPÞ and ðRÞ. It is easy to verify that Proposition 2.1. If x is a solution to ðRÞ, then y ¼ J Bk x is a solution to ðPÞ. Proof. Indeed, since x is a solution of ðRÞ, we have Bk ðxÞ ¼ C k ðxÞ, or equivalently J Bk ðxÞ ¼ J Ck ðxÞ. Now, according to the fact that Bk ðxÞ 2 BðJ Bk ðxÞÞ and
Bk ðxÞ ¼ C k ðxÞ 2 CðJ Ck ðxÞÞ ¼ CðJ Bk ðxÞÞ;
we infer that J Bk ðxÞ solves ðPÞ and Bk ðxÞ solves ðDÞ. The proof is completed.
ð2:1Þ h
It should be mentioned that in the particular case, where k ¼ 1; C is a single-valued continuous mapping, 1 B ¼ N K the normal cone to a closed convex set K and PK :¼ ðI þ N K Þ the metric projector on K, relation (2.1) is nothing but CðPK ðxÞÞ þ ðx PK ðxÞÞ ¼ 0; a problem which arises frequently in the optimization and equilibrium analysis and is related to the so-called Wiener–Hopf equations (normal mapping). Now, if C is a set-valued map, ðPÞ can be viewed as a modification of the problem of finding the zeroes of C in such a way that the operator N K C satisfies Aubin’s tangential condition, namely 8 x 2 K ðCðxÞ N K ðxÞÞ \ T K ðxÞ 6¼ ;; where T K ðxÞ is the negative polar cone of N K ðxÞ.
448
A. Moudafi / Applied Mathematics and Computation 202 (2008) 446–452
Moreover, by definition of the normal cone to K at x, the inclusion ðPÞ in this case is equivalent to the variational inequality find x 2 K; 9v 2 CðxÞ such that hv; z xi 6 0
8 z 2 K:
In the finite-dimensional setting, C is an upper semicontinuous set-valued map on intK with convex compact values and thus if K is compact, by a result of Aubin ([3], theorem 9.9), the variational inequality has a solution on intK which in this case is a zero of the operator C. The proof of the following result is immediate. Proposition 2.2. If x is a solution to ðPÞ, then x þ kz is a solution to ðRÞ, where z is a solution to ðDÞ. Thus, the solutions xP ; xR and xD of problems ðPÞ; ðRÞ and ðDÞ are related by xR ¼ kxD þ xP : This clearly shows that limk!0 xR ¼ xP . Now, let us say something about the uniqueness of solutions. Proposition 2.3. If ðDÞ admits a unique solution and if in addition B is strongly monotone with constant a, then ðPÞ has a unique solution and so is ðRÞ. Proof. Let x be the solution of ðDÞ, namely 0 2 C 1 ðxÞ B1 ðxÞ. Then there exists z 2 H ; z 2 C 1 ðxÞ \ B1 ðxÞ or equivalently x 2 CðzÞ \ BðzÞ such that z solves ðPÞ. Consequently, kx þ z solves ðRÞ. Assume that ðRÞ has two solutions y 1 ¼ kx þ z1
and y 2 ¼ kx þ z2 ;
then J Bk y 1 and J Bk y 2 are the solutions of ðPÞ and Bk ðy 1 Þ and Bk ðy 2 Þ solve ðDÞ. The uniqueness of the solution of ðDÞ ensures that Bk ðy 1 Þ ¼ Bk ðy 2 Þ or equivalently y 1 y 2 ¼ J Bk y 1 J Bk y 2 . Using the a-strong monotonicity of B 1 and taking into account the fact that its resolvent is 1þka -Lipschitz continuous, we infer that ky 1 y 2 k 6
1 ky y 2 k; 1 þ ka 1
which yields y 1 ¼ y 2 and z1 ¼ z2 . This completes the proof.
h
Properties described in Propositions 2.1 and 2.2 are illustrated in the following simple examples. Example 1. Let H ¼ IR; B ¼ I; C ¼ @ j j, namely CðxÞ ¼ sgnðxÞ, in particular Cð0Þ ¼ ½1; þ1. The critical points of A :¼ I @ j j are 1, 0, and 1. The resolvent operators of B and C are given, respectively, by x 8 x 2 IR; J Bk x ¼ 1þk and J Ck x ¼ 0 if j x j6 k; J Ck ðxÞ ¼ x k if x P k
and
J Ck ðxÞ ¼ x þ k if x 6 k:
Thus the critical points of Bk C k are 1 k, 0 and 1 þ k and those of the dual problem are 1, 0 and 1. Example 2. Let K be a nonempty closed convex subset of H, let C be a maximal monotone operator and consider the problem N K ðxÞ CðxÞ 3 0. According to our methodology, a way of approximating these critical points is to consider critical points of the regularized version of N K ðxÞ CðxÞ. That is, 2k1 @d 2K ðxÞ C k ðxÞ, where d K denotes the distance function to the set K, and x is a critical point of the initial operator if and only if J Ck ðxÞ ¼ PK ðxÞ, the projection of x on K. Now let us give some properties of the operator Bk C k relying on variational convergence. To begin with, let us note that for a sequence of subsets Dn ; limn!þ1 Dn will stand for the convergence of Painleve´-Kuratowski, i.e. lim supn!þ1 Dn ¼ D ¼ lim inf n!þ1 Dn . Identifying the operators with their graph and using the fact that for a given maximal monotone operator D, we have that limk!0 Dk ¼ D, we immediately obtain lim inf Bk lim sup C k lim inf ðBk C k Þ k!0
k!0
k!0
i:e:; B C lim inf ðBk C k Þ: k!0
A. Moudafi / Applied Mathematics and Computation 202 (2008) 446–452
449
The whole convergence will be obtained, see, for example, Penot and Zalinescu [17], if we assume in addition an asymptotically compactness assumption on one of the operators and a condition of the type B1 \ C 1 ¼ f0g, where D1 stands for the recession (or asymptotic) operator associated to D. It is well known that, for a maximal monotone operator D, the set DðxÞ is closed and convex and that for any x 2 domðDÞ, we have limk!0 Dk ðxÞ ¼ D0 ðxÞ, the element of minimal norm in DðxÞ. Now, suppose that H is a finite-dimentional with IRþ ðBðxÞ CðxÞÞ ¼ H , by applying ([17], proposition 27), we obtain the following result: lim Bk ðxÞ \ C k ðxÞ ¼ B0 ðxÞ \ C 0 ðxÞ for any x 2 DðBÞ \ DðCÞ: k!0
The convergence result still holds true for the bounded Hausdorff topology. Finally, it is easy to check that the operator ðBk C k Þ þ k1 I is monotone. This follows directly by combining the next inequalities: hBk ðxÞ Bk ðyÞ; x yi P 0
and
2
hC k ðxÞ C k ðyÞ; x yi 6 k1 kx yk :
3. Algorithmic aspect To begin with, let us recall some definitions and a preliminary result which will be used in the sequel. An operator A is said to be cocoercive, if there exists c > 0 such that for all x; y 2 DðAÞ, hAðxÞ AðyÞ; x yi P ckAðxÞ AðyÞk2 and is firmly nonexpansive if c ¼ 1. Lemma 3.1. ([6], theorem 3.1)Let X be a Banach space, D be a closed convex subset of X, and A; B be the two operators on D into X such that Ax þ By 2 D for every pair x; y 2 D. Suppose A is a strict contraction, and B is continuous and RðBÞ is contained in a compact set. Then, there is a point x in D such that Ax þ Bx ¼ x. Now, according to Proposition 2.1, we can solve ðPÞ via ðRÞ. To this end, we will use the following fixedpoint formulation of the regularized problem ðRÞ: ðFÞ
find
x2H
such that
x ¼ J Bk x þ ðI J Ck Þx:
Using Lemma 3.1, we directly infer Proposition 3.2. Let B and C be the two maximal monotone operators in a Hilbert space H. Suppose that either 1. B is strongly monotone and RðCÞ is contained in a compact set. or 2. C is cocoercive and DðBÞ is contained in a compact set. Then, there is a point x 2 H such that x ¼ J Bk x þ ðI J Ck Þx. Proof. The fact that B is strongly monotone ensures that J Bk is a strict contraction. On the other hand, the operator I J Ck is firmly nonexpansive, hence continuous and RðI J Ck Þ ¼ RðCÞ. The result follows then, thanks to Lemma 3.1. The second assertion follows by taking into account the fact that 1 ðI J Ck Þx ¼ kJ Ck1 kx , RðJ Bk Þ ¼ DðBÞ and that the cocoercivity of C is nothing but a strong monotonicity of its inverse C 1 : x is a solution of the regularized problem ðRÞ and consequently y ¼ J Bk x solves the initial problem ðPÞ. The proof is completed. h Finding a solution of ðRÞ is equivalent to finding x 2 H such that J Bk x ¼ J Ck x. We can use the following iterative process which generates, from an initial data x0 , a sequence ðxn Þ by J Ckn xnþ1 ¼ J Bkn xn
8 n 2 IN :
ð3:1Þ
Note that if C ¼ 0, we recover the classical proximal algorithm. Furthermore ð3:1Þ () y n ¼ J Bkn xn
and
xnþ1 2 y n þ kn Cy n ) y nþ1 ¼ J Bkn ðy n þ kn Cy n Þ:
450
A. Moudafi / Applied Mathematics and Computation 202 (2008) 446–452
In the interesting case of DC programming, i.e B ¼ @g; C ¼ @h and f ¼ g h, where f and g are two convex lower semicontinuous functions, this algorithm is nothing but 1 2 ky wn k with wn 2 y n þ @hðy n Þ: y nþ1 ¼ proxkn g ðwn Þ :¼ arg min gðyÞ þ y2H 2kn It was shown in Moudafi and Mainge´ [14] that the sequence ðy n Þ weakly converges to a critical point of f (in other words, a solution of the problem ðPÞ). More precisely, we have the next convergence result. Theorem 3.3. Suppose that the function f :¼ g h is bounded from below and assume that for any n 2 IN kn P c > 0, then the sequence ðf ðy n ÞÞn2IN is convergent and limn!þ1 k1 n ky n y nþ1 k ¼ 0. Moreover, if the sequences ðy n Þ and ðzn Þ 2 @hðy n Þ are bounded, then every cluster points y 1 and z1 of the sequences ðy n Þ and ðzn Þ are the solutions of ðPÞ and ðDÞ, respectively. It is worth mentioning that, in the context of DC programming, Fernandez-Cara and Moreno [9] proposed and studied a semiexplicit scheme, namely having y n compute xnþ1 and y nþ1 by y n 2 @gðxnþ1 Þ
and
y nþ1 ¼ k1 ðI proxkg Þðxnþ1 þ ky n Þ:
We would like to emphasize that Hiriart-Urruty has noted that this formulation did not take into account both convexity of g and h. It is in fact a kind of forward–backward splitting method slightly similar to that developed in the context of finding a zero of the sum of two maximal monotone mappings, see, for example, Tseng [19]. It is worth mentioning that, for the problem of finding a zero of the sum of two maximal monotone operators, the algorithms which take into account the maximal monotonicity of the two mappings were developed with success, see for instance [7,13]. More recently, based on the A-monotonicity and H-monotonicity notions, some algorithms are proposed in [8,20] which not only generalized the maximal monotonicity notion, but also gave a new edge to the so-called splitting algorithms. In what follows and in the general context of maximal monotone operators, we will propose a splitting method which uses the proximal mappings of both B and C and is based on the fixed-point formulation ðFÞ of the regularized problem ðRÞ. This algorithm works as follows: it generates, from x0 2 H , a sequence ðxn Þn2IN by the following rule: xnþ1 ¼ J Bk xn þ ðI J Ck Þxn
8 n 2 IN ;
ð3:2Þ
which reduces, in the context of DC programming, to xnþ1 ¼ proxkg xn þ ðI proxkh Þxn ¼ proxkg xn þ kproxk1 h ðk1 xn Þ: It is worth mentioning that the weak convergence of ðxn Þ can be achieved if the operator GðkÞðxÞ :¼ J Bk x þ ðI J Ck Þx is firmly nonexpansive (which is an open question). If it is the case, the result follows, thanks to Browder and Petryshn’s theorem [4]. We end this section by stating a convergence result based on Krasnoselskii–Mann theorem and by deriving two corollaries. Proposition 3.4. Assume that ðRÞ has at least one solution, B is a-strongly monotone, C is b-cocoercive and rn 2 ð0; 1Þ 8n 2 IN . Then, the sequence generated by the rule xnþ1 ¼ rn xn þ ð1 rn ÞGðkÞxn ; weakly converges to a solution of ðRÞ provided that ab P 1 and
P
ð3:3Þ n rn ð1
rn Þ ¼ þ1.
1 1 Proof. Since ðI J Ck Þx ¼ kJ Ck1 kx and taking into account the fact that J Bk (resp. kJ Ck1 k ) is a contraction 1 1 with constant 1þak (resp. 1þbk 1 ), it is then easy to check that GðkÞ is nonexpansive provided that ab P 1 and the result follows by applying Krasnoselskii–Mann theorem, see, for example [21]. h
Corollary 3.5. Suppose that ðRÞ has at least one solution and assume that J Bk is weakly closed, then ðy n Þ :¼ ðJ Bk xn Þ weakly converges to y 1 ¼ J Bk x1 which solves the initial problem ðPÞ.
A. Moudafi / Applied Mathematics and Computation 202 (2008) 446–452
451
Proof. Indeed, the sequence ðxn Þ weakly converges to some x1 and as the sequence ðy n Þ is bounded, for any subsequence ðy m Þ converging to some ~y 2 H , we have ~y ¼ J Bk x1 ¼ y 1 , hence the weak convergence of the whole sequence to y 1 . h Corollary 3.6. Suppose that ðRÞ has at least one solution and assume that the operator J Bk is compact, then the sequence ðy n Þ :¼ ðJ Bk xn Þ strongly converges to y 1 :¼ J Bk x1 which is a solution of problem ðPÞ. Proof. Indeed, the sequence ðy n Þ remaining in a compact set, for any convergent subsequence ðy m Þ ! y 1 , we have y 1 ¼ J Bk x1 . Because J Bk is the maximal monotone and hence its graph is weakly–strongly closed. From which we infer the strong convergence of the whole sequence to y 1 . h 4. Conclusion It is tempting to extend the above results on critical points to ‘‘approximate critical points”, which we will define with the help of the e-enlargement’s notion introduced in [5], instead of the exact operators. Given e > 0, x is said to be an e-critical point of B C if Be ðxÞ \ C e ðxÞ is nonempty, where Be is the e-enlargement of B, namely Be ðxÞ :¼ fu 2 H ; hu v; x yi P e
8 y 2 H; v 2 BðyÞg:
1. If x is an e-critical point of B C and y 2 Be ðxÞ \ C e ðxÞ, then y is an e-critical point of B1 C 1 , because 1 e for any operator D, it is easy to check that ðDe Þ ðxÞ ¼ ðD1 Þ ðxÞ. In contrast, results associating e-critical points of Bk C k with e-critical points of B C are not as complete and pleasant as those for the limiting case e ¼ 0. 2. In the particular case of DC programming, if x is an e-critical point of g h, namely @ e gðxÞ \ @ e hðxÞ is nonempty, then x is an e-critical point of @g @h. This follows thanks to the fact that, for any lower semicontinuous convex function f, the following inclusion holds true @ e f ðxÞ ð@f Þe ðxÞ. 3. The algorithmic procedure that we propose constitutes a substantial progress since it takes into account the maximal monotonicity of both B and C. To conclude, a substantial progress is being made in finding the zeroes of the difference of two maximal monotone operators, a problem which is not fully explored up to now. We think that our results obtained in this paper may inspire and pave the way to future research in this field. Acknowledgements The author would like to thanks the anonymous reviewers and the editors for their suggestions of relevant up-to-date references. References [1] L.T.H. An, D.T. Pham, The DC programming and DCA revised with DC Models of real world nonconvex optimization problems, Annals of Operations Research (2005) 25–46. [2] H. Attouch, M. The´ra, A duality principle for the sum of two operators, Journal of Convex Analysis 3 (N1) (1996) 1–24. [3] J.-P. Aubin, Optima and Equilibria, Springer, 1998. [4] F. E Browder, W.V. Petryshn, Construction of fixed points of nonlinear mappings in Hilbert spaces, Journal of Mathematical Analysis and Applications 20 (1967) 197–228. [5] R.S. Burachik, A.N. Iusem, B.F. Svaiter, Enlargement of maximal monotone operators with application to variational inequalities, Set-Valued Analysis 5 (1997) 159–180. [6] G. L Cain Jr., M.Z. Nashed, Fixed points and stability for a sum of two operators in locally convex spaces, Pacific Journal of Mathematics 39 (N3) (1971) 581–591. [7] J. Eckstein, D.P. Bertsekas, On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators, Mathematical Programming 55 (3) (1992) 293–318.
452
A. Moudafi / Applied Mathematics and Computation 202 (2008) 446–452
[8] Y.P. Fang, N.J. Huang, H-monotone operators and system of variational inclusions, Communications on Applied Nonlinear Analysis 11 (1) (2004) 93–101. [9] E. Fernandez-Cara, C. Moreno, Critical point approximation through exact regularization, Mathematics of Computation 50 (N181) (1988) 139–153. [10] A. Hamdi, A Moreau-Yosida regularization of a difference of two convex functions, Applied Mathematics E-Notes 5 (2005) 164–170. [11] A. Hamdi, A modified Bregman proximal scheme to minimize the difference of two convex functions, Applied Mathematics E-Notes 6 (2006) 132–140. [12] J.-B. Hiriart-Urruty, How to regularize a difference of convex functions, Journal of Mathematical Analysis and Applications 162 (1991) 196–209. [13] P.-L. Lions, B. Mercier, Splitting algorithms for the sum of two nonlinear operators, SIAM Journal of Numerical Analysis 16 (N6) (1979) 964–979. [14] A. Moudafi, P.-E. Mainge´, On the convergence of an approximate proximal method for DC functions, Journal of computational Mathematics 24 (N4) (2006) 475–480. [15] A. Moudafi, An algorithmic approach to prox-regular variational inequalities, Applied Mathematics and Computation 155 (3) (2004) 845–852. [16] T. Pennanen, Local convergence of the proximal point algorithm and multiplier methods without monotonicity, Mathematics of Operations Research 27 (1) (2002) 170–191. [17] J.-P. Penot, C. Zalinescu, Continuity of usual operations and variational convergence, Set-Valued Analysis (2003) 1–32. [18] R.T. Rockafellar, R. Wets, Variational Analysis, Springer, Berlin, 1988. [19] P. Tseng, A modified forward–backward splitting method for maximal monotone mappings, SIAM Journal of Control and Optimization 38 (2000) 431–446. [20] R.U. Verma, Approximation solvability of a class of nonlinear set-valued variational inclusions involving ðA; gÞ-monotone mappings, Journal of Mathematical Analysis and Applications 337 (2008) 969–975. [21] Q. Yang, J. Zaho, Generalized KM theorems and their applications, Inverse Problems 22 (2006) 833–844.