Operations Research Letters 14 (1993) 111-120 North-Holland
September 1993
Modified descent methods for solving the monotone variational inequality problem * D.L. Zhu and P. Marcotte D~partement de Math~matiques, CollOgeMilitaire Royal de Saint-Jean, Richelain, QuY. JOJ 1RO, Canada Received March 1992 Revised June 1993
Recently, Fukushima proposed a differentiable optimization framework for solving strictly monotone and continuously differentiable variational inequalities. The main result of this paper is to show that Fukushima's results can be extended to monotone (not necessarily strictly monotone) and Lipschitz continuous (not necessarily continuously differentiable) variational inequalities, if one is willing to modify slightly the basic algorithmic scheme. The modification applies also to a general descent scheme introduced by Zhu and Marcotte. variational inequalities; descent methods; projection; global convergence.
I. Introduction
Let C be a nonempty, closed and convex subset of R n and let F be a mapping from R n into R ~. We consider the variational inequality problem (VIP): Find x * ~ C such that ( F ( x * ) , x* -x)<_ 0
for all x in C,
(1)
where ( . , • ) denotes the standard Euclidian inner product in R ~. Traditionally, algorithms for solving variational inequalities have relied on the fixed point formulation of the problem, and their convergence properties were derived from topological arguments. For instance, the differences of successive iterates x ~ - x ~+t generated by the classical projection algorithm define a contracting sequence, whenever the mapping F is strongly monotone (see Pang and Chan [5]). If the mapping F is only monotone, a variation of the projection algorithm known as the extragradient method (see Khobotov [2]) generates a sequence of feasible points whose (unknown!) distance to the set of equilibrium solutions decreases at each iteration. The advantage of a descent approach to the VIP is to minimize, from iteration to iteration, a measure of proximity of the current iterate to an equilibrium solution. The idea is not new: back in 1951, J. Robinson [6] proved the convergence of the fictitious play method for solving 2-person zero-sum games, an instance of skew-symmetric linear VIPs, by showing that the iterates were minimizing, although not monotonically, a certain merit function. More recently, researchers have investigated the direct minimization of merit, or 'gap' functions. In particular, Marcotte [4] has proposed a nondifferentiable optimization algorithm for minimizing the gap function g(x) = max(F(x), x-y), x~C
(2)
under a monotonicity assumption for F. * Research supported by NSERC (grant A5789) and D N D - A R P (grant FUHBO).
Correspondence to: Prof. P. Marcotte, D6partement de Math6matiques, Coll~ge Militaire Royal de Saint-Jean, Richelain, Qu& J0J 1R0, Canada. 0167-6377/93/$06.00 © 1993 - Elsevier Science Publishers B.V. All rights reserved
111
Volume 14, Number 2
OPERATIONS RESEARCH LEIq'ERS
September 1993
Recently, Fukushima [1] reformulated the VIP as the unconstrained mathematical program min
(3)
g,~(x ) ,
xEC
where 1 =
x-y>-
IIx-y[I 2
(4)
and Ilx II a = X/(X, ax> denotes the norm induced by the symmetric, positive definite matrix G. The main advantage of Fukushima's gap function over (2) is that if F is continuously differentiable, so is go, whenever a is finite. Fukushima [1] developed an algorithm for solving the VIP based on the direct minimization of go, for finite values of the parameter a. However, the strict descent property could only be demonstrated when F was strictly monotone and continuously differentiable on C. It is the main objective of this paper to show that an easily implementable modification of the basic scheme could indeed relax those two assumptions. The paper will be organized as follows: First we give some definitions and basic properties of the function go; then we show how the basic algorithmic scheme of Fukushima can be modified to accommodate monotone, but not necessarily strictly monotone mappings; finally, we adapt the convergence results to a very general class of descent methods for solving the VIP.
2. Definitions and some basic properties of g~
In this section we recall some definitions about variational inequalities and prove some properties of the function go. Throughout the paper, it will be assumed that the solution set of the VIP is nonempty. This will be the case if the set C is compact, among other sufficient conditions (see Hartmann and Stampacchia [3]). Definition 1. The mapping F : R n -~ R n is monotone on C if
(F(y) - F ( x ) , y - x ) > _ O
forallx, y inC,
strictly monotone on C if the above inequality is strict whenever y ~ x, and strongly monotone with m o d u l u s / z if (F(y)
-F(x), y -x)>_/zl[x-yll~
for all x, y in C.
If F is differentiable, then it is monotone on C if and only if its Jacobian matrix VF(x) is positive semidefinite on C for all x in C. Respectively, it is strongly monotone on C if and only if
(VF(x)(x-y),
x -y>>/zllx-yll~
for all x, y in C.
If the set C is compact, the latter condition is satisfied if the Jacobian matrix on C, for all x in C.
VF(x) is positive definite
Definition 2. The mapping F is Lipschitz continuous on C with Lipschitz constant L if [I F(x) < L l [ x - y l [ 2 for all x, y in C.
- F(y)l[
It is clear that any strictly monotone mapping F is monotone and that any strongly monotone mapping is strictly monotone. Let a denote a positive, finite parameter. For any point x in C we have:
g~,(x) = max(F(x),,~c x - y ) -
1 ~--~-al]x -YI]~ 1
=(F(x), 112
x-p,~( x) ) - --f-dallx-p,~ ( x)ll~
(5) (6)
Volume 14, N u m b e r 2
OPERATIONS RESEARCH LETFERS
September 1993
where p,(x) = prOJc,G(X -- aG-tF(x)) is the projection of the point x - aG-tF(x) onto the set C, with respect to the G-norm, which can also be characterized as a solution of the variational inequality
( x - G-1F(x) - p ~ ( x ) , G ( y - p ~ ( x ) ) ) <
0 for all y in C.
(7)
It is readily checked that g , is nonnegative on C, and that go(x) is equal to zero if and only if x is a solution of the VIP, i.e. that x =p~(x). If F is continuously differentiable on C, we have:
Vg~( x) = F ( x ) - (VF( x) - G / a ) ( p~( x) - x).
(8)
Fukushima showed that if F is also strictly monotone on C, the feasible direction p,(x) - x is a descent direction for g , at x, i.e.
g'(x; p~,(x) - x ) < 0 .
(9)
The next two propositions and corollary will refine this result. Proposition 1. Let F be Lipschitz continuous (with constant L) and strongly monotone on C (with modulus IX). Then we have that p~(x) - x is a strict descent direction for g~ at x, whenever x is not a solution of the VIP. Proof. Let x t =x + t(p~(x) - x ) be any point on the segment [x, p,,(x)]. Define the function r as:
r(u) = min ½11z-ull~
(10)
z~C 1
(11)
= ~llprOjc,G(U)-uIIzG • The function r is convex and continuously differentiable on C with gradient
Vr( u) = G(u - projc,o(U)), hence it satisfies the convexity inequality
r(v) - r ( u ) >__( v - u , O(u-proJc,a(u))). If one applies the previous inequality to the function
[x- G-tf(x)]llZ
= kll p o ( x ) one obtains
~(y) - ~( x) > _ ( y - x - o l G - 1 F ( y) +aG--'F( x), G [ x - a G - ' F ( x )
-p,~( x)] ).
Therefore: O/
g~( y) -g~( x) <
2
O/
TIIF(Y)IIG-~-TIIf(x)ll~ , + (F( x),
y - x)
+(F(y) -F(x), x-p~(x) )- l (y-x, G(x-p~(x))) +a(F(x) -F(y),
G-1F(x))
TIIF(y)- r(x)112a-,+ ( F ( x ) , Ol
=
y-x)+(F(y)-F(x),
x-p.(x))
1
--(y-x,G(x-po(x))). Ol
113
Volume 14, N u m b e r 2
OPERATIONS RESEARCH LEqTERS
September 1993
Replacing y by x t in the above inequality yields: 1
a
7(ga(xt)
--ga(x)) <--~llF(x,)
1
- F(x) II~-,- -fi(F(xt) - F ( x ) , x t - x ) 1
+(F(x), p,~-x)+-IIx-p,(x)llo ol
=TI+T
2.
Since F is strongly monotone and Lipschitz continuous on C, we can write
rl ~-~
t-I~
IIx-po(x)l122.
On the other hand, putting y = x in (7) and rearranging terms, one obtains: ( F ( x ) , x - p ~ ( x ) ) > ( 1 / ~ ) IIx - p ~ ( x )
I1~,
which implies that T 2 is nonpositive. Finally we obtain: 1
g',~(X; p , ~ ( X ) - X ) = lira0 t (g~,(xt)-g~,(x))
<- -.llx-p.(x)ll~ which implies the desired result.
(12)
[]
The following definition characterizes "strong" descent directions and will be required in the convergence analysis of our algorithms. Definition 3. The direction vector d is a y-strong descent direction for g,~ at x if
g'~(x; d) < - y g ~ ( x )
(13)
for some positive parameter 3'. Proposition 2. Let F be Lipschitz continuous and monotone on C. Then, for every x in C, either the vector p~(x) - x is a strict descent direction for g~ at x, or we have: g~(x) = ( l / a ) I I x -p~(x)II 2, Proof. As in the proof of Proposition 1, we have: 1
7(ga(xt)
- - g a ( x ) ) ~ T 1 + T 2.
From the monotonicity of F one obtains: o/
T, < - ~ l l F ( x t )
F ( x ) I1~, 2
-
-<
a l l II6 - ' 112 2
IIp~(x) - x II~
(141
whose limit is 0 as t goes to 0. By definition of p~(x), the second term is 1
T 2 = ( F ( x ) , p~(x) - x ) + --IIp~(x) o/
1
= --g~,(X) + 114
~llp~(x) -xll~
-xll~
Volume 14, Number 2
OPERATIONS RESEARCH LETrERS
September 1993
and it follows that:
1
g'~(x; p~( x ) - x ) <_ - g ~ ( x) +
~-a 11p~(x ) -xl[~ _<0.
Finally we conclude that either g" is negative or g , ( x ) + ( 1 / 2 a ) l l p~(x) - x
112 = 0,
as required.
[]
Corollary 1. Let F be Lipschitz continuous and monotone on C. Then, for every x in C, either the vector p~(x) - x is a y-strong descent direction for g~ at x, or we have: 1
g,Xx) <-
2(1 - y ) a
IIpo(x)-xll .
(15)
Proof. As in the proof of Proposition 2, we have: 1
p (x) -xll .
g ' ( x ; p,~(X) --X) <_ - g , ~ ( x ) + If - g , , ( x ) + ( 1 / 2 a ) l l p , ~ ( x ) - x Otherwise we must have that
II2 < -yg.(x), p~(x)-x
1 g,~(x) <_ 2(l_y)IIp.(x
)
-xll .
is a y-strong descent direction for g , at x.
[]
3. A modified Fukushima algorithm In this section we introduce a variation on Fukushima's theme, where the p a r a m e t e r a may vary from iteration to iteration. Whenever the mapping F is monotone and he set C is compact, the algorithm will generate subsequences converging to solutions of the VIP. Algorithm MF (Modified Fukushima) Step 1. (Initialization) Let x ° be a point in C. Let e be a predetermined tolerance factor. Let ot°, Aa and y be positive p a r a m e t e r s (y < 1). Set k to zero. Step 2. (Stopping criterion) if gak(x k) < e stop. Step 3. (Linesearch or null step) if g,~(x k) < (1/2(1 - y),~k)II p ~ ( x ~) - x k 112 then (nullstep) let a~+ l = a 0 + kAcr
x k+l ..~_xk else (linesearch) °~k+ 1 = O~k
dg =p,k(x k) - x k x k+t ~ argmint ~t031 g~k(x ~ +tdk).
Step 4. (Incrementation) Let k = k + 1 and return to Step 2. As an illustration of the above algorithm, consider the affine VIP with
C={xlO
} and F ( x ) = ( 1
ltx+(0) 1
-
115
Volume 14, Number 2
OPERATIONS RESEARCH LETTERS
September 1993
Let G = I. It is easily checked that F is monotone on C and that the equilibrium point is the point x * = (0, 1) T. Let a be strictly less than 4. At x ° = (2, - 3 / 2 ) T we have p~(x °) = (2 - a / 2 , - 3 / 2 + a / 2 ) T and Vg~(x °) = (0, 0). Thus, p~(x °) - x ° is not a strict descent direction for g,, at x °. Now set a = 6 . We have: p 6 ( x ° ) = ( 0 , 3 / 2 ) T, g6(x0) = 17/12 and 1 7 g 6 ( x ° ) = ( - 5 / 6 , - 1 ) . Thus, (Vg6(x °) - , p~(x °) x °) < 0 and the direction d I = p 6 ( x °) - x ° = ( - 2 , 3) T is a descent direction for the gap function g6(x) at x °. Minimization of g6 along d~ yields x 1 = (0, 3 / 2 ) T, g6(x 1) = 3 / 4 , P6(X 1) = (0, - 3 / 2 ) T, Vg6(x 1) = ( 9 / 2 , 3) and (Vg6(xl), p6(x 1) - x a) < 0. Thus, the direction d62 --p6(x 1) - x 1 -(0, - 6 / 4 ) T is also a descent direction. At x 1, minimization in the direction d g leads directly to the solution x * = (0, 1) T. The remainder of the section will be devoted to the convergence proof of algorithm MF. We first introduce the notation
h(/3, x) = mxa ~ ( F ( x ) , x - y ) -
/3
llx-yll z
(16)
and prove two simple lemmas. Lemma 1. Let C be compact and F be Lipschitz continuous on C. Then the function h is uniformly Lipschitz
continuous in the variable/3. Proof. Clearly, h is continuously differentiable with respect to/3, for fixed x, and:
Vtjh (/3, x ) = - 111x - Px/fl(x)
112c.
Since C is compact, Vt3h(/3, x) is uniformly bounded on C, and the result follows.
[]
Lemma 2. Let C be compact, F Lipschitz continuous on C and consider a sequence {x ~} of points in C converging to ~. Then we have: lim
g.k(xk)=g($).
(17)
~ k ---~ -boo
Proof. Let e denote a (small) positive number. As a consequence of Lemma 1, there exists an index k~ such that, for k larger than k~, we have:
Ig=k(x ) -g(xk)l <
1
Since g is continuous, there also exists an index k 2 such that
I g(x k) - g( )l <
1
for all indices k larger than k 2. Therefore, if k > max{k 1, k z} we obtain
Ig,~k(xk)--g(x)[
[]
Theorem 1. Let the set C be a nonempty, compact and convex subset of R n, and F be a Lipschitz continuous, monotone mapping from R n into R ~. Then, from any initial point x ° in C, either Algorithm M F stops at a solution of the VIP, or the limit of any convergent subsequence generated by the algorithm is a solution of the VIP. Proof. Since x k and x k + d k both belong to C, it follows from the convexity of the set C that the sequence {x k} is feasible. Now, if algorithm MF stops, it clearly does so at a solution of the VIP. 116
Volume 14, Number 2
OPERATIONS RESEARCH LETTERS
September 1993
Otherwise, assume that an infinite sequence is generated and consider any convergent subsequence
{Xk'}k'~ K'- We consider two cases: (i) First case: limk,_~= a k, -----+ ~ . In this case we must have that the condition
po ,( x
-
0
is satisfied for infinitely many indices. From Proposition 2, the compactness of C and the incrementation procedure for a, it follows that limk,_~ += g~k,(x ~') = 0. We can conclude, from Lemma 2, that g ( ~ ) = 0, and $ is a solution of the variational inequality. (ii) Second case: there exists an index k~ such that a k, = ak, for all k ' > kl. In this case, dk'=p~k,(xk')--X k' is a y-strong descent direction for all k ' > k 1. To prove the convergence of our algorithm we will show that the assumptions of Zangwill's convergence Theorem A [7, page 91] are fulfilled. Conditions 1 and 2 of the theorem hold trivially in our case. Let us consider condition 3, namely the closedness of the algorithmic mapping, which decomposes into a direction-finding mapping and a linesearch. As stated, our linesearch is exact and its closedness results from the continuity of g~. We will now study the closedness of the direction mapping: x k ~ D(x k) = d k. Let X k ' _..~X °°
dk' ~ D( x k') ~ d ~ po,,(x*') By construction (Step 3 of algorithm MF) we have
1 g % ( x ) > 2(1 --y)ag~
P%(xk,) --X 2G for all k ' > k 1.
As in the proof of proposition 2, we obtain
k' a k t C Pa*,(Xk')-Xk' 71 (gCtkl(xt)--gCtkl(Xkt))~-~ 2
2a_yg,,kl(X,, )
where xk, ' = x k' + t(p,~!x k') - x k') for some t ~ [0, 1]. Taking the limit in the above inequality, one can write: t ( g % ( x t ) --g"ktX=))<
2
where x7 = x = + t(P,k ( x =) - x=). Therefore:
and if x = is not a solution to the VIP, d = is a y-strong descent direction for g,, at x =. Consequently the • , kl algorithmic mapping D is closed and convergence follows from Zangwdl s Theorem A. [] Remark. If the set C is not compact the preceding result may fail to hold. For instance, let F ( x ) = ( - x 2 , Xl) T and C = R 2. F is monotone on C and the corresponding VIP has the unique solution x * = (0, 0) T. Set G = I. We have:
p,~(x) = x - a F ( x ) , x-p,~(x) =aF(x), Vga(x)=--t~(Xl,
X
2) T ,
(Vg,~(x), P , ~ ( x ) - x ) = O , 1
g,~(x) =
1
IIp (x) -xll2z = Kdll f(x)ll2
a
= - llxll , 117
Volume 14, Number 2
OPERATIONS RESEARCHLETFERS
September 1993
and
g~(x +t(p,~(x) - x ) ) =
Ilx-taf(x)1122
= ~ ( 1 +t2aZ)llxll~
-- (1 +t2a2)g,~(x).
Hence, for any nonzero vector x, the gap function g,, increases along the direction specified by algorithm MF.
4. Adaptation of the modified a l g o r i t h m to a general descent f r a m e w o r k
In [8], Zhu and Marcotte extended Fukushima's framework, retaining his strong monotonicity assumption for F. We will now show that this assumption can be relaxed, as was the case for the basic scheme. For the same of completeness, we recall some definitions and results from [8]. Definition 4. Let a be a positive parameter. We define the generalized gap function G~ associated with
the VIP as
G,,(x) = m a x ( F ( x ) , x - y ) y~C
1
--~b(x, y)
(18)
Ol
where ~b(x, y): C x C c R ~ x R ~ - o R has the following properties: - it is continuously differentiable on C x C. - it is strictly convex in y on C. - its Hessian matrix with respect to y is symmetric on C. If F is continuously differentiable, then G,~ is also continuously differentiable and its gradient is given by 1
VG~( x) = F ( x ) + VF( x ) ( x - W~( x)) + --Vx~b(x, W~( x)) a
(19)
where W~(x) is the unique solution of the optimization problem in y on the righthand side of (18), and plays the role played by p~(x) in the preceding sections of this paper. Proposition 3. Let the mapping F : R n -o R" be continuously differentiable and monotone on C. Then, for each x ~ C, either the vector d,~ = W~(x) - x is a descent direction for G,~ at x, or we have G,~(x) = (1/a){(VxCk(x, W,~(x)), W , ( x ) - x ) - $ ( x , W,(x))}. Proof. Using the expression (19) for the gradient of G,, we obtain:
(VG,~( x), W,~(x) - x ) = - ( 1 V~b(x, W,~(x) ) + F( x), W,~(x ) - x )
-((VF(x)V(Wo(x) -x), Wo(x)-x). Since F is monotone on C:
(vao(x), W.(x) -x)_< - a . ( x ) and the result follows.
1
W.(x)), W (x)
w.(x))}
[]
Proposition 3 can be used to prove the validity of the following algorithm, under the assumption that the set C be compact. 118
Volume 14, Number 2
OPERATIONS RESEARCH LETTERS
September 1993
General algorithm
Step 1. (Initialization) Let x° ~ C. Let • be a predetermined tolerance factor. Let ao, Aa be positive parameters. Set k to zero. Step 2. (Solve the auxiliary maximization problem) W,~k(x~) E a r g m a x y ~ c ( F ( x ) , x - y ) - (1/ak)qS(x, y) d k = W~k(xk ) --X k. Step 3. (Stopping criterion). if G,~k(xk) < e then stop. if
Co (X ) __
1
then a k + 1 = a 0 d- k A a
k=k+l go to Step 2. else O/k+ 1 = O~k
Step 4. (Armijo-type linesearch) Select parameters/3 ~ (0, 1) and or ~ (0, 1) if Gak(x k + d k) <_G~k(xk) then t = 1 else
m=0 while G~k(x k) - G ~ k ( x k + / 3 m d k ) < -a/3m(VG~k(xk), d k ) do
m=m+l endwhile t =tim
endif X k + l : X k -~- t d k
Step 5. (Incrementation) Let k = k + l a n d g o t o S t e p 2 The next theorem summarizes our work. Theorem 2. Assume that the set C is a nonempty, compact and convex subset of R ' , that the mapping
F : R ~ ~ R n is continuously differentiable and monotone on C, and that ~b is defined as above. Then, for any starting point x ° in C, either the algorithm stops at a solution of the VIP, or the limit of any convergent subsequence is a solution of the VIP. Proof. The arguments are similar to those used in the proof of Theorem 3.1.
[]
Remark. If we set ~b(x, y) = ½II x - y II 2 then the general algorithm reduces to the modified projection
algorithm of section 3.
References [1] M. Fukushima, "Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems", Math. Programming 53, 99-110 (1992). 119
Volume 14, Number 2
OPERATIONS RESEARCH LETTERS
September 1993
[2] E.N. Khobotov, "Modification of the extragradient method for solving variational inequalities and certain optimization problems", USSR Computational Math. Physics 27, 120-127 (1987). [3] Hartmann and Stampacchia, "On some nonlinear elliptic differential functional equations, Acta Mathematica 115, 153-188 (1966). [4] P. Marcotte, "A new algorithm for solving variational inequalities with application to the traffic assignment problem", Math. Programming 33, 339-351 (1985). [5] J.S. Pang and D. Chan, "Iterative methods for variational and complementarity problems", Math. Programming 24, 284-313 (1984). [6] J. Robinson, "An iterative method of solving a game", Ann. Math. 154, 296-301 (1951). [7] W.I. Zangwill, Nonlinear Programming: A Unified Approach, Prentice-Hall, Englewood Cliffs, NJ, 1969. [8] D.L. Zhu and P. Marcotte, "An extended descent framework for monotone variational inequalities", publication 917, Centre the Recherche sur les Transports, Universit6 de Montr6al, Montr6al, Qu6bec, Canada (1993), forthcoming in JOTA.
120