U.S.S.R. Compur. Maths. Math. Phys. Vol. 19. pp. 51-71 o Pergamon Press Ltd. 1980. Printed in Great Britain.
0041-5553/79/0201/0057%07.50/0
AGRADIENTMETHODFORTHEMODIFIED LAGRANGEFUNCTION" G.
D.MAISTROVSKII Kharkov
(Received 4 Ocrober 1977)
A GRADIENT method with an adaptive procedure for choosing the step length is applied to search for the saddle point of the modi~ed Lagrange function of a convex programing problem. It is shown that the process is convergent to a saddle point. When sufficient conditions for a strict regular maximum are satisfied, the rate of convergence is exponential. Consider in Rn the convex programming problem
with non-empty set Z* = X* X Y* of saddle points of the Lagrange function L. We shall assume that the concave functionsf, gl , . . . , g, are differentiable, and that their derivatives satisfy a Lipschitz condition in any compactum. A gradient method for seeking a point of the set Z* was proposed in [ 11, and is usually known as the method of Lagrange multipliers. This method is not in general convergent to the set Z* even when the standard sufficient conditions for a second-order maximum are satisfied: along with the conditions for strict regularity. The convergence holds only when extra conditions on the spectrum of the Hessian of the function L are satisfied, assuming, of course, that the step length of the method is sufficiently small. If the set of saddle points is stable with respect to x, the method is convergent to the e-neighbourhood of the set Z* for any E > 0, provided that the step length is sufficiently small compared with E (see [2] ). In the absence of stability (this is the situation e.g. in a linear programming problem), even e-convergence cannot be guaranteed. The method described in fl] is not an algorithm, inasmuch as there is no procedure for finding the step length. Selection of the step factors is difficult, for the following reasons. A fu;ed step multiplier may not ensure the required smallness of the step. If a sequence of step multipliers (rkJO- is chosen, so that the conditions
lim zk=O, k-cm
lZh. vjkhisl. Mat. mar. Fix, 19, 1,56-69,
lil
.tk=
m
1
max +< t, h
R=O
1979.
57
G. D. Maistrovskii
58
are satisfied, convergence to the set Z* is ensured (provided that the set of saddle points is stable with respect to x), though only in the case when 7 is sufficiently small [2]. But it is not obvious u @on’ how this latter quantity should be chosen, since no effective estimates are available. Moreover, if we use a method in which the step multiplier length tends to zero, we inevitably get slow convergence. In the present paper, the method described in [I], supplemented by an adaptive procedure for choosing the step length, is applied to a modified Lagrange function, instead of to the classical function itself. We will show that the method is convergent to the set Z*, while Iim t'>O. k-r-
In the case when the standard conditions for a maximum, and the conditions of strict regularity, are satisfied, the method converges at a rate of a geometric progression. In Sections 1 and 2, the method is studied in the context of a class of concave functions which are not in general connected with problem (0.1); and in Section 3 our results are illustrated by taking the example of the modified function, studied in [4-71 and some other papers.
1. Description of the method and proof of convergence
Let $ be a concave-convex differentiable function in Rn X Rn, with a derivative that satisfies a Lipschitz condition in any compactum. Let $‘(z) denote the value of the derivative at an arbitrary point z = (x, ?sjzand Gx(x, _v):Gy(x, JJ) the partial derivatives with respect to x and?!. Hence, +‘(z) =(qI(s, y), qV(x, y)). We shall assame that the setZ*=X*X Y*of saddle points of J/ is non-empty. Given any concave-convex function, we have the inequality (see ]l,
81) ‘I’> (I.
(X-X..
9) ) - (Y-Y’, Zl.b(XI Y>>w
(x, y) ER’ XR”‘.
(5’. y')EZ’.
We shall impose a stonger condition on $: a 7
(x-x’.
lJ%(X. y))-
(J. y) =R”XK‘“.
(y-y’, (ax’, y’)
>
(1.1)
0 exists such that
3bC-r.Y))~--Tll$V(~? Y)jj**
~7’.
(1.2)
Here and below, 11. II denotes the Euclidean norm. be an arbitrary point of space Rn X Rm, and r” any positive number I_& z.(‘= (2. y”) Consider the sequence {z4}om, z4= satisfying the inequality T” Gmin {I, .?‘ii$‘(z”) II-‘}. defined respectively by the recurrence (sL. y”) =R’ XR”,. and the numerical sequence {T’}~~, relations +‘=Z4+l,+,
IJ”),
$,lit’=y4-T4&,(Zk,
Tk+l=min {rk( I-T”]~$‘(z”) II’),
~-‘//$‘(Z”“)
y”),
(1.3) Ii-‘).
(1.4)
59
A gradient method for the modified Lagrange function
To avoid the need to make formal remarks regarding division by a quantity that vanishes for zk E Z*, we shall exclude from our discussion the trivial case of convergence to the set Z* after a fiite number of steps. Theorem 1 1. There exists a number k, such that r”“=t’(
I-$‘/j$‘(z’)
ii’), k>ko.
2. lim ?>O. A-r_ 3. The sequence
(1.5)
{zR}o- is convergent to a point of the set Z*.
For the proof, we need three preliminary lemmas. In accordance with the accepted terminology, we shall say that process (1.3) (1.4) is compact, if the sequence {/Iz~!/}~~ is bounded. Lemma
1
Process (1.3), (1.4) is compact.
Hence we find. using inequality (1.1). (1.6) On the other hand, it follows from (1.4) that (1.7) Adding (1.6) and (1.7) we get
and hence
Now observe that, from relation (1.4), we have T ~=~.i!-'~~~'(z') i,-'.i=l, 2. . . . . ?I’) ,I-‘} >O, Ic=O, 1, . . by the same expression, T'+' 22-' min {r’. 111J’( ~~z~-_z*~(~~J~~O-~*~~*+~~,This proves the lemma.
Hence, Hence
Lemma 2
A number ko exists such that T4+‘=TA(1-T1ij3’(~k)I/‘)r
k>ko.
(1.8)
60
G. D.
Maistrovskii
Proof: Assume that the lemma is false. Then, it follows from (1.4) that, for an infinite subsequence
of numbers k, we have
t’+‘=2-‘ii$‘(z’+‘)
II-‘.
To arrive at a contradiction,
it is sufficient to show that, in fact,
lim z’ll$’ (zi) 11*=0. i-tm
(l-9)
If the series m
c
~‘ll$’ (2’) II2
(1.10)
i-0
is convergent,
then Eq. (1.9) is satisfied. Now assume that series (1 .lO) is divergent. From inequality
(1.7) we have
Since the series (1.10) is divergent, it follows from the last inequality
that (1.11)
lim t’=O. k-ccc
But process (1.3) (1.4) is compact. Hence sup II*'
(zR)ll
k
Inequality
(1.9) follows from the last two inequalities.
Fix a point
z’= (5‘. y’) =z!’
where y is the quantity
This proves the lemma.
and consider the function
appearing in inequality
(1.2). We choose a convex compacturn
K,
containing all points zk. The possibility of such a choice is ensured by the compactness of the process. Let M be the Lipschitz constant for the derivative of the function G in the set K, and 7 any positive number satisfying
? <27&I-‘.
Lemma 3 If rk < r, then
(1.12)
Proof: Using Taylor’s formula, we arrive at the inequality
G (F’) i.e.
+
~ll~~+~-zI/l~,
A gradient method for the modified Lagrange fbnction
G (zk+‘)
Hence,
using
(&-x*,
y”) ~~‘-~t~v(~,
$,(r’, 3”) ii’>+
y”)) -
61
(y”--y’, $v(t’, y”) ) 1
$(Tk)zj/‘$%k)
\I’*
inequality (1.2), and collecting like terms, we obtain
This proves the lemma. Proof of Theorem 1. Para. 1 is already proved. To prove Para. 2, we start by assuming that series (1.10) is divergent. We showed in the proof of Lemma 2 that this assumption implies that Eq. (1 .I 1) can be satisfied. Hence a number kl exists, such that, for all k > k,, we have rk < 7, where ‘i is the quantity appearing in the statement of Lemma 3. Hence, for k < k,, inequalities (1.12) hold. On adding these inequalities, we obtain
zkll$‘(zk) lI%G(zkl) - lim inf G(zk). k=k,
Since the process is compact, the sequence {G (z’) } ooc is bounded. Hence series (1.10) is in fact convergent. Hence, from Eq. (1.8) and the well-known property of an infinite product, we obtain inequality (1 S). Let us now prove Para. 3 of the theorem. It follows from inequality (1 S) and the convergence of series (1.10) that v3
c
II$’ (zk)112<~. A=0
(1.13)
Hence lim 3’ (z’) =O. L+ A’
(1.14j
Since the process is compact, it has at least one limit point z*. It follows from (1.14) that $‘(z*) = 0, and hence z* E Z*. We choose an arbitrary E > 0. In view of inequality (1.13), a number k2 exists such that
(1.15)
Since z* is the limit point of the process, a number k, > kp_ exists, such that (1.16)
62
G.D.Maistrovskii But it follows from inequality (1.16) that, for anu number k > kj ,
(1.17) i=k,
On combining inequalities (I 1_5)-(l.l7),weobtain positive, it now follows that
/j~“--z’~~~~(~, k>k,.
Since eisarbitrary
lim zk=z’. I-+@ This proves the theorem.
2. Rate of convergence Consider the conditions under which the convergence of the process at the rate of a geometric progression can be guaranteed. We shall use the following standard notation. If T; Rn + Rm is a linear operator, then T* is the adjoint operator to T, Im T is the image of T, and Ker T is the kernel of T. We denote by I the identity mapping in any space. By an eigenvalue of a real operator we mean an eigenvalue of its complex extension. The term “spectrum” will be used in the corresponding way. Theorem 2
Let the set Z* consist of a single point Z* = (x*, J’*), let the function il, be thrice differentiable at this point, let the operator $,x(x*, y*) be negative definite, and operator GY,,(x*, y*) non-negative, and let Ker qyy (z*? y’) =Im qVX(r*. y’) . Then process (1.3), (1.4) is convergent to the point z* at the rate of a geometric progression. We consider in Rntm the mapping F, defined by the equation F(Z) = ($X(X. Y) 1-% :=(5. y). Let
(5. y)).
T= lim TV. k-bBy Theorem 1,~ > 0, the sequence {zk}oX is convergent to the point z*, while, for all sufficiently large k, we have the equations z”“=;k+TkF(ek),
We put D=l t~F’(.z’).
Tk++k(l-~kIIF(Zk)
II)‘.
(2.1)
Then the operator D has the matrix representation
(2.2)
where A=-~$,,(s‘, y’). B=rqUx(r*, 11A II < 2 and II C II < 2, i.e.
y’),
~=T$,,(z'.
y'j.
It
can be shown that, if
63
A gradient method for the modified Lagrange function
then the spectral radius of operator D is less than I. In this situation, our theorem can easily be derived from Lyapunov’s theorem, The difficulty thus lies in the fact that satisfaction of inequalities (2.3) cannot in general be guaranteed, (In particular, this means that the unit circle does not necessarily split up the spectrum of operator D). A similar difficulty was overcome in [9, lo] when proving the exponential convergence of the iterative process {zk},,=, satisfying the condition for finiteness of the path length, i.e.
2 liZt+’4/l
(2.4)
k=”
since the Admittedly, our process (3.1) is not iterative with respect to the sequence (z~)~step multiplier is not constant. Also. instead of condition (2.4), we can only guarantee a priori for it the weaker condition
Yet in spite of these differences, it turns out that the scheme developed in 19, lo] is applicable to our present situation. We shall therefore use this scheme. As a preliminary, we shall examine the spectral properties of the operator D. Lemn1o 4 Let A, B, Cbe linear operators, d:R‘l-R’s, B:R”-R’,, conditions
d =_-l=>B’B.
whi,e
l)~R~,--~t~_R’c+“~
C=C’G4h
C:R’,-R”.
Ker C=I111
satisfyring the
B.
is the operator defined by Eq. (2.2).
Then : (1) all the real eigenvalues of the operator D are less than 1, (2) all the non-rea1 eigenvalues of the operator D are inside the unit circle. (3) the Jordan cells, corresponding to the real eigenvalues p < -1, are one-dimensional. Proof: Let cct i h be an eigenvalue of the operator D, and (U t ir, s + iv) the corresponding non-zero eigenvector. Then,
We rewrite this equation as
(3.5
1
64
G. D. Maistrovskii
(I-A)~(~B’s=c~z~-?.“:
(2.6)
-BUS
(2.7)
(I-C)s=p-ix,
(J-_-l)rSB’c=l.u+pr, -Br+
(1-C)
We shall make use of the consequences
I.=~.s-T~L..
-(!-pjr-i.u.Br),
(2.9)
of these equations
basic way’: we form the scalar product of each equation resulting equations.
(2.8)
obtained
from them in the same
with a certain vector, and then add the
As the four sets of four vectors we take (u, s, r. L’). (-1-p) (r,--L;. --u~s)~
-Cs) (0, Cv,O,
u+i.r,
Bu,
.As a result, we obtain respectively:
(1-p) (IIul/~i/~r112~lisllzSllull~) = (.A21,12)$ (Ar, r) + (Cs, s) +- (CV, c) , [i.2-(1-~)‘I +(Ar,
(llu’iii-!irll~)+(1-~)
r))-(liBrrli?+IIBri12)
Here we have made use of the equation
((Au.
(2.10)
(2.11)
U)
=O,
CB = 0, which follows from the third of conditions
(2.5).
Let us prove para. (1) of the lemma. Let the real number P be an eigenvalue of operator D. This implies that h = 0. r = 0. I‘ = 0. It follows from (2.10) that 16 A is positive definite! and the operator C is non-negative
1. Let I-(= 1. Since the operator
definite, we find from (2.10) that u = 0
and Cs = 0. Then, (2.6) gives us B*s = 0. Since Ker C= Im B, the equations imply that s = 0. Hence (U t ir, s t iv) is a zero vector; and this contradicts
that
Cs = 0 and B*s = 0 our hypothesis.
Turn to para. (2). Let p + ih be a non-real eigenvalue. Since h f 0, it follows from (2.12) and 0 = Cs = 0. Then, we can rewrite Eq. (2.10) as Ijs;,?S[IuII:=jjU!j’fIjrjl:PO ((:Ui’~TJ(r;:‘)=(;1zI.
?(&I’)
rr)T(Ar,
r),
(2.13)
From (2.11) and (2.13) we obtain
p+j,z= Since, by hypothesis,
I-
( (A-B-B)
u, u) + (
A - B*B > 0, the last equation
(A-B'B) r, r)
IIullz+II~IIz
.
implies that
We now prove para. 3. Let a Jordan cell of the operator D which is not one-dimensional correspond to the real eigenvalue ~1.Let wl = (~1, sl) be the non-zero eigenvector, ~9 = (~2, “2) be the associated first-order vector, of this cell. Then,
i.e.
and
65
A gradierlt method for the modified Lagranpe functiorl
(I-.l)uiSB*R~=~Irl~.
(2.14)
-Rrr;-(I-C)n,=ps,.
(2.15)
{I- -:1 ,I rl~tB’S2=U,+@l,,
(2.16)
-Brc,’
(2.17‘)
(I-C)s2=s,Sp3,
2Bu,--( l-p)Bu,.Eqs. (2.14)-(2.17) by the vectors - (I-p)u,~ (1-11)~~~. BQ, respectively and add. Recalling that CB = 0, we obtain (l-p)%,, (1--I+) We multiply
(1-p)
(Arc,. u,)=21lBu,~ ‘.
(2.18)
and adding, we In the same way. multiplying (2.14)-(2.17) by the vectors u,. s:. -ul. -s, S,Ii’=l/lj,i;‘. Since ~‘1 =# 0. it follows from the last equation that ul $ 0 also. Then. On combining this inequality with (2.18). the condition A > B*B gives (Au,. u,) >~~Bz~,li'. obtain
we find that p > -1. The lemma is proved.
Proof of l%eorem 2. By the hypothesis of the theorem. the mapping F is twice differentiable at the point z*. Moreover,
F(z*) = 0.To simplify, the notation. we shall assume during our proof
that z* = 0. This obviouslv implies no loss of generality.
Then. in the neighbourhood
of zero.
F(z)=F’(o)z+o(ll;ll): Since
r=
lim tk, k-m
the first of Eqs. (2.1) can be rewritten
as _‘-‘=DrJ+o(;,;‘“).
(2.19)
into the direct sum of subspaces U, I’, and I%‘,ccrresponding to the We decompose space R 't+m parts of the spectrum of the operator D lying respectively inside, outside, and on the unit circle. Let the corresponding decomposition for the operator D be D=D,+D2+D,. Since the spectral radius of the operator a contraction operator.
D, is less than 1, we can introduce into the space U a norm such that D1 is Denote this norm by, 1. 1.Hence.
lD,1+=11.
Similarly, in the appropriate
norm in space I’:
ID_-'1Gq-c 1. I S\R1’4: I
(2.20)
( 2.21)
66
G.D.
Maistrovskii
In space II’we choose any norm 1. I. By I z I, with z E Rn+“, we shall mean the quantity I~I-/r:I-i-lu*l.
where
z=u+~-Cu:
is the direct resolution
if the vector z. The conditions
of Lemma 4 are satisfied for the operator D. For, expanding vectors $,(x, .Y*) in Taylor series with respect to x in the neighbourhood
of x*, we can rewrite condition
(1.2). with y = .I’*, as
i.e. J--J’)
(A (a-s’),
2 +- ~;B(J--TC’) lj”-i-0 (/ix---s’/i’l.
Since x is any point of the neighbourhood
of x *, this last inequality
{rk} ,,y
Since r” < y, and the sequence
is equivalent
to .A>yB*Bl?.
is decreasing, we have y > r. Moreover, by the
hypothesis
of the theorem, A > 0. The first of conditions
(2.5) follows from these facts. The other
conditions
of Lemma 4 are obviously satisfied. By the lemma, all the eigenvalues of the operator D,
lying on the unit circle, are equal to -1. and the corresponding
Jordan cells are one-dimensional.
This means that (2.22)
D,=-I. Using relations (2.20)-(2.22).
we find from Eq. (2.19) that
where L~~+~~~~Lc’=z~. while tozero.Weput a’=JZij-‘((7.FI-171.11).
is a sequence of non-negative
(I.‘},,~
Then.
Jt~‘(=(l-~‘l~)~r’~.
numbers. convergent ltfollowsfrom
(2.23) that
On cross-multiplying
these inequalities,
we obtain
Hence we obtain
a”-2r’,
ak+’
2
(I-q)x"+q'
(2.24)
Let us now show that there exists lirn CC’, k-s
(2.25)
lim sup a’. I-r*
(2 26)
a=
where CK= 0 or 1. We put a=
A gradient method for the modified Lagrange function
67
Since 0 < ct! Q 1, then also 0 < a Q 1. If Q = 0, our assertion is proved. Let (I:> 0. We choose an arbitrary v E (0, a). Since rk --t 00,a number kl exists such that
v(l--v) (I-qp29,
kak,.
(2.27)
By definition (2.26) of oL,a k2 > kl exists such that ok2 > V.Then, from (2.24), (2.28)
kak,.
a’>v,
For, if (2.28) holds for some k > k2, we obtain from (2.24) and (2.27): Qb+’ >
v-2? (I-q)vSq
>v*
Since v is an arbitrary number less than cr, we derive (2.25) from (2.26) and (2.28). Let us now pass to the limit in (2.24): a a SinceO
(i-q;a+q*
l.Hence,cr=Oor Iu~I+Iu~~[).
1.
Using the same method as when proving (2.25) we can
p= lim PA, k-cm where fl= 0 or 1. In short, four cases are logically possible: cx= 0 and fi = 0, cv= 1 and /.3= 0, Q = 1 and /3= 1, and cr = 0 and fl= 1. The first case is not realised, since a’+p”= l+ 1zk I--! ( wb I 2 1 and hence cr t 0 > 1. The second case implies that
zk=vkSo ( 1vk I)
)
(2.29)
the third, that z~=z~;k+o ( I WRI ) ,
(2.30)
zk=Uk+o ( 1UkI) .
(2.31)
and the fourth, that
Let us show that, in fact, either of relations (2.29) or (2.30) contradicts the convergence to zero of the sequence {z”}OoDIn fact, it follows from (2.29) and the second of inequalities (2.23), that
68
G. D. Maisttovskii
and hence, for all sufficiently
large k, we have 1zk+’ I> 1zk I.
Now assume that relations (2.30) hold. From the Taylor expansion
F(z) =F’(O)zS
(0) [z, z] +o ( lIzliz) we obtain z”+‘=~~+T~F~~=z~+T~F (0) zk+ $
We put
Ek,tk-.r
and observe that
,rkF”(0) [ zk, zk] +O ( Izk I’).
F’(0) =zbi (D-I).
(2.32)
Hence we can rewrite (2.32)
as zk+‘=Dzk + !&)z*
T
Since 03 = -I. this equation
rewrite the last equation
[~“,z~]+U(IZ~~~).
gives in the MI-component:
m*+‘=-(l+~)
where 3R is the projection
+ +F”(O)
rJk+~RIZk;Zk]+o(IZkl~),
of the mapping p’(O) on to the subspace M’.Using relation (2.30), we in the final form
Wk+l=-
(1 +F)
U;V-tR[Wk,
We shall first show that, by, virtue of Eq. (2.33)
w”]iok,
cr~=O(IuPl*).
a constant
(2.33)
Cl > 0 exists such that ((2.34)
In fact, (2.33) implies that ~wP-+‘~++Cz~zL~~“,
(2.35)
CZ>O.
On the other hand. T’+‘=T~-
(~A)2~jF~kj/2<~k-~Z~;F~A~(~.
(2.36)
By Lemma 4, unity is not a point of the spectrum of operator D. This means that 0 is not a point of the spectrum of the operator p(O). Hence a constant C3 > 0 exists such that IiFzk]j>Cs]lzA]]. But, in finite-dimensional space, all norms are equivalent. Hence the last inequality gives llFzkIIsG~zk~~C,~ ~‘1, where Cd > 0. In short, from (2.36) we obtain (2.37)
Zk+‘
On comparing
inequalities
(2.35) and (2.37), we obtain
Iw~I-lwk+‘I~c~($-Tk+~),
c, =-.
G T2C‘2
69
A gradient method for the modified Lagrange fhction
On adding these inequalities,
from an arbitrary
fied
superscript k to infinity,
we arrive at
relation (2.34). The mapping R occurring in Eq. (2.33) is quadratic. can obtain from (2.33) the following recurrence
Hence, after elementary
working, we
relation of depth 2:
(1.38)
where
In the last relation, there is no quadratic term. This circumstance
s2=0(jw4j’).
is
decisive. For, it follows from (2.38) and (2.34) that
large /i, we have 1wk+* I> 1td’] .
i.e., for all sufficiently T_he inequality
obtained
obviously
contradicts
the convergence
to zero of the sequence
{w”} om In short, in the conditions of the theorem. relations (2.3 1) are necessarily satisfied. Hence ~~u”~i). From this and (2.20) we find that u k+‘=Di~ki-o(
it follows from (2.19) that IIUk+‘JI~qIIUbllSO(lIUblj).
Consequently,
lim sup(
IukI)~‘kGq.
h-w
It only remains to use relation (2.31) a second time. Theorem 2 is proved.
3. Application Now let $ be the modified
to a problem of convex programming Lagrangian of problem (0.1): i.e.
(3.1)
Here,
[a]+=(a+ja1>/2
concave-convex.
is the positive part of the number cr. The function
$J is
The set of its saddle points in Rn X Rm is the same as the set X* X Y* of
saddle points of the classical Lagrangian L in Rn X R, m. Under our assumptions
about problem
(O.l), the function $ is differentiable and its derivative satisfies in any compactum a Lipschitz condition (see [3-61). It is easily shown by direct working that, given any points (z, y) =R”+“’ and (x*, y*) E Z*, we have tl;e equation
b--3*, fx (5, Y)>- (Y-Y’,4%(5, Y)>= t b--X’, L (5, u) > (3.2)
-~~-~~.L.~I,U~~l-r’ld~~~~Y~!!l+~~~~.~~,
70
G.D. Mairtrovskii
where ui=[y,-ygi(s)]+,
V,=[Yi-yg,(x)I-.
The first term on the right-hand side of (3.2) is non-positive,
since inequality
(1 .l) holds
for the function L in Rn X R+m, while the third term is also non-positive, since y * > 0 and v < 0. On discarding these terms, we arrive at inequality (1.2). This inequality was also obtained for the modified Lagrangian in [7]. In short, all the assumptions Hence we obtain from Theorem
made in Section 1 hold for the modified Lagrange function. 1:
CorollaQ, 1 If Ji is the modified function
(3.1) of problem (O.l), the following statements
are true
for process (1.3) (1.4): 1) a number k, exists, such that
~h+i=~b(l-~RII~‘(~k)ii’), k>ko;
2) lim +O; h-woo
3) the sequence Nowletz*=
{z’} OoD is convergent
to a point of the set Z*.
(x*, JB*) be a saddle point of the Lagrange function L of problem (0.1) and
let functions land gj be thrice differentiable at the point x *, For clarity, we shall assume that the constraints gi(x) > 0, i = 1, 2. . . . , p, are active at the point x*, and are passive for i = p t 1, p+2,...,
m. Let Lo be the classical Lagrange function
corresponding
to the active constraints,
i.e. 9
We shall assume that the sufficient of strict regularity,
conditions
for a second-order
maximum,
and the conditions
are satisfied at the point (x*, y*). Recall that this implies the following:
(a) the operator
LX:,’(x’, y’)
has rank p (i.e. the gradients of the active constraints
are
linearly independent), (b)y*i>O,i=l,2,
. . . . p.
(c) there is no non-zero vector h satisfying the conditions
L,” (Y, y’) h=O
and
L,,o (a?, y’) h-0. Under these assumptions,
z * is the unique saddle point of the function
li, in Rn X Rm.
Ifi’ l> 2,. . . , p, then, by condition (b), yi*-ygi (5’) =yt'>O. If i = p + 1, p + 2, . . . , m, then yi'-_Ygi(S')=-ygi(S')p. Hence Eq. (3.1) takes the form in this neighbourhood:
A gradient
method
for the modified
Lagrange
71
function
Then.
Here, all the derivatives are calculated at the point (x*, _I’*), while P is the orthogonal Rm
onto the coordinate
subspace corresponding
to the coordinates,
projector in
numbers p + 1, p + 2, . . . nz.
The operator Gxx is obviously non-positive, while it follows from condition (c) that is negative definite. Condition (a) implies that Im Gyx is the same as the coordinate subspace corresponding to the coordinates,
numbers
1, 2, , . , p. Hence
Ker $,,=Im
qyx.
The conditions
of
Theorem 2 are thus satisfied. Corollan~
?
Assume that the sufficient conditions for a second-order maximum, and the condition of strict regularity, are satisfied at the saddle point z * = (.Y*,_I.*) of the Lagrange function of problem (0.1). and that the functions f and gi are thrice differentiable at the point x”. Then process (I .3). ( 1.4) is convergent to the point z * at the rate of a geometric progression. Translated
b?,
D. E. Brown
REFERENCES 1.
UDZAW’A. H., Iterative methods of concave programming. in: Studies in [blear and non-liuear Standford U.P., 1960 (Russian translation, IIL, Moscow. 1962). pp. 228-245.
2.
MAISTROVSKII, G. D., A gradient method for finding saddle points, Ekonomika .Vo. 5, 91 7-929,
matem.
merod],
12,
1976.
3.
WIERZBICKI, A. P., A penah!. function shifting method in constrained static optimization convergence properties, Arch. automaf. telemech., 16, No. 4, 395-416, 1971.
4.
ROCKAFELLAR, J. Optimiz.
prog~ummi~p,
and its
R. T., The multiplier method of Hestenes and Powell applied to come?. propammmp 12, No. 6, 555-562, 1973.
Theor? Appl,
5.
TRET’YAKOV, N. V., The method of penalt!, estimates for problems of convex programming, mate?n. metody, 9, No. 3, 526-540, 1973.
Ekonomika
6.
GOL’SHTEIN, E. G. and TRET’YAKOV, N. V., A gradient method of minimization and convex programming algorithms connected with modified Lagrange functions, Ekonomika matem. meted?,, 11, No. 4, 730-742, 1975.
7.
GOL’SHTEIN. E. G., The convergence of a gradient method for seeking saddle points of modified Lagrange functions, Ekonomika matem. metody, 13, No. 2, 322-329, 1977.
8.
ZANGWILL, W. I., Nonlinear programming
9.
LYUBICH, Yu. I., The rate of convergence of stationary gradient relaxation, Zh. @hisi.
(Russian translation, Sov. radio, Moscow, 1973). mat. ma?. Fiz., 6.
No. 2,356-359,1966.
10. OL’KHOVSKII, Yu. G., The rate of convergence of iterative processes, in: Computational mathematics computing techniques (Vshisl. matem. i. vychisl. tekhn.,) No. II, FTINT Akad. Nauk UkSSR, Kharkov, 1971, pp. 7-10.
and