Two modifications of the linearization method in non-linear programming

Two modifications of the linearization method in non-linear programming

U.S.S.R. Comput.Maths.Mat~.Phys.Vol.23,No.2,pp. 36-44,1983 Printed in Great Britain. OO41-5553/83 $IO.OO+O.OO 9 1984 Pergamon Press Ltd. TWO MODIFIC...

475KB Sizes 0 Downloads 40 Views

U.S.S.R. Comput.Maths.Mat~.Phys.Vol.23,No.2,pp. 36-44,1983 Printed in Great Britain.

OO41-5553/83 $IO.OO+O.OO 9 1984 Pergamon Press Ltd.

TWO MODIFICATIONS OF THE LINEAR.IZATION METHOD IN NON-LINEAR PROGRAMMING* A. I. GOLIKOV and V. G. Z H A D ~

In Pshenichnii's linearization method for solving the general problem of non-linear programming, an auxiliary quadratic programming problem is solved at each steP. In the two modifications of the linearization method described below, auxiliary problems of linear programming are solved. The properties of these a u x i l i a ~I problems are studied, the convergence of the methods to the solution of the nonlinear programming problem is proved, and features of their numerical realization are discussed. i.

Introduction.

we consider the non-linear programming problem

X={x~E.lg'(x)<-O,i=|,2 ..... m},

rain/(z),

(1.1)

z~X

where E, is n-dimensional Euclidean space, and I(x),g~(x), i=I, 2 .... , nt, are continuously differentiable functions given in E.. Let X. be the set of solutions of problem (i.i) , and /~ the set of points x. at which the first-order necessary conditions for a minimum are satisfied, i.e., m

l~(z.)+2 u.%'(z.)=O, g'(x.)<~O,

(1.2)

u.'g'(x.)=O, u.'>-O,

i=1, 2.....

(1.3)

m.

Here, ~, gz~ are the column vectors of gradients of the functions / and gl, and It.= [u.', .... u.~] is the vector of Lagrange multipliers. We assume that X, is not empty and that

X.~R. We introduce the following notation:

gQ(x)

is the function identically equal to zero:

q~( x ) = m a x g ' ( x ) ; m

T(a)=~,

a', where a=[a' ..... a'];

ACz)={i<~<,nlE(x)>~(z)-~}, Ao(z)=(O~(~)-8}, where 6 is a non-negative parameter;


a~bj

J=!

is the scalar product of the vectors Euclidean norm in E.. w e s h a l l a l s o

a=[a' .... , a"], b=[b' .... , b'] use the norm

in

E.;

I [ a l l = < a , a ) '/'

is

the

Ilalh=Z laq $--t

and the Chebyshev norm

Ilall|

max

la'l.

We know that the linearization method described in /i/ can be treated in the same way as a method of minimizing the unsmooth penalty function

G(x, t)=l (z) +t~(z),

(1.4)

where t is a fairly large positive parameter. This interpretation is extremely useful for elucidating the properties and features of the method, and essential use will be made of it throughout. We call [z, t]~E~+, a stationary point of the function G(x,t) if *Zh.v~chisl.Mat.mat.Fiz.,23,2,314-325,1983 36

37

in[

aG(x't)>~O, O--G(x't)=o.

,,~:,-,

Here, OG/Op is the derivative of accordance with /2/,

aG (z, t)

Op

ap

G(x, I)

i.

Let

I~T(u.). Proof. Assume v':u.'/t, i~A(x.), v~

p~E..

with respect to the direction

stationary

points

of

x . ~ ; then Conversely, if

[x., t] is a stationary point of the function [x., t'] is a stationary p o i n t of G(x, t), then

for

that noting

simplicity then,

In

+t ,+m,+a+(z> x .

There is a close connection between points of set ~ and the Lemma for all

(1.5)

Ot

T(u.)>0, and that u.'=0

take for

l>~T(u.). Then, if i~Jo(x.), we obtain

G(z, t). G(x, t) x,~. we put from

(1.2) :

Z

v'[l,(z.)+tg.'(x.) 1=0,

(1.61

where

v'~O,

iEA~

Z

v+=L

(1.7)

~E,TooCar.)

Relations (1.6), (1.7) are equivalent to the first condition (1.5) (see /2/). The second condition (1.5) is satisfied since the point x, is admissible. Next assume that [x., t'] is a stationary point of G(x, t). Since then (p(x.):0, then x.~X. Moreover, if we put u.'=t'u', i~]o(X.), and u J=0, i~]0(x.),then (1.6), (1.7) lead to (1.2), (1.3). The lemma is proved. If (i.!) is a problem of convex programming, then the function G(x, l) is convex with respect to x, so that Ix., t], where x.~2~, and t is sufficiently large, is not only a stationary point of G(z, t), but z. is the minimum point of G(x, t) with respect to x. Under certain assumptions, this property is also retained in the non-convex case. At x. the sufficient conditions of second order for an isolated local solution /3/ are satisfied, if x.~B, and in addition, if 0 for any non-zero vector y~E. such that (g~(x.),y)~-<-0,i~]o(x.), where (gz~(x.),y)=0, when u.i>0. Here, m

L (x, u) =](x) + E u'g~(x) is the Lagrange function and

L = is the matrix of its second derivatives.

Theorem i. Let ~(x), g'(x), i=|, 2, .... :rn:be twice continuously differentiable functions, and let second-order sufficient conditions for an isolated local solution of (i.i) be satisfied at z.. Then, for any t>T(u,), a strict local minimum o f the function G(x, t) with respect to x is realized at x.. The question of when the solutions of problems of unconstrained minimization for unsmooth penalty functions are identical with the solutions of a convex programming problem has been thoroughly studied (see e.g., /4, 5/). The qu.estion was considered in the case of the general problem of non-linear programming in /6/. Theorem 1 is an obvious consequence of the results obtained in /6/. 2. Auxiliary problem. The first modification of the linearization method is based on solving, at each step of the iterative process, at the running point z, the auxiliary problem of linear programming

min(I.(x),p)+31(z)o,

(2. la)

p.a

+g+(z)-~_<0,

I/l-o<+~,

i=I,

q~]t(x),

2 ..... ,,

G~>0,

(2 .ib) (2.1c)

where p=[pl,...,p.], M(x) is a continuous function ir E, such that M(x)>~i+nV'I[[x(x)[I,6 is a positive constant. we shall give some lemmas relevant to problem (2.1). For brevity, we denote the target function in the problem by h(q). Here, q = [p, c]~E,+t. In general, the solution of the problem may not be unique. Denote by Q(x) the set of its solutions, and by q(x)= [p(x), 0(x)] any solution of Q(x). It follows from the form of the function Is(q) that, for all q(x)EQ(x), (~(x) takes the same value. F L e m m a 2. If the admissible set in problem the problem exists and is finite.

(2.1) is not empty, then the solution

Proof. Assume the contrary. Then, there is a isfying the constraints of problem (2.1) and such that

sequence of points llm,+~]lq,[I=~,

of

q.=[p,, 0.], sat-

38

h(q,+,)~h(q,),

s=l, 2 .....

(2.2)

Here, naturally, lim,~.o,=oo. It follows from (2.2)"that lim,~.h(q,)<~h(qo). On the other hand, in view of the inequality ilP.11<~n':'(l+o,), we have h(q,)>~M(x)o,U]=(z)T]]]p,n>~I(x)os-n't~(|-~o,)H]sCx)H>~6,-n'hlI]=(x)I], and hence lims~| which is impossible. The lemma is proved. The dual problem to (2.1) is n

max 2 u,p..po

tt'g/(x)--Z ll+~;--Zlt-J' 3~t

i ~ .r6(x)

(2.3a)

$~t

Z Utg='(x) +~t+-~-=-/=Cx), i~Jg(,) ~+J+Z

~t-J~
(2.3b)

~+~0,

g_>10, u>---0.

(2.3e)

Here, ~+={M+t ..... ~L+"}, ~t-={[t-t..... ~,_"}, u={u', iE.;6(x)~. Denote .its solution by [u(x), lt+(x), ~-(x)] ~ By the duality theorem, the optimal values of the target functions in (2.1) and (2.3) are the same:

(~), p (x) >+M (x) o (~) = ~ , ~'g' (x) -~__ ;~j (x)-~, r*-~{x).

c2.4)

Let us show that the direction p(x), found by solving problem (2.1), is the direction in which the function G(x, t) decreases for sufficiently large t. We put

fGJg(x)

L e m m a 3. If x~R, then, given any t>Ts(u(x)), direction in which the function G(x, t) decreases. Proof.

The derivative

the direction

G(x, t) with respect to the direction

of

p(x)

With

i~/o~

, we have

(2.5)

,~.*~.;

~<--q)Cx). IEJ0*(x)

the

p(x) is

0 G (x, t..._.~=) +t max .

Op(z)

is

that

--"

Hence

aG(x, t) ~
(2.6)

(2.4), noting that

tq)(x)>~ Za'g'(x)'

(2.7)

i~J0(=} we arrive at n

j~l

j=i

Moreover, inequality (2.7), and hence (2.8)" also, is strict conclude that @G(x, t)/~p(x)<0, if x~X. Now let x~X. In this case (2.5) simplifies, and from

when

x~X.

(2.6),

(2.8) we have

Hence we at once

aG (x, t)

ap(z)
Let us show that the equality sign is then, in accordance with (2.8)

only

possible

n

if

.

M (~) o (~)+ ~ t,+'(~)+ ~ t*-'(~)~ 0 ~ J=t

i.e.,

O(x)=0, M+(x)=M-(x)=0.

.&.1-

Hence we obtain from (2.3) :

l,(z)+ ~ , u'(z)g,'(x)=O, i,,:o (~)

x~R.

In fact, if
39 while we obtain from (2.4): %--I

~ , a' (z) g'(x) =0.

l~3fi(*) Hence it follows that u'(x)=0 when i=-Js(z) and u'=0, i~]~(z), we arive at (1.2), (1.3), i.e.,

g;(x)<0. Putting all the remaining x~R. The lemma is proved.

Corollary. If aG(x, t)/@p(x)=O , and also l>T6(u(x)), then =~R. Denote by H(x, t) the function <~,(x),p(x)>--l~(x). It follows from the above arguments that H(x, l)<-~O for any l~-~-T6(u(z)). Moreover, it is strictly n e g a t i v e if X~R. By (2.6) , the value of H(x, t) is an upper bound for the derivative o f the function g(x, t) with respect to the direction p(x) at the point x. Notice also that the direction p(x), found from the solution of problem (2.1) with 6 = 0, is the same as the direction of steepest descent of the function G(x, l) when t=To (u(x)) (see /2/) . This can be shown by comparing the Kuhn-Tucker conditions for problem (2.1) with the corresponding Kuhn-Tucker conditions for the problem of choosing the direction of steepest descent min~,

(2.9a)

<].(z), p>+t(g.~(x), p)~%,

i~Jo~

llpl].~|.

(2.9b)

The latter reduces to a problem of linear programming. For small non-zero b, the direction p(x) is close to the direction of e-steepest descent of the function G(x, t) when t=T, (u(x)), but is not identical with it. It is precisely the direction of e-steepest descent if, in the constraints of (2.1) , instead of g~(z) we substitute ~(z) and we take 6=e. L e m m a 4. If Ix, t] is a stationary point of the function G(x, t), then among the solutions of problem (2.1) there is the zero solution q(x)=[0, 0]. Conversely, if {0, 0]~Q(z),, then Ix, t] is a stationary point of G(x, t) for any t>-T~(u(x)).

Proof. Let Ix, t] be a stationary point of G(x, t). Then the point q0=[0, 0] is admissible for problem (2.1). By Lemma I, x~R, i.e., we have conditions (1.2), (1.3). Hence it follows that the point {u.i,i~]o(z), ~t§ is admissible for the dual problem (2.3). Since the values of the target functions are the same at these points, then qo~Q(x). Now assume that [0, 0]~Q(z) and that l>~T6(u(x)). Since the zero solution is admissible for problem (2.1), we have g~(x)~0, lEft(x), and hence ~max g'(x)=0. The second of conditions (1.5) for a stationary point thus holds. Moreover, from the condition of supplementary non-rigidity for (2.1) it follows that ~t+(x)=~.-(x)=O and u'(x)=0 when g'(z)<0. In this case, from the constraints of problem (2.3) we have the equation

h (x)n u ~

u'(z)gt'(z) =0.

(2. i0)

i~;0(:)

I f we p u t v'=u~(x)/t, i~;o(X), v~ t h e n f r o m (2 . 1 0 ) we o b t a i n ( 1 . 6 ) , ( 1 . 7 ) , g u a r a n t e eing t h a t the f i r s t o f t h e c o n d i t i o n s (1.5) f o r s t a t i o n a r y p r o p e r t i e s i s s a t i s f i e d for t~ T~(u(x)). The l e p t a i s p r o v e d . L e t us i n d i c a t e t h e c o n n e c t i o n between d u a l p r o b l e m (2 . 3 ) and t h e c o r r e s p o n d i n g dual problem in the method of linearization of /i/. It is easily shown that the solution of problem (2.3), i.e., the vector u(z), is unchanged if the first constraint in (2.3) is replaced by the system of inequalities

-t~§

~,

u'g2(z)~tt-.

~GJ6(x) Hence problem

(2.3) is equivalent

(in the sense of .having the same solution

ifiJ6(x)

u(x))

to

i~76(s)

where

i~76{x)

The target function of this problem is the same as the target function of the dual problem of the method of linearization of /i/, except that different vector norms are used. 3. First version of the linearization method, we shall give a version of the .linearization method, based on the solution of auxiliary problem (2.1). Let xo~E., and let constants N>O, 6 > 0 exist, such that the following assumptions hold. A,. The set Q.~.(xo)={x:C,(x, N)<~G(xo, N)} is bounded. A2. The gradients of the functions ](x) and g'(x), i=|, 2,..., n~, satisfy a Lipschitz condition with constant I in .O.k.(x0):

l[l.(x,)_l.(x~)li<~lilx,-x~U,

Ilg.'(x,)-g.'(x,)ll
i = l , 2. . . . . m.

4O Given any x~.o..~-(xo), the linear programming problem for all q(x)~Q(x), Further, given the initial step ~>0 and the parameter in the method are performed according to the recurrence scheme As.

(2.1) is solvable and

T8

(u(x))<~N--|

0<~
The

iterations

Xk+,=X~+(~hpk,

(3. i)

where p~=p(xk) is the direction found by solving auxiliary problem (2.1), and descent step, chosen by halving the initial step ~ until we satisfy the condition

~

is the

G(x~+a~p~, N)<~G(xh, N)+ah[~[--N~(xk)]. Using Taylor expansion for the functions obtain the lower bound for ~:

a~> I rain I s ,

-5

[

6

(p(x~) +Kllp~ll

[(x)

gi(x) in

and

(3.2)

the same way as in /i/,

(t--~) (A'q~(x~)-) ~ (N+ 1) l llp~ll'

(3.3)

where K is a quantity bounding []/x(x)l[ in Qx(x~). It is clear from (3.3) that, if then ~ only has to be halved a finite number of times in order to obtain (z~. Lamina

5.

Let assumptions

A, and A~ hold.

x~l~,

Then,

sup 0(x)< ~ .

Proof.

we

(3.4)

x~-O-x(Xo),we have the chain of inequalities h(q(x) )>~M(x)o(x)-llI:,(x)I[llp (x)II~> M (x) o (z) -n" IlL (x)II (1+o (x)) > ~ (x) -n'~'li.~=(x)l[ ~ o (x) -nV'K.

Given any

On t h e o t h e r h a n d , by ( 2 . 4 ) ,

(3.5)

we h a v e t h e e s t i m a t e

F,

(3.6)

~'a,u'(x)g~(x)<_Nq~(x)<-.NC,

C---- max ~(x).

Since the function (~(x) is continuous in the compact set ~-.~(x0), the constant 6' exists. From (3.5) and (3.6) we obtain ~(x)~NC~-n';~K, whence (3.4) follows. The lamina is proved. We will now state a convergence theorem for process (3.1). Let ~ be the set of limit points of the sequence {x~}. It is not empty, since, by (3.2), the function G(x, ~V) is decreasing during the iterations, and hence the sequence {za} belongs to the bounded set

"~176

2. Proof.

If assumptions

A,--A~ hold, then

.~cRDQ..~-(z0).

We extract a convergent subsequence from

lira x~ ,=~,

~XN

{xa} .

Let

~.~(xo).

Since the continuous function G(x, N) is lower bounded in the compactum (3.2) , aa,(N(p(za,)--<[~(xa,), pa,))-+O. Then, necessarily,

^%(x..)---~0.

Qx(x0), then,

by

(3.7)

For, if (3.7) were false, then ~-~0. But it is clear from (3.3) that this is only possible when (3.7) holds, since, by Lemma 5, the pk are bounded in aggregate in .o..~.(x0). We shall first show that ~(~)=0, i.e. , ~ X . In fact, it can be assumed without loss of generality that the sequence {x~} is taken such that the set ]~(x~,) is the same for all x~. We denote it by J6. By Lamina 7.1 of /2/, f6--~J8(x)- From the sequence {zk0} sequences of solutions of the direct and dual problems (2.1), (2.3) are defined: {q(xk,)}, {[u(xh,), ~t+(xk,), ~-(z,,)]}. They are all bounded, in view of Lamina 5, assumption A~ and the conditions of problem (2.3). Hence we can extract from them convergent subsequences. We shall assume for simplicity that q(xk,)-~, u'(xh,)~ ~,i~I6, ~t+(xk,)-~+, ~t_(xk,)~-. On passing to the limit in dual relations (2.4) and noting that the functions /=(x) and g(x) are continuous, we obtain n n P> -

,3.8)

i~__f6

~=I

The right-hand side in (3.8) is non-positive.

But, by

-N~(~)=0

J=l

(3.7), (3.9)

41 If we now assume that ~X, then we arrive at a contradiction from (3.8) and (3.9). Hence 2~X and (].(~),fi)=0. Moreover, Eq. (3.8) is only possible if 6=0, ~ + = ~ _ = 0 , and ~=0 for g,(~) <0. Now consider problems (2.1) and (2.3) with x=~ and let us show that the solutions of (2.1) include the zero solution. Then, by Lemmas 4 and i, we can at once conclude that ~ ROQ.v(x0). If we take as ~t+,.tt_ and u i,i~f~, respectively ~+, ~_, ~i and we also put u~=0, i~Js(~)\f~, t h e n it can e a s i l y be shown that this point is admissible for problem (2.3) . On the other hand, the point q,=[0, 0] is admissible for problem (2.1) . Since the values of the target functions in (2.1) and (2.3) are then identical, we see by the duality theorem that q0 is the optimal solution of (2.1). The theorem is proved. Among our assumptions guaranteeing convergence of method (3.1) to the set R, the most burdensome are A~ and A,. Hence it is interesting to find the cases where they will certainly be satisfied. From the statement of Theorem i, when sufficient conditions of second order hold at the point x.~X., the. set ~-x(x0) will necessarily be compact, provided that we choose x0 sufficiently close to x., and 2Vsatisfies the inequality N>~T(u.)+I. The convergence of method (3.1) to x. is in this case merely local. If (i.i) is a problem of convex programming, then assumptions A, and A, will likewise in general be satisfied. As applied to the linearization method of /i/ , this question is examined in detail in /I, 7/. According to e.g., /7/, it is sufficient for this for the admissible set X to satisfy Slater's regularity condition and to be bounded. Let us s h o w t h a t this result is retained for the present linearization method. Theorem 3. Let the functions [(z), g'(x), i=|, 2..... m, in problem (I.i) be convex and continuously differentiable in E., and let set X satisfy Slater's regularity condition and be bounded. Then: a) given any x~E,, an N(Xo)>O exists such that, for all N>~N(2o), the sets ~.'.'(Xo) are uniformly bounded; b) for every xEE. , the auxiliary problem (2.1) has the solution q(x), where

r.(u(x))
(3.1o)

z~.Q,~(z~) , N>~N(xo).

Proof. Assertion a) is proved in /7/, so we shall only give the proof of assertion b). It is similar to the proof of Theorem 5.2 of /i/. Since X satisfies Slater's regularity condition, an ~X exists such that g~(x)<0, i=l, 2,..., m. Hence i t follows that, whatever x~E, and 6>0, the system of constraints in problem (2.1) is always compatible, since taking p=x--x, we find, in view of the convexity of g~(x), that g'(x)+
i=|, 2 ..... m.

(3.11)

Hence problem (2.1) is always solvable. We now show that inequality (3.10) holds. Let |Vbe a compact set containing all -Q.v(x0), N>~N(x~). Using (3.11), and the Kuhn-Tucker theorem on the saddle point of the Ligrange f u n c t i o n , we a r r i v e a t t h e i n e q u a l i t y

h(q(x))<~h(~)+ Z

u'(z)

((g~'(x),p)+g'(x))+

leJt(x) n

.

~,~+~(z) (p'-e-l)+ ~-~ .-'(z) (-p~-o-l)-v(z)e<~ )-1 J=! h(~t)"t= Z u'(x)g'(5), where

q=[.~,ol,

O = ma x

IFI,

v ( x ) is the Lagrange multiplier corresponding to the contraint

o~>0.

Hence

u,(x)< ~ --h(q(x) )+h(q)

(3.12)

-g'(~)

The numerator on the right-hand side of (3.12) is bounded in IV, since the function M(z) is continuous in E,, while J(x) is continuously differentiable. Moreover, by (3.5), we have the inequality h(q(z))>~-lz%]l[~(x)II. Hence we can indicate a constant O > 0 such that T~(u(x))~D+I, we arrive at (3.10). The theorem is proved.

4. Second version of the linearization method. The second version of linearization method only differs in having different type of auxiliary problem, solved each step of iterative process (3.1). 9 he auxiliary problem is rain <[,(x),p+--p_)+M(x)o, p+.p .o


the at

(4. la)

i~J6(z),

(4. ib)

42

~. p+~+~ p-~--o~
p+=[p+' ..... p+"], p-=[p-', .... p-"], and

Here, M(z)

p+>~O, p->~O,

(4.1c)

o>~0.

$--I

M(x)

is a continuous function in

E. such that

> l+llh(z) II. The dual problem to (4.1) is:

max Z

u'g'(x)-n,

(4.2a)

u,~ i~Ja(x)

-- ~ u'g.'(z)-lle<~/.(x), i~dt(x) ~ , u'g.'(z)--qe<~--j.(z),

(4.2b)

u'>~O,

O<~I]<~M(x),

i~A(x),

(4.2c)

{~Tl(x)

where I|~E,, , and e is the n-dimensional vector all of whose components are unity. easily shown that problem (4.2) can be rewritten as

t~/llx)

It is

iell(x)

where

lenA(x)"

Notice that (4.3) only differs from (2.11) in using a different (Chebyshev) norm. Problem (4.1) has properties similar to those of problem (2.1). Denote by [p+(x), p-(x), o(x)] the solution of (4.1) with fixed x, and by [u(z), ~(z)] the solution of (4.2). As the direction of descent in iterative process (3.1) of the second version of our method we take pt=p(xh), where p(x)=p.(x)--p_(x). It can be shown in the same way as in Lemma 3 that p(x) is the direction in which the function G(z, t) decreases for any t>Tt(u(x)), provided that x~R. If [x, l] is a stationary point of G(I, t), then the solutions of (4.1) include one such that p(x)=0, 0(z)=0. Conversely, if p(x)=0, o(z)=0, then [x, t] is a stationary point of G(x, t) for any l>~T6(u(x)). The descent step in (3.1) in the second version of our method is also found from condition (3.2), and estimate (3.3) holds for it. As in the first version, the convergence is proved under assumptions A,--A, in the same way as in Theorem 2. Naturally, assumption A, is here restated in the context of problem (4.1). 5. Computational aspects of the methods. In the numerical realization of our versions of the linearization method, the main volume of the computations is concerned with solving auxiliary problems (2.1) and (4.1). We first consider the first version. The number of variables in problem (2.1) is n+l, and the number of constraints is rk+2n, where rk is the number of indices in the set J~(xh). Hence the number of constraints in (2.1) always exceeds the number of variables. Consequently, if the simplex method is used to solve the linear programming problems, then it is more convenient to solving the dual problem (2.3) rather than problem (2.1) itself. We now turn to the second version. In (4.1) the number of variables is 2n+l, and the number of constraints is ra+l. Since the labour of solving a linear programming problem by the simplex method depends on the number of constraints, the following a priori conclusion can be drawn by comparing problems (2.3) and (4.1) : when the number of variables in problem (i.i) is substantially less than t~e number of constraints, so that on average it does not exceed the number of active constraints at each step of process (3.1), the firstversion proves more effective when solving the problem by the linearization method; and conversely, if the number of variables in (i.i) is greater than the number of constraints, it is better to use the second version. It is also preferable when the number of active constraints close to the solution of problem (l.1) is small. This case is often realized in practice. Notice that the auxiliary problem in the first version can be somewhat simplified by eliminating from (2.1) certain constraints, and thereby reducing the number of variables in dual p r o b l e m (2.3). We do this by replacing the inequalities

I/l-o
1=1,2

.....

(s.1)

n,

by /-0~
I/I-o
l.J(x)<0,

i f 12(z)=o,

p'-o>--l,

j=1,2 . . . . .

if /2(x)>0, ..

43

In the most favourable case w e c a n dispense with n constraints in this way. This device is exactly similar to the introduction of normalization N3, used in the method of feasible directions /8/. Computational experience shows that, to accelerate the convergence of both versions, it is useful for the constraints imposed on the vector norms to be varied during the iterations. We can arrange for this by introducing e.g., into auxiliary problem (2.1), instead of inequalities (5.1), the following:

Ip'l-a<~, where

~

1=1, 2 . . . . . n,

is a sequence of positive numbers satisfying the conditions

h-0 The number N considerably affects the rate of convergence. The more N exceeds T(u.), the more the number of iterations needed to obtain the solution of problem (i.i) with given accuracy increases. Hence it is useful to reduce N f r o m time to time, while watching for satisfaction of the condition N>T6(u(xa)). Obviously, the convergence of the method is retained if this is done a finite number of times. As an example, we quote the results of solving the following problem from /9/.

Test Problem: /(z) = (x'+3xi+x ~)2+4 (x'-x') :- 1.8310995,

g' (x) = x ' + x ' + x ~- 1 =0,

g~(x)=--x'
g'(x)=-x'
g'(z)=--x3
The minimal value of ](x) in the admissible set is roughly equal to zero. The initial vector x0=(0.1,0.7, 0.2). All the derivatives of the functions ](x), g(x) are found numerically. The initial value N=IO, 6=0.5. Using the first version, we obtained after 8 iterations the target function value of order I0-~, the equality constraint is satisfied with accuracy up to 10-', and the inequality constraint g~(x)with accuracy 10-*; altogether 282 computations of the functions ] and g were required, "ind a time of 5 sec. correspondingly, the second version gave after iO iterations I0-',I0-'~ I0-~, 342,5 sec, and the linearization method of /i/ gave after iO iterations respectively 10-~, 10-5, 10-7, 315,15 sec. Let us also quote the results of solving the problem of /iO/:

1 (x) =0.5 (z' + z 9 ~+50 ( z : - x ' ) ~+ (x ~) :, g' (x) =sin (x'+x 2) --x3=O, g~ (x) = ( x ' - 1) ~+ (z ~- l) ~+ (z ~ - l ) ' - 1.5~0. W h e n computing from the initial p o i n t x0=(--i,4, 5) with initial value N=i0, 5=0.| , the results of the second version are better than those of the first, or of the linearization method of /i/. After 125 iterations an accuracy of I0-G was achieved with respect to the equality; for an accuracy of 10-~ with respect to the inequality we needed 2049 computations of the functions and a time of 8 sec. For this example it proved essential to introduce into the algorithm the possibility of reducing the value of the penalty coefficient. Doubling of N with infringement of condition A'>Ts(a(x~)) led in this example even after I0 iterations to the value N~150. If this value is retained henceforth, this leads to a substantial increase in the total number of iterations. This seems to be the explanation of the results quoted in /iO/ of using the linearization method of /i/, which required 740 iterations to obtain a solution satisfying the constraints up to accuracy J0-~. Our modifications of the linearization method are now in the library of dialogue system of optimization (DISO), developed in the computing c e n t r ~ of the Academy of Sciences, USSR. REFERENCES i. 2. 3. 4.

5. 6. 7.

PSHENICHNYI, B.N., and DANILIN, YU. M., Numerical methods in extremal problems (Chislennye metody v ekstremal'nykh zadachakh), Nauka, Moscow, 1972. DEM'YANOV, V.F., and MALOZEMOV, V.N., Introduction to minimax (Vvedenie v minimaks), Nauka, Moscow, 1972. FIACCO, A., and McCORMICK, G., Non-linear programming, Wiley, 1968. EREMIN, I.I., and ASTAF'EV, N.N., Introduction to the theory of linear and convex proggramming (Vvedenie v teoriyu lineinogo i vypuklogo programmirovaniya), Nauka, Moscow, 1976. F EDOROV, V.V., Numerical methods of maximin (Chislennye metody maksimina), Nauka, Moscow, 1979. HAN, S.P., and MANGASARIAN, O.L., Exact penalty functions in nonlinear programming, Math. Program.,17, 251-269, 1979. PANIN, V.M., On a convergence condition for the linearization method in convex programming problems, in: Optimal decision theory (Teoriya optimal'nykh reshenii), IK Akad. Nauk UkSSR, Kiev, 65-71, 1978.

44 8. 9.

ZOUTENDIJK, G., Methods of feasible directions, Elsevier, 1960. MOISEEV, N.N., IVANILOV, YU.P., and STOLYAROVA, E.M., Optimization methods (Metody optimizatsii), Nauka, Moscow, 1978. iO. PSHENICHNYI, B.N., and SOBOLENKO, L.A., Acceleration of the convergence of the linearization method for constrained minimization problems, Zh. vych. Mat. i mat. Fiz., 20, No. 3, 605-614, 1980. Translated by D.E.B.