127 REFERENCES i. K H A C H A T U R O V V.R., A oombinatoric-approximation method for the decomposition and composition of systems and its application for solving regional p r o g r a m m i n g problems, in: ICM-82, Warsaw, Short communs., V, Sect.16, Combinatorics and Math. Program., Proceedings of the International Mathematical Congress, Warsaw, 1983. 2. A L E K S A N D R O V P.S., Introduction to the theory of sets and general topology (Vvedenie v teoriyu mnozhestv i obshchuyu topologiyu), Nauka, Moscow, 1977. 3. GRE2"rSER G., General theory of lattices /Russian translation/, Mir, Moscow, 1982. 4. K H A C H A T U R O V V.R., The combinatoric-approximation method and some applications, Zh. vych. Mat. i mat. Fir., 14, 6, 1464-1487, 1974. 5. PODINOVSKII V.V. and NOGIN V.D., Pareto-optimal solutions of m u l t i c r i t e r i o n problems, (Pareto-optimal'nye resheniya m n o g o k r i t e r i a l ' n y k h zadach), Nauka, Moscow, 1982. 6. K H A C H A T U R O V V.R., The stability of optimal values in problems of discrete programming, in: Optimizat. Techn. IFIP Techn. Conf., NOvosibirsk, July 1-7, 1974, Springer, Berlin, 1975. 7. K H A C H A T U R O V V.R., On dynamic planning and its implenentation when operating oil-fields, in: The uncertainty factor when taking optimal decisions in large energy systems, 2, SEI SO AB SSSR, Irkutsk, 1974. 8. K H A C H A T U R O V V.R., Construction of a simulation system for planning the development of a new oil-producing region, in: Proceedings of International Conference "Modelling of e c o n o m i c processes" (Erevan, April, 1974). VYs ~4 SSSR, Moscow, 1975. 9. K H A C H A T U R O V V.R., Dynamic design and its implementation in the oil industry, in: Proc. W o r l d Conf. "Towards a Plan of Actions for Mankind," Paris, 1977, 4, Design of global systems models and their limitations, Pergamon Press, Oxford, 1977. i0. T I K H O N O V A.N.:and ARSENIN V.YA., Methods of solving ill-posed problems (Metody resheniya nekorrektnykh zadach), Nauka, Moscow, 1979.
Translated
U.S.S.R. Comput.Maths.Hath.Phys.,Vol.25,No.6,pp.127-138,1985 Printed in Great Britain
by D.E.B.
OO41-5553/85 $i0.OO+O.OO Pergamon Journals Ltd.
SOME METHODSOF STOCHASTICPROGRAMMINGIN HILBERT SPACE" N.M. NOVIKOVA Stochastic algorithms are p r o p o s e d for minimizing a functional, written as the integral with respect to a measure in H i l b e r t space, for different cases of specifying the measure. Convergence conditions are obtained for the algorithms in the class of convex problems. i. Consider the following p r o b l e m of stochastic p r o g r a m m i n g in Hilbert space: required to find the value I° and the realization z°~Z ~ of the infimum
inf l ( z ) ,
l(z)=~w(z,u)B(z, du),
,iz
it is
(1)
u
Z={z~Z°cL2(Y)[G(z(~))~O
U={ueU*cL2(X)]H(u(x))~
where a.s. Vy~Y}, (a.s. = almost surely); 0 a.s. ~ I ~ X } is M - m e a s u r a b l e Vz~Z°; U c, Z ° are weakly compact; H and G are continuous functionals respectively in U ° and Z°; X c R s, Y c R r are f i n i t e - d i m e n s i o n a l domains; ~'(z, ] is ~(z, ) -integrable V z E Z °. we assume that the functional ~ can be written as w(z,a)=~
(~g(z(y))dy, ~h(u(x))dx, ~](z(y),u(x))d(g.x)), y
x
yxx
where ~(.,-,-) is a function of three variables, and g(-), h(-) , and /(. ) are functionals respectively in Z °, U °, Z°XU e. The measure ~(z,-) is specified in ~-0 is continuously dependent on z, and is a probability m e a s u r e for any z ~ Z °. In the cases when u:(z, .) is integrable also with respect to a finitely additive measure (weak distribution), we do not require that ~(z, .) be denumerably additive. For such z we assume that ~ c < ~ : w ( z . )>--¢. W h e n studying stochastic o p t i m i z a t i o n problems, it is a s s u m e d that the measure ~(z,-) is not specified explicitly and that we can only make judgements about it by observing the realization u ~ U ° V z ~ Z °, However, such an observation is highly problematic in infinitedimensional space, so that it seems natural to rely on it only for finite-dimensional pro3ections of the measure ~(z, "). We shall therefore assume that, for all z ~ Z ° , the measure ~(z, ") is specified by its values on the algebra of cylindrical sets cf U ° (see /i, pp.218219/). For this, we fix the o r t h o n o r m a l bases ~ = { ~ } in U°cL~(X) and introduce for it the sequence of m a t c h e d measures B,(z,.) in R " , n ~ N = { I , 2 ....}, c o r r e s p o n d i n g to ~[z..) in the sense: ~.(z, Q)=~(z, { a ~ U ° i [ a ] , ~ Q } ) V~,(z, .) is measurable in Q ~ . Here and henceforth,
*Zh.vychisl.Mat.mat. Fiz.,25,12,1795-1813,1985
128
[.]~
for the vector in the brackets its first n components, while for any set V=-U°: if V a ~ U ' [a}oEA"; the scalar product is not d i s t i n g u i s h e d b y parentheses, and w h e n the element u~L,(X) is e x p a n d e d with respect to the basis ~, we write ula~, denotes
[V].={a"~A"C-R"I~a~EV:a"=[a].}.
m
J--t
~-1
We assume that V n ~ N , V a ~ U ° [ a ] , , ~ U °. By the d e f i n i t i o n of an integral for weak distributions m e a s u r e in H i l b e r t space, we have the equation
,(z)=lim{
/2, pp.17-19/ and for cylindrical
J" {c+w(z,a"%)}~.(z, da")-cp(z,U) } VzEZ °.
W e will d e n o t e the e x p r e s s i o n u n d e r the limit sign by ],(z). We will assume that the functional I is w e a k l y lower s e m i c o n t i n u o u s in Z* and that the set Z is weakly compact. Then, z°~Z (the i n f i m u m in (i) is reached) and, if w(z, .) is u n i f o r m l y b o u n d e d with r e s e p c t to u~U ° and z ~ Z °, we obtain
io= min l(z) = ]im min/. (z). For, since
(2)
w(z, .)>---¢, the sequence of integrals /,÷(z)fl.(z)+c~(z, U) is not increasing, because
[UI.
Va
V.= {a~U°l [a] .E [U] .} = {agEU'I2a'~EU : [a'] .=[a].}-=V . . . . l(z)+c~t(z,U)ffiinfl.÷(z)
and
Hence
l ( z ) = inf l.(z).
Thus the m i n i m u m with r e s p e c t to z and the limit with r e s p e c t to n can be interchanged. T o c o n s t r u c t algorithms for the numerical solution of p r o b l e m (2), it is natural to try to use s t o c h a s t i c g r a d i e n t m e t h o d s in Z (e.g., those of /i/, or if Z is f i n n i t e - d i m e n s i o n a l , those of /3/), w h i l e combining them with an increase in the d i m e n s i o n a l i t y n of the approximating s p a c e for u. Problems are then possible in w h i c h e v a l u a t i o n of the g r a d i e n t of functional w w i t h respect to z~Z in i n f i n i t e - d i m e n s i o n a l space L,(Y) p o s e s a serious problem, and cases are p o s s i b l e when the measure ~(z, .) can be defined only at points z which have a finite number of n o n - z e r o components of their e x p a n s i o n w i t h r e s p e c t to the basis of space L,(Y). In such s i t u a t i o n s it is realistic to use only f i n i t e - d i m e n s i o n a l approximations of the set Z, and if they are unknown, then we have to a p p r o x i m a t e set Z t' by taking account of the c o n s t r a i n t s on Z by m e a n s of the penalty method. Notice also that, in the definition of w, there f i n i t e - d i m e n s i o n a l integrals over sets X, Y, and Y X X participate, and in the general case it is n e c e s s a r y to use (in the light of the gradients of w) the a p p r o p r i a t e approximate values. In particular, the Monte C a r l o method is u s e d b e l o w to evaluate the finited i m e n s i o n a l integrals. We thus obtain the following a l g o r i t h m for solving p r o b l e m (i), combining the Ritz m e t h o d with a composite method of p e n a l t i e s and of stochastic q u a s i g r a d i e n t s , b a s e d on the scheme g i v e n in /4/.
Algorithm
i.
We fix the o r t h o n o r m a l i z e d basis ~ = { ~ } in L,(Y). W e specify the numerical control sequences I,. n,, C,, R,, S,# oo, ~,, 8,~0 and the initial a p p r o x i m a t i o n b'~[Z°]~, to the vector b ~ that z°=b°~. We define the future approximations b'~[Z°]q, t E N by means of the iterative procedure n,
such
t , d(Y) , , [b'+'],,--a, b-~,gradblq~(--~, ~ g ( b ~ ( y , ) ) , St
S, .-,
l~t
R,S,
C,[max(O,G(b'~(lh')))]'}},
St
._, .-, b,'+' = 0
V]=l,+t ..... l ....
Here, ~, is the operator of p r o j e c t i o n o n t o [Z°],t;gradb{.} denotes the g e n e r a l i z e d g r a d i e n t with r e s p e c t to b of the e x p r e s s i o n in braces; as the v e c t o r a t at each step t we choose the first of the i n d e p e n d e n t r e a l i z a t i o n s of the random q u a n t i t y d i s t r i b u t e d in |U°].t in accordance w i t h the m e a s u r e
~.t(b'~, .), such that we have the c o n d i t i o n
H(a'~(x,'))~8, ~x,', $=I, 2,....
S,; s e q u e n c e s {y,}f_i, {z,'},~, for t ~ N are sequences o f i n d e p e n d e n t r e a l i z a t i o n s of random q u a n t i t i e s u n i f o r m l y d i s t r i b u t e d in Y, X respectivley. If sets Y (or X ) are specified by their c o n s t r a i n t s in ya (or X°), in o r d e r to c o n s t r u c t the required sequences of values y~' (or z.') , we use sources of a u n i f o r m d i s t r i b u t i o n in y0 (or X°), and s e l e c t from their g e n e r a t e d sequences the values which satisfy the constraints. With r e s p e c t to the
129 numerical
sequences
R, and S,, we assume that
~,, ~ $ 0
II(1-o~,)=1-(Z>Po",
(R,c~,)-'";0, We also assume sufficiently
exist,
(3a)
(S,-~,)-"'40.
slow growth of the p e n a l t y
C,Z Ib,°l' --.-0,
for which we have
I] (i-~,) = i-~>P~'",
(3b)
constants
~C,'~,=<~. X~,=+o~,
C,~:
{L~".
(4)
J>Ij Here and below, we omit the limits fo m u l t i p l i c a t i o n , summation, and m i n i m i z a t i o n (including infimum), and also of union and intersection w i t h respect to tEN. Algorithm 1 may be used, not only in s t o c h a s t i c prograunming, but also for a numerical search for (i) in cases when an analytic e x p r e s s i o n is known for the m e a s u r e p, while the analytic expression for the integral J(.) is unknown, or the g r a d i e n t of /(-) is unknown. Let us study the convergence
of the algorithm.
Put
~(b,a)=u:(b~,a~),
st 1,1 ~'={x/} .... y'={y/} ....
H, ,
,
d(Y)
,,(b, a t, ,x )=(p(-W~, ~ _ ' g(b~ (V.)), sI
,~(X) S,
E ..,
~,
h(a~.(x,')),
$,
g. g /Cbr,Cv,'), a~(z,')) ) "
R,S,
....
~
I X( b )ffi d--~! {max(O,G(b~(y)))l'dy,
x,(b]y')=(max [0,G(b~(g,'))]}', Y'=Y,', d(Y×X)
We require that the functions introduced, and their g e n e r a l i z e d gradients, be bounded following sense (i.e., uniformly in the f i n i t e - d i m e n s i o n a l p r o j e c t i o n s U ° and Z°): Vl, heN,
V[b];e[ZÜ],,
V [a] .e [U°] ., Vyey,
VxeX,
in the
Vy', x', Vt
X,([b],lv)<~K,(1), Igrad~x,([b],ly) t<~K.(l), Igrad.H([a].~(x))l<~K.(n), [g([b]~(y))l<~K,(l), ]h([a].~(x))l<~K,,(n), I[([b]~(g), [a].~(±))[<~K (l, n). I*([bl,, [a].)l<~K,(l, n), Igradb~,([b],, [a],l)", x')l~K.(l, n). We match the control sequences of the a l g o r i t h m slow growth of dimensionalities l~. n~:
with these constraints,
x C,,~,,{K,,(t,)+K,,(Z,)}
(a,)*)'~
by choosing
sufficiently
(Sa)
(5b)
C,x(tb°],,)--O,
ZC,~,'K.(I,)K,(I,. n,)
(5c)
Z ~,'{K,'(l,, n,)+K,'(l,, n,)}<,~,
(5d)
K,(I,) 40, o, o, = (n,~,)"--------7
Kdn,)
(s,'~,) "
~.0, x,
K,(l,, n )
(n,S,'h) '~
,~0.
(6)
If the constraints are independent of the order of a p p r o x i m a t i o n of Z ° and U °, conditions (5) , (6) are contained in (4), (3). Notice that the first c o n d i t i o n in (5b) need not be guaranteed for all H, U °. For its satisfaction, it suffices that H be Lipschitz. Moreover, all the conditions on H can be replaced by
VneN,
Va~eU
H([a].%(x) )~O, a . s . V x ~ X .
The presence in p r o b l e m (i) of inequalities that define the d o m a i n of integration U and which may be taken into account only approximately, d e m a n d s the i n t r o d u c t i o n of a further assumption. In fact, we shall assume that the functional u) is non-negative; otherwise, we have to assume that continuity, uniform with respect to z, of the m e a s u r e ~(z, .), i.e., that is is denumerably additive (or that U°=U).
Theorem i. Let /(') be a strongly c o n v e x c o n t i n u o u s f u n c t i o n a l in Z°; G(-) is continuous and convex in Z°; H(.) is continuous, and convex or Lipschitz, in U°; the functional u;(z, u) is convex with respect to z in Z °, and c o n t i n u o u s and u n i f o r m l y lower b o u n d e d in Z~XU°; Z °, U ° are weakly compact in L,; ~(.,., .) is L i p s c h i t z w i t h r e s p e c t to the set of its arquments; and
d(X× Y) <,~. Then, under our assumptions and w i t h c o n d i t i o n s (3)-(6) on the numerical conurol sequences of algorit2nm i, the sequence z'=O'~ g e n e r a t e d b y them converges to :°=b°t in the stronc metric of the space L,(Y) with p r o b a b i l i t y P~ in the set of random sequences {): x:. a:}:~, USSR 2 5 : 6 - I
130
Proof. By the construction the inequality
of
b' and the properties
of the projection
operator,
we have
Ib .... [b°],,,, I'=[a .... [b°],,t'+ }', (b?):< j--ll÷l tt.,
Ib'-[b°]',t =+ Z
(b,°)'-2a, grado{¢,(b',a'lY ',x')+
j--t,+t
C,x,(b'tY')} (b'-[b°],,)+4~,21gradb{¢,(b',a'ly',x')+C,x,(b'lY')}
I'.
The l a s t e x p r e s s i o n on the right-hand side, in view of conditions (5a, b, d), does not exceed the element of the absolutely convergent series. Notice that, from the inclusion z*~L2(Y) we h a v e t h e c o n v e r g e n c e o f t h e s e r i e s Z(b,°) ', i . e . , t h e s e c o n d t e r m on t h e r i g h t - h a n d side of our inequality i s a l s o a n e l e m e n t o f an a b s o l u t e l y convergent series. Now, b y t h e p r o p e r t y of generalized gradients of convex functions, we c a n w r i t e ]b . . . . [b°],,÷,
12<~[b'--[b°],,12--2~,{¢,(b',a'[y',x')+
C,x,(b'l y')--¢,([ b°],,,a'lY',x')--C,x,([ b°], , l y ' ) } + v , , Z v,<~,, as shown above. Under our assumptions, X,2([b]jtlyt)<.K,2(l,);then, since the Lebesgue measure of the set Y is bounded, and the expectation with respect to yt is bounded, Mxt'([b]~ t [y,)~-K,:(/,) is b o u n d e d
where
Hence, by what was p r o v e d in /5, pp.lO2-104/, for mutually independent random quantities, the series of corresponding random quantities v~,C,{x,(b'ly')-x(b')}converges with p r o b a b i l i t y 1 in the set of random sequences {yt}, in view of condition (5a) for the convergence of the series ~,'C,2K,2(I,), b o u n d i n g the variance of these random quantities, and in view of the vanishing of their e x p e c t a t i o n values. Similarly, we have the convergence of the series Z~J,C,×{X,([be]~,Iy')-x([b°],t) } with probability i. In view of all these converging series, we have Ib . . . . [b°],,÷, I'<~ Ib'-[b°],, I'-2~,C,{x(b')--x([b°],,)} -
2[~,{%(b',a'ly',x')-~,([ b°],,,a'ly',x') } +~,, where
~u,
{y'}.
~t
I¢(b,a)-w,(b, aly',x') I
d(X) Z Ri
$~
h(a (z))ez I., e(r×x)v
i.I
1,×x
Since we assume
that the growth of
g, h, /
is bounded and that
¥
¥
x
x
d(YXX)
is bounded,
P([b]z~(Y),[a].i(x))d(Y,x)¥×x
{i
T×X
Here and henceforth, K denotes any constants. From the Chebyshev inequality, for each t~N, the probabilities at the t-th step of realizations a~
we obtain the following i n e q u a l i t i e s of the random v e c t o r s y', x':
for
131
P{KI ~t
St
d(X×Y) ~~. ~ , . f ( [ b ] , Z;(y,'), [a],,,~.(x.'))-
P'{K
T ×.,t
Then, for the p r o b a b i l i t y of the intersection of all these events w i t h r e s p e c t to all t~N, w h i c h is c o n s i d e r e d in the set of infinite random sequences {y',x'},,,, the lower b o u n d of P0 holds, by v i r t u e of c o n d i t i o n s (3), (6), and b y virtue of the c o n t i n u i t y of the probability. Hence
P {] ¢ ([ b°]~ ,, a') - ¢ , ( [ b°lt,, a' Iy', x') I < K (p,+o,+T,) WreN} >Po, P{[*(b', a')-~;,(b', a'[y', x')l--
Vt~N}~>P o.
p0z we have
I b .... [ b°], ,+,
12<]b'-[bOl, ' [=-2g,{,(b ', t') --xp([ b°],,, a')} -
2~,C,{7.(b')-x([b°],,)}+~,K(p,+o,+'r,)+~,, Let us e s t i m a t e the difference between ~(b', a') and write, for U'={a'~[U°].,tlH(a'%(x,'))<~@t ~ s = 1 , 2 ..... S,},
X{;,<~.
l(z'). B y our assumptions, we can
• .. S,'(b',at)~t.,(z',dag...~=,(z',dat)<~Ktt(l,,nt) u*
Consequently,
Vt~N.
fit
from c o n d i t i o n
(5d), u s i n g in t h e same way as above t h e r e s u l t
o f / 5 , Theorem
2.3/, w e o b t a i n the c o n v e r g e n c e of series Z~,l]t(b')-~(b',at)[ a.s. in the set of random sequences {a'}, g e n e r a t e d by the procedure i n d i c a t e d in a l g o r i t h m i, "where ],(b') is used to d e n o t e the c o n d i t i o n a l e x p e c t a t i o n of ~(b', a ~) with fixed seqnence {(b', a'),..., (b'-', a'-')) with r e s p e c t to a', d i s t r i b u t e d in U t in accordance w i t h the m e a s u r e M.t(z', .). In the same way, ~,]],([b°]tt)--~([b°],, a ' ) l < ~ a.s. in the set of random sequences {a'}.... w h e r e
],(b)=M.,{,(b,a') [a', .... a'-'}=~ *(b,a')it., (b~,da')
Vb.
Vt
Hence, w i t h p r o b a b i l i t y
p2
V t ~ N , we have
Ib .... [b°]t,+, 1-~
(7)
It remains to find the limit relations b e t w e e n I(z') and ],(bt) and b e t w e e n I(z°) and ],([b°],t), since, i n a s m u c h as the bounds on U are taken into account a p p r o x i m a t e l y in the a l g o r i t h m ],(b)~:].t(z) and (2) c a n n o t be used. Put U(t)={a~U°iH(a[(z,t))<.Ot V $ = I , 2 ..... S,}, [U(t)].t~-U' and NU(t)~U. A s s u m e t h a t NU(t):#U, i.e., Ha~C°: H(a~(x'))>O Yx'~X',d(X')>O and a%EU(t) V t ~ N ; then, s i n c e the L e b e s g u e measure d(.) is continuous, it follows that ~ 8 > 0 : H(a~(x"))~> 0 %Zx"~X", d(X"):>O, and on choosing t o in such a way that 0,0<0, we find that x.'~X" Vt>P, V s = | , 2 .... ,S,, i.e., a c o n t r a d i c t i o n with the fact that the d i s t r i b u t i o n of x,' in A" is uniform. Hence NU(t) =U. Since I(.) is c o n t i n u o u s with respect to z, and [z],-~z in L,(Y), then ]([z]i) converges to ](z) as l~c:. H e n c e a sequence e/,t0 exists, for w h i c h /([z°],)+et'$l(z°) as l~°~, and r e c a l l i n g that the m e a s u r e of the set U~-U ° is bounded, we have
{~'([z°l,,u)+s,}t~([z°],,du)~ ~ w(zO, u)tt(z°,du), u
Since w is uniformly
l~®.
u
l o w e r bounded, we have
w(z, u)~--c, ¢~0, and
I(z)= Sw(z, u)~(z, du)=S {w(z,u)+c}~t(z, du)-c~t(z, U). U
In turn,
U
from the r e l a t i o n s o b t a i n e d between
Vn~N t-*~ ; then,
l (zO)+cg (z *, U)= inf I
it follows that
{,(b,a")+c}~t~(z, da~)~ S {~(b,a")+c}~t.(z, da~)
~ [Vtt)].
m o n o t o n i c a l l y as
U(t) and U, [U Ia
from the above, we can w r i t e
~{w([z°]. u)+c+s,}~([z*l,,du)= u
infinf ~ {'~([b°],,a")+c+e~)gd[z°],,da")= infinfiM {t'(t)]l
132 The last integral is m o n o t o n i c a l l y n o n - i n c r e a s i n g with respect to l, n and t, so that its infimum with respect to l, n. t can be equated to its infimum with respect to t:
I {q:([b°l/t, a') + c ~ ettl txn, ([z°], :, da') = t I "-'di ]nt
inf
i
lira
{~°([b°]:v at) + c + ett} ~"t ([zO]tt' dat)"
[L'I/)]nt
t ~
This limit exists, b e i n g the limit of a m o n o t o n i c a l l y n o n - i n c r e a s i n g sequence, lower-bounded by zero, and in a c c o r d a n c e w i t h the relations between [U(t)].I and U', its value is not less than the upper limit as
t~:
lim sup $ (¢i([b°],,,a')+c+e,,}g.,
([z"],,,da')>
u,
limsup { ~ *(Ib°l,,.a')..,([zO],,.
da') +crt., ([z°l,,, U')~ .
trl
Hence
I (z °)/> lim sup {L ( [ b° ], ,) +cla., ( [z°], ,, U') } - c g (z °, U) = lira sup {i,([b°],, )+cg,,([z°],,, U ' ) } - lira cg([z°]q, U ) = lira sup{j,([b°],,)+c{g([z"],,, {a~,~U°l [a], ~ U ' } ) p([z°]t,,U)}}~>limsupj,([b°],,) a . s . in the set of random s e q u e n c e s {x'},.~. W h e n ¢,&0, the last i n e q u a l i t y is o b t a i n e d on the basis of the a.s. inclusion Uc-U, ~ {a%~U°[[a],,EU'}, w h i c h is g u a r a n t e e d b y choosing 0, in accordance with (5b) . For, given any
a~EU,
w e have
H(a~(x))<~O
a.s.
Vz~X,
so that
H(a%(x,'))<.O Vs=l, 2..... S, a.s., i.e.,
H([a].,~(x.') )<.H([a].,~(z.9 )-H(a%(x.') )<~ . . . . .
hence [a].t=~U'. From these r e l a t i o n s b e t w e e n U and a.s., since a.s. f
s,,
a.s.
l(z')<<-L(b')+cg(z',Ut\U)
U,, we see that
1 (z') +c~ (z', U) ~< ] {w (z', u) +c} ~.(z', du) = Ur
Ct then
{#?(b'. a') +c} It,, (z', da') =i.,(b') +cp., (z', U').
We shall show that, if the m e a s u r e g(z', U , \ U ) ~ 0 , t~oo. Obviously,
g(z, .) is u n i f o r m l y continuous with respect to z,
g(:', U ) = inf la. (z', [U].) =p.. (z', [Ul,,)-e", where
e"$O.
Hence
g(z', U , \ U ) = g . , (z', U ' ) - ~ (z', U)~
limsup{/~(b'~)+C~x(b~)}<~limsup
{/.([b°],
)+Cmx([b°],
)}~
I ° + lim sup C.7. ( [ b°],., ) = i o since C,x([b°],t)--O,t ~ , by (5b) . This functional is w e a k l y lower semicontinuous, in view of the fact that G(.) and hence also, max[G(.), 0] and X(') are convex, inasmuch as X(') is continuous. H e n c e we find that any weak limit point Z of the s e q u e n c e { z ' } satisfies the constraints on Z. For, as we have just proved, j.(b')+C~x(b") ~ are upper b o u n d e d b v 1°, w h i l e /~(b"*)>~--c. H e n c e C ~ x ( b ~ ) < ~ and X(b')--0 , m - ~ . Thus x(b)--~0 As a result, liminf/(z')>P, m--~, since otherwise, for the limit Z of the w e a k l y c o n v e r g e n t s u b s e q u e n c e of the sequence {z~}, realizing lim inf/(z~) (such a s u b s e q u e n c e exists, since Z ° is w e a k l y compact), we should find from the w e a k lower s e m i c o n t i n u i t y of ](.). that ;(Z)~<|im/(z~p)
limsupl(z~)~limsup.i~(b'"),
]iminfl(z~)~liminf/~(b'~).
133
W e have thereby shown that, in the set of measure 1 of sequences in the limit as m - ~ , m ~ M , we have
{x t },,x, for w h i c h
Uc-U,
Vt~N
l(z ~) <~lira inf j~ (b TM)~ lim inf {1".(b") + C . x ( b~ ) } ~< {]~(b~)C~x(b")}~l°~lim s u p / ( z ~ ) ~ < l i m s u p / , . ( b ' ~ ) ~ lim sup {j~ ( b") +C~x ( b~) } <~l°, lim l ( z~) =lim {j~ ( b~')+C~x ( b~) } =l °,
P ~ l i m inf lira sup
i.e., the s e q u e n c e {z~}~,u is minimizing for the functional I(.). And since the t a r g e t functional is s t r o n g l y convex in the convex weak compactum Z, then, b y the r e s u l t of /7, p.55/, p r o b l e m (i) is c o r r e c t in Tikhonov's sense, i.e., any m i n i m i z i n g s e q u e n c e is s t r o n g l y conv e r g e n t to the solution. Hence llz~--z°II~0 , Ib'--[b°],=I2<~Am-~0 , m - ~ . From i n e q u a l i t y (7) it now follows that, with p r o b a b i l i t y P02,
I b '+ ' - [ b°]z ,÷, 12• max {~', i b ' - [ b°lq 12- 2[~,6, +Kt~, ( p , + c , + z , ) + v , ' } , where the first c o n s t r a i n t under the m a x i m u m holds for
t~M,
and the second,
for
t~N\M.
The
quantity &m=A'-2,5,{/,(6"')+C,.x(b'")-/.,([b°]~)-C~x([b°],.)}+K~(p=+o,+x~)+v= ' d o e s not e x c e e d an i n f i n i t e s i m a l as m-~o% since ~'--0, v~'--0. ~m;0, pm+O=+X~--0, C~x([b0],,)~0, fl,(b'")+C~x(b')--F, limsupim([b°]t.)~
Dh(z) <
~w~(z,u)~t(z,d~)<.F,(z),
large n u m b e r of such o b s e r v a t i o n w i t h r e s p e c t to for ~t(z,.-)-
k = 1 , 2 . . . . . m.
(8)
u We assume that, w i t h k = 1 , the condition ~L(z, U ) = I is contained in (8). There are not s u f f i c i e n t inequalities (8) to define ~(z, .) u n i q u e l y in all U. Denote by ~(z) %zz~Z ° the set of probability measures in V ° w h i c h satisfy (8) . If w(z, .) and w~(z, .) are i n t e g r a b l e w i t h respect to finitely additive m e a s u r e s (weak d i s t r i b u t i o n s ) , the latter are also i n c l u d e d in ~(z). When we are thus i n f o r m e d about the c o n d i t i o n s of p r o b l e m (i), we can w r i t e an a priori estimate f0 of the result of (i) as
I¢=inf l(z)=f(z), ~EZ
/'(z) =
sup
~ w(z,u)~t(z, du).
(9)
u(z,.)E~ (z) C
TO seek the value of I0 and some realization of it Z~Z°, we reduce the i n n e r o p t i m i z a t i o n p r o b l e m in ~(z) to the simpler optimization problem in U, by means o f the a p p r o a c h d e s c r i b e d in /9, p p . l O 6 - 1 1 4 / for m e a s u r e s in Euclidean space. We assume that, for k>/(z), we have only the left-hand i n e q u a l i t i e s in (8). We denote summation over i=1,2, .... m by Z'.
Theorem 2. Assume that U is weakly compact and V z ~ Z ° with r e s p e c t to u we have the following: w, w ~ - - c V k = 1 , 2,..., nl; w,(z, .).... , w~(z, .) are weakly continuous; w(z, .), u'~_,(z, .), ..., w~(z, .) are w e a k l y u p p e r semicontinuous. Then V z E Z ° there exists a m e a s u r e ~(z, . ) ~ ( z ) , which gives the m a x i m u m value f(z) to the integral in (9) and is c o n c e n t r a t e d on not more than m elements u',...,u"~C. Proof. we shall use the scheme given in /3, pp.72-74/ for f i n l t e - d i m e n s i o n a l U, though d i s p e n s i n g with the c o n d i t i o n that the images be closed. Also, up to the end of the proof, we omit the a r g u m e n t z everywhere, including in ~(:). on the a s s u m p t i o n t h a t our entire d i s c u s s i o n refers to arbitrary ze~Z'~,~[R={t.l{l.t(U)=l}. Denote by l ~ = { u = ( u ...... t,=)l~u~U: v,=w(u), v,=tt',(u),i= 2, 3..... m } the image of the v e c t o r functional W ( - ) = { w ( . ) , we(.) ..... w,,( )}, specified in U. Similarly, Q = { q = ( q ...... q ~ ) l ~ p ~ : q,=J,(~) ~i=I, 2,..., m} is the domain of values in ~ (under the c o n d i t i o n ~ ( U ) = I ) of the vector of integrals
l,(l~)=5,,,(u)u(d,,), u
l,(l~)=~a',(u),(du),
i = 2 , 3 . . . . . m.
u
Then, Q=c-o-V, where the bar denotes closure i n R'". F o r , s i n c e ] , + ( I t ) - c , i = | , 2 . . . . . m, w h e r e , f o r t h e d e n u m e r a b l y a d d i t i v e measures
p.(U)=I, bt~l,
we p u t
V~t~DI J , ( p . ) =
134
],÷(tt)= lira Z
{w(u,")+c}P'(U'`),
]~÷ (It) = lim Z
{w, (u,") +c} tt (U,"),
k = 2 , 3 , . . . , m;
.+!
Vt-~I
UU,"ffiu,
U,'AU,"=¢,
U~~ are ~ - m e a s u r a b l e (we choose as divisions {U~"}~., V n ~ N the intersection of the respective d i v i s i o n s p a r t i c i p a t i n g in the d e f i n i t i o n of each integral ]~+(~).k=1, 2,..., /n). Given any f i n i t e l y - a d d i t i v e m e a s u r e ~ t ~ , we can construct a sequence of measures ~'~ in such a w a y that VcQ is obvious.
A ÷ ( ~ ) = l i m A ÷ ( ~ ') for
It=l, 2 ..... m.
Qcco
Hence
Put B--BX[D,, F,]X...X[D~, P,]×[D~.,, +~)×...X[D,, +~).
V~
and the inclusion to
The p r o b l e m of seeking
c0
1(z) reduces
to finding the s u p r e m u m sup
qs-~-
~QN B
max q,:
max
vl----- m a x
~:co ~'-'~NS
~QN S
v,.
~coVAB
The last m a x i m u m is r e a c h e d at the e x t r e m e points of the set sncoV. Hence, denoting b y W({u'}) the l i m i t i n g value of the sequence of vectors {|¥(u')},.~, and using C a r a t h e o d o r y ' s theorem, w e see that I is equal to the m a x i m u m of ~'w({u,'))p, over all the sets of numbers p~0 and of s e q u e n c e s {u,t~U},.s, i = t , 2 .... , m, for w h i c h {]V(ut)},~ are convergent and w h i c h s a t i s f y the s y s t e m of conditions
k=2, 3. . . . . m,
D,~E'w~({u,'})p,<~F~, Using our a s s u m p t i o n s a b o u t
U, w,, and
w
E'p,=t.
the theorem now follows.
Corollary. If, in the s t a t e m e n t of T h e o r e m 2, we dispense with the condition that wA be w e a k l y continuous, it can be used to find the s u p r e m u m of J,(~) with respect to ~-4/~(z). In this case, the m e a s u r e ~(z, .) which realizes /,need not n e c e s s a r i l y satisfy constraints (8) for the c o r r e s p o n d i n g k, though all the m elements on which it is concentrated b e l o n g to U. Under the conditions of T h e o r e m 2, the p r o b l e m of finding ~0 and the value of 2 in (9) is e q u i v a l e n t to s e e k i n g inf
.
max
I ~ Z lutEU, i = l , 2 . . . ,
where
A(u',...,
u',
z)={p', ....
(D(u* . . . . .
u% z),
(t0)
mlA(u', .,., u m, z ) ~ 2 1
p~O[D~(z)<~E'wh(z, u~)f<~F,(z) O(u', .... u",z)=
max
Vk=t,
2,...,
m},
E'w(z,u')p',
A(u',...,~,*)
p~(p' ..... pro), U=(U' .... , U~). If the emptiness or otherwise of the set A is independent o f z, we can solve p r o b l e m (iO) by using an a l g o r i t h m similar to i, and b a s e d on the m e t h o d in /4/ for s o l v i n g m a x - m i n p r o b l e m s in H i l b e r t space. Otherwise we can attempt to use the result of /lO/. T O evaluate t h e g e n e r a l i z e d gradients with resepct to b of the function can w r i t e the e x p r e s s i o n of /ll, Note i/ for our case:
O¢(u,b¢) Ob,
~(u, b~), w e
max{V' Ow(b~,u') '+m:xE'u,(b~,u')~l'} r ' Ob~ P
(ll)
where
p~Arg
max E ' w ( b ~ , u g f ,
n ~ { n = ( n ...... n ~ ) l V k = l , 2 . . . . . m,
if
Z'u'h(b~,u')p ~=
t
Fh(b~)(or
Dh(b~)), t h e n E'~u,h(b~,u')~l'+
Ow(b~, u,) p, }<~0 Obj
( o r >/0)}.
We o b t a i n this simple form of the d i r e c t i o n a l derivative of the m a x i m u m function (with cons t r a i n t s d e p e n d e n t on the argument of d i f f e r e n t i a t i o n ) , because the m a x i m i z e d function is linear w i t h r e s p e c t to the m a x i m i z i n g variable. As a result, to seek the value of (ii) , we only n e e d to apply the simplex m e t h o d twice, the d i m e n s i o n a l i t y m of the problem b e i n g u s u a l l y small. E x p r e s s i o n (ii) holds p r o v i d e d the constraints are regular and that the m a p p i n g s p e c i f y i n g t h e s e c o n s t r a i n t s is continuous. When these conditions are not satisfied, the g e n e r a l i z e d g r a d i e n t can be replaced b y its difference analogue. To solve problem (9) , we give
Algorithm
2.
Let ~ be an o r t h o n o r m a l basis in
L.,(Y).
We assume that,
for any
./6N ,
we know a
135 reasonably simple set B~[b],, where ~b~=£ of (9). Let 6~=(J, b)~(0)°, 6~t....) be a vector whose first component J is the a p p r o x i m a t i o n for I ° of (9). We have
J ~ [ J o , J°],
in[
J, <~
l(z),
J°>~
sup
l(z).
We put formally [~],=(J, [b],)=(~0 ...... ~,) V l e N , ~ = ( P , b). We take an orthonormal basis in L,(X) and assume that, for any , ~ N , we know a set A" of simple structure, such that [U],~-A"~-[U*]., V a = A "+t [a].~A". We also choose a continuous measure %" in U ~, whose projection onto [U°]~ we call v.. In particular, if ~Za~N A"+'f]R"=A" and d.(A")<~, we can take as a measure of ~.(.) the value d.(.)/d.(A"), where d.(.) is the Lebesgue measure in space B". We specify any initial a p p r o x i m a t i o n ~0'=(]', b') for and the numerical control sequences I,,n,,Rt,St,Ct~, ~, xt40 for t~N. we define the further approximations ~t=(j,, b t) V t ~ N by the iterative procedure
(1%6)
[Cot+l]l t = 7it { O)t - - ~t grado~ {jr _~ Ct [min [0, j t max
Z',,
At(a t, bfJy t. x t)
( b t,
Ctxt (btj yt)}},
a it
J yt, x t) PillZ
60~-' : 0
_
_+_
V~ = I t -4- 1 . . . . .
lt+x,
where
A,(a, bJy', x ' ) = { p ~ > 0 ' , V k = l , 2 . . . . . Z ' ¢ ? ( b , a'ly', x')p'<~F,(b~)+x,}, a' at
is the projection onto [J0, J°]XB". of the random q u a n t i t y d i s t r i b u t e d
m
D ~ ( b ~ ) - × , ~< t~N;
Here we take as a",..., a ~' the independent realizations in A TM in accordance with the measure ~-t (in particular,
[~a~U°: a'=[a], t and
it may be uniformly distributed) , for which
H(a~(z,'))<.O V8=|, 2,..., St.
(If the functional H is L i p s c h i t z the same check as in algorithm i, namely, comparison with 0,, may be used.) The n o t a t i o n ~t h corresponds to wk, k=|, 2 .... ,m, in the same way as ~ corresponds to w. The remaining n o t a t i o n was introduced earlier. To simplify the arguments when seeking u',..., u ~ we take the same basis, the same constraints A" on the F o u r i e r coefficients, and the same measure in the set of constraints. However, the properties of algorithm 2 do not deteriorate if they are all taken to be different for different u ~, k=l, 2 .... ,m. In addition, it always makes practical snese to choose different (not necessarilty in order) p e n a l t y constants C~ for penalizing departure beyond the constraint z~Z and for the penalty resulting from reduction of the min-max problem to a problem of limiting minimization. At any rate, the respective parameters need to be adjusted separately (in turn). Algorithm 2 is b a s e d on the method, p r o p o s e d in /9, pp.256/, of replacing the [gin-max search problem by a m i n i m i z a t i o n p r o b l e m with an infinite number of constraints, which are removed by means of an integral penalty. The essence of the method (assuming that A(u, z ) ~ Z Vz) consists in the following replacement: i n f m a x (D = :
u
inf (J J J =
m a x (D} =
:, J
inf
u
t, J : J ~ '
Vu
J =
nf{s +cl U ... UI
C--.¢~ z, J It is justified
spaces in /iO/, under the condition on measure ~: A possible way of dispensing with the evaluation of the integral in the p e n a l t y was devised in /12/ (for finite-dimensional problems) , where it was proposed to combine the p e n a l t y m e t h o d with the method of stochastic gradients /3/. The resulting method does not fit into the scheme of the stochastic gradients method because the penalty parameter has to approach infinity. Let us make the same assumptions regarding the functions ~, ~t~ and their generalized gradients as are made regarding ~, ~t, grad~ ~t, while marking the respective functions K, h, K~ t, ~, by a superscript k=l, 2 .... , m . To match the control parameters of algorithm 2, we replace conditions (5) and (6) by
V,>O,
Vu°~U °
for the case of functional
~({u~U°HJu°-u',l<~})~>6(E)>0
Z~,~C,~(K,'(l,)+K2'(l,)+[K~(l,,
n , ) ] ' + [K,~(/,, n,)]'}
~t
Let us examine
k = l , 2 . . . . . m,
j>tt
the c o n v e r g e n c e
of algorithm
2.
Theorem 3. Under the assumptions above, and those of Theorems being Lipschitz, let the functional 2(') be continuous and strongly functional ~(u, z) be convex with respect to z and continuous, and have: the set A(u. z ) ~ Z , the vectors (wA, (z, u') .... , w~(z, urn)) are
{ k = t , 2. . . . .
mlX'u',(z,
(12)
u')p'=D,(z)
Vp>~0: 2 ' w ( z ,
u')p'=¢(u,
z)} U { k = l ,
2 .....
1 and 2, except for H convex, and let the %'z~Z e, u',..., umcaU ° we linearly independent %'k~
rnJX'wh(z,
u')p'=F,(z)
Vp~>0:
Z'W(Z, a')pi=¢(U, Z)}, and the functions wk([b~_],,u~), w([b~]i, ui~ are continuously differentiable with respect to b V J ~ N : then, with p r o b a b i l i t y P02~ , under conditions (3) , (4) , (6) , (12) , (13) on the control p a r a m e t e r s of algorithm 2, the sequence z'=b'~ generated by them is strongly c o n v e r g e n t to Z in L..(Y) and the numerical sequence J' is convergent tc ~.
Proof.
Put
s = ( a I,....
a'),
qr(a, J, b)=[min[O,J-@(a~,b~)]] 2,
136
~,(s, 1, bly', x')= {rain[O, J - In
where
mbx
A~(a, bly t, X1)
Z'~t(b,a'lf, x')/]}'.
the same way as when proving Theorem l, we find by construction that Zv'<~
Ib .... [b], ,+, I'~; Ib'-[b],, 1'-21~2, grad, {W, (a', 2, b'l y',x') +z,(b' I~/)} (b'-t[~],,)+v', from conditions (12) and z~L~(Y), while also
,1'+'-I°l'<.l'-IoJ'-2~,{
i+C, O~- ~F,(a',I',b'ly',x')(1'-I °) }+v',
Here and henceforth, the arguments y' and point. In view of the convexity, we have
x' of the functions
Y..o'<,o
tI~,, *,, ~,~ are replaced by a
Ib .... [b],,+, I'<---Ib'-[b],, I'+v'-2~,C,{W,fs',Y',b'l ")W,(a',.~', [b],, I.)+z,(b'ly')-X,(['b],, lU')}, I1.... I°1'~ I1'--I o1~-2~,{1'-I°+C,{~,(a ', .t', b'l ")-
u:,(a',l°,b'l.)}}+v '.
Similarly,
I ~, .... [~3], ,+, I'< I~'--[g,],, 1'-2~,{1'-I°+c,{W,(a',~/I
%(s', [~], ,I )+X(b')-x([b], ,))}+~',
.)-
X~'<~
a.s.
in the set of sequences {y'},,,. The passage from x,(bly') to x(b) is based on the same arguments as in the proof of Theorem I. Following this proof further, we obtain with probability P0 :~ the existence of constants K such that V t ~ N . X'{l~P,(b',a"l-)-,(b',a't) l÷l~,([b],,,a" I-)-
v:([b'],,, ,") I + ~ {l*?(b'. ~"1 )-vp'(b', a") I+ I~:([£1, ,,a"l .)-~'([b], ,,a")I })
I
max
At(af, bll) '1, x 1)
Z',t (bt, a i'
I' ) pi
--
max Z'~t (b', a" I.) P~l < K (p, ~ o, - tt ÷ x,), A(al~,, bt~) I max Z',t(b l, a i'l.)p~-~D(a'~, b'~)l-~K(p,~-otn-t d. a(st~,bid The same holds if b' is replaced by
[b]~t'
Hence we have, with probability
IW,(a',1',b' I .)-W (a', 1', b')I+1 W,(a',] ', [b],,
P.",
I')--
u,, (a',.". [/,], ,) I ~g (p,+o,+t,+~,). I~ .... [~],,+, I'~ I o/+[~],, I ~-213,{~"-1°+C,( q' (a', ~')+[a', [&], ,)+z(b')-z(tb], ,)+K(p,+a,+~,+×,) }} +~' Put
nt
Ut ----{a~ ~ U ° IH (a~.(.r/))-<.0 Vs----I, 2,...,S. [a],~~ .4 }_=U, [l_'tjnt
[[ tint
Then, by construction of the sequences {a"},.n, it follows from conditions way as we treated ],(b) when proving Theorem i, that
zc,p,{l~,(~,)-~ (s,,~,) i+ i~,([~],,)-~ (a'.[~],,) I}<® in the set of random sequences
{a'},..~, generated by algorithm 2.
I'~ .... [Q],,+, I'~1~"-- ['~1,, 1'-2t~,{g'-I°+C,{/~(~/)-~,([&
a.s. As a result.
] ' ,)+x (b') -z ([/~]' ,) +K(p,+o,+t,+x,)
Z V,'<~ with probability P0" (in {y',x'},.N for almost all {a'},,n). There exists 6,40: Z~,5,=~, C,x,=0(6,); the latter is possible from (13). set of indices t ~ N for which
~'-I°+c,{j,(~/)-7,([6],,
(12) , in the same
)+z(b')-z([b], ,)} <~,;
} } +v,',
Denote by T the
137 the set T is infinite.
From the condition that
X(')
be convex, we have inequality
x([b]l,)--
X(b)~s&X(.[b]l,)([b]it-b), while since X(b~)=0, X([bL,)~0, it follows from the assumptions boundedness made in Theorem 1 that, in accordance with (12),
of
a>h
Under t h e s e
assumptions,
t h e maximum i s r e a c h e d i n
(9), i . e . ,
b~Z, i.e.,
l'm~(a~, be), iV(a,
~)=
0 Va'tGU. By definition of the integral with respect to the measure ~, and by the conditions that be non-negative and that ] be continuous, we see that there exists ~,~0 for which
~2
0 = ~ ... I {{~°--CD(u , ~)},}'v(du')...v(d~")= in! ~... ~{e t + {[?0 .__ ~ (u, [~],)]-}'}v (dul)... v (du') > t~N bI~:N I~:N n~:N iUt]n
| t t]n
v . (da m) = l i r a ], ([&14) > O, U~flU,. in the same way as was proved for U(t) i n T h e o r e m 1 ; t h e l a s t equation to limit holds because the last minimized repeated integral is monotonically non-increasing
since
respect to
l,,, and t.
~([~]i,)+x(b'))-~<0.
Hence
]~([~]'t)--0, t~°°.
Noting that ~' is bounded,
t~T, t- ~ ,
For
that
], and
we
have
the
with
J'-2e~-C,{]t(~')-
X are non-negative,
and that
C,#~,
we see that x(b'),],(~')~0, t- ~ , t~T. As a result, given any limit point ~' of the sequence {¢~'},., (where, when writing b', we have ~n mind that b'~ is a weak limit point of {z')), we obtain b'~Z, as shown in the proof of Theorem i, and I'>_-~(t~ b'~) V R ~ U . For, since • is convex with respect to z, we see that and hence
]~
is convex with respect to
u Hence
b, ~,
u
U, DU.
Hence
and
](~')-~0
O<~](Q)')~lim]((~')=O, i.e.,
for
1(~')=0
tET,
(since
~ r , t-~®).
For ~' we can now write as
is convex with respect to
](.) is weakly lower semicontinuous
since @~lim](s')~-lim],((0'), because
),(~')-*0,
•
Since ~ is continuous with respect to z, we have
~{(J-¢(.,,....~',bU}-}'~(d-')~(d~').
;(~)= ~ is likewise continuous.
~.
/'~°+limC,],([~],,)for
t~T.
Andsince themodulus of
grad),([&],~),
also of gra&{~(a'~, [b~],t~) }: can be bounded by K{[K,(l,,n,) ]'+[K~(/,,n,) ]'}, it follows from (13)
that
|imC,],([~]l/)=0,
The set of resulting
in the same way as the similar relation inequalities
j, _]0 and
J'>~(~,b'~)
for l[mC,x([b]~t )was proved above.
Vu'~U,
recalling that
b'~Z,
imply
that 0)'--~, i.e., any limiting value of the sequence {]t},.r, is equal to I °, and any weak limiting value of the sequence {b'~},,r is equal to 2. Further, inasmuch as 1 is strongly convex, the solution 2 is unique, we can conclude that the sequence {z'),.r is weakly convergent to Z. Hence it follows, since the functional 2(-) is weakly lower semicontinuous, that lim I(z')mI ° for t~T. By the equation lim/(~')=0 proved above, we see that, for t ~ T ,
lim ~... I ((¢(u, z')-l')+}%(du')...v(du~) =0, i.e., O=
l i r a max. {(D (u, z') - - J'}+ = l i m m a x t ~ T ul~L, t-~ T
(0, m a x (D(u, : t ) _ u t~U
jt}~
lim { m a x (D (u, zt) - - ./'} = l i m m a x • (u, z t) - - ~'0 = I~T ut~[: t~T ui~U
l i m i ( z ' ) - - i °,
i°~lim/(z'),
t~T.
t~T
Consequently, liml(z')=l °,
t~T; thus the sequence {z'},,r is minimizing, and, since the functional ~(') is strongly convex, the problem of its minimization is Tikhonov correct, so that, for t~T, we have z'-~2 in L~(Y) and l~'--[~]qlz~0, l~ . By the definition
for t~T
of T, it follows from
with probability
P0:"
we have
;,=]o that |imC,(]~(~')+x(b')}=0
for I~T,
~A,-+0: [~).... [~],:+, 12
i.e., t_~."
Hence, with probability P0:" v t ~ N I~ .... [~]~t., l'~
is Hilbert,
Notes. i.
l]~'--~j~0 ,
which it was required to prove.
Instead of assuming that the func~ionals
I, I are strongly
convex,
it is
138 sufficient to make the weaker assumption of strict uniform convexity. In the case of simple convexity of the minimized functionals, algorithms 1 and 2 require regularization. 2. Algorithms 1 and 2 can be simplified somewhat by randomly rejecting at each t step, instead of St values x' and Ht values yt, just one or several auxiliary values, and computing the integral approximately, with respect to all preceding realizations, i.e., putting x'=(x', ....x~),y'= (y'.....y'). We can conclude from Theorems 1 and 3 that the solution of convex problems of constrained optimization in Hilbert space is in a sense equivalent to a denumerable number of finitedimensional iterations, amounting in most problems to elementary operations. The specific feature of infinite-dimensionality seems to be essential only for the non-convex case, when the conditions of convergence of algorithms of type 1 to a stationary point are not yet obtainable. To study the theoretical convergence, we considered a very simple version of algorithms I and 2, which may be regarded as having a common scheme. On the basis of this scheme we can obtain algorithms with better properties of practical convergence, without detriment to the theoretical convergence. Our testing of algorithm 1 shows that its properties are similar to those of stochastic algrithms for solving simpler problems, see /4, 12/. The main merits of the combined method of penalties and stochastic gradients lie in the speed with which the first approximation to the solution is obtained, and in the simplicity of realization, which does not demand analytic evaluation of auxiliary functional characteristics of the problem. For subsequent refinement of the solution, it makes sense to increase the accuracy of evaluating the integrals in an iteration, i.e., to use, not one, but several vectors a' at each t step (e.g., if the measure ~ is independent of z, to use a '-t, a t-z.... ) and to choose a suitably averaged value of ~ as an approximation to ],. We can treat a' in algorithm 2 in a similar way. But this improvement in accuracy of integration resulting from the accumulated random realizations, if it is made during preliminary adjustment of the algorithm parameters or during primary localization of the solution, can lead to loss of practical convergence, due to the insufficient number of iterations performed. For preliminary computation by algorithm i, with ~500 iterations, we recommend the following parameters: ~t=~/t, Ct=Ot °'~, ~'/2, C