European Journal of Operational Research 174 (2006) 1368–1379 www.elsevier.com/locate/ejor
Continuous Optimization
New class of multiplicative algorithms for solving of entropy-linear programs Yu. S. Popkov
*
Institute for Systems Analysis, 9 Prospect 60 let Octyabrya, Moscow 117312, Russia Received 16 January 2003; accepted 4 January 2005 Available online 27 June 2005
Abstract The general entropy-linear program (ELP) is considered. Two types of the new coordinate-wise multiplicative algorithms with p-active variables and feedback control with respect to dual variables and mixed type (dual and primal variables) are proposed for solving the problem. Study of algorithms convergence is based on a stability analysis of the auxiliary differential equations that are continuous analogues of the algorithms. Ó 2005 Elsevier B.V. All rights reserved. Keywords: Entropy maximization; Multiplicative algorithms; Row-action procedures; Active variables; Feedback control; G-convergence; Exponential Lagrange multipliers
1. Introduction Entropy and the classical variational principle of the statistical physics are the effective tools for modelling and solving a lot of applied problems. There are many definitions of ‘‘entropy’’ functions. The book by Kapur (1989) contains some of them. The classical definition of the physical entropy was introduced by Boltzmann in 1871 (Boltzmann, 1871) and was developed by Fermi, Dirac and Bose, Einstein (Landau and Lifshitz, 1964). Introduced essentially later the information entropies by Shannon and Kulback and generalized information entropies by Fermi–Dirac and Bose–Einstein (Popkov, 1995) are closed to the definition of the physical entropy (they are Stirling-approximations of the Boltzmann-, or Fermi-, or Einsteinentropies).
*
Tel.: +7 095 135 42 22; fax: +7 095 938 22 08. E-mail address:
[email protected]
0377-2217/$ - see front matter Ó 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2005.01.069
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
1369
It is necessary to remark also that the notion of entropy as a measure of uncertainty is the base one although analytical expression of this measure can be different and it depends on the uncertainty model. We will use the definitions of the generalized information entropies introduced in (Popkov, 1995). Such entropies and their particular forms are used as natural functionals for modelling transportation flows (Wilson, 1970, Imelbaev and Shmulyan, 1978) and stationary states of the macrosystems (Popkov, 1984, 1986), for urban and regional planning (Leonardy, 1978, Erlander, 1980, Batten and Roy, 1982). Entropy maximization are applied to image reconstruction from projections (Lent, 1977, Herman, 1980, 1982, Gull and Skilling, 1984, Byrne, 1993). A large number of applications of the entropy maximization principle is contained in (Kapur, 1989, Fang et al., 1997). Entropy and entropy-like functions have some useful properties caused of their application in numerical procedures. In particular, one of directions in the theory of ill-posed problems is based on entropy regularization methods (Erlander, 1981, Smith and Grandy, 1985). In this paper we consider the entropy maximization problem on polyhedral sets, which describe the most part of the above-named applied models. We will call such problem the entropy-linear programs (ELP). Multiplicative procedures represent one of the classes of numerical procedures applied to ELP solving. Apparently, the first general approach for synthesis of such procedures was proposed in (Dubov et al., 1983). Simple multiplicative algorithm was applied to minimization of strictly convex functions on nonnegative orthant. Later, the multiplicative algorithms with respect to dual variables are used for solving conditional minimization and mathematical programming problems (Aliev et al., 1985, Popkov, 1988, 1995a). Also, the multiplicative algorithms are used for solving nonlinear equations (Popkov, 1996). The multiplicative procedures for finding nonnegative solutions of the minimization problems over nonnegative orthant were proposed again in the paper (Iusem et al., 1996). Some types of the multiplicative algorithms may be derived from approach based on the Bregman function and generalized projections with ShannonÕs entropy. In this case we obtain so-called row-action algorithms, iterations of which have a multiplicative form. The first algorithm of this type was proposed in (Gordon et al., 1970) for three-dimensional image reconstruction from projections. Further, the row-action algorithms are developed and modified (Herman, 1982, Censor, 1981, Censor and Segman, 1987, Byrne, 1996, Censor and Zenios, 1997). It is neccesary to note that in the most cited works the multiplicative algorithms are applied to the problems of entropy maximization with linear constraints equalities. In this paper we consider the ELP problem, where a feasible set is described by the system of the equalities and inequalities. The regular procedure for design of multiplicative algorithms with p-active variables is proposed for this problem solving. On the basis of the procedure above we synthesize the algorithms with respect to dual variables and to mixed, dual and primal, variables simultaneously. The choice of the active variables is implemented by feedback control with respect to the current state of the iterative process. Convergence study of the multiplicative algorithms is based on the continuous analogues of the algorithms and equivalence of the iterative sequences generated by the dual and mixed type algorithms.
2. The ELP problem and optimality conditions The ELP is a problem of mathematical programming with entropy goal-function and polyhedral feasible set. We will consider the ELP with two types of entropy goal-functions, namely: H B ðyÞ ¼
m X n¼1
where e ffi 2.71 and
y n lnðy n =an eÞ;
ð1Þ
1370
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
Y þ ¼ fy : y n P 0; n ¼ 1; . . . ; mg;
ð2Þ
and H F ðyÞ ¼
m X
y n ln
n¼1
yn þ ðbn y n Þ lnðbn y n Þ; an
ð3Þ
where Y ¼ fy : 0 6 y n 6 bn ; n ¼ 1; . . . ; mg.
ð4Þ
In these expressions a and b are the parameters of these functions. They make sense of the prior probabilities and the maximum values of the components of the vector y, respectively. The polyhedral feasible set takes the form: y 2 D ¼ fy : Ty ¼ v; T~ y 6 ~vg.
ð5Þ
where the matrices T = [tkn; k = 1, . . . , l; n = 1, . . . , m], T~ ¼ ½tðkþlÞ;n ; k ¼ 1; . . . ; w; n ¼ 1; . . . ; m, and vectors v = [vk, k = 1, . . . , l], ~v ¼ ½vkþl ; k ¼ 1; . . . ; w have nonnegative elements. The matrices T and T~ have the ranks l and w, respectively, where r = l + w < m. Assume that the sets Y~ þ ¼ D \ Y þ ¼ 6 ;; 6 ;; Y~ ¼ D \ Y ¼
ð6Þ
and contain more than one element. Hence, the ELP is formulated in the following form: ð7Þ
H ðyÞ ) max . y2D
It is necessary to make some remarks about the entropy functions (1) and (3). These functions have various interpretations. One of them is connected with a random distribution of particles among the subsets of the close states in the phase space. If the average numbers of particles in the subsets are small in compare with capacities of the subsets then we have Boltzmann-statistics and the generalized Boltzmann entropy (1) characterizes such random distribution process. Here the parameters a1, . . . , am are the prior probabilities of the distribution process. If there is information on the capacities of the subsets, namely, b1, . . . , bm, and each state in the subset can be occupied by one particle only, then we have Fermi-statistics and the generalized Fermi–Dirac entropy (3) characterizes this distribution process. Also we can see that the conditions 0 6 yn 6 bn, n = 1, . . . , m are satisfied automatically for the function (3). Consider the Lagrange function for this problem: ! ! l m w m X X X X Lðy; k; lÞ ¼ H ðyÞ þ kk v k tkn y n þ lk vkþl tðkþlÞ;n y n ; ð8Þ k¼1
n¼1
k¼1
n¼1
where k1, . . . , kl and l1, . . . , lw are Lagrange multipliers. Assume that the Slater conditions are valid, i.e., there exists a vector y0 belonging to the set Y + or Yh such that Ty 0 ¼ v; T~ y 0 < ~v. According to (Polyak, 1987) the following expressions give the necessary and sufficient conditions of optimality of the triple (y*, k*, l*) for the problem (7), where H(y) and D are defined by (1)–(6):
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
ry n Lðy ; k ; l Þ ¼ 0; rkk Lðy ; k ; l Þ ¼ 0;
n ¼ 1; . . . ; m; k ¼ 1; . . . ; l;
1371
ð9Þ ð10Þ
rlk Lðy ; k ; l Þ P 0; lk rlk Lðy ; k ; l Þ ¼ 0; lk P 0;
k ¼ 1; . . . ; w.
ð11Þ
In these expressions the following designations are used: l w X oH ðyÞ X kk tkn lk tðkþlÞ;n ; oy n k¼1 k¼1 m X rkk Lðy; k; lÞ ¼ vk tkn y n ;
ry n Lðy; k; lÞ ¼
n¼1 m X
rlk Lðy; k; lÞ ¼ vkþl
ð12Þ
tðkþlÞ;n y n .
n¼1
Consider the problem (7) with the entropy function HB(y) (1). From (10)–(12) we obtain: y n ðz; lÞ ¼ uBn ðzÞxBn ðlÞ; ! l Y t uBn ðzÞ ¼ ~ an zk kn ; k¼1
xBn ðlÞ
¼~ an exp
pffiffiffiffiffi ~ an ¼ an ; n ¼ 1; . . . ; m;
w X
! lk tðkþlÞ;n ;
ð13Þ
k¼1
where zk = exp(kk). The multipliers z and l must satisfy the following system of the equations and the inequalities: m 1 X tkn uBn ðzÞxBn ðlÞ ¼ 1; k ¼ 1; . . . ; l; Bk ðz; lÞ ¼ vk n¼1 w X Hkþl ðz; lÞ ¼ vkþl tðkþlÞ;n uBn ðzÞxBn ðlÞ P 0; k ¼ 1; . . . ; w;
ð14Þ
k¼1
lk Hkþl ðz; lÞ ¼ 0;
zi P 0; i ¼ 1; . . . ; l; lk P 0; k ¼ 1; . . . ; w.
Now consider the problem (7) with the entropy function HF(y) (3). From the optimality conditions (10)– (12) we have: y n ðz; lÞ ¼ uFn ðzÞxFn ðlÞ; " #1 l Y 1 t bn 1 þ z kn ; uFn ðzÞ ¼ ~ an k¼1 k " !#1 w X 1 F ~n 1 þ exp xn ðlÞ ¼ b lk tðkþlÞ;n ; an k¼1 pffiffiffiffiffi ~ bn ¼ b n ; n ¼ 1; . . . ; m.
ð15Þ
1372
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
In this case the multipliers z and l must satisfy the following system of equations and inequalities: l 1 X tkn uFn ðzÞxFn ðlÞ ¼ 1; k ¼ 1; . . . ; l; vk k¼1 w X Ckþl ðz; lÞ ¼ vkþl tðkþlÞ;n uFn ðzÞxFn ðlÞ P 0; k ¼ 1; . . . ; w;
F k ðz; lÞ ¼
ð16Þ
k¼1
lk Ckþl ðz; lÞ ¼ 0;
zi P 0; i ¼ 1; . . . ; l; lk P 0; k ¼ 1; . . . ; w.
3. Multiplicative algorithms We apply a coordinate-wise version of the multiplicative procedure proposed in (Dubov et al., 1983) to find nonnegative solutions of the systems (14) or (16). Introduce the following designations: Bk ðz; lÞ; for the problem (1); (7); Xk ðz; lÞ ¼ ð17Þ F k ðz; lÞ; for the problem (3); (7); where k = 1, . . . , l. Hkþl ðz; lÞ; for the problem (1); (7); W k ðz; lÞ ¼ Ckþl ðz; lÞ; for the problem (3); (7);
ð18Þ
where k = 1, . . . , w. (1) Multiplicative algorithms with respect to dual variables. Let us define the notion of the active variables at the sth iteration. The dual variables zk1 ðsÞ ; . . . ; zkp ðsÞ and lt1 ðsÞ ; . . . ; ltq ðsÞ are active ones if they vary at the sth iteration. The number of active variables p + q 6r. The multiplicative algorithm with (p + q)-active dual variables, called as the (p + q)th MD-algorithm, can be written in the following form: (a) initial step z0i > 0; i ¼ 1; . . . ; l;
lj > 0; j ¼ 1; . . . ; w;
(b) iterative step ¼ zsk1 ðsÞ Xck1 ðsÞ ðzs ; ls Þ; zksþ1 1 ðsÞ ... zksþ1 p ðsÞ
¼
zksþ1 ¼ zsk ; ltsþ1 1 ðsÞ
¼
...
ð19Þ
...;
zskp ðsÞ Xckp ðsÞ ðzs ; ls Þ;
lst1 ðsÞ
... ¼ lstq ðsÞ ltsþ1 q ðsÞ ltsþ1 ¼ lst ;
k ¼ 1; . . . ; l; k 6¼ k 1 ðsÞ; . . . ; k p ðsÞ; 1 aW t1 ðsÞ ðzs ; ls Þ ;
...
...;
1 aW tq ðsÞ ðz ; l Þ ; s
ð20Þ
s
t ¼ 1; . . . ; w; t 6¼ t1 ðsÞ; . . . ; tq ðsÞ;
~ are In these expressions the parameters c and a are the coefficients of step, and the functions X and X defined by the expressions (17) and (18).
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
1373
(2) Multiplicative algorithms of the mixed type. The algorithms (19), (20) with respect to dual variables have an identical structure for the problems (1), (7) and (3), (7). For design of the mixed type algorithms (with using dual and primal variables) the form of entropy functions plays a significant role. So we write the algorithms for (1) and (3) separately. Consider the case (1), (7) and write expressions for elements of the sequence of the primal variables at the sth and the (s + 1)th iterations using the expressions (13): y sn ¼ uBn ðzs ÞxBn ðls Þ; ¼ y sþ1 n
1 B lp sþ1 B p sþ1 B wq sþ1 B q sþ1 u ð½~z Þun ð½~z Þxn ð½~ l Þxn ð½~ l Þ; bn n
n ¼ 1; . . . ; m.
~q ¼ flt1 ; . . . ; ltq g are the vectors of the active dual variables. In these expressions ~zp ¼ fzk1 ; . . . ; zkp g and l wq lp ~ The components of the vectors ~z and l are nonactive dual variables. Substituting the expressions for the dual variables from (19), (20) into the latter equations, we implement the mixed type multiplicative algorithm (p + q)th MDP-B for the problem (1), (7): (a) initial step y 0n ¼ uBn ðz0 ÞxBn ðl0 Þ;
z0i > 0; i ¼ 1; . . . ; l;
l0j > 0; j ¼ 1; . . . ; w
(b) iterative step s ~ t ðsÞ;...;tq ðsÞ ð½~ y sþ1 ¼ y sn Uk1 ðsÞ;...;kp ðsÞ ðy s Þ exp U lq ; y s Þ ; 1 n s s lsþ1 h ¼ 1; . . . ; q; th ¼ lth 1 aQth ðy Þ ;
n ¼ 1; . . . ; m;
ð21Þ
ð22Þ
In these expressions the following designations are used: !ctk ;n h p m Y 1 X Uk1 ;...;kp ðyÞ ¼ tkh ;n y n ; vkh n¼1 h¼1 ~ t ;...;tq ð~ lq ; yÞ ¼ a U 1
w X
lsth Qth ðyÞtðth þlÞ;n ;
h¼1
Qk ðyÞ ¼ vkþl
m X
ð23Þ
tðkþlÞ;n y n ;
n¼1
~q ¼ flt1 ; . . . ; ltq g; l where c, a are the step factors. Consider the problem (7) with the entropy function (3) Its solution can be represented in the multiplicative form (15) We will use the variables un(z) and xn(l) from (15) without the top index F for simplification of graphical image of the expressions. The algorithm (19), (20) generates the sequence of the dual variables z1, z2, . . . and l1, l2, . . . According to the equalities (15) we can construct the sequences of the auxiliary variables un(z), xn(l) and the primal variables yn(z, l). Writing the expressions for the elements of these sequences at the sth and (s + 1)th iterations, substituting the values of the dual variables zs + 1, ls + 1 from the algorithm (19), (20) and using the transformation fn ¼ d n 1;
dn ¼ ~ bn =y n ;
n ¼ 1; . . . ; m;
1374
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
we obtain the mixed type multiplicative algorithm (p + q)th MDP-F for the problem (3), (7): (a) initial step " #1 l Y 1 t 0 0 in ~n 1 þ un ¼ b ðz Þ ; an i¼1 i " !#1 w X 1 0 0 ~ lj tðjþlÞ;n ; xn ¼ bn 1 þ exp an j¼1
ð24Þ
y 0n ¼ u0n x0n ; n ¼ 1; . . . ; m; z0i > 0; i ¼ 1; . . . ; l; l0j > 0; j ¼ 1; . . . ; r. (b) iterative step y sn ¼ usn xsn ; 1 unsþ1 ¼ ~ bn usn usn þ ð~ bn usn ÞUk1 ðsÞ;...;kp ðsÞ ðy s Þ ; 1 ~ t ðsÞ;...;tq ðsÞ ð½~ xnsþ1 ¼ ~ bn xsn xsn þ ð~ bn xsn Þ exp U lq s ; y s Þ ; 1 s s ltsþ1 ¼ l 1 aQ ðy Þ ; h ¼ 1; . . . ; q. th th h
n ¼ 1; . . . ; m;
ð25Þ
~ t ; . . . ; tq ð~ where the functions Uk1 ; . . . ; k p ðyÞ; U lq ; yÞ; Qk ðyÞ and the vector lq are defined by the expressions 1 (23). (3) Active variables. To choose active variables we will use a feedback control with respect to the errors #k ðzs Þ ¼ j1 Xk ðzs Þj; s
s
et ðy ; l Þ ¼
jlst Qt ðy s Þj;
k ¼ 1; . . . ; l;
ð26Þ
t ¼ 1; . . . ; w.
Using these errors it is necessary to select p maximal errors (#k1, . . . , #kp) and q maximal errors (et1, . . . , etq) for each iterative step s. The numbers (k1, . . . , kp) and (t1, . . . , tq) belong to the intervals [1, l] and [1, w], respectively. To do so, this we apply the following procedure: s-iterative step. Find the maximal error #k1 ðsÞ ðzs Þ among l errors #1 ðzs Þ; . . . ; #l ðzs Þ. Exclude the error #k1 ðsÞ ðzs Þ from the set #1 ðzs Þ; . . . ; #l ðzs Þ. Find the maximal error #k2 ðsÞ ðzs Þ among l 1 errors #1 ðzs Þ; . . . ; #k1 1 ðzs Þ; #k1 þ1 ðzs Þ; . . . ; #l ðzs Þ and etc., until all p maximal errors ð#k1 ðsÞðzs Þ ; . . . ; #kp ðsÞ ðzs ÞÞ are found. Selection of the maximal errors et1 ðsÞ ðy s ; ls Þ; . . . ; etq ðsÞ ðy s ; ls Þ is implemented similarly. Introduce the following designations to formalize of this procedure: l ¼ I þ d; q ¼ sðmodðI þ 1ÞÞ; p w ¼ J þ x; b ¼ sðmodðJ þ 1ÞÞ; q l w I¼ ; 0 6 d 6 p 1; J¼ ; p q
ð27Þ
0 6 x 6 q 1.
ð28Þ
Consider the index sets: K ¼ f1; . . . ; lg; ~ ¼ f1; . . . ; wg; K
K h ðsÞ ¼ fk 1 ðsÞ; . . . ; k h ðsÞg; ~ s ðsÞ ¼ ft1 ðsÞ; . . . ; ts ðsÞg; K
ð29Þ
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
1375
where 8 < 1; . . . ; p; if q < I; h ¼ 1; . . . ; d; if q ¼ I; : 0; if d ¼ 0. 8 > < 1; . . . ; q; if b < J ; s ¼ 1; . . . ; x; if b ¼ J ; > : 0; if x ¼ 0. Denote " # q [ P h1 ðsÞ ¼ K p ðs vÞ [ K h1 ðsÞ;
ð30Þ
ð31Þ
ð32Þ
v¼1
Gh1 ðsÞ ¼ K n P h1 ðsÞ. " P~ s1 ðsÞ ¼
b [
# K q ðs vÞ [ K s1 ðsÞ;
ð33Þ
v¼1
~ s1 ðsÞ ¼ K ~ n P~ s1 ðsÞ. G Here the numbers h and s are determined by the expressions (30), (31) and ~ 0 ðsÞ ¼ ; ~ 0 ðsÞ ¼ P 0 ðsÞ ¼ G0 ðsÞ ¼ P~ 0 ðsÞ ¼ G K 0 ðsÞ ¼ K
for all s.
Now we introduce the rule of the (p + q)-maximal error in the following form: k h ðsÞ ¼ arg max #i ðzs Þ; i2Gh1 ðsÞ
ð34Þ
ts ðsÞ ¼ arg max ej ðy s ; lsj Þ; ~ s1 ðsÞ j2G
According to this rule we have the following chain of inequalities: #kp ðsÞ ðzs Þ < #kp1 ðsÞ ðzs Þ; . . . ; < #k1 ðsÞ ðzs Þ; etq ðsÞ ðy s ; lstq Þ < etq1 ðsÞ ðy s ; lstq 1 Þ; . . . ; < et1 ðsÞ ðy s ; lst1 Þ.
ð35Þ
So, the (p + q)-active variable zk1 ðsÞ ; . . . ; zkp ðsÞ and lt1 ðsÞ ; . . . ; ltq ðsÞ are generated in each s-iteration. We can see that all dual variables z1, . . . , zl; l1, . . . , lw are sequentially transformed to active ones during I + J + 2 iterations. It is repeated with a period of I + J + 2.
4. Convergence of the multiplicative algorithms (1) Algorithm (p + q)-MD. Consider the sequence (k(I+J+2)s, l(I+J+2)s), s = 0, 1, 2, . . . , elements of which are generated by the algorithms (19), (20) ~ sM sM ¼ ksM ksMþM k k þ c ln Xk ðk ; l Þ; ~ t ðksM ; lsM Þ ; lsMþM ¼ lsM 1 aW t t where M ¼ I þ J þ 2;
ðIþJ þ2Þs kk
~ k ðk; lÞ ¼ Xk ðexpðkÞ; lÞ; X
¼
k ¼ 1; . . . ; l; t ¼ 1; . . . ; w;
ðIþJ þ2Þs ln zk ;
ð36Þ ð37Þ
k ¼ 1; . . . ; l; and
~ t ðk; lÞ ¼ W ðexpðzÞ; lÞ. W
ð38Þ
1376
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
Definition. An algorithm is called G-convergent if there exist a set G Rl Rwþ and positive scalar numbers a(G), b(G), c, a such that for all c 2 (0, a(G)), a 2 (0, b(G)), and (k0, l0) 2 G the algorithm converges to the point (k*, l*), with the convergence being linear. In this case, the point (k*, l*) is the solution of the system (14) or (16) where kk = ln zk. Convergence study is based on continuous analogue of the algorithm (36), (37) namely, on the auxiliary system of differential equations derived from (36), (37) for c ! 0, a ! 0: dkk ~ k ðk; lÞ; ¼ ln X dt dlt ~ t ðk; lÞ; ¼ W dt
k ¼ 1; . . . ; l;
ð39Þ
t ¼ 1; . . . ; w.
ð40Þ
The convergence conditions of the iterative process (36), (37) are formulated by the following Theorem 1: Theorem 1. Let (a) (b) (c) (d)
~ 1; . . . ; X ~ l; W ~ 1; . . . ; W ~ w be twice-differentiable functions, X the Jacobian of the system (39), (40) at the point (k*, l*) be a Hurwitz matrix,1 G be a compact set, and there exists e > 0 such that kk(t) k*k + kl(t) l*k 6 e for all (k0, l0) 2 G with some t > 0, where (k(t), l(t)) is a solution of the equations (39), (40) with initial state k(0) = k0, l(0) = l0.
Then the (p + q)-MD (19), (20) or (36), (37) G-converge. According to the proof-idea of the similar theorems (see Popkov, 1995, p. 147), the proof key-points are a stability of the equations (39), (40) (condition d) and Hurwitz-property of the Jacobian (condition b). We formulate the results in the following lemmas. Lemma 1. The point (k*,l*) f 0 is a singular point for the differential equations (39), (40) that is asymptotically stable for any initial conditions k0 2 Rl ; l0 2 Rwþ . Lemma 2. The Jacobian of the system (39), (40) is a Hurwitz matrix at the point (k*, l*). Proofs of these lemmas are given in Appendix A. According to these lemmas the conditions (b) and (d) of the main theorem are valid. (2) Algorithm (p + q)-MDP-B and (p + q)-MDP-F. Theorem 2. Let the (p + q)-MD (19), (20) be G-convergent to the solution (y*, k*, l*) of the problems (1), (7) or (3), (7), respectively. Then the algorithms (21), (22) and (24), (25) G-converge to this solution. The proof of the theorem is based on checking the equivalence of the sequence generated by the (p + q)MCL with initial points (k0, l0) 2 G and the expressions (13) or (15) and the sequence generated by the (p + q)-MDP-B or (p + q)-MPD-F with the same initial points.
1
Hurwitz matrix has the negative real part of the eigenvalues.
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
1377
Appendix A (1) Dual Lagrange function. Consider the Lagrange function (8) and the dependences (13) or (15) of the primal variables y from dual variables k and l that are obtained from the stationary conditions of the Lagrange function with respect to primal variables. Substituting yn(k, l) (13) or (15) into (11) we obtain the dual Lagrange function ~ lÞ ¼ L½yðk; lÞ; k; l. Lðk;
ð41Þ
The components of its gradient are as follows: m X ~ lÞ ¼ vk rkk Lðk; tkn y n ðk; lÞ; k ¼ 1; . . . ; l; n¼1 m X
~ lÞ ¼ vtþl rlt Lðk;
ð42Þ tðtþlÞ;n y n ðk; lÞ;
t ¼ 1; . . . ; w.
n¼1
In terms of the dual Lagrange function the optimality conditions (10), (11) can be represented in the following form: ~ ; l Þ ¼ 0; k ¼ 1; . . . ; l; rkk Lðk ~ ; l Þ P 0; rl Lðk
ð43Þ
t
~ ; l Þ lt rlt Lðk
¼ 0;
lt
P 0;
t ¼ 1; . . . ; w.
~ Consider the Hessian of the function L: " # ~ lÞ rki ;l Lðk; ~ lÞ rki ;kk Lðk; t ; N ðk; lÞ ¼ ~ lÞ rl ;l Lðk; ~ lÞ rlj ;kk Lðk; j t where (i, k) = 1, . . . , l, (j, t) = 1, . . . , w; m m X X oy oy ~ lÞ ¼ ~ lÞ ¼ rki ;kk Lðk; tkn n ; rki ;lt Lðk; tðtþlÞ;n n ; ok oki i n¼1 n¼1 m m X X oy oy ~ lÞ ¼ ~ lÞ ¼ rlj ;kk Lðk; tkn n ; rlj ;lt Lðk; tðtþlÞ;n n . ol olj j n¼1 n¼1
ð44Þ
ð45Þ
From the expressions (13) or (15) it is seen that the derivatives oy n 6 0; oki
oy n 6 0; olj
for all ðk; lÞ 2 Rl Rwþ . Hence, the elements of the Hessian (44) are nonnegative. It can be shown using the multiplicative construction of the expressions (13) or (15) that the quadratic form K = hNn, ni is a strictly ~ is a strictly convex positive definite for any ðk; lÞ; n 2 Rl Rwþ (see Popkov, 1995). Hence, the function L one. ~ P Lmin < 1. From the Eqs. (14), (18) it follows that (jk*j, jl*j) 6 c < 1, and the function L ~ lÞ and determine its derivation along the trajec(2) Proof of Lemma 1. Consider the dual function Lðk; tories of the system (39), (40). Taking into account the expressions (20), (38) and (42) we obtain:
X l w
2 ~ X dL 1 ~ ~ ~ lj . ¼ rkk L ln 1 rkk L rlj L ð46Þ dt vk k¼1 j¼1
1378
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
We can see that ~ dL < 0 for all ðk; lÞ 6¼ ðk ; l Þ; k 2 Rl ; l 2 Rwþ ; dt ð47Þ ~ dL ¼ 0 for ðk ; l Þ. dt ~ is the Lyapunov-function for the system (39) and (40). h Hence, L (3) Proof of Lemma 2. The Jacobian of the system (39), (40) at the point (k*, l*) can be represented in the following form: " 1 # ~ ; l Þ 1 rl ;kk Lðk ~ ; l Þ vk rki ;kk Lðk j vk J ðk ; l Þ ¼ ; ð48Þ ~ ; l Þ J lj ;lt lt rki ;lt Lðk where
( J lj ;lt
¼
~ ; l Þ l r2 Lðk ~ ; l Þ; rlt Lðk lt t
for j ¼ t;
~ ; l Þ; lt rlj ;lt Lðk
for j 6¼ t.
In these expressions (i, k) = 1, . . . , l; (j, t) = 1, . . . , w. Introduce the following designations: " # 1 diag 0 vk Sðk ; l Þ ¼ . 0 diagðlt Þ 0 0 W ðk ; l Þ ¼ ~ ; l ÞÞ . 0 diagðrl Lðk
ð49Þ
ð50Þ ð51Þ
t
Then we can represent the Jacobian (48) in the following form: J ðk ; l Þ ¼ Sðk ; l ÞN ðk ; l Þ W ðk ; l Þ. ~ lÞ is a strictly convex one, then the Jacobian J(k*, l*) is a Hurwitz matrix for As the dual function Lðk; w l 2 Rþ . h
References Aliev, A.S., Dubov, Yu.A., Izmailov, R.N., Popkov, Yu.S., 1985. Convergence multiplicative algorithms for solving of convex programming problem. In: Dynamics of Nonhomogeneous Systems, Proceedings of VNIISI, Nauka, Moscow, pp. 59–67 (in Russian). Batten, D.F., Roy, J.R., 1982. Entropy and economic modelling. Environ. Planning A 14, 1037–1061. Boltzmann, L., 1871. On the link between the second beginning of mechanical calorie theory and probability theory in theorems of thermal equilibrium. In: Classics of science. Nauka, Moscow. Byrne, C., 1993. Iterative image reconstruction algorithms based on cross-entropy minimization. IEEE Trans. Image Process. 2 (1), 96– 103. Byrne, C., 1996. Block-iterative methods for image reconstruction from projections. IEEE Trans. Image Process. 8, 275–291. Censor, Y., 1981. Row-action methods for huge and sparse systems and their application. SIAM Rev. 23 (4), 444–464. Censor, Y., Segman, J., 1987. On block-iterative entropy maximization. J. Inform. Optim. Sci. 8, 275–291. Censor, Y., Zenios, S.A., 1997. Parallel Optimization Theory. Algorithms and Applications. Oxford University Press, New York. Dubov, Yu.A., Imelbaev, Sh.S., Popkov, Yu.S., 1983. Multiplicative schemes for iterative optimization algorithms. Sov. Math. Dokl 22 (2), 524–526. Erlander, S., 1980. Optimal Spatial Interaction and Gravity Models. Springer-Verlag, Berlin. Erlander, S., 1981. Entropy in linear programming. Math. Programm. 21, 137–151. Fang, S.-C., Rajasekera, J.R., Tsao, H.S.J., 1997. Entropy Optimization and Mathematical Programming. Kluwer Academic Publisher, Dordrecht.
Yu. S. Popkov / European Journal of Operational Research 174 (2006) 1368–1379
1379
Gordon, R., Bender, R., Herman, G., 1970. Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography. J. Theoret. Biol. 29, 471–481. Gull, S.F., Skilling, J., 1984. Maximum entropy method on image processing. IEEE Proc. 131 (Pt.F 6), 646–659. Herman, G.T., 1980. Image Reconstruction from Projections: The Fundamentals of Computerized Tomography. Academic Press, New York. Herman, G.T., 1982. Mathematical optimization versus practical performance: A case study based on the maximum entropy criterion in image reconstruction. Math. Program. Study 20, 96–112. Imelbaev, Sh.S., Shmulyan, B.L., 1978. Modelling of Stochastic Communication Systems. In: Wilson, A.G. (Ed.), Entropy Methods in Complex System Modelling. Nauka, Moscow, pp. 230–274 (in Russian). Iusem, A.N., Svaiter, B.F., Teboulle, M., 1996. Multiplicative Interior Gradient Methods for Minimization over the Nonnegative Orthant. SIAM J. Control Optim. 34 (1), 319–446. Kapur, J.N., 1989. Maximum Entropy Models in Science and Engineering. John Wiley & Sons, New York. Landau, L.D., Lifshitz, E.M., 1964. Statistical Physics. Nauka, Moscow. Lent, A., 1977. A convergent algorithm for maximum entropy image reconstruction, with a medical X-ray application. In: SPSE Conf. Proc. ed. R. Show, pp. 247–449. Leonardy, G., 1978. Optimal facility location by accessibility maximization. Environ. Planning A 10, 1287–1305. Polyak, B.T., 1987. Introduction to Optimization. Optimization Software, New York. Popkov, Yu.S., 1984. Mathematical models of stationary states of macrosystems. Autom. Remote Control 45 (6, Part 2), 795–802. Popkov, Yu.S., 1986. Mathematical models of macrosystem stationary states: Principles of formulation analysis, and algorithms. Large Scale Syst. 10, 1–20. Popkov, Yu.S., 1988. A Multiplicative Algorithm for Fermi—Models of Macrosystems. Autom. Remote Control 49 (7, Part 1), 320– 927. Popkov, Yu.S., 1995. Macrosystem Theory and its Applications (Springer Series LNCIS). Springer-Verlag, London. Popkov, Yu.S., 1995a. Multiplicative algorithms for finding nonnegative solutions of convex programming problems. Control Cybernet. 24 (2), 155–169. Popkov, Yu.S., 1996. Multiplicative algorithms for determining positive solutions of nonlinear equations. J. Comput. Appl. Math. 69, 27–37. Smith, C.P., Grandy, W.T. (Eds.), 1985. Maximum Entropy Bayesian Methods in Inverse Problems. D. Reidel Publ. Wilson, A.G., 1970. Entropy in Urban and Regional Modelling. Pion Ltd., London.