Submodular Function Minimization

Submodular Function Minimization

287 Chapter VI. Submodular Function Minimization The present chapter deals with developments in algorithms for submodular function minimization. Read...

1MB Sizes 0 Downloads 116 Views

287

Chapter VI. Submodular Function Minimization The present chapter deals with developments in algorithms for submodular function minimization. Readers should also be referred to a nice survey [McCormickOS] for submodular function minimization.

13. Symmetric Submodular Function Minimization: Queyranne's Algorithm H. Nagamochi and T. Ibaraki [Nagamochi+Ibaraki92] devised an efficient algorithm for finding a minimum cut for a capacitated undirected network without using any fiow algorithms, where a cut means a nonempty proper subset U of the vertex set F , the capacity of a cut U is the sum of the capacities of edges between U and V — U, and a minimum cut is a cut of minimum capacity. Then, A. Frank [Frank94a] and M. Stoer and F. Wagner [Stoer-t-Wagner95] independently gave simple proofs of the validity of the Nagamochi-Ibaraki min-cut algorithm. Based on the results of [Frank94a] and [Stoer+Wagner95], M. Queyranne [Queyranne95] extended the Nagamochi-Ibaraki algorithm to a combinatorial polynomial algorithm for minimizing symmetric submodular functions. Although the problem of minimizing symmetric submodular functions is quite diff'erent from that of minimizing submodular functions, Queyranne's result recaptured researchers' attention to submodular function minimization. Let y be a nonempty finite set of cardinality | y | = n > 2 and / : 2^ -^ R be a submodular function. A submodular function / is called symmetric if / satisfies f{X) = f{V-X) (XCV). (13.1) Throughout this section we assume that / : 2^ ^ ' R is a symmetric submodular function with /(0) = f{V) = 0. Note that under this assumption f{X) > 0 (X C F ) since 2 / ( X ) = f{X) + f{V - X) > f{V) + /(0) = 0. Hence 0 and V are minimizers of / on 2^. The problem of minimizing a

288

VL SUBMODULAR

FUNCTION

MINIMIZATION

symmetric submodular function / is to minimize / over 2^ — {0, V^}. We call such a minimizer a min-cut of / . Remark: The term "symmetric submodular function" was introduced in [Fuji83]. But the author thought that the term "symmetric" was confusing, and hence tacitly avoided the description of symmetric submodular functions in the first edition of this monograph. (In the ordinary terminology a set function /i : 2^ —» R is called symmetric if for each X CV the value of h{X) depends only on the cardinality of X,) However, since Queyranne's paper [Queyranne95] appeared, the term "symmetric submodular function" has widely been accepted, so the author also adopts it here. For any distinct s^t e V a, set U C V with |{s,t} nU\ — 1 is called an s-t cut and a minimum s-t cut U is an s-t cut of minimum value f{U), In the same way as Nagamochi and Ibaraki's min-cut algorithm, Queyranne's algorithm picks up elements of V one by one, which determines a linear ordering (t;i, t'25 • • •, Vji) of y^ in such a way that {vn} is a minimum Vn-i-Vn cut. The ordering {vi^V2r'' ^^^n) is called a maximum-adjacency ordering (or an MA-ordering for short) in Nagamochi and Ibaraki's algorithm, so we also call this ordering an MA-ordering (also see [NagamochiOO, 04] and [Nagamochi-|-Ibaraki02] for related topics). The following argument is based on [Fuji98]. While determining an MA-ordering {vi^V2r " -^Vn)-, we define Uk — {t^i, V2r " ^ ^/c} (fc = O515 • * • 7 ^) ^nd also for each fc = 1,2, • • •, n — 1 define i/;fc : 2^ - . R by w^{C) = f{C) - ( l / 2 ) { / ( C n Uk-i) + f{C n Uu-i) - /(C/^-i)}

(13.2)

for all C C y , where X iov X C.V denotes the complement of X in V. It should be noted that Wk is symmetric (i.e., Wk{C) = Wk{V — C)) but not necessarily submodular. Queyranne's Algorithm for Finding an MA-ordering Step 0: Choose an element vi G V. Put Ui ^^ {vi} and k <^ 2, Step 1: Let v^ be an element oiV — Uk-i that attains the maximum value of Wk{{u}) over u eV — Uk-i, where Wk is defined by (13.2). Step 2: li k = n, then return {vi^ V2j" ' ^ '^n)- Otherwise put Uk ^^ Uk-i U {vk} and fc -^ fc -h 1, and go to Step 1. (End)

13. Symmetric Submodular Function Minimization

289

Then we have the following. L e m m a 13.1: For any fc = 1,2, • • •, n — 1 and any u GV — Uk, {u} minimizes Wk{C) over u-vj^ cuts C. (Proof) We show this lemma by induction. For fe = 1, we have UQ = %^ so that wi{C) = 0 for all C Q V. Hence the statement for fc = 1 holds. Now, suppose that it holds ior k = I with 1 < / < n — 1. Consider any u ^V — Ui^i and any u-vi^i cut C, If C is a u-vi cut, we assume without loss of generality that {ti, i;/, t^z+i} nC = {vi^ '^/+i}- Then, by the induction hypothesis {wi{C) > wi{{u])) and by the definitions of wi and wi^i we have wi+^{C) - «;/+i(W) > wi{C) - wi{{u]) > 0, (13.3) where the first inequahty follows from the symmetry and submodularity of / as 2{wi+x{C) - wi+i{{u]) - wi{C) + wi{{u]))

= /(C n C7I u {vi}) - f{c f\iJi) + f{{u} nui)- f{{u} n uTi) - fjcnui u {vi}) + m-^ {u}) - f{cn ul) - fjuTi - {u}) > /{uTi - {u}) + f{c nui)- f{c nui)- /{uTi - {u}) = 0.

(13.4)

Next, if C is a vi-vi^i cut, we assume that {u, w;, v/+i}flC = {vi+i}. Then, by the symmetry and submodularity of / we have wi+i{C) - wi{C) > wi+i{{vi+i}) - wi{{vi+i}),

(13.5)

since 2{wi+i{C) - wi{C) - wi+i{{vi+i}) + wi{{vi+i}))

= f{C n c/i_i) + f{Ui - {^Hi}) - f{c n Ui) - f{Ui-i - H+i}) > 0.

(13.6)

It follows from the induction hypothesis {wi{C) > wi{{vi^i})) that wi+i{C) > wi+,{{vi+i}). Moreover, by the definition of vi^i we have wi^i{{vi^i}) Hence, from (13.7) we get wi+i{C)>wi+i{{u}). Consequently, the statement for fc = / + 1 holds.

>

and (13.5) (13.7) wi^i{{u}). (13.8) Q.E.D.

290

VI. SUBMODULAR

FUNCTION

MINIMIZATION

For k = n — 1 and any Vn-i-Vn cut C we have Wn-l{C)

= / ( C ) - {l/2){f{{Vn-l})

+ f{{Vn})

- / ( K - 1 , ^n})}-

(13.9)

Hence from Lemma 13.1 we have Theorem 13.2: For the MA-ordering {vi,V2r " ^'^n) obtained by Queyranne^s algorithm, {vn} is a minimum Vn-i-Vn cut tor f. After finding a minimum Vn-i-Vn cut Co = {vn} for / , we restrict the domain 2^ of / to P i = {X | X C V, \{vn-i, Vn}nX\ i- 1}. Denote the restriction of / to V\ by / i . The new symmetric submodular system (I>i, / i ) is the aggregation of (2^, / ) by the partition {{v\}, • • •, {tVi-2}, {'^n-i, ^n}} (see Section 3.1.d). Considering the simphfication ( ^ i , / i ) of (I^i,/i), or regarding {vn-iiVn) as a single element, we apply Queyranne's algorithm described above to the simplification (Pi, / i ) to find a minimum cut C\ of / i . The minimum cut C\ gives a cut C\ of / that attains the minimum of / ( C ) over C G P i . We repeat this process n — \ times to get a set of cuts C'o, Ci, • • •, Cn-i of / , where each Ci-\ (i = 1, 2, • • •, n) is a cut obtained by the ith application of Queyranne's algorithm. A cut Ci* that attains the minimum of /(Co), / ( C i ) , • • •, f{Cn-i) is a minimum cut of the original / . T h e o r e m 13.3: Queyranne^s algorithm finds a minimum cut off in O(n^) time, where we assume a function evaluation oracle for / . Remark: Queyranne [Queyranne95] has also shown that for a symmetric submodular function / : 2^ -^ R and a pair of distinct s,t G F , finding a minimum s-t cut is as hard as minimizing a (general nonsymmetric) submodular function. Nagamochi and Ibaraki [Nagamochi-|-Ibaraki98] have shown that Queyranne's algorithm also works for submodular functions satisfying / ( X ) + f{Y) > f{X -Y) + f{Y - X) (X, Y C y ) , which are slightly more general than symmetric submodular functions (also see [RizziOO]). Also see [Baiou+Barahona+MahjoubOO] for an application of Queyranne's algorithm. It may be worth pointing out that an MA-ordering algorithm provides us with a new maximum-flow algorithm for directed capacitated networks ([FujiOSb], [Fuji+Isotani03]). 14. Submodular Function Minimization

14. Submodular Function Minimization

291

Grotschel, Lovasz and Schrijver devised the first weakly polynomial algorithm for submodular function minimization [Grotschel + Lovasz + SchrijverSl] and also the first strongly polynomial algorithm [Grotschel + Lovasz + Schrijver88], both based on the ellipsoid method [Khachiyan79, 80] (see Section 7.1.a). It had been an open problem since 1981 to find a 'combinatorial' polynomial algorithm for minimizing submodular functions (see earlier work of [Cunningham84, 85a], [Narayanan95] and [Sohoni92]). This long-standing open problem was resolved in 1999 independently by S. Iwata, L. Fleischer and S. Fujishige [IFFOl] and A. Schrijver [SchrijverOO] in different ways but based on the same framework due to W. H. Cunningham [Cunningham84, 85a]. We describe their polynomial algorithms for minimizing submodular functions. It should be noted that there are problems that heavily rely on algorithms for general submodular function minimization. Such problems are found in [Hoppe+TardosOO] for dynamic flows, in [Tamir93] for facility location, in [Han79] for multiterminal source coding (also see [Fuji78c]), etc. Let y be a nonempty finite set and consider a submodular function / : 2^ -^ R with /(0) = 0. The following min-max relation due to Edmonds [Edm70] is essential in submodular function minimization. Lemma 14.1: mm{f{X)

\X CV} = max{x(y) | x < 0, x G P ( / ) } ,

(14.1)

where P ( / ) is the submodular polyhedron associated with f. (Moreover, if f is integer-valued, there exists an integral maximizer x of the right-hand side of (14.1).) The min-max relation (14.1) was explicitly stated in [Fuji84c] for the first time in the literature, to the author's knowledge, though it easily follows from [Edm70]. Equivalently it can be rewritten in terms of the associated base polyhedron B(/) as follows. Lemma 14.2: min{/(X) \X CV} = max{x-(y) | x G B ( / ) } ,

(14.2)

where x~ for x G R ^ is a vector defined by x-{v)

= min{0, x{v)}

{v G V),

(14.3)

{Moreover, if f is integer-valued, there exists an integral maximizer x of the right-hand side of (14.2).)

292

VL SUBMODULAR

FUNCTION

MINIMIZATION

Note t h a t the min-max relation (14.1) (or (14.2)) is a strong We have the following easy weak duality:

duality.

L e m m a 14.3:

yx cv.yxe B(/) : f{x) > x-{v),

(14.4)

Because of this weak duality, if / is integer-valued and the duality gap f{X) — x~{V) is less t h a n one, then we see t h a t X is a minimizer of / . This lays a basis for obtaining a weakly polynomial algorithm for submodular function minimization of Iwata, Fleischer and Fujishige [IFFOl]. Given a base x, consider a set X C F of nonpositive components of x such t h a t {veV\ x{v) < 0} C X C {^ G F I x{v) < 0 } . (14.5) T h e n we see from (14.2) or (14.4) t h a t if X is x-tight, i.e., x{X) then f{X) = mm{f{X) \XCV},

=

f{X), (14.6)

i.e., X is a minimizer of / . If there is no x-tight set X satisfying (14.5), we can increase some negative component of base x and simultaneously decrease some positive component of x by the same amount to get a new base. We may repeat this process until we eventually get an x-tight set X satisfying (14.5). This is, however, a generic algorithm t h a t is not easy to implement in an efficient way. Hence, instead of directly treating a base, we express a base as a convex combination of extreme bases since extreme bases are easy to transform. This is t h e framework of Cunningham ([Cunningham84, 85a]). Let L = (vi^V2r'' ^^n) be a linear ordering of V, For each i = 1,2, • • •, n denote by L{vi) the set of the first i elements of L, i.e., L{vi) = {t;i,t'2, • • • , t ' i } . Then, linear ordering L determines an extreme base y G B ( / ) by the greedy algorithm of Edmonds [Edm70] and Shapley [ShapleyTl] as y{vi) = f{L{vi)) - f{L{vi.i)) (i = 1,2, • • •, n ) , (14.7) where L{vo) = 0 (see Section 3.2.b). We see from the greedy algorithm t h a t each initial segment L{vi) of L is y-tight, i.e., y{L{vi)) = f{L{vi))^ for i = 1,2, • • • , n . W h e n a base x is expressed as a convex combination of extreme bases yi {% G / ) as

x = Y.^iyi

(14.8)

14.1. The Iwata-Fleischer-Fujishige Algorithm

293

with Xi > 0 (i ^ I) and ^i^j A^ = 1, we can easily see that for a set W C.V W is x-tight <^=^ yie

I : W is yi-tight.

(14.9)

14.1. The Iwata-Fleischer-Fujishige Algorithm In this section we give the algorithm devised by Iwata, Fleischer and Fujishige [IFFOl] (also see [FujiOSa]). We assume that we are given an oracle for the function evaluation of a submodular function / , i.e., to compute f{X) for any X CV requires 0(1) time. (a) A weakly polynomial algorithm We describe a weakly polynomial algorithm for submodular function minimization of [IFFOl]. The key techniques are augmenting-path and scaling techniques [Iwata97] and an exchange operation technique to search for augmenting paths [Fleischer+Iwata+McCormick02] both developed for submodular flows. The former technique of [Iwata97] overcomes the difficulty arising in rounding base polyhedra and the latter technique of [Fleischer+Iwata+McCormick02] avoids exchange operations on an augmenting path. It should be mentioned that a technique related to the former was also proposed in [Narayanan95] and one related to the latter in [Goldfarb+Jin99]. We first consider an integer-valued submodular function / defined on 2^. Let Afy be the complete directed network with vertex set V and arc set V X V. For a given parameter 5 > 0 a fiow (p : V x V ^ "R in Afy is called 6-feasible if it satisfies 0 < (p{u,v)
{u,ve

V).

For any 5-feasible flow if in Afy we assume that (f{v^v) = 0{oYvEV \/u,veV:

{ip{u,v)>0

=^

(p{v,u) = 0).

(14.10) and (14.11)

Furthermore, deflne d^S = {d(p I 99 : a (5-feasible flow in Afy},

(14.12)

where d(p is the boundary of flow (p in Afy. Recall that the flow boundary polyhedron d^s is the base polyhedron associated with the cut function

294

VL SUBMODULAR

FUNCTION

MINIMIZATION

Ac<5 of network A/y with a uniform capacity S, i.e., KS{X) = S\X\\V — X\ {X C V) (see (2.65)). Now, instead of directly treating the dual pair of the min-max problems in (14.2) we consider perturbed dual pair of problems associated with the Minkowski sum (vector sum) of the original base polyhedron B(/) and the flow boundary polyhedron d^s{= B(/^^)). Note that the Minkowski sum B ( / ) + d^s is equal to B ( / + KS). We try to solve (approximately) the following dual min-max problems: Minimize subject to

f{X) + Ks{X) X CV,

(^A^^\ ^ ^ ^

Maximize subject to

(x + d(p) ~{V) X e B(/), Lp: a 5-feasible flow

(14.14)

by repeating a 6-augmentation (precisely defined below). Each (5-augmentation increases a negative component oi x + dip by S and simultaneously decreases a positive component of it by 6, The detail of 5-augmentation is described below. Suppose that we are given a base x G B(/) and a (5-feasible flow ip in A/y- Define the residual graph G{(p) to be a graph {V^E{(p)) with an arc set E{(p) = {{u, v) \u,v eV, u^v, (p{u, v) = 0}. (14.15) The arcs of the residual graph G{ip) are exactly those (non-selfioop) arcs in which fiow (p can be increased by S without destroying the (5-feasibility of (p. Also define S T

= {veV\ = {v^V\

x{v) + dip{v) < ~5}, x{v) + d^{v) > 5],

(14.16) (14.17)

A directed path from 5 to T in residual graph G{(p>) is called a S-augmenting path. If there exists a (5-augmenting path P , augment the current fiow (p by S along path P as (p{u, v) ^ S - (p{v, u), ^{v,u)^{)

(14.18) (14.19)

for each arc {u^ v) in P. This results in increasing {x + d(p)~{V) in (14.14) by 5 and the updated fiow Lp is 5-feasible. This operation is called a 5augmentation.

14.1. The Iwata-Fleischer-Fujishige

Algorithm

295

If there does not exist such a 5-augmenting path, then let W be the set of vertices in residual graph G{LP) that are reachable along directed paths from S. If W is x-tight (i.e., x{W) = f{W))^ then we finish what we call the 5-scaling phase for a current 5 > 0 (and, if necessary, we put S
W is not

yi-tight.

O O

Ov

u

o

^u

Li

Li

W

o

o





o

o





u

Figure 14.1: An example of exchanging (the dark circles are elements of W and the light circles are oiV — W). We shift forward a VF-element u by interchanging a nonVF-element v before um Li. By this transformation we get a new linear ordering L^ and a new extreme base y^. Its u component is increased and v component is decreased by the same amount, say a ( > 0), as

y'i ^yi + oi{xu-xv),

(14.20)

296

VL SUBMODULAR

FUNCTION

MINIMIZATION

which can easily be seen from the greedy algorithm (see Section 3.2). Here, a can be computed as a = fiUiu)

- {v}) - f{Li{u)) + Viiv)

(14.21)

(recall that this is equal to the exchange capacity c{yi^u^v) (see (2.38)). Also note that if yi ^ y[^ then yi and y[ are adjacent vertices (extreme bases) of base polyhedron B ( / ) . We keep x + dip invariant and also keep the (5-feasibility of (/p. To achieve this we compute new x and (^ as follows (we denote the procedure by DoubleExchange). Putting (3 = min{5, A^a}, x^x

+ (3{xu - Xv)

(14.22)

and ^{v^u)

-^

max{0,/? — (^(u, t')},

(14.23)

ip{u, v)

^

max{0, (^{u, v) - /?}.

(14.24)

If Xia < 5, the new yi replaces the old yi. We put yi^yi + a{xu - Xv) (14.25) and update Li by interchanging u and v. In this case the present operation of updating x, (/?, yi and Li is called a saturating push. If Xia > 6^ then we need both new and old extreme bases to express the new X in (14.22) as a convex combination of currently available extreme bases. We put A: ^^ a new index,

(14.26)

/^/U{fc},

(14.27)

Xk^^i^/«, Xi ^ f3/a,

(14.28) (14.29)

Vk ^ Vu Lk ^ Li

(14.30) (14.31)

and update yi by (14.25) and Li by interchanging u and v^ where note that we have a > 0. This operation is called a nonsaturating push.

14.1. The Iwata-Fleischer-Fujishige Algorithm

297

(1) After a nonsaturating push, the value of ^{u^v) becomes zero by (14.24). Hence arc (u^v) appears in the updated residual graph G{if) and W gets enlarged. Consequently, there are at most n nonsaturating pushes before the next (5-augmentation or the end of the current scaling phase. (2) |/| is increased by one after a nonsaturating push and hence |/| < 2n, where initially and every time we finish a 5-augmentation or a current scaling phase, we get |/| < n by expressing a current base x as a convex combination of affinely independent extreme bases. (We denote the procedure to reduce |/| by Reduce(x,/).) (3) There are O(n^) saturating/nonsaturating pushes for each yi (i G / ) since for each i every VF-element is shifted forward in the list (linear ordering) Li. (4) Each saturating push requires 0(1) time and each nonsaturating push 0(n) time. It follows that we find a 5-augmenting path (or finish the (5-scaling phase) in O(n^) time. As mentioned in (2) above, after a (^-augmentation (or at the end of (5-scaling phase) we perform Reduce(x,/), i.e., we express a current base X = Y^i^j XiVi as a convex combination of an affinely independent subset of {y^ I 2 G / } , which requires O(n^) time by Gaussian elimination. Let us now examine how many ^-augmentations and how many scaling phases we need to get a minimizer of / . We have the following relaxed weak duality. Theorem 14.4: For any base x G B(/) and for any 6-feasible Bow (p we have (x + d(p)-{V) < f{X) + n'^S/A {X C V). (14.32) (Proof) For any X C F we have {x + d^)-{V)

<{x + d^)-{X)

< f{X) + 6\X\\V -X\<

<{x + d^){X)

f{X) + r?5/A.

= x{X) + dip{X) (14.33) Q.E.D.

We also have a relaxed strong duality.

298

VL SUBMODULAR

FUNCTION

MINIMIZATION

Theorem 14.5: At the end of each 6-scahng phase we get a set 0 X =< V W

ifS^H) ifT = 0 otherwise

(14.34)

such that (x + dip)-{V) > f{X) - nS,

(14.35)

which also impUes x~{V) >f{X)-n^5.

(14.36)

(Proof) Let X be the set defined by (14.34). If 5 = 0, it follows from the definition of S in (14.16) that (x + d(p){v) > —S {v e V). Hence we have (14.35) for X = 0. If T = 0, then we see from (14.17) that {x + d(f)-{V) > {x + d^){V) -n5 = f{V) -nS.IiS^H) and T ^ 0, then SCW CV-T. Hence, putting z = x + dip^ we have z~{V) = z~(W) + z~{V — W) > z{W) - S\W\ - S\V -W\= x{W) + dip{W) -nS> f{W) - n6, where the last inequality follows from the fact that W is x-tight and d(p{W) > 0. This shows (14.35). Moreover, since dip{v) < (n — 1)5 for each i; G F , we have from (14.35) x-{y) > z-{V) - n{n - 1)6 > f{X) - v?5, Q.E.D. It follows from (14.36) that \i 5 < 1/n^, then X obtained after the current 5-scaling phase through (14.34) is a minimizer of / . If 5 > 1/n^, then we proceed to the next scaling phase by putting 5 'r- ^5 and ip <^ ^(p. In the beginning of the next 5-scaling phase we have from Theorems 14.4 and 14.5 f{X) - 2nS - n^S/4 < {x + d^)-{V)

< f{X) + n^5/4.,

(14.37)

This implies that the current duality gap is at most (2n + m?/2)5^ so that there are O(n^) ^-augmentations in the 5-scaling phase. Moreover, we initialize the input so that the number of 5-augmentations is O(n^) in the initial scaling phase, as follows. We put L <^ di linear ordering of V^

(14.38)

X ^^- an extreme base determined by L,

(14.39)

S^mm{\x-{y)lx^{V)}/n^,

(14.40)

I ^ {k}, Vk ^x,

(14.41)

^ ^ 0,

\k^

1, Lk ^ L,

(14.42)

14.1. The Iwata-Fleischer-Fujishige Algorithm

299

where x~^ is a vector defined as x~^{v) = max{0,a;(t')} {v EV). It follows from (14.40) that there are O(n^) 5-augmentations in the initial scaling phase. Summing up the above arguments, we describe the IFF algorithm as follows. The Weakly Polynomial IFF Algorithm SFM(/) Step 0: Initialize L,x,5,I,^ as (14.38)-(14.42). Step 1: While 5 > l/n^, do the following ( l ) - ( 6 ) : (1) S^{v\ x{v) + d(f{v) < -S}, (2) T^{v \xlv) + d^lv) >S}, (3) W <^ the set of vertices reachable from S in G{(p)^ (4) While VF n T / 0 or there is an active triple do While VF n T = 0 and there is an active triple do Apply Double-Exchange to an active triple {i^u^v). Update W. If T^ n r 7^ 0 then Augment flow ^p along a ^-augmenting path P . Update G((/^), 5, r , W . Apply Reduce(x,/). (5) 5 ^ 6/2 (6) ^ ^ ^/2 Step2: Return W. Now the complexity of the IFF algorithm is given as follows. Theorem 14.6: Define M = max{/(X) \ X C V} and suppose M > 1. There are O(logM) scaling phases till 6 < 1/v?. In each 6-scaling phase {with S > 1/v?) there are O(n^) S-augmentations. Each S-augmentation requires O(n^) time. Hence we can Gnd a minimizer of f in O(n^logM) time. (Proof) It suffices to show the first statement. The others follow from the above argument. From the definition of the initial S in (14.40) we see that the initial S satisfies n'^S = mm{\x~{V)\,x~^{V)} < x'^{V) < M. Hence after O(logM) scaling phases 6 becomes less than 1/n? and we find a minimizer of / . Q.E.D. Remark: Without Gaussian eliminations the Iwata-Fleischer-Fujishige (IFF) algorithm is still a polynomial algorithm and requires O(n^logM)

300

VL SUBMODULAR

FUNCTION

MINIMIZATION

time. On the other hand, Schrijver's algorithm described later does not enjoy this property. This is a crucial point in Iwata's fully combinatorial polynomial algorithm [Iwata02] for submodular function minimization derived from the IFF algorithm. (b) A strongly polynomial algorithm In the IFF paper [IFFOl] a technique is given to obtain a strongly polynomial algorithm for submodular function minimization by using the weakly polynomial algorithm as a subroutine: the weakly polynomial algorithm is modified so that we perform O(logn) scaling phases, and we compute a minimizer of / after invoking the weakly polynomial IFF algorithm O(n^) times. Hence the strongly polynomial IFF algorithm runs in O(n^logn) time. Let / : 2^ -^ R be a real-valued submodular function. Note that we can perform the weakly polynomial IFF algorithm for / , discarding the stopping criterion 6 < 1/n^. We denote this procedure by SFM(/). The following lemma is crucial to get a strongly polynomial algorithm for submodular function minimization. Lemma 14.7: At the end of a S-scaling phase in SFM(/) the following two statements hold: (a) If x{w) < —v?5, then w is contained in every minimizer of f. (b) If x{w) > in?5, then w is not contained in any minimizer of f. (Proof) Let X be the set and x the base appearing in Theorem 14.5. Then, for any minimizer y of / we have f{X) > f{Y) > x{Y) > x-iY).

(14.43)

It follows from (14.43) and Theorem 14.5 that at the end of the ^-scaling phase x~{V) > f{X) - n'^6 > x-{Y) - n^5. (14.44) Hence, if x{w) < —n^5, then w ^Y. x-{Y)

On the other hand we have

> x-{V) > f{X) - n^5 > x{Y) - n^5,

Theorefore, if x{w) > in?5, then w ^Y,

(14.45) Q.E.D.

14.1. The Iwata-Fleischer-Fujishige Algorithm

301

Lemma 14.7 will be employed to get information about elements that are contained in every minimizer of / and about a binary relation RCVxV such that {v^w) G R implies that any minimizer of / containing v also contains w. We keep (1) a set X C F that is included in every minimizer of / , (2) a partition n = {Vi, F2, • • •, V/} oiV—X and a set U = {ui^U2^ • • • ,ui} such that each ui ^ U represents component VJ G H, (3) a submodular function / on 2^, (4) a directed acyclic graph D = {U^F)^ where the existence of an arc {v^ w) ^ F means that any minimizer of / containing v also contains w. Initially, we have X - 0,

n - {{v} \veV},

U = V,

F = 0,

f = f.

(14.46)

For each u E U let R{u) be the set of the vertices oi D that are reachable from vertex u by directed paths in D. Also let fu be the contraction of / by R{u), i.e., for each Z CU — R{u) UZ)

= f{Z U R{u)) - f{R{u)).

(14.47)

A linear ordering {ui^U2^ - - ^ui) oiU \s called consistent withD if [ui^Uj) G F implies that j < i. The extreme base x G B(/) that is determined by a consistent linear ordering is also called consistent with D. It should be noted that such an extreme base x is an extreme base of a submodular system (P, / ) on [/, where T> is the set of the (lower) ideals of the poset corresponding to D and the domain of / should be restricted to V. Hence, L e m m a 14.8: Any extreme base x G B(/) consistent with D satisfies x{u) < f{R{u)) - f[R{u) - {u}) for each ueU. (Proof) See (3.89)-(3.91). Q.E.D. Suppose that we are given a scaling parameter ry > 0 and an extreme base X G B(/) consistent with D and that f{U) > ri/3 or there is a set Y C U such that f{Y) < —rj/S. Starting with 5 = 77, we repeat the scaling phase of the weakly polynomial IFF algorithm O(logn) times until S < r]/{3n^). Denote this procedure by Fix(/,D,r7). We can easily see the following.

302

VL SUBMODULAR

FUNCTION

MINIMIZATION

(i) When f{U) > 7?/3, at least one element w E U satisfies x{w) > in?5 at the end of the last scaling phase (since x{U) = f{U) > ry/3 > n^S). It follows from Lemma 14.7(b) that such an element w is not contained in any minimizer of / . (Then we can delete w and some other possible elements from U and restrict / on a smaller domain.) (ii) When f{Y) < —r]/3 for some y C [/, at least one element w E Y satisfies x{w) < —n^S at the end of the last scaling phase (since x{Y) < f{Y) < -ri/3 < -n^S) . By Lemma 14.7(a), such an element w and hence elements in R{w) are contained in every minimizer of / . (Then we can restrict our attention to U — R{w) by considering the contraction of / by R{w). Accordingly we update set X.) Now, define r] = max{/(i?(u)) - f{R{u) - {u}) \ueU},

(14.48)

It follows from Lemma 14.8 that x{u) < rj {u E U) for any extreme base X G B(/) consistent with D. (I) If 7? < 0, then any extreme base x G B(/) consistent with D satisfies X < 0, which implies that C/ is a minimizer of / . Hence, defining U as the subset of V that is the union of sets corresponding to all u E U^ we have a minimizer X UU oi the original / . (II) If 77 > 0, then let u be an element in U that attains the maximum in the right-hand side of (14.48). Since 7? = fiR{u))-fiR{u)-{u})

=

f{U)-f{R{u)-{u})+{f{R{u))-f{U)), (14.49)

we have max{/(C/), -f{R{u)

- {u})J{R{u))

- f{U)} > 77/8.

(14.50)

Hence there are the following three, not necessarily exclusive, subcases (II-l), (11-2) and (II-3) to consider.

(ii-i) [ m

> v/s ]

Apply Fix(/, D, rj) to find a new element w E U that is not in any minimizer of / . For such an element w^ any element v with w G R{v) does not belongs to any minimizer of / , so that we delete {v \ w E R{v)} from U,

14.1. The Iwata-Fleischer-Fujishige Algorithm

303

(II-2) [ f{R{u) - {u}) < -77/3 ] Apply Fix(/,D,77) to find a new element w E R{u) — {u} that is contained in every minimizer of / . For such an element w^ R{w) is also contained in every minimizer of / , so that we put U ^^ U — R{w)^ f '^ fw and X ^e- X U (5 with Q — R{w)^ where fxv is the contraction o f / b y i ^ H as in (14.47). (II-3) [ UU - R{u)) - / ( [ / ) - f{R{u)) < - r / / 3 ] Perform F\x{fu,D{^,ri)^ where Du is the subgraph of D induced by U — R{u). Then we get an element w GU — R{u) that is contained in every minimizer of f^. It follows that every minimizer of / containing u contains w. Hence we add a new arc {u^ w) to F. If this creates a cycle in D (i.e., u G R{w))^ let Z =^ {v \ v ^ R{'^)i '^ ^ R{v)} (i.e., Z is the vertex set of the strongly connected component of D that contains u and w). Shrink Z into a single vertex to obtain a new directed acyclic graph D and update U and / regarding Z as a singleton. We repeat (I) and (II) by updating 77 by (14.48) tiU we get a minimizer by (I) or U becomes empty. When U becomes empty, the current X is a minimizer of / . Procedure F'\x{f^D^ri) for Cases (II-l) and (II-2) are performed 0(n) times and f\x{fu,Du^r]) for Case (II-3), 0{n?) times. In each of Cases (II1), (II-2) and (II-3) each Fix carries out O(logn) scaling phases and each scaling phase requires O(n^) time. Hence we have Theorem 14.9: The algorithm described above finds a minimizer of f in O(n^logn) time. It should be noted that since the algorithm keeps a set X U C/ C y that includes all the minimizers of / , the finally obtained X U C/, an output of the algorithm, is a unique maximal minimizer of / . (c) Modification with multiple exchanges We can modify the weakly polynomial IFF algorithm as follows ([Fuji02], [Fuji03a]). In searching for a 5-augmenting path, the original IFF algorithm interchanges adjacent W- and nonW^-elements to shift forward each W-element

304

VI. SUBMODULAR

FUNCTION

MINIMIZATION

W is not yi-tight.

W is j/j-tight. O O

o U

a

Li

W

Figure 14.2: A multiple exchange. (see Fig. 14.1). Instead of repeating such an interchanging, here we simultaneously shift all H^-elements forward to make W an initial segment of a new linear ordering L[ (see Fig. 14.2), keeping the orders among W and V -W. This new linear ordering gives a new extreme base y[ as Vi

yi +

E uew

(14.51) vez

where Z = V — W. The additional part in (14.51) can be expressed as the boundary of a flow ij) :V xV —^ R+ in a forest: (14.52) where {(ix, v) \ XIJ{U^ V) > 0} forms a forest. Such a flow I/J can be constructed in a greedy way or by the so-called north-west corner rule (see Fig. 14.3). Then, to keep x + d(p invariant, we update (p based on this ip so that new (p cancels the additional part in (14.51) multiplied by A^. Here, to keep also the (5-feasibility of (^, we have a saturating push or a nonsaturating push similarly as in the IFF algorithm. In case of nonsaturating push, we need both new and old extreme bases and W gets enlarged. Moreover, while W

14.1. The Iwata-Fleischer-Fajishige Algorithm

305

W

o-i

Figure 14.3: An example of a flow '?/? in a forest and its boundary ^^p, remains the same, there is at most one saturating push for each i G / . This simphfies the IFF algorithm but the complexity is the same as the original IFF algorithm. It should also be noted that new extreme base y[ computed through (14.51) is not adjacent to yi in general. (d) Submodular functions on distributive lattices Let / : !> ^ Z be a submodular function on a distributive lattice with P C 2^, 0 , F G P and /(0) = 0. We cannot directly apply the IFF algorithm to such a submodular function / by formally deflning f{X) = +00 for X G 2^ - P since M in Theorem 14.6 becomes H-oc. We describe a modification of the IFF algorithm indicated in [IFFOl]. Suppose that V is the set of all (lower) ideals of a poset V = (Vi :^), and let G{V) = (V, A{V)) be an acyclic graph representing the poset P , i.e., {u, v) G A{V) <^=> V -< u. Any base x G B(/) is expressed as

x = Y,>^iyi + d^,

(14.53)

where the first term is the convex combination of extreme bases yi {i G / ) of B(/) and the second one is the boundary of a nonnegative fiow ip in graph G{V). Note that each extreme base yi of B(/) is determined by a linear extension Li of poset V — (V, :<) and that the characteristic cone of B(/) is the set of boundaries of all nonnegative flows in G{V) (see Theorem 3.26). We keep the expression of a base x as (14.53) instead of (14.8). We shall show how to adapt the weakly polynomial IFF algorithm for minimization

306

VL SUBMODULAR

FUNCTION

MINIMIZATION

of the submodular function / : D ^ Z. As in the original IFF algorithm, we consider a 5-feasible flow in the complete directed network Afy — {V^Vx V) and also consider the residual graph G{ip) = {V,E{(p)). The initialization is the same as in the IFF algorithm, except that in addition we put '0 ^^ 0, a zero flow in G{V), If there exists a 5-augmenting path P in residual graph G((/P), then augment flow (p along P. Otherwise let W be the set of vertices that are reachable from S in G{(f). We try to enlarge W by modifying extreme bases yi {i G / ) and a nonnegative flow tp in G{V) as follows. If there is an arc (ix, v) G A{V) with u ^W and v ^ W^ then put il;{u, v) ^ il){u, v) + 5, Lp{v^ u) ^^ 6 — (f{u^ v)^ ^{u,v) ^ 0 .

(14.54) (14.55) (14.56)

We call this operation a nonsaturating push (for i/j). This results in enlarging W, If any ^-augmenting path P appears in the updated residual graph G{(p), we carry out a (^-augmentation along P. Hence let us assume that all such possible nonsaturating pushes have been performed for a current -0, so that there is no arc (i^, v) G A{V) with u EW and v ^ VF, i.e., VF is a (lower) ideal of'P (or W ^V). Furthermore, if there is an arc {v^ u) G A{V) such that u G W^ v ^W and '0(i;, u) > 0, then put 7 ^^ min{^(t?, u)^ (f{u^ v)},

(14.57)

xl;{v, u) ^ ^l){v, u) - 7,

(14.58)

ifiu^ v) <^ (p{u, v) — 7.

(14.59)

If (p{u, v) becomes zero, we call this operation a nonsaturating push (for T/J) and W gets enlarged. Otherwise il){y^u) becomes zero and we call this operation saturating push (for ip). Hence let us further suppose that all such possible nonsaturating/saturating pushes for '0 have been made, so that W eV and that for each arc {v,u) G A{V) with u eW and v ^W we have ip{v^u) = 0. (Note that this implies di(^{W) = 0.) Suppose that H^ G P and il^{v,u) = 0 for all {v,u) G A{V), If W is an initial segment of each Li (i G / ) , then we finish the 5-scaling phase. Otherwise suppose that u ^W is immediately after v ^W in a list Li (see Figure 14.1). Then we have the following.

14.1. The Iwata-Fleischer-Fujishige Algorithm

307

Lemma 14.10: Suppose as above. Interchanging u and v in Li, we get a Unear extension L/ ofV. (Proof) Since W^ Li{u)^ Li{v) — {v} G T> and {u^v} OW = {u}^ we have Li{u) - {v} = {Li{u) nW)U

{Li{v) - {v}) e V,

(14.60)

where recall that Li is a linear extension of V and that Li{u) is the set of elements in the initial segment of Li till u (including u). Hence the present lemma holds. Q.E.D. Because of this lemma we can modify extreme bases as in the original IFF algorithm when W EV. To sum up, defining M = max{|/(X)| \X eV}, (1) By the initialization we have n^J = mm{\x-{V)lx^{V)}

< x+(F) < a + ( F ) ,

(14.61)

where a+{v) = f{D{v)) - f{D{v) - {v}) {v G V) (see (3.89)). Hence we have a^{V) < 2nM. (14.62) It follows that there are O (log nM) scaling phases till S < I/in?. (2) After a nonsaturating push W gets enlarged. Hence there are at most n nonsaturating pushes for extreme bases and for flow ^ before the next (^-augmentation or the end of the current scaling phase. (3) There are O(n^) saturating/nonsaturating pushes for each yi {i G / ) and ^ before the next 5-augmentation or the end of the current scaling phase. Relaxed weak duality (Theorem 14.4) and relaxed strong duality (Theorem 14.5) hold for submodular functions on distributive lattices with a minor modification: 'X C V in (14.32) should read 'X G V.' The proofs of the two theorems are also valid mutatis mutandis. Note that di{j{X) < 0 for any X e V and that when we finish a 5-scaling phase without making W enlarged, we have W EV and 5'0(VF) = 0. Consequently, we finish each 5-scaling phase after O(n^) 5-augment at ions. Since each 5-augmentation requires O(n^) time, the total running time is O(n^ log nM). Furthermore, making this weakly polynomial algorithm

308

VL SUBMODULAR

FUNCTION

MINIMIZATION

strongly polynomial results in an O(n^logn) algorithm, where in the beginning of the algorithm, graph D = ([/, F) that keeps information about the minimizers of / coincides with G{V) = (V, A{V)). It should also be noted that the modification with multiple exchange described in Section 14.1.C can also be adapted for minimization of submodular functions on distributive lattices.

14.2. Schrijver's Algorithm Schrijver [SchrijverOO] devised a combinatorial, strongly polynomial algorithm for submodular function minimization, independently and differently from the IFF algorithm [IFFOl]. Schrijver's algorithm also takes Cunningham's approach. That is, we assume that a current base x G B(/) is expressed as a convex combination of extreme bases yi (i G / ) corresponding to linear orderings Li (i G / ) :

x = J2Xiyi, iei

(14.63) [

where A^ > 0 (i G / ) and J2iei K = ^' For each i G / we denote by
(14.64)

for s, t G y and i E L For some i G / and s^t EV with s
(14.65)

(for some r/ > 0) of the current base x by generating new extreme bases as follows. This is a key procedure of Schrijver's algorithm. For each u E {s^t]i let L^'^ be the linear ordering of V obtained from Li by moving u to the position immediately before s and denote by <|'^ the linear order corresponding to L^'^. Also denote by y^'^ the extreme base determined by the linear ordering L^'^. Then, from the submodularity of / we can easily see the following lemma.

14,2. Schrijver's Algorithm

309

Lemma 14.11: For each u G {s^t]i we have -

yr{v)-yi{v)

if s

= { + ifv = u 0

(14.66)

otherwise^

where z — — means z <0 and z — -\- means z >0. It follows from this lemma that (i) If for some u G (s, t]i we have yl'^{u) — yi{u) = 0, then y^'^ = yi. We replace yi and Li by y^'^ and L^'^. (ii) If yi''^{u) - yi{u) > 0 for all u G {s,t]i, then xt - Xs is uniquely expressed as a linear combination of y^'^ — yi {u G (5, t]i) with positive coefficients. Then for a (unique) (5 > 0, 5{xt — Xs) is expressed as a convex combination of y^'^ — yi {u G (s, t]i), i.e.,

s{xt-xs)=

E A^«(yr-y^)

(14.67)

with IJLU > 0 {u e {s,t]i) and 5]uG{s,t]i M-^ "^ 1- Adding to (14.63) the above (14.67) multiplied by A^ yields an expression (14.65) with ri = 6Xi> 0. Note that after the operations of (i) and (ii) the extreme base yi disappears from the expression of the transformed (new) base and that the length of the interval (5, t]^'^ in Case (i) and the lengths of all the intervals (s,t]^'^ {u G {s^t]i) in Case (ii) decrease by one from that of {s^t]i. For the current yi and Li {i G / ) we define a directed graph G = (V, ^4) with a vertex set V and an arc set A = {{u,v)\3iel

:u
(14.68)

Now, Schrijver's algorithm is described as follows. We assume V — {l,2,.--,n}. Schrijver's Algorithm Step 0: Choose a linear ordering Li and let yi be the extreme base determined by Li. Put / ^ {1} and Ai ^ 1. Also let G = {V,A) be the directed graph associated with the current Li.

310

VJ. SUBMODULAR

FUNCTION

MINIMIZATION

Step 1: Define P = {veV\ x{v) >0} d^nd N = {v eV \ x{v) < 0}. If there exists no directed path from P to N in G = (V, A), then let U be the set of vertices in G from which we can reach AT by a directed path, and return [/ ([/ is a minimizer of / ) . Step 2: Let d{v) {v G V) be the distance in G from P to v^ i.e., the minimum number of arcs in a directed path from P to v. Choose s EV and t G AT such that the ordered triple (d(t),t, 5) satisfying d{t) < +CXD, (5, t) ^ A and d{s) + 1 = d{t) is lexicographically maximum, where the order in F = {1,2, • • •, n} is the ordinary order for integers. Then let i be an index in / that attains the maximum of the lengths |(5, t]j\ over j G / . For the chosen 5, t and i compute an elementary transformation of the current base x by the procedure described above. Let x' be the new base. (2-1): If x'(t) < 0, then from the expression of x' as a convex combination of current extreme bases, compute an expression of x' as a convex combination of affinely independent extreme bases, put x 0, then let x" be a convex combination of x and x' such that x^\t) = 0. Compute an expression of x'' as a convex combination of affinely independent extreme bases chosen from among the current extreme bases, put x <— x" and update / , y^, L^, A^ (i G / ) and G = (V, A). Go to Step 1. (End) We will show the validity and the complexity of the algorithm. The following arguments are based on [SchrijverOO] and [Vygen03]. When the algorithm terminates at Step 1, it is easy to see that then obtained U is y^-tight for each i G / , so that U is x-tight. Since x{v) < 0 {v G U) and x(^) >0{veV-U),we have f{U) = x{U) = x-{V) and hence ?7 is a minimizer of / . Consider an execution of Step 2. Define a = |(5,t]^| for the chosen i e I and define /? as the number of j ' s such that |(s,i]j| — a. Then let x', d\ G', A', P', AT', t'^ s', a', (5' be the objects x, d, G, A, F, N, t, s, a, /3 after the execution of Step 2. Fact 1: If (u, v) ^ A' — A, then s
14.2. Schrijver's Algorithm

311

d{v) < d{s) + 1 = d{t) < d{u) + 1. Hence, adding arc {u,v) to G does not decrease the distance from P to v. Moreover, since we have P' C P and removing arcs does not decrease the distance, this completes the proof. Q.E.D. Fact 3: The number of consecutive iterations of Step 1 and Step 2 with the same pair (t, s) is 0{n?). (Proof) For each u G {s^t]i we have |(s,t]^'^| < |(s,t]i|, so that a^ < a. Moreover, during the consecutive iterations with the same pair (t, s) we have x{t) < 0. Hence, if a' = a, then 13' < j3 because yi disappears (since x'{t) < 0). It follows that (a,/3) decreases lexicographically, and the number of such iterations is 0{v?'). Q.E.D. Fact 4: While all distances d{v) {v G V) remain the same, max{d(i') | v G A^} does not increase and furthermore if max{(i(t') | v G N} also remains the same, the set of the maximizers remains the same or becomes smaller. (Proof) A vertex v becomes a new element of N only ii v = s. Hence the new element v satisfies d{v) = d{t) — 1. Q.E.D. Fact 5: For each t* G V there are O(n^) executions of Step 2 with t = f" and x'{t) = 0. (Proof) When we get x'(t*) = 0, there holds t* ^ N'. When x(t*) becomes negative next time, letting d'\ s", t", N" be the then objects d, s, t, N, we have s" = T and max{d(^) \ v e N} = d(t*) < d'\t*) < d"{t") = m8ix{d''{v) \v e N"}, due to Fact 2 and the definitions of s''{= t*) and t'\ It follows from Fact 4 that for some v E V we have d{v) < d'\v). Hence Fact 5 follows from Fact 2. Q.E.D. Fact 6: For any u,v EV , u is called t;-boring if {u, v) ^ A or d{v) < d{u). Let 5*,t* G V. Consider a sequence of consecutive iterations of Step 1 and Step 2 starting with s = s* and t = t* and ending with the changing of d{t*). Then, for any v with v > s* is t*-boring in these iterations. If s* becomes t*-boring in some of these iterations, it remains t*-boring until d{t*) changes. (Proof) At the beginning of the sequence of the iterations, any v with t; > s* is t*-boring due to the choice of s = 5*. Since d ( r ) remains the same, it follows from Fact 2 that t*-boring v can become not t*-boring only if arc (t',t*) newly appears in A. Suppose that t' > s* is t*-boring and (t'jt*) newly appears in A when t and s are chosen. Then we have 8 v, then we have d{t*) < d{s) because t* = s or s was t*-boring and {s,t*) G A. If

312

VL SUBMODULAR

FUNCTION

MINIMIZATION

s < v^ then we have d{t) < d{v) because t = v or {v^t) ^ A by the choice of s. It follows that d{t*) < d{y) and that v remains t*-boring. Q.E.D. Fact 7: Call the sequence of consecutive iterations described in Fact 3 a block. There are O(n^) blocks throughout Schrijver^s algorithm. (Proof) A block can end only in one of the following three cases: (a) Some distance d{v) for v EV changes. (This occurs 0{n?) times due to Fact 2.) (b) t is removed from N. (From Fact 5, this occurs O(n^) times.) (c) (s,t) disappears from A, (This occurs O(n^) times since d{t) changes before the next block with the same pair (5, t) because of Fact 6.) Q.E.D. It follows from Fact 3 and Fact 7 that there exist O(n^) iterations of Step 1 and Step 2, which requires O(n^) running time with a function evaluation oracle for / . If we assume that invoking the function evaluation oracle takes time 7, then the complexity of Schrijver's algorithm is 0(n^ + ^ynJ), Note that the weakly polynomial IFF algorithm runs in O (771^ log M) time and its strongly polynomial version in 0(771^ log n) time. Remark: In Schrijver's algorithm, updating the expression of a current base as a convex combination of affinely independent extreme bases is inevitable to achieve polynomiality of the algorithm, since without such a reduction of the size of the set of extreme bases the number of extreme bases to express a current base becomes exponential before the algorithm terminates. Recall that the IFF algorithm also performs such a reduction operation but that the reduction is performed only to make the algorithm faster; without such a reduction operation the IFF algorithm remains polynomial. Another feature of Schrijver's algorithm is that the algorithm finds not only a minimizer of / but also a maximizer, a base x = Y^i^j XiVi^ of m.3x{x~{V) I X G B(/)} expressed as a convex combination of extreme bases yi {i ^ I), By the same procedure given by (7.24)~(7.27) we can efliciently construct the poset that expresses the distributive lattice T>{x) of all the x-tight sets. Then x-tight sets X satisfying {v \ v e V^ x{v) < 0} ^ X C {v \ V ^V^ x{v) < 0} are exactly the minimizers of / . It should

14.3. Further Progress in Submodular Function Minimization

313

also be noted that the obtained maximizer x may not be integral when / is integer-valued. 14.3. Further Progress in Submodular Function Minimization In this subsection we give a short description of further progress in submodular function minimization. After the IFF algorithm and Schrijver's one appeared, Iwata [Iwata02] devised a fully combinatorial strongly polynomial algorithm for submodular function minimization based on the IFF algorithm. Iwata's fully combinatorial algorithm performs arithmetic operations of addition, subtraction and comparison only. In the IFF algorithm we need multiplications and divisions to compute coefficients appearing in a convex combination of extreme bases in the representation of a current base. His algorithm takes advantage of some flexibility in determining the values of saturating and nonsaturating pushes in the IFF weakly polynomial algorithm while keeping polynomiality of the algorithm. This is a key property of the IFF algorithm. In the kth scaling phase we treat only rational numbers that are integral multiples of 1/2^ for the positive integer fc, where actual computations in [Iwata02] are performed for such numbers multiplied by 2^ so that only integers appear. The property that the IFF algorithm is still polynomial without Gaussian eliminations is another crucial key to get the fully combinatorial algorithm. Because of this we can avoid multiplications and divisions required in the Gaussian eliminations. The problem of finding a fully combinatorial (strongly) polynomial algorithm for submodular function minimization was thus solved by [Iwata02]. However, since the algorithm in [Iwata02] parallels the IFF algorithm, it treats integers of polynomial size in n. Queyranne's algorithm for symmetric submodular function minimization, for example, is a fully combinatorial polynomial algorithm that treats only numbers given as inputs without worrying about the lengths of the numbers. It is still open to devise a 'fully' combinatorial polynomial algorithm for submodular function minimization in this sense. Concerning the progress in complexity of submodular function minimization algorithms, Fleischer and Iwata [Fleischer-hlwata03] improved Schrijver's algorithm by using the push-relabel technique of Goldberg and Tarjan [Goldberg-|-Tarjan88], whose complexity, however, turned out to be the same as Schrijver's, which is due to Vygen [Vygen03]. Iwata [Iwata03]

314

VL SUBMODULAR

FUNCTION

MINIMIZATION

also improved the IFF algorithm by combining the scaling algorithm of IFF and the push-relabel framework of [Fleischer+Iwata03] to get an 0((n^7 + n^)logM) weakly polynomial algorithm, an 0((n^7 + n^) logn) strongly polynomial algorithm, and an 0((n^7 + n^)log^n) fully combinatorial algorithm, where 7 is the time required for an oracle call for evaluating f{X) for any specified X CV and M is equal to max{|/(X)| | X C V}. These are currently the best. The multiple-exchange technique described in Section 14.1.C was also adopted in [IwataOS]. Moreover, assuming an oracle for membership in base polyhedra, Fujishige and Iwata [Fuji-hlwata02] showed an 0{n?) algorithm for submodular function minimization. Practically, the minimum-norm-point algorithm for submodular function minimization given in Section T.l.b performs well (see [Isotani03]). The behavior of the algorithm seems to be worth investigating. Moreover, bisubmodular functions discussed in Section 3.5.b are generalization of submodular functions. Combinatorial polynomial algorithms for minimizing bisubmodular functions are given in [Fuji+IwataOl] and [McCormick-hFuji05].