A faster polynomial algorithm for the constrained maximum flow problem

Computers & Operations Research 39 (2012) 2634–2641 Contents lists available at SciVerse ScienceDirect Computers & Operations Research journal homep...

Download PDF

447KB Sizes 1 Downloads 130 Views

Report

PDF Reader
Full Text

Computers & Operations Research 39 (2012) 2634–2641

Contents lists available at SciVerse ScienceDirect

Computers & Operations Research journal homepage: www.elsevier.com/locate/caor

A faster polynomial algorithm for the constrained maximum ﬂow problem Cenk C - alıs-kan n Department of Finance and Economics, Woodbury School of Business, Utah Valley University, Orem, UT 84058, United States

a r t i c l e i n f o

a b s t r a c t

Available online 2 February 2012

The constrained maximum ﬂow problem is a variant of the classical maximum ﬂow problem in which the ﬂow from a source node to a sink node is maximized in a directed capacitated network with arc costs subject to the constraint that the total cost of ﬂow should be within a budget. It is important to study this problem because it has important applications, such as in logistics, telecommunications and computer networks; and because it is related to variants of classical problems such as the constrained shortest path problem, constrained transportation problem, or constrained assignment problem, all of which have important applications as well. In this research, we present an Oðn2 m logðnCÞÞ time cost scaling algorithm and compare its empirical performance against the two existing polynomial combinatorial algorithms for the problem: the capacity scaling and the double scaling algorithms. We show that the cost scaling algorithm is on average 25 times faster than the double scaling algorithm, and 32 times faster than the capacity scaling algorithm. & 2012 Elsevier Ltd. All rights reserved.

Keywords: Network ﬂows Maximum ﬂow Minimum cost network ﬂow Scaling Computational complexity

1. Introduction Let G ¼ ðN,AÞ be a directed network consisting of a set N of nodes and a set A of arcs. In this network, the ﬂow on each arc (i,j) is represented by the nonnegative variable xij with a cost cij and capacity uij. The constrained maximum ﬂow problem is to send the maximum possible ﬂow from a source node s A N to a sink node t A N where the total cost of the ﬂow is constrained by the budget, D. For simplicity of exposition, we add an arc ðt,sÞ A A with P P cts ¼ 0 and uts ¼ minf ðs,iÞ A A usi , ði,tÞ A A uit g, an upper bound on the maximum ﬂow from s to t. The problem then can be formulated as follows: max s:t:

xts X

ð1:1Þ xij

ði,jÞ A A

X

X

xji ¼ 0

8i A N

ð1:2Þ

ðj,iÞ A A

cij xij r D

ð1:3Þ

ði,jÞ A A

0 r xij r uij

8ði,jÞ A A:

ð1:4Þ

It is important to study this problem because it has important applications, such as in logistics, telecommunications and computer networks; and because it is related to variants of classical problems such as the constrained shortest path problem, constrained transportation problem, or constrained assignment n

Tel.: þ1 801 863 6487; fax: þ1 801 863 7218. E-mail address: [email protected]

0305-0548/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2012.01.010

problem, all of which have important applications as well. In this research, we propose a combinatorial (non-LP based) algorithm that runs in Oðn2 m log ðnCÞÞ time. Our algorithm is based on the cost scaling algorithm for the minimum cost ﬂow problem due to Goldberg and Tarjan [11], which in turn, is the improved version ¨ of the algorithm by Rock [17]. We also provide the results of computational tests that show that the new algorithm signiﬁcantly outperforms the existing combinatorial algorithms: the capacity scaling algorithm by Ahuja and Orlin [2] and the double scaling algorithm by C - alıs- kan [5]. The proposed cost scaling algorithm is on the average 25 times faster than the double scaling algorithm, and 32 times faster than the capacity scaling algorithm. The rest of the paper is organized as follows: we describe the motivation in Section 2, review the previous work in Section 3, provide the required preliminary information and deﬁnitions for the algorithm in Section 4, describe the proposed cost scaling algorithm in Section 5, establish the worst case complexity of the algorithm in Section 6, provide the results of the computational experiments in Section 7; and ﬁnally, we provide the concluding remarks in Section 8.

2. Motivation The constrained maximum ﬂow problem is related to some important problems such as the knapsack problem. In the knapsack problem, a hiker must decide which items to include in her knapsack that can carry a limited weight to maximize the total utility of the knapsack where each item has an associated utility

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641

and a weight. If all items have equal utility, we can formulate this as the constrained integer maximum ﬂow problem as follows: (1) create a node for each item, (2) connect node s to each item node with a directed unit capacity arc with cost equal to the item weight, (3) connect each item node to node t with a directed unit arc of zero cost. An extended version of this problem is as follows: consider another set of items with zero utility but nonzero cost. Selection of one item may force the hiker to select another item from a set of alternatives, that in turn may force her to select from another set of alternative items, and so on. This forms a tree rooted at each original item node with all arcs directed away from the root. If we disconnect the original item nodes from t and connect each leaf node to t with a directed unit capacity arc of cost zero, the resulting problem is again a constrained integer maximum ﬂow problem. This problem can also be stated as the maximum number of paths problem which can be expressed as follows: given a directed graph with associated arc costs, ﬁnd the maximum number of paths (and the selected paths) from s to t such that the total cost of the paths does not exceed the budget. The constrained maximum ﬂow problem has many potential applications. We provide a few such applications here. Consider an agency that would like to allocate its funds on a number of activities with associated costs. The objective is to maximize the number of selected activities subject to a budget. Each activity has an associated start and end time and some activities may require selecting from a number of subsequent activities. This can be formulated as the extended knapsack problem described above. Another example is as follows: consider an agency that would like to allocate its funds to train its employees. The objective is to maximize the number of employees trained. There are a number of courses with associated costs and class size limits, as well as start and end times. Selecting a course may require a subsequent one from a number of alternatives. This can also be formulated similar to the problem explained above; except that capacities do not have to be one. Finally, assume that we would like to determine the capacity of a distribution network. Some distribution centers are origins, and others are destinations for the ﬂow of goods. Each arc represents a mode of transportation between two distribution centers and has an associated cost and capacity. The objective is to determine the maximum ﬂow between the origin and destination nodes subject to the transportation budget. This problem can be formulated as the constrained maximum ﬂow problem by introducing a super source and a super sink node and connecting the super source to the origins and the destinations to the super sink by directed arcs of unlimited capacity and zero cost. The same framework can be used to model computer networks where the objective is to maximize the packets transferred between sources and sinks and the total transfer cost is constrained.

2635

classical maximum ﬂow problem, as well as many of the important algorithmic developments on the problem. On the other hand, very few papers appeared in the literature for variants of the classical maximum ﬂow problem. Fulkerson [9] ﬁrst describes a convex cost version of the constrained maximum ﬂow problem as a network expansion problem. Malek-Zavarei and Frisch [15] study the version of the maximum ﬂow problem in which additional linear constraints are imposed on some of the nodes. They propose a Lagrangian relaxation based heuristic solution procedure for the problem. For the constrained maximum ﬂow problem that we study in this research, Ahuja and Orlin [2] propose an OðmSðn,m,nCÞlog UÞ capacity scaling algorithm, where n is the number of nodes in the network; m, the number of arcs; C, the largest arc cost; U, the largest arc capacity; and Sðn,m,nCÞ, the time to ﬁnd a shortest path from a single source to all other nodes where nonnegative reduced costs are used as arc lengths. Using the best worst-case time bounds for the single source shortest paths problem with nonnegative arc lengths, this translates into the minimum of pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Oððm2 þ mn log nÞlog UÞ (Raman [16]), Oðm2 log log n log UÞ (Thorup [19]), Oððm2 þ mn log log nÞlog UÞ (Thorup pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ [20]), Oðm2 log log C log UÞ (Johnson [14]), Oððm2 þ mn log C Þlog UÞ (Hagerup [13]), and Oððm2 þ mnðlog C log log CÞ1=3 Þlog UÞ (Ahuja et al. [1]). C - alıs- kan [6] shows that the capacity scaling algorithm of Ahuja and Orlin [2] may sometimes terminate with an infeasible ﬂow and proposes modiﬁcations in the algorithm to avoid this problem. C - alıs- kan [5] proposes a double scaling algorithm that runs in Oðnm logðn2 =mÞ log m log U logðnCÞÞ time with the use of dynamic trees due to Goldberg and Tarjan [12], and C - alıs- kan [7] proposes a specialized network simplex algorithm for the problem. Empirically, the specialized network simplex algorithm is the fastest algorithm for the constrained maximum ﬂow problem, but it is not polynomial, i.e. its worst-case complexity is exponential. The cost scaling algorithm that we propose in this research is empirically the fastest combinatorial (non-LP based) polynomial algorithm to date, as our computational experiments indicate.

4. Preliminaries A ﬂow is a function x : A-R þ [ f0g that satisﬁes Eqs. (1.2) and (1.4). The value of xts is called the value of ﬂow x. If a ﬂow x also satisﬁes the budget constraint (Eq. (1.3)), then we call it a feasible ﬂow. The constrained maximum ﬂow problem is to ﬁnd a feasible ﬂow of maximum value. For simplicity of analysis, we make the following assumptions. None of these assumptions is restrictive as we describe below: Assumption 1. All arc capacities and all arc costs are integers. Assumption 2. All nonnegative.

arc

capacities

and

all

arc

costs

are

3. Literature review Assumption 3. For every optimal solution xn , cxn ¼ D. The constrained maximum ﬂow problem is a variant of the classical maximum ﬂow problem, which is probably one of the most studied problems in operations research. Fulkerson and Dantzig ﬁrst introduced the classical maximum ﬂow problem, and the ﬁrst algorithm to solve the problem was proposed by Ford and Fulkerson [8]. The maximum ﬂow problem has many direct applications, as well as applications where it appears as a subproblem within the context of larger problems in transportation, logistics, telecommunications, and many other ﬁelds. Many algorithms were proposed to solve the maximum ﬂow problem to date. Ahuja et al. [3] describe a number of applications for the

Assumption 4. Every arc in the network is on at least one path from s to t. Assumption 1 can be satisﬁed by multiplying all arc capacities and costs by a suitable constant. Admittedly, this is some loss of generality; and because the algorithm’s complexity depends on C, it may seemingly pose a problem if the cost data is ill-conditioned, i.e. the difference in magnitudes between the largest and smallest arc costs is large. In real world applications, this usually is not the case though, as it is also mentioned in Gabow and Tarjan [10]. In any case, since the algorithmic complexity is

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641

2636

logarithmic in C, the effect of multiplying by a constant is rather small. For example, if the smallest arc cost is 10 6 and we need to multiply costs by 106, the worst case complexity will only increase about 20-fold (log 106 ¼ 19:93). Assumption 2 can be satisﬁed by applying a straightforward network transformation as explained in Ahuja and Orlin [2]. If Assumption 3 does not hold, the maximum ﬂow is not constrained, i.e. the budget constraint is redundant. In this case, any standard maximum ﬂow algorithm could be used to solve the constrained maximum ﬂow problem. Assumption 4 can also be satisﬁed by a straightforward network transformation. An arc that is not on any path from s to t cannot have a positive ﬂow in a feasible solution, and we can easily remove such arcs by repeatedly pruning leaf nodes (nodes that have no incoming arcs or no outgoing arcs) with their incident arcs, until no such node exists. In the resulting network, every arc will be at least on one path from s to t. A pseudoﬂow is a function x : A-R þ [ f0g that satisﬁes only Eq. (1.4). For a pseudoﬂow x, we deﬁne the imbalance of node i A N as follows: X X eðiÞ ¼ xji xij for all i A N: ð4:1Þ ðj,iÞ A A

ði,jÞ A A

If eðiÞ 40 for some node i, we call node i an excess node and we call eðiÞ, the excess of node i. If eðiÞ o 0 for some node i, we call node i a deﬁcit node and we call eðiÞ, the deﬁcit of node i. If eðiÞ ¼ 0 for some node i, we call node i a balanced node. A pseudoﬂow with eðiÞ ¼ 0 for all i A N is a ﬂow. We also call an excess node active, and all other nodes, inactive. Given a pseudoﬂow x, we deﬁne the corresponding residual network G(x) as follows: we replace each arc ði,jÞ A G with two arcs: (i,j) and its reversal (j,i), where cji ¼ cij . The residual capacity of (i,j) is deﬁned as r ij ¼ uij xij , and the residual capacity of (j,i), as r ji ¼ xij . We only include in G(x) arcs with positive residual capacity. For each node i A N, we deﬁne Fði,xÞ ¼ fj A N9ði,jÞ A GðxÞg and call it the forward star of i in G(x); and we deﬁne Bði,xÞ ¼ fj A N9ðj,iÞ A GðxÞg and call it the backward star of i in G(x). We associate a node potential pðiÞ for each node i A N. With respect to a set of node potentials p, we deﬁne the reduced cost of an arc (i,j) as cpij ¼ cij pðiÞ þ pðjÞ. The following theorem from Ahuja and Orlin [2] is crucial for our cost scaling algorithm: Theorem 1. Let xn be an optimal solution to the minimum cost network ﬂow problem with the same costs as the constrained maximum ﬂow problem and source supply and sink demand equal to the optimal ﬂow value of the constrained maximum ﬂow problem. Then, xn is also an optimal solution to the constrained maximum ﬂow problem if cxn ¼ D. Proof. See [2, p. 91].

&

Based on Theorem 1, the constrained maximum ﬂow problem can be solved by modifying an existing algorithm for the minimum cost network ﬂow problem. We propose a modiﬁed cost scaling algorithm that is both theoretically and empirically fast. We use the concept of E-optimality of the minimum cost network ﬂow problem in our cost scaling algorithm. This concept is due to Bertsekas [4], and independently, Tardos [18].

Lemma 1. For a minimum cost network ﬂow problem with integer arc costs, if E Z C, where C is the largest arc cost in the network, any ﬂow x is E-optimal. If E o1=n, then any E-optimal ﬂow is an optimal ﬂow. Proof. See [3, p. 363].

&

We call an E-optimal ﬂow x that also satisﬁes the condition cx ¼D an E-optimal solution (to the constrained maximum ﬂow problem). We call an arc (i,j) in the residual network G(x) an admissible arc if E=2 r cpij o 0, and a path consisting entirely of admissible arcs, an admissible path. We deﬁne the admissible network of a given residual network as the network consisting solely of admissible arcs. We represent the admissible paths with predecessor indices. For a given admissible path, predðiÞ represents the node on the path that comes immediately before node i.

5. The cost scaling algorithm A scaling algorithm either solves a series of approximate versions of the original problem, or obtains a series of approximate solutions, with an increasing degree of accuracy. The cost scaling algorithm obtains a series of E-optimal solutions with decreasing values of E, starting with E ¼ C. The algorithm then performs cost scaling phases by iteratively executing the improve_approximation procedure that transforms an E-optimal solution into an E=2-optimal solution. Initially, the condition cx ¼D is not satisﬁed since x¼0. At the end of the ﬁrst scaling phase, cx ¼ D is achieved and the solution is a C=2-optimal solution to the constrained maximum ﬂow problem. At the end of every E-scaling phase, the ﬂow is an E=2-optimal solution to the constrained maximum ﬂow problem, and at the end of the last scaling phase, E o1=n and by Lemma 1 and Theorem 1, the algorithm terminates with an optimal solution to the constrained maximum ﬂow problem. The algorithm and its procedures are formally described in Figs. 1–3. At the beginning of an E-scaling phase (after the initial scaling phase), the solution ðx, pÞ is E-optimal for the constrained maximum ﬂow problem since the E-optimality condition (Deﬁnition 1) and cx ¼D are satisﬁed. The procedure improve_approximation converts this E-optimal solution to an E=2-optimal solution in two stages: (i) by saturating the arcs that violate E=2-optimality to convert the ﬂow to an E=2-optimal pseudoﬂow, and then (ii) by pushing ﬂows on admissible arcs to gradually convert the E=2-optimal pseudoﬂow to a ﬂow and by sending ﬂows on admissible paths from s to t to restore the condition cx ¼ D. Thus, at the end of an E-scaling phase, ðx, pÞ is an E=2-optimal solution to the constrained maximum ﬂow problem. The procedure improve_approximation ﬁrst creates an E=2-optimal pseudoﬂow from an E-optimal ﬂow by saturating the arcs that violate E=2-optimality, but this may create imbalances at some nodes. The procedure then pushes ﬂows from active nodes to inactive nodes to convert the pseudoﬂow into a ﬂow. If a push on an arc (i,j) is equal to its residual capacity r ij , we

Deﬁnition 1. A ﬂow x or a pseudoﬂow x is called E-optimal for some E and p, if cpij ZE for all (i,j) in the residual network G(x). It is well-known that an E-optimal ﬂow is optimal for the minimum cost network ﬂow problem if E ¼ 0. When the arc costs are integers, any E-optimal ﬂow is optimal for E o 1=n, as the following lemma establishes:

Fig. 1. The cost scaling algorithm.

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641

2637

Fig. 2. The procedure improve_approximation of the cost scaling algorithm.

create at least one admissible arc and retreats to the previous node. The procedure terminates when it reaches node t. Fig. 2 formally describes the procedure improve_approximation, and Fig. 3 describes the procedure ﬁnd_admissible_path.

6. Analysis of the algorithm In this section, we will analyze the algorithm and prove that it terminates with an optimal solution to the constrained maximum ﬂow problem and establish its worst case time complexity. The following lemma proves that the procedure improve_approximation starts with an E-optimal solution and ends with an E=2-optimal solution. Lemma 2. The procedure improve_approximation converts an E-optimal solution to an E=2-optimal solution.

Fig. 3. The procedure ﬁnd_admissible_path of the cost scaling algorithm.

call it a saturating push; otherwise, a nonsaturating push. These pushes are carried out on admissible arcs in order to maintain E=2-optimality. If there is no admissible arc emanating from an active node i, we increase pðiÞ by an amount equal to the smallest of the reduced costs of the arcs in Fði,xÞ plus E=2, which will guarantee that there will be at least one admissible arc emanating from that node after the change in pðiÞ. This is called relabeling. After each push, if cx o D, the procedure ﬁnd_admissible_path identiﬁes an admissible path and we push the required amount of ﬂow from s to t to restore cx ¼D. The procedure ﬁnd_admissible_path starts constructing an admissible path at s, and adds admissible arcs to the path at each step by advancing to the next node via the ﬁrst incident admissible arc. If at any point, there is no admissible arc incident to the current node, the procedure increases the node’s potential by an amount equal to the smallest of the reduced costs of the arcs emanating from it plus E=2, to

Proof. At the beginning of the procedure, all arcs in G(x) with reduced costs less than E=2 are saturated. This removes all arcs that previously violate E=2-optimality from G(x). The arcs with reduced costs that are greater than E=2 already satisfy E=2-optimality, so they are not changed. The resulting pseudoﬂow is E=2-optimal. We will now show that push, relabel and retreat/ relabel steps maintain E=2-optimality. When we push ﬂow on an admissible arc (i,j), we may introduce the reverse arc (j,i) into G(x). But since E=2 r cpij o 0 (by the deﬁnition of admissibility), cpji ¼ cpij 4 0 and the resulting pseudoﬂow will still satisfy

E=2-optimality. If we increase the node potential pðiÞ of a node i, reduced costs of all outgoing arcs will be lowered by an amount minj A Fði,xÞ cpij þ E=2. Before the change, cpij Zminj A Fði,xÞ cpij for all outgoing arcs (i,j), and after the change, cpij Z E=2. Reduced costs of all incoming arcs will be increased by a positive amount, minj A Fði,xÞ cpij þ E=2. Thus, E=2-optimality conditions are satisﬁed for all arcs after the node potential change. Furthermore, push operations convert the pseudoﬂow into a ﬂow by eliminating excess and deﬁcits, and pushes from s to t restore cx ¼D. Therefore, the resulting ﬂow is an E=2-optimal solution. & We will now prove that the procedure ﬁnd_admissible_path always terminates with an admissible path. The procedure

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641

2638

ﬁnd_admissible_path starts at node s, iteratively forms a path out of s, and it terminates when it reaches node t. Only if it cycles through a series of nodes, it will not terminate within a ﬁnite number of advance steps. But the admissible network stays acyclic as the following lemma establishes. Thus, the procedure always terminates after a ﬁnite number of steps. Lemma 3. The admissible network is acyclic throughout the cost scaling algorithm. Proof. This is true at the beginning of the algorithm as the initial reduced costs are all nonnegative and there are no admissible arcs in the residual network. We always push ﬂow on an arc (i,j) when cpij o 0, so if we add the reversal arc (j,i), cpji 40 and the reversal arc is not admissible. Thus, pushes do not add admissible arcs to the residual network and the admissible network stays acyclic. When we increase the node potential of node i, we create at least one admissible outgoing arc from i. However, since we increase pðiÞ by minj A Fði,xÞ cpij þ E=2, all incoming arcs of i become inadmissible. Thus, increasing the potential of a node cannot create a directed cycle passing through the node. Thus, neither pushes nor node potential increases can create directed cycles and the admissible network remains acyclic. & We rely on the following lemma from [12] to establish a bound on the number of relabel and retreat/relabel steps throughout an execution of the procedure improve_approximation. Lemma 4 (Goldberg and Tarjan [12]). Let x be a pseudoﬂow and x0 be a ﬂow. For any node v with eðvÞ 4 0, there exists a node w with eðwÞ o0 and a sequence of distinct nodes v ¼ v0 ,v1 , . . . ,vl1 ¼ w such that ðvi ,vi þ 1 Þ A GðxÞ and ðvi þ 1 ,vi Þ A Gðx0 Þ for 0 ri ol. Proof. See [12, p. 445].

&

Lemma 5. During an execution of the procedure improve_approximation, the number of potential increases for any node is O(n). Proof. Let x be the E=2-optimal pseudoﬂow, and x0 be the E-optimal ﬂow at the end of the previous cost scaling phase, and p and p0 be the corresponding node potentials. By Lemma 4, for every active node v, there exists a deﬁcit node w and a directed path P from v to w in G(x), whereas its reversal, P 0 is a directed path in Gðx0 Þ. Thus, arcs on P satisfy E=2-optimality and arcs on P 0 satisfy E-optimality. Since a path can contain at most n 1 arcs, it follows that: X cpij Zðn1ÞE=2 ði,jÞ A P

and X

algorithm increases a node potential by at least E=2 units at a time, a node can be relabeled at most 3(n1) times, or O(n) times. & Now, we will determine upper bounds for the number of saturating and nonsaturating pushes during an execution of the procedure improve_approximation. The following two lemmas establish those bounds. Lemma 6. The procedure improve_approximation performs O(nm) saturating pushes. Proof. Consider a saturating push of an arc (i,j). From the deﬁnition of admissibility, cpij o 0 before the saturation. Before we can saturate this arc again, we must send back some ﬂow on the reverse arc (j,i). In order to do that, (j,i) must be admissible, so cpji o 0, or cpij 40. But this is only possible if we have relabeled node j. Then, in order to saturate (i,j) again, we must have cpij o 0, which is only possible if we have relabeled node i. By Lemma 5, the number of relabelings for any node is O(n). Thus, we can saturate any arc at most O(n) times, which makes the total number of saturating pushes for the procedure improve_approximation O(nm). & Lemma 7. The procedure improve_approximation performs Oðn2 mÞ nonsaturating pushes. Proof. Let f ðiÞ be the number of nodes that are reachable from node i P in the admissible network (including i). Let F be i:active f ðiÞ. Throughout the procedure, there can be at most (n 1) active nodes in the network since there must be at least one deﬁcit node. At any time, f ðiÞ r n for each active node, which bounds F by Oðn2 Þ. A saturating push on an arc (i,j) might make node j active, which would increase F by f ðjÞ r n. Lemma 6 implies that the total increase in F due to saturating pushes is Oðn2 mÞ. A relabeling operation on node i may create new admissible arcs, which may increase f ðiÞ by at most n units. Since relabeling of node i makes all incoming arcs of i inadmissible, f ðkÞ will not change for any other node k. Lemma 5 implies that the total increase in F due to relabeling operations is Oðn3 Þ. Now, consider the effect of nonsaturating pushes on F. After a nonsaturating push on arc (i,j), node i becomes inactive and node j might become active. Thus, F may increase by at most f ðjÞf ðiÞ units after a nonsaturating push. But f ðiÞ Z f ðjÞ þ1 since every node that is reachable from j is also reachable from i and by Lemma 3, node j is not reachable from node i. Thus, F decreases by at least one unit after a nonsaturating push. Thus, the number of nonsaturating pushes throughout the procedure improve_approximation is bounded by: Oðn2 Þ þ Oðn2 mÞ þOðn3 Þ ¼ Oðn2 mÞ. & Lemma 8. The procedure improve_approximation runs in Oðn2 mÞ time.

cjip0 Z ðn1ÞE:

ðj,iÞ A P0

Substituting cpij ¼ cij pðiÞ þ pðjÞ and cjip0 ¼ cij p0 ðjÞ þ p0 ðiÞ in the above expressions results in X cij pðvÞ þ pðwÞ Zðn1ÞE=2 ð6:1Þ ði,jÞ A P

and X cij p0 ðwÞ þ p0 ðvÞ Zðn1ÞE:

ð6:2Þ

ði,jÞ A P

Rearranging the terms in Eq. (6.2), we obtain X ðn1ÞEp0 ðwÞ þ p0 ðvÞ Z cij :

ð6:3Þ

ði,jÞ A P

Substituting Eq. (6.3) in Eq. (6.1) results in ðpðvÞp0 ðvÞÞ rð3=2Þðn1ÞE þ ðpðwÞp0 ðwÞÞ:

ð6:4Þ

Now observe that pðwÞ ¼ p ðwÞ throughout the algorithm, since we never increase node potentials for inactive nodes. Since the 0

Proof. By Lemma 6, the number of saturating pushes is O(nm), and by Lemma 7, the number of nonsaturating pushes is Oðn2 mÞ. A push from s to t consists of saturating and nonsaturating pushes, thus the total time for those is also bounded by Oðn2 m þ nmÞ. An execution of the procedure ﬁnd_admissible_path consists of advance and retreat/relabel steps. By Lemma 5, the number of retreat/relabel steps is O(n). The advance steps can be classiﬁed into two types: the ones that are later cancelled and the ones that are not cancelled. The ﬁrst type constitute the retreat part of the retreat/relabel steps, thus they are bounded by O(n). The second type are also bounded by O(n) because the length of a path cannot be more than n 1. Finally, the total time to ﬁnd an admissible arc for all nodes in the network before one relabeling is O(m), as one only has to go through every outgoing arc once before a relabeling. Since each node is relabeled at most O(n) times, the total time for ﬁnding admissible arcs is O(nm). Thus, the overall complexity of the procedure improve_approximation is: Oðn2 m þ nmþ n2 m þnm þ n þ n þ nmÞ ¼ Oðn2 mÞ. &

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641

We will now show that the algorithm terminates ﬁnitely and establish its worst case complexity. The algorithm starts with E ¼ C and halves it until E o1=n. Therefore, the algorithm performs 1þ log Clogð1=nÞ ¼ 1 þ logðnCÞ ¼ OðlogðnCÞÞ cost scaling phases, which means that the number of improve_approximation procedure executions is also OðlogðnCÞÞ. During an execution of improve_approximation, the ﬁrst stage is to saturate arcs that violate E=2-optimality, which is O(m). The second stage in improve_approximation is to push ﬂows from active nodes to inactive nodes (towards deﬁcit nodes), and it is Oðn2 mÞ by Lemmas 6 and 7. The only part of the algorithm that seems to have the possibility of not terminating ﬁnitely is ﬁnd_admissible_path. The admissible network changes throughout the algorithm, but Lemma 3 proves that it always stays acyclic. If it is always acyclic, then during an execution of ﬁnd_admissible_path, it is not possible to come back to a previously visited node via advance steps and potentially inﬁnitely cycle through the same set of nodes. Thus, ﬁnd_admissible_path always terminates in a ﬁnite number of steps as well. Since the number of scaling phases is OðlogðnCÞÞ, we have thus established the following theorem: Theorem 2. The cost scaling algorithm for the constrained maximum ﬂow problem runs in Oðn2 m logðnCÞÞ time. 7. Computational results In order to test the empirical performance of the cost scaling algorithm, we generated problem instances using a random network generator similar to the one that was described in C - alıs-kan [5]. In the computational tests, we compared the cost scaling algorithm to the existing polynomial combinatorial algorithms: the capacity scaling and the double scaling algorithms. This also provided a sanity check for our code. All algorithms were coded in the same programmatic style, using the same network representation and data structures, so that there was no performance variation due to differences in implementation details. We coded the cost scaling algorithm in Cþþ and compiled it with Microsoft Visual Cþþ 7.1, using the optimization option/O2, and the same was true for the other two algorithms. We conducted the computational tests on a s personal computer with a 2.0 GHz Intel Core 2 DuoTMprocessor with 2.0 GB of memory. The generator that we used in the computational tests generates a random network given the user deﬁned lower and upper bounds on costs and capacities, the number of nodes, the desired network density, and a seed for the random numbers. The density d of the network is deﬁned as m/n. The generator ﬁrst randomly designates one of the nodes as node s, and another one as node t. It then randomly generates a path from s to t to ensure feasibility. This is done by ﬁrst randomly generating an arc out of s as the ﬁrst arc of the path. The generator then extends this partial path by randomly generating an arc from the path’s last node to another node, and this process repeats until the path ﬁnally reaches node t. After generating the ﬁrst path, the generator then connects the remaining nodes of the network to the nodes of the ﬁrst generated path by generating ‘‘side paths.’’ This is done by randomly selecting a node that is not on the ﬁrst generated path and randomly connecting this node to the path nodes by an arc from a path node to the selected node and by another arc from the selected node to another path node. After all nodes are connected this way, the generator continues to generate arcs randomly until the user-deﬁned network density is reached. During the process, every time an arc is generated, its cost and capacity are determined from uniform distributions between the lower and upper bounds on costs and capacities, respectively. Finally, the budget P P is determined as D ¼ minf ðs,iÞ A A usi , ði,tÞ A A uit gC=2, where C is the maximum arc cost. This was empirically determined and it

2639

successfully constrained the maximum ﬂow in all instances of the problem, i.e. it created problems in which the budget constraint was tight at optimality. The network generator creates networks in which every node is on at least one directed path from s to t. This principle was incorporated in the network generator because the arcs that are not part of any directed path from s to t cannot carry nonzero ﬂow in any feasible solution, due to ﬂow conservation at nodes. If there are such arcs in a network, they can easily be removed in O(m) time by a breadth-ﬁrst search from each of the nodes that have no incoming arcs or no outgoing arcs (other than s and t). We generated networks that have up to 16 384 nodes and 524 288 arcs in our experiments. Different values for the number of nodes were chosen as: n ¼ 2x where x¼ {8, 9, 10, 11, 12, 13, 14}. The network density d was in the range {8, 16, 32}, the arc capacities were uniformly distributed in [1, 104], and the arc costs were uniformly distributed in [1, 102]. We generated 10 random instances for each combination of the parameters, making the total number of instances 210. Table 1 shows the CPU times of the cost scaling algorithm vs. the capacity scaling and the double scaling algorithms. We stopped the algorithms after 1400 s elapsed, as this was about four times as long as the longest CPU time for the cost scaling algorithm in the experiments. In Table 1, we did not report CPU times if there were instances for a particular network size that were not solved to optimality within the maximum time limit, because the CPU times for the solved ones would not represent the average CPU time in that group. The number of solved instances were as follows for those missing entries in Table 1: n¼8192 and d ¼32, 1 for capacity scaling; n¼ 16 384 and d¼8, 7 for double scaling and 4 for capacity scaling; n ¼16 384 and d ¼16, 0 for double scaling and 2 for capacity scaling; n ¼16 384 and d¼32, 0 for both double and capacity scaling. The results in Table 1 and in Figs. 4–6 show that the cost scaling algorithm is computationally superior to both the capacity scaling and the double scaling algorithms. In the experiments, the cost scaling algorithm was up to 56 times faster than the double scaling algorithm with an average of 25 times faster; and up to

Table 1 CPU times of the cost scaling, double scaling and capacity scaling algorithms. Network size n

d

CPU times (s)

CPU time ratios

m

Cost scaling (i)

Double scaling (ii)

256 512 1024 2048 4096 8192 16,384

8 8 8 8 8 8 8

2048 4096 8192 16,384 32,768 65,536 131,072

0.012 0.060 0.188 0.773 2.895 13.111 155.013

0.470 1.495 4.258 14.544 67.509 297.225

256 512 1024 2048 4096 8192 16,384

16 16 16 16 16 16 16

4096 8192 16,384 32,768 65,536 131,072 262,144

0.017 0.078 0.272 1.110 4.352 19.102 242.528

0.941 2.206 7.487 26.189 104.308 490.647

256 512 1024 2048 4096 8192 16,384

32 32 32 32 32 32 32

8192 16,384 32,768 65,536 131,072 262,144 524,288

0.041 0.122 0.480 1.687 6.298 26.549 325.738

1.923 4.603 14.395 49.539 167.725 670.588

a

Capacity scaling (iii) 0.140 0.495 2.466 10.857 93.971 611.961

a

a

0.208 0.569 7.607 37.608 294.300 1341.136

a

a

0.413 1.639 13.725 189.025 1088.305

a

The program was stopped after 1400 s elapsed.

a a

(ii/i)

(iii/i)

38 25 23 19 23 23 –

11 8 13 14 32 47 –

56 28 27 24 24 26 –

12 7 28 34 68 70 –

47 38 30 29 27 25 –

10 13 29 112 173 – –

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641

2640

10000.000

1000.000

1000.000

CPU time, sec. (logscale)

CPU time, sec. (logscale)

100.000

10.000

1.000

0.100

Capacity Scaling

100.000

10.000

1.000

Capacity Scaling

0.100

Cost Scaling

Cost Scaling

Double Scaling

Double Scaling

0.010

0.010 8

9

10

11

12

13

8

14

Fig. 4. CPU times of the cost scaling algorithm vs. the capacity scaling and the double scaling algorithms when d¼ 8.

10

11

12

13

14

Fig. 6. CPU times of the cost scaling algorithm vs. the capacity scaling and the double scaling algorithms when d ¼ 32.

that runs in Oðn2 m logðnCÞÞ time. The constrained maximum ﬂow problem is important to study as it is related to important combinatorial optimization and network ﬂow problems, and it has many potential applications. Our computational tests show that the proposed algorithm is signiﬁcantly faster than the existing polynomial combinatorial algorithms for the problem: on average, 25 times faster than the double scaling algorithm, and 32 times faster than the capacity scaling algorithm, making it the fastest combinatorial algorithm for the constrained maximum ﬂow problem.

10000.000

1000.000 CPU time, sec. (logscale)

9

n, number of nodes (power of 2)

n, number of nodes (power of 2)

100.000

10.000

References

1.000

Capacity Scaling

0.100

Cost Scaling Double Scaling

0.010 8

9

10

11

12

13

14

n, number of nodes (power of 2) Fig. 5. CPU times of the cost scaling algorithm vs. the capacity scaling and the double scaling algorithms when d¼ 16.

173 times faster than the capacity scaling algorithm with an average of 32 times faster. Furthermore, it was signiﬁcantly faster than both algorithms for every instance of the test problems, with a minimum of 19 times faster than the double scaling algorithm, and a minimum of seven times faster than the capacity scaling algorithm. The specialized network simplex algorithm proposed by C - alıs- kan [7] runs faster than the cost scaling algorithm presented in this research. Empirically, the specialized network simplex algorithm is the fastest algorithm for the constrained maximum ﬂow problem, but it is not polynomial. Thus, the cost scaling algorithm is empirically the fastest combinatorial polynomial algorithm for the constrained maximum ﬂow problem.

8. Conclusions In this research we propose a polynomial combinatorial (nonLP based) algorithm for the constrained maximum ﬂow problem

[1] Ahuja RK, Mehlhorn K, Orlin JB, Tarjan RE. Faster algorithms for the shortest path problem. J ACM 1990;37:213–23. [2] Ahuja RK, Orlin JB. A capacity scaling algorithm for the constrained maximum ﬂow problem. Networks 1995;25:89–98. [3] Ahuja RK, Magnanti TL, Orlin JB. Network ﬂows: theory, algorithms and applications. Englewood Cliffs, NJ: Prentice-Hall; 1993. [4] Bertsekas DP. A distributed algorithm for the assignment problem. Working Paper. Laboratory for Information and Decision Systems. Cambridge, MA: MIT; 1979. [5] C - alıs-kan C. A double scaling algorithm for the constrained maximum ﬂow problem. Comput Operat Res 2008;35(4):1138–50. [6] C - alıs-kan C. On a capacity scaling algorithm for the constrained maximum ﬂow problem. Networks 2009;53(3):229–30. [7] C - alıs-kan C. A specialized network simplex algorithm for the constrained maximum ﬂow problem. Eur J Operat Res 2011;210(2):137–47. [8] Ford LR, Fulkerson DR. Maximum ﬂow through a network. Can J Math 1956;8:399–404. [9] Fulkerson DR. Increasing the capacity of a network: the parametric budget problem. Manage Sci 1959;5(1):473–83. [10] Gabow HN, Tarjan RE. Faster scaling algorithms for network problems. SIAM J Comput 1989;18:1013–36. [11] Goldberg AV, Tarjan RE. Solving minimum cost ﬂow problem by successive approximation. In: Proceedings of the 19th ACM symposium on the theory of computing; 1987. p. 7–18. [Full paper in Math Operat Res 15 (1990) 430–466]. [12] Goldberg AV, Tarjan RE. Finding minimum cost ﬂow circulations by successive approximation. Math Operat Res 1990;15:430–66. [13] Hagerup T. Improved shortest paths in the word RAM. In: 27th international colloquium on automata, languages, and programming, Geneva, Switzerland, 2000. p. 61–72. [14] Johnson DB. A priority queue in which initialization and queue operations take Oðlog log DÞ time. Math Syst Theory 1982;15:295–309. [15] Malek-Zavarei M, Frisch IT. A constrained maximum ﬂow problem. Int J Control 1971;14(3):549–60. [16] Raman R. Priority queues: small monotone and trans-dichotomous. In: Proceedings of the fourth annual European symposium on algorithms. Lecture notes in computer science, vol. 1136. Springer-Verlag; 1996. p. 121–37.

C. C - alıs- kan / Computers & Operations Research 39 (2012) 2634–2641 ¨ [17] Rock H. Scaling techniques for minimal cost network ﬂows. In: Discrete structures and algorithms. Munich: Carl Hanser; 1980. p. 181–91. [18] Tardos E. A strongly polynomial minimum cost circulation algorithm. Combinatorica 1985;5:155–247. [19] Thorup M. On RAM priority queues. SIAM J Comput 2000;30:86–109.

2641

[20] Thorup M. Integer priority queues with decrease key in constant time and the single source shortest paths problem. In: Proceedings of the 35th annual ACM symposium on theory of computing (STOC), San Diego, CA. New York: ACM; 2003. p. 149–58.

A faster polynomial algorithm for the constrained maximum flow problem

A faster polynomial algorithm for the constrained maximum flow problem

Recommend Documents