Computers & Operations Research 39 (2012) 164–178
Contents lists available at ScienceDirect
Computers & Operations Research journal homepage: www.elsevier.com/locate/caor
A three-stage approach for the resource-constrained shortest path as a sub-problem in column generation Xiaoyan Zhu a,n, Wilbert E. Wilhelm b a b
Department of Industrial and Information Engineering, The University of Tennessee, Knoxville, TN 37996-2100, USA Industrial and Systems Engineering Department, Texas A&M University, College Station, TX 77843-3131, USA
a r t i c l e i n f o
a b s t r a c t
Available online 29 March 2011
This paper focuses on the common scenario in which the resource-constrained shortest path problem (RCSP) on an acyclic graph is a sub-problem in the context of column generation. It proposes a pseudopolynomial time, three-stage solution approach. Stages 1 (preprocessing) and 2 (setup) are implemented one time to transform RCSP into a shortest path problem, which is solved by stage 3 (iterative solution) at each column generation iteration, each time with different arc costs. This paper analyzes certain properties related to each stage as well as algorithm complexity. Computational tests compare the performances of this method, a state-of-the-art label-setting algorithm, and CPLEX optimization software on four classes of instances, each of which involves either one or multiple resources, and show that the new method is effective. The new method outperforms the label-setting algorithm when resource limitations are tight—as can be expected in practice, and outperforms CPLEX for all tested instances. The label-setting algorithm outperforms CPLEX for all single-resource RCSP instances and almost all multiple-resource RCSP instances. & 2011 Elsevier Ltd. All rights reserved.
Keywords: Integer programming Column generation Resource-constrained shortest-path problem Pseudo-polynomial time Three-stage solution approach
1. Introduction This paper considers a scenario in which a resource-constrained shortest path problem (RCSP) on an acyclic graph is used as a subproblem in column generation. This problem finds many applications, for example, scheduling [24,3,13]; designing assembly systems [27]; global supply chain design [32]; prescribing the content and timing of products upgrades [31]; optimizing picking [29,33] and placing [30] operations on dual-head placement machines, which are used to assemble circuit cards; and balancing transfer lines [9]. Wilhelm [28] provided a recent survey on column generation. This paper focuses on RCSP on an acyclic, directed graph G ¼(V,A) of n¼ 9V9 nodes. We assume that nodes are in topological order (i.e., each arc (i,j)AA satisfies 1rioj rn) and that v1 and vn are start and end nodes, respectively. Let path v1 vj denote a series of consecutive arcs from v1 to vj (such a path may not be unique). An index set of resources, R, is available, each in the limited supply denoted by Tr, which is positive and discretevalued for each rAR. Each arc (i,j)AA has a real-valued cost cij and a discrete-valued vector wij ¼ ðwij1 ,. . .,wij9R9 Þ representing the resources required to traverse it. A path is (resource) feasible if
n
Corresponding author. Tel.: þ1 865 974 7655; fax: þ 1 865 974 0588. E-mail addresses:
[email protected] (X. Zhu),
[email protected] (W.E. Wilhelm).
0305-0548/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2011.03.008
the requirement for resource r on the path (i.e., the sum of the requirements for resource r associated with all arcs in that path) is no greater than Tr, 8rAR. RCSP is to find a least-cost, resource-feasible v1 vn path in G. We denote RCSP with a single resource constraint as SRCSP (i.e., 9R9¼1); and, with multiple resource constraints as MRCSP (i.e., 9R9 41). Time, distance, capacity, money, workload, and reliability requirements are typical resources of interest. In column generation, a sub-problem (considering a RCSP) must be solved a number of times in the search for an optimal solution of a master problem. Each time, resource constraints remain the same but arc costs are updated with the current (optimal) values of dual variables in the master problem. The research objectives of this paper are: (1) an effective, pseudopolynomial time method to solve RCSP on an acyclic graph in the column-generation context; (2) characterizations of certain properties of associated, specialized techniques as well as algorithm complexity; and, importantly, (3) tests to identify computational characteristics and to benchmark relative to a state-of-the-art, label-setting algorithm, and CPLEX, a powerful optimization software used in solving integer programming problems. The remainder of the paper is structured as follows. Section 2 reviews related literature and describes the contributions that this paper makes relative to results obtained by prior research. Section 3 details our proposed three-stage approach (TSA). Section 4 reviews the label-setting algorithm of Dumitrescu and
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
Boland [12] and extends it to deal with multiple resources for use as a benchmarking method. Section 5 presents our computational tests and compares the performance of our approach with that of our benchmarking method and CPLEX. Section 6 provides conclusions and discusses future research.
2. Literature review RCSP is NP-hard [10,16,20], even if the graph is acyclic, only one resource constraint is involved, and all resource requirements and costs are positive [12]. In particular, it is NP-hard in the ordinary sense on an acyclic graph [2]. Two families of exact algorithms for solving RCSP have been proposed: one involves using Lagrangian or linear relaxation (e.g., [16,2,23]) and the other uses dynamic programming or a labeling algorithm that is developed from a dynamic programming formulation (e.g., [4,6,7,11,12,14,17–19]). The method proposed by Dumitrescu and Boland [12] for solving SRCSP modifies the labelsetting algorithm of Desrochers and Soumis [7], using information provided by preprocessing. Dumitrescu and Boland [12] evaluated their method computationally on SRCSP instances in grid networks and showed that it outperforms three scaling algorithms [11,17,22] in terms of both run times and memory requirements. However, this method is restricted to SRCSP and cannot be used directly to solve MRCSP. In this paper, we extend this method to deal with multiple resources for use as a benchmarking method in application to MRCSP instances. As a sub-problem in column generation, RCSP must be solved at each of a number of column generation iterations. At each iteration, resource constraints remain the same but arc costs are updated with the different reduced costs. Thus, the RCSP instances at these iterations are different from each other but have the same underlying graph and resource constraints. Previous algorithms do not exploit this context. To reoptimize at each column generation iteration, these algorithms must be employed from scratch (i.e., from the first step onwards). However, several approaches have been devised to exploit this context for other types of sub-problems. Considering a shortest path problem (SPP) with several resource window constraints on each node in an acyclic graph, Jaumard et al. [21] described a pseudopolynomial, two-phase approach. The first phase constructs an expanded graph through which the second phase prescribes an optimal path using a shortest path algorithm. The expanded graph need be constructed only once, and SPP is solved at each column generation iteration in an attempt to generate an improving column. However, they did not report any computational results. Our TSA is specialized to solve the RCSP on an acyclic graph as a sub-problem in column generation; that is, it can effectively optimize a set of RCSP instances that have the same underlying graph and resource constraints but different arc costs, each instance corresponding to a column generation iteration. The framework of TSA is analogous to some of the methodology that
165
was developed previously (e.g., [21]) for solving different, but related, problems. Our paper contributes by devising specialized techniques and proving certain properties (Section 3) upon which the success of our method depends. Although not our primary focus, this paper also contributes by extending the label-setting algorithm of Dumitrescu and Boland [12] and evaluating its computational characteristics in solving MRCSP instances when using it as a benchmarking method. This paper presents a computational evaluation, which shows that our TSA outperforms the benchmarking method and CPLEX. In addition, preprocessing techniques can reduce the size of an instance (i.e., by removing nodes and arcs or by tightening resource limitations) using either resource-based reduction (e.g., [1,5]) or joint resource- and cost-based reduction (e.g., [2,12]). For a SPP with resource windows on each node, Desrochers et al. [5] developed a resource-based method that examines and tightens the resource windows of nodes cyclically, until no further tightening is possible. Dumitrescu and Boland [12] presented a preprocessing algorithm that interleaves resource- and costbased reduction, and demonstrated through computational tests that it can be surprisingly effective in reducing problem size. Recently, Sellmann [25], Gellermann et al. [15], and Sellmann et al. [26] proposed shorter-path filtering algorithms combined with Constraint-Programming-based Lagrangian relaxation to reduce the size of an instance, particularly SRCSP. Their filter method is a joint resource- and cost-based network reduction that filters during the optimization of the Lagrangian dual. Gellermann et al. [15] and Sellmann et al. [26] showed that this method outperforms the methods in Aneja et al. [1] and Beasley and Christofides [2] in terms of reducing instance size. Our paper develops a different resource-based preprocessing stage for RCSP, which removes nodes and arcs that cannot be on any feasible path and then prescribes a window for each type of resource at each remaining node.
3. TSA As depicted in Fig. 1, TSA comprises three stages: preprocessing, setup, and iterative solution. Stage 1 (preprocessing) revises graph G(V,A) to form reduced graph GR ¼ (VR,AR) and then formulates a window for each resource at each node in the reduced graph. Stage 2 (setup) uses an expansion procedure to construct an equivalent classical SPP on expanded graph GE ¼(VE,AE). Stages 1 and 2 are each implemented one time before column generation iterations, and stage 3 (iterative solution) is implemented at each column generation iteration to find the least-cost path, each time using a new set of arc costs. 3.1. Stage 1 algorithm: preprocessing The stage 1 algorithm deals exclusively with resource constraints, which do not change from one column generation iteration
Fig. 1. TSA.
166
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
to the next; it does not use any arc cost parameters because arc costs differ at each iteration. We now introduce notation that we use in the stage 1 algorithm (see Fig. 2): FS(vj)(BS(vj)) set of the successors, i.e., forward stars (predecessors, i.e., backward stars) of vj the minimum (maximum) amount of resource f jr (f jr ) rAR required for the v1 vj path the minimum (maximum) amount of resource b jr (bjr ) rAR left over for the v1 vj path after subtracting the maximum (minimum) resource r required for the vj vn path from total resource availability Tr the lower (upper) bound (i.e., window) for t jr (t jr ) resource r at vj. Step 1 first makes a backward pass from vn to v1 and then a forward pass from v1 to vn, iteratively identifying and deleting bottlenecks (nodes or arcs), which, by definition, cannot be on any feasible v1 vn path. That is, any v1 vn path containing a bottleneck requires an amount of at least one resource rAR that exceeds resource availability Tr. The backward pass determines
pass determines f jr (step 1(iv)), and identifies any node vj with f jr 4bjr (step 1(v)) and arc (i,j) with f jr þ wijr 4bjr (step 1(vi)) as bottlenecks. The backward pass calculates bjr and uses it to identify bottlenecks and the forward pass calculates f jr and uses it along with bjr in a tighter set of conditions to identify bottlenecks. Step 1 reduces G to GR. RCSP is resource-infeasible (i.e., no v1 vn path satisfies all resource constraints) if step 1 eliminates all nodes from G and so VR ¼ + (see step 2). Steps 3(i) and 3(iii) calculate b jr and f jr , respectively. Note that by definitions of b 1r and f nr , b 1r Z0 (equivalently f nr r Tr ) implies that every v1 vn path is feasible relative to resource r; thus, the constraint limiting resource r is redundant and can be eliminated (see step 3(ii)). If all resource constraints are redundant, then RCSP reduces to SPP. Step 3(iv) formulates window [t jr ,t jr ] for each resource rAR at each node vj in GR such that any v1 vj path with requirement for resource r larger than t jr is infeasible; if it is less than t jr , it can be increased to t jr . By the definitions of bjr , f jr , b jr , and f jr , we have
bjr (step 1(i)) and identifies any node vj with bjr o0 (step 1(ii)) and arc (i,j) with wijr 4bjr (step 1(iii)) as bottlenecks. The forward
t jr ¼ minfbjr ,f jr g
Fig. 2. Stage 1 algorithm: preprocessing.
ð1Þ
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
fact that vj is not a bottleneck (otherwise it is deleted and does
and t jr ¼
8 < maxfb jr ,f g jr
if f jr 4 b jr
: f jr
otherwise
:
ð2Þ
For Eq. (2), when f jr 4b jr and further f jr 4 b jr , then t jr ¼ f jr and no v1 vj path has requirement for resource r less than t jr . When f jr 4 b jr and further f jr r b jr , then t jr ¼ b jr and any v1 vj path with requirement for resource r between f jr and b jr can be treated as its requirement for resource r equal to b jr (the minimum amount of resource r left for a v1 vj path) because the v1 vj path can still be augmented by any vj vn path to form a v1 vn path that is feasible relative to resource r. If f jr rb jr , then every v1 vn path containing vj is feasible relative to resource r and t jr can be the largest possible value of t jr ¼ f jr , i.e., this resource r window contains only one value. Note that, t 1r ¼ t 1r ¼ 0 and t nr ¼ t nr ¼ Tr for all rAR. In RCSP, a difficulty arises because it is not possible to determine if the resources required exceed limitations until an entire v1 vn path has been defined. The stage 1 algorithm overcomes this difficulty by changing resource-limitation constraints on v1 vn paths to resource-window constraints at each node in reduced graph. 3.1.1. Properties and computational complexity Lemma 1 and Proposition 1 below are valid only for SRCSP. The proof of Proposition 1 uses Lemma 1. We suppress index r in these proofs because only one type of resource is involved. Lemma 1. For SRCSP, if step 1 of stage 1 algorithm, which travels through graph G once in the backward direction and once in the forward direction, is executed for the second iteration, it will not change the values of bj and f j for any remaining node vj, 1rj rn that are calculated by the first iteration of step 1. Proof. Clearly, Lemma 1 is true for the trivial case in which the first iteration of step 1 does not reduce the graph. Now, suppose that the first iteration identifies and deletes arc (p,q). We show that the values of bj and f j will not be changed for any remaining node vj, 1rj rn after the first iteration of step 1 if the second iteration is performed to recalculate them based on the graph without (p,q).
bj for p þ1rj rn and f for 1rj rq 1. Because calculations of j bj for p þ1rj rn and f j for 1rjr q 1 do not involve (p,q), the
167
deletion of (p,q) will not change their values. f j for qrj rn. In the first iteration, f j for qrj rn are calculated after the deletion of (p,q) (i.e., based on the graph without (p,q)); thus, f j for qrjrn will not be changed in the second iteration.
bj for 1 rjrp. If arc (p,q) is deleted in the backward pass, then in the first iteration, bj for 1rjrp are calculated based on the graph without (p,q) and thus will not be changed. Now, suppose that arc (p,q) is deleted in the forward pass, and thus f p þwpq 4bq . By way of contradiction, assume that the bj value is changed in the second iteration due to the deletion of arc (p,q). Then, in the first iteration, arc (p,q) must be in the vj vn path with minimum resource requirement and, consequently, bp ¼ bq wpq . Because f p þ wpq 4 bq , then f p 4 bp and so vp is a bottleneck. Because vp is in the vj vn path with minimum resource requirement, vj is a bottleneck, which contradicts the
not remain in the graph). Thus, bj for 1 rjrp cannot be changed in the second iteration. Thus, we have shown that deleting a bottleneck arc in the first iteration of step 1 will not affect the calculations of bj and f j for all remaining nodes vj, 1rj rn. Similarly, we can show the same for a bottleneck node. This completes the proof. & Proposition 1. For SRCSP, graph G need be traversed only once in the backward direction and only once in the forward direction to delete all bottleneck nodes and arcs. Proof. By Lemma 1, one iteration of step 1 (i.e., one backward and one forward pass) permanently determines bj and f j for all remaining nodes vj, 1rj rn. The detection of bottlenecks in step 1 is completely determined by the values of bj and f j for 1rj rn; thus, Proposition 1 is true. & In contrast, for MRCSP, it may be necessary to traverse a graph a number of times in each direction for each resource type in order to identify and delete bottlenecks. To see that, suppose that step 1 completes backward and forward passes for resource r first and for resource r0 second (r0 ar), and that, in this process, some bottlenecks are identified and deleted relative to limitation of resource r0 . Then, values of bjr and f jr may change when they are recalculated by step 1 relative to this revised, reduced graph. Consequently, it may be possible to identify additional bottlenecks relative to limitation of resource r. Thus, for MRCSP, step 1 should be repeated until no more bottlenecks can be deleted (see step 1(vii)). Note that for MRCSP, the final outcome of stage 1 does not depend on the order in which resources are treated because step 1 is repeated until no more bottlenecks can be identified and each iteration of step 1 goes through all types of resources. Thus, the order in which resources are processed does not matter since they will all be processed again anyway in the next iteration as long as some bottlenecks are identified and deleted. In addition, step 3 deals with each resource independently. Additionally, for MRCSP, stage 1 may not identify and delete all bottlenecks, because stage 1 identifies bottlenecks based on the values of bjr and f jr for each individual resource and this is not sufficient if multiple resources are involved. However, for SRCSP, Proposition 1 shows that stage 1 identifies and deletes all bottlenecks and thus each node and arc in GR must be on some v1 vn path(s) that satisfy resource windows determined by the stage 1 algorithm. Let g be the number of times (i.e., iterations) step 1 is repeated. For SRCSP, g ¼1 by Proposition 1. For MRCSP, g r9A9 as illustrated in the proof of Proposition 2; however, our computational results in Section 5 show that, on average, g is much smaller than 9A9. Proposition 2. Stage 1 algorithm runs in O(9R99A92) time for MRCSP and O(9R99A9) for SRCSP. Proof. Assume 9A949V9¼n. Step 1 runs in O(9R99A9) time for each iteration because every arc in G can be processed in constant time for each rAR; similarly, step 3 runs in O(9R99A9) time. Thus, the complexity of the stage 1 algorithm is O(9R99A9g). For MRCSP, in the worst case g ¼ 9A9, each iteration of step 1 would identify and delete only one bottleneck arc, and the maximum number of bottleneck arcs is 9A9. For SRCSP, g ¼1 by Proposition 1 and thus the stage 1 algorithm runs in O(9R99A9) time. & 3.2. Stage 2 algorithm: setup Using an expansion procedure, stage 2 transforms the SPP with resource windows [t jr ,t jr ] for each rAR and vjAVR in GR, which is
168
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
obtained from stage 1, to the unconstrained SPP on expanded graph GE. We now introduce notation that details the stage 2 algorithm (see Fig. 3): Sj DVE
set of nodes in expanded graph GE that is associated with node vjAVR in GR the kth node in set Sj for k¼1,2, y, 9Sj9
skj A Sj L
L
RLjr ¼ fd1jr ,. . .,djrjr g ordered set with d1jr ¼ t jr ,djrjr ¼ t jr and o d‘jr , ‘ ¼ 2,. . .,Ljr , where Ljr ¼9RLjr9 d‘1 jr ykj ¼ ðykj1 ,. . .,ykj9R9 Þ vector of (adjusted) cumulative resource requirement for rAR at skj ASj Yj
resource r available for v1 vj vp paths, dpr. RLjr includes element djr if djr ¼dpr wjpr lies on the range [t jr ,t jr ] and thus RLjr is a subset of [t jr ,t jr ]. Starting from RLnr ¼{Tr} for end node vn as the amount of resource r available for v1 vn paths, Tr, Eq. (3) calculates RLjr in decreasing order of vj, ultimately determining RL1r for start node v1. For each vj, (3) determines RLjr considering each arc (j,p)AAR and each element dprARLpr. Step 2 initializes Y1 ¼ fy11 ¼ 0g. For each vjAVR\{v1}, in order of increasing index, calculate ykjr . First, compute gjrk , as the sum of a particular yhir at node i, and the requirement of resource r on arc (i,j), wijr:
set of vectors ykj for skj A Sj such that
gjrk ¼ yhir þwijr , for, ði,jÞ A AR and r A R:
t j r ykj rtj .
Then, round up the value of gjrk using ordered element d‘jr in RLjr to determine ykjr : 8 1 djr if gjrk r d1jr ¼ t jr > > > < ‘ k ‘ if d‘1 ð5a2cÞ ykjr ¼ djr jr ogjr rdjr for some ‘ ¼ 2,. . .,Ljr > > L > jr : þ1 if g k 4 d ¼ t :
As stage 2 algorithm expands GR to form GE, it defines a set of nodes Sj associated with vjAVR in GR. Then, VE ¼[vjAVRSj where Sj \Sj’ ¼| for vj,vj0 AVR, j aj’. The stage 2 algorithm processes nodes vjAVR in order of increasing index, calculating ykj , associating a node skj ASj with each unique vector ykj AYj, and connecting shi to skj if ykj is calculated from yhi (in step 3(ii–iv)). The core part is our specialized method for calculating ykj , which we illustrate below. The ordered set RLjr is needed to calculate ykjr and contains all possible values for ykjr . For each rAR, step 1 initializes RLnr ¼{Tr} and calculates RLjr according to (3) for each vjAVR\{vn} in decreasing order: RLjr ¼ fdjr 9djr ¼ dpr wjpr ,t jr rdjr rt jr ,dpr A RLpr ,ðj,pÞ A AR g [ ft jr ,t jr g ð3Þ Each element djr in RLjr specifies the amount of resource r available for v1 vj paths and is calculated by subtracting the resource requirement on arc (j,p), wjpr, from the amount of
jr
jr
ð4Þ
jr
Fig. 4 illustrates (5a–5c) relative to resource r for the v1 vj ‘ paths. If the gjrk for a given v1 vj path falls in interval ðd‘1 jr ,djr , this v1 vj path can be augmented with any vj vn path that has resource requirement less than or equal to Tr d‘jr to form feasible v1 vn paths. Note that there is no vj vn path having resource requirement larger than Tr d‘jr and less than Tr d‘1 jr . Thus, ‘ Expression (5b) rounds all values of gjrk in interval ðd‘1 jr ,djr up to ykjr ¼ d‘jr . Expression (5a) deals with lower bound of resource window, t jr (see step 3 of stage 1 algorithm and its illustration). Expression (5c) excludes all paths whose requirement of resource r exceeds the upper bound of resource window, t jr , because only vector ykj such that t j rykj rtj can be added to set Yj and be associated with a unique node in GE (see step 3(ii)).
Fig. 3. Stage 2 algorithm: expanding GR ðVR ,AR Þ to form GE ðVE ,AE Þ.
Fig. 4. ykjr is calculated using our rounding technique.
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
169
Remark 1. We refer to our method for calculating ykj (i.e., expressions (3)–(5)) as the rounding technique (i.e., using RLjr to round gjrk up to ykjr ). Let Ljr be a set of values of ykjr for rAR and
Example 1. Fig. 5(a) depicts an MRCSP example with seven nodes and limitations relative to two resources: T1 ¼T2 ¼2. In Fig. 5(a), the label on each arc is the resource requirement vector. Results of the stage 1 algorithm are presented in Table 1 and GR ¼G. Note that, although arc (1,3) is a bottleneck, the stage 1 algorithm does not identify it as such (see Section 3.1.1). Following stage 1, the stage 2 algorithm first (step 1) calculates the RLjr’s as listed in Table 1. Then, initialing y11 ¼ ð0,0Þ (step 2), vectors ykj ’s in increas-
vj AVR. If some interval (e.g., ðd‘jr ,d‘jrþ 1 in Fig. 4) does not contain n o any gjrk , then d‘jrþ 1 eLjr. Thus, Ljr D RLjr D t jr ,. . .,t jr and 9Ljr9r 9RLjr9 r t jr t jr þ 1rTr þ1. That is, the rounding technique reduces feasible values in the state space of cumulative resource requirement vectors associated with each vj AVR, perhaps significantly. Consequently, the rounding technique increases the opportunity to find a vector in Yj matching ykj in step 3(iii), and thus helps to
ing order of j ¼2, y, 7 are calculated (step 3) as: y12 ¼ g12 ¼ y11 þ ð0,0Þ ¼ ð0,0Þ; y13 ¼ g13 ¼ y12 þ ð0,0Þ ¼ ð0,0Þ and y23 ¼ g23 ¼ y11 þ ð1,1Þ ¼ ð1,1Þ; y14 ¼ g14 ¼ y13 þð1,0Þ ¼ ð1,0Þ and y24 ¼ g24 ¼ y23 þ ð1,0Þ ¼ ð2,1Þ. For calculating yk5 ’s, g15 ¼ y14 þð0,0Þ ¼(1,0) is rounded to
manage the size of expanded graph GE because each vector in Yj is represented by a single node in Sj (see steps 3(iii) and (iv)). Further, the rounding technique also reduces the computational effort of stage 3 of TSA for solving SPP on GE because the effort depends on the size of GE (see Section 3.3).
y15 ¼ ð2,0Þ because RL5,1 ¼{0,2}; similarly, g25 ¼ y13 þ ð0,1Þ ¼ ð0,1Þ is rounded to y25 ¼ ð0,2Þ; g35 ¼ y23 þð0,1Þ ¼ ð1,2Þ is rounded to y35 ¼ ð2,2Þ; g45 ¼ y24 þ ð0,0Þ ¼ ð2,1Þ is rounded to the same y35 ¼ ð2,2Þ. After that, y16 ¼ g16 ¼ y25 þ ð2,0Þ ¼ ð2,2Þ and y17 ¼ g17 ¼ y15 þ
associated vector ykj . As shown in Fig. 5(b), node s35 does not have
Wilhelm et al. [31] reported a methodology analogous to TSA but it calculates ykj vectors using Eq. (4) rather than our rounding technique. Our preliminary computational tests (not included in this paper to save space) show that our approach is absolutely superior in terms of storage requirement (i.e., the size of GE) and the run time required to construct GE.
any successor and is not on any s11 –s17 path; thus, for MRCSP, step 4 of the stage 2 algorithm is needed to delete the ‘‘dangling’’ nodes and arcs in the dashed box (including arc (s11 ,s23 )) to complete GE.
3.2.1. Properties and computational complexity Because Y1 ¼ fy11 ¼ 0g and Yn ¼ fy1n ¼ ðT1 ,. . .,Tn Þg (RLnr ¼{Tr} 8rAR), S1 ¼ fs11 g and Sn ¼ fs1n g; that is, there is one start node (s11 )
ð0,2Þ ¼ y16 þ ð0,0Þ ¼ ð2,2Þ. Associating a node skj with each unique vector
ykj ,
and connecting
shi
to
skj
if
ykj
is calculated from
yhi
(step
3(ii)–(iv)), Fig. 5(b) shows the expanded graph that steps 1–3 of the stage 2 algorithm create, in which the label on node skj is the
Fig. 5. An example of expanded graph GE .
Table 1 Example results. r¼1 j bjr f jr
1 2
r¼2 2 2
3 2
4 2
5 2
6 2
7 2
1 2
2 2
3 2
4 2
5 2
6 2
7 2
0
0
0
1
0
2
0
0
0
0
0
0
0
0
b jr
2
1
1
0
0
2
2
2
1
1
0
0
2
2
f jr t jr
0
0
1
2
2
4
4
0
0
1
1
2
2
4
0
0
0
1
0
2
2
0
0
0
0
0
2
2
t jr RLjr
0
0
1
2
2
2
2
0
0
1
1
2
2
2
0
0
0 1
1 2
0 2
2
2
0
0
0 1
0 1
0 2
2
2
170
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
and one end node (s1n ) in GE. By Proposition 3 below, step 4 in the stage 2 algorithm is not needed for SRCSP. However, for MRCSP, steps 1–3 may create ‘‘dangling’’ nodes and arcs that are not on any s11 s1n path, and step 4 is needed to remove them. Proposition 3. For SRCSP, every node that steps 1–3 of the stage 2 algorithm create in GE is on some s11 s1n path. Proof. Consider 9R9¼ 1. Note that every node in GE (except start node s11 ) has one or more predecessors because GE is constructed in increasing order of vjAVR index. By way of contradiction, suppose the proposition is not true. Then, steps 1–3 create a node skj that is associated with ykj1 and does not have any successor in GE, implying ykj1 4 bj1 . However, steps 1-3 create node skj only if ykj1 o t j1 rbj1 , establishing a contradiction. This completes the proof. & We analyze the size of expanded graph GE (Proposition 4), which determines storage requirements. We also determine the complexity of the stage 2 algorithm (Proposition 5). Proposition 4. The Q number of nodes (arcs) in GE is bounded by Q n r A R ðTr þ1Þ (A r A R ðTr þ 1Þ). Proof. Recall that Ljr is the set of values of ykjr for vjAVR, rAR, and Q Q 9Ljr9rTr þ1 (see Remark 1). Then, 9Sj9 r r A R 9Ljr 9 r r A R ðTr þ1Þ; that is, the number of nodes associated with any vjAVR Q is bounded by r A R ðTr þ 1Þ. Thus, the total number of nodes in GE Q Q is bounded by 9VR 9 r A R ðTr þ 1Þ r n r A R ðTr þ 1Þ. Because each k node sj ASj DVE has at most 9FS(vj)9 successors, the total number P Q Q of arcs in GE is bounded by vj A VR 9FSðvj Þ9 r A R ðTr þ 1Þ r 9A9 r A R ðTr þ1Þ. & Because 9RLjr 9 rt jr t jr þ1 (see Remark 1), RLjr for vjAVR, rAR can be stored in a one-dimensional array of size (t jr t jr þ 1). Using this data structure for RLjr, step 3(ii) in the stage 2 algorithm can calculate each ykj in O(9R9) time according to our rounding technique. Once all successors of vjAV1 are processed, set Yj can be eliminated, freeing the memory used to store it. Proposition 5. Stage 2 algorithm runs in P ðTr þ1Þ þ9A9 r A R ðTr þ 1ÞÞ time in the worst-case.
Oð9R99A9
Q
rAR
3.3. Stage 3 algorithm: iterative solution Stage 3 finds a least-cost path from s11 to s1n in GE with respect to cij’s, which are updated at each column generation iteration. The cost on arc ðshi ,skj Þ A AE in GE is the cost on arc (i,j)AA, cij. If arc ðshi ,skj Þ is on the solution path for SPP on GE, then arc (i,j) is on the solution path for RCSP on G. The advantage of TSA (Fig. 1) is demonstrated when RCSP is solved repetitively as a sub-problem in column generation. One implementation of stages 1 and 2 makes GE available for all column generation iterations, but only stage 3 is needed at each column generation iteration to optimize SPP on GE. Q Stage 3 runs in Oð9AE 9Þ ¼ Oð9A9 r A R ðTr þ 1ÞÞ time (see Proposition 4) since each arc can be processed in constant time (e.g., [8]). Using Propositions 2 and 5, Proposition 6 shows that TSA runs in pseudo-polynomial time. Proposition 6. Initialization of TSA (preprocessing and setup stages) requires Q P 2 Oð9R99A9 þ 9R99A9 r A R ðTr þ 1Þ þ 9A9 r A R ðTr þ 1ÞÞ time for Q P MRCSP and Oð9R99A9 þ9R99A9 r A R ðTr þ 1Þ þ 9A9 r A R ðTr þ1ÞÞ time Q for SRCSP. Each iterative solution requires Oð9A9 r A R ðTr þ1ÞÞ time.
4. Extended label-setting algorithm (EMLSA) In this section, we extend the label-setting algorithm of Dumitrescu and Boland [12], and refer the resulting algorithm as EMLSA. EMLSA can deal with MRCSP with a real-valued cost and integer-valued resource requirements on each arc. In Section 5, we use EMLSA as a method to benchmark the performance of TSA. For our extension, we define labels to deal with multiple resources. The following definitions were initially proposed by Desrochers and Soumis [7] for SRCSP and generalized by Desrochers [4] for MRCSP. Definition 1. Define label (Wj,1,Wj,2,y, Wj,9R9,Cj) for a path from v1 to vj, where Wj,r is the total amount of resource r required by arcs in the path and Cj is the cumulative cost at vj along with the path. We use Wj to represent the vector (Wj,1,Wj,2, y, Wj,9R9). Definition 2. Let ðW0j ,Cj0 Þ and ðWj ,Cj Þ be labels corresponding to two different paths from v1 to vj. Then, label ðW0j ,Cj0 Þ dominates
Proof. Constructing sets RLjr for all vjAVR can be done in P Oð ði,jÞ A AR 9t jr t jr þ 19ÞrO(9A9(Tr þ1)) time for each rAR. Thus,
ðWj ,Cj Þ or ðW0j ,Cj0 Þ oðWj ,Cj Þ if and only if Cj0 r Cj and Wjr0 r Wjr
constructing RLjr for all vjAVR and all rAR (step 1) can be done in P Oð9A9 r A R ðTr þ 1ÞÞ time. Step 2 can be done in constant time.
non-dominated (efficient) if it is not dominated by any other label at vj.
Considering (i,j)AAR, each ykj can be calculated from each yhi A Yi in Q O(9R9) time. Because 9Yi9 r r A R ðTr þ 1Þ, the calculation of ykj Q from all yhi A Yi can be done in Oð9R9 r A R ðTr þ 1ÞÞ time. Inserting all resulting ykj into Yj in lexicographic order can be done in Q Oð9R9 r A R ðTr þ 1ÞÞ time, because vectors yhi A Yi are stored and processed in lexicographic order. Thus, step 3 can be done in Q Q P Oð ði,jÞ A AR ð9R9 r A R ðTr þ 1ÞÞÞ ¼ Oð9R99A9 r A R ðTr þ 1ÞÞ time. For MRCSP, deleting dangling nodes and arcs that are not on any Q connected s11 s1n path in step 4 can be done in Oð9A9 r A R ðTr þ 1ÞÞ Q time because 9AE9 is bounded by 9A9 r A R ðTr þ1Þ by Proposition 4. Q P Hence, the total run time is Oð9R99A9 r A R ðTr þ 1Þ þ9A9 r A R ðTr þ1ÞÞ. Q The factor r A R ðTr þ1Þ in Propositions 4 and 5 may increase rapidly with the number of resources. On the other hand, it is likely that more bottlenecks can be identified and deleted to reduce the size of G as the number of resources increases.
for all r ¼1, y, 9R9 and ðW0j ,Cj0 ÞaðWj ,Cj Þ. A label (Wj,Cj) is
We adopt the notation used by Dumitrescu and Boland [12] to describe EMLSA: U Qijc
upper bound on the least cost of the optimal v1 vn path least-cost path from vi to b with respect to arc costs given by cij r least-cost path from vi to b with respect to arc costs Qijw given by wijr c(P) total costs associated with path P, i.e., the sum of costs on each arc in path P wr(P) total requirement of resource r along path P, i.e., the sum of the resource requirement r on all arcs in path P Ij index set of labels at node vj. For each, there is a v1 vj path with label ðWkj ,Cjk Þ. Fig. 6 details EMLSA. Phase 1 (steps 0–4) implements the preprocessor devised by Dumitrescu and Boland [12]. Phase 2 (steps 5 and 6) starts with the label (0,0) on node v1 and extends
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
171
Fig. 6. EMLSA.
the set of all efficient labels by ‘‘treating’’ an existing label on a node, that is, by extending the corresponding path along each wr outgoing arc. Note that phase 1 (preprocessor) makes U, Qjn , and c Qjn for each vjAV\{vn} available for use in phase 2. In application to solving RCSP sub-problems in column generation, all steps (i.e., 0 through 6) of EMLSA are executed with respect to the updated arc costs at each column generation iteration.
Our extension differs from the method of Dumitrescu and Boland [12] in that it uses labels ðWkj ,Cjk Þ for multiple resources as in Definitions 1 and 2, deals with the nodes in topological order (step 6) (because graph G is acyclic), selects the labels at a node in lexicographic order (step 6(i)), and processes the labels by checking the conditions for each type of resource (step 6(ii) for treatment of labels).
172
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
Note that both EMLSA and TSA incorporate preprocessing techniques; however, they differ in whether they use cost parameters or not. The EMLSA preprocessor calculates least-cost paths c c Q1i and Qin at step 1; continues to improve the upper bound on cost, U, at steps 0, 2(i), and 3(iii); and uses this cost information in steps 3(i) and 3(ii) to reduce the network; thus, it must implement its preprocessor at each column generation iteration. However, TSA does not use arc cost parameters in its preprocessing (stage 1) because arc costs are changed at each column generation iteration; consequently, it implements its preprocessor (stage 1) only once before the first column generation iteration. EMLSA uses the Dumitrescu and Boland [12] preprocessing phase, which solves fewer than 9R99A9 SPPs on G(V,A). The solution of SPP on G can be obtained in O(9A9) time. Thus, the preprocessing phase has the complexity of O(9R99A92). The complexity analysis for phase 2 with 9R9¼ 1 can be easily generalized to the case in which 9R941. The only difference is that each calculation or comparison of labels now requires O(9R9) operations instead of O(1) when an array is used to store the labels. By Desrochers and Soumis [7] and Dumitrescu and Boland [12], the complexity of phase 2 of the method proposed by Dumitrescu and Boland [12] is bounded by O(n2PrARTr). Thus, phase 2 runs in O(n29R9PrARTr) Q 2 time and EMLSA has complexity Oð9R99A9 þ n2 r A R Tr Þ for each solution.
5. Computational evaluation This section identifies computational characteristics of our TSA and benchmarks relative to the state-of-the-art EMLSA and CPLEX optimization software (version 11.1). We program our TSA and EMLSA in the C/Cþþ language and conduct all experiments on a Dell PC (OptPlex GX620) with an Intel 3.2 GHz dual core CPU, 2 GB RAM, and 160 GB hard drive. 5.1. Test problems We study four classes of test instances, all involving acyclic graphs. Instances in Classes 1 and 2, on graphs ranging in size from 100 nodes and 959 arcs to 500 nodes and 4978 arcs, are adapted from Beasley and Christofides [2]. The Class 1 instances are SRCSPs, while the Class 2 instances are MRCSPs with 9R9¼10 resources. Instances in Classes 3 and 4 are randomly generated. To construct their underlying acyclic graphs, we specify the number of nodes n, and use q(1oqrn) to restrict the span of each arc (i,j) (i.e., j i) to be j irq so that the optimal path contains at least n/q arcs. In this study, qEn / 4 when nZ100 and q E n when n o100. Arc (i,j) is included in the graph with probability p for integer j on ½i þ 1, minðn,iþ qÞ: The expected number of arcs is thus pqðnððq þ 1Þ=2ÞÞ. The generated Class 3 instances range in size from 100 nodes and 946 arcs to 3000 nodes and 29,013 arcs, and the Class 4 instances range from 20 nodes and 155 arcs to 1000 nodes and 10,435 arcs. According to the number of resources and the way that we assign resource requirements to arcs, the Class 3 instances are SRCSPs and comprise two types (s1 and s2), and the Class 4 instances are MRCSPs with 9R9¼ 4 resources and comprise three types (m1, m2, and m3): type s1: wij is generated independently from discrete uniform distribution DU[1,10], type s2: wij ¼[(j i)R] (i.e., wij is positively related to the span of arc (i,j)), type m1 (m2): for each resource r, wijr is assigned independently as for type s1 (s2),
type m3: wijr are mutually independently drawn from DU[1,100] and wij,r þ 9R9=2 ¼ ½2500R=wijr (i.e., inversely related to resource r) for r ¼1, y, 9R9/2, where [ ] denotes the nearest integer and R denotes a random number from uniform distribution U(0,1.0). For each instance in Classes 3 and 4, we determine the limitation for resource rAR using Tr ¼ Tmin,r þ ðTmax,r Tmin,r ÞZ where Tmin,r (Tmax,r ) denote the requirement for resource r on the v1 vn path(s) that require(s) the minimum (maximum) amount of resource r, and we specify parameter Z on ð0,1:0 to control the tightness of resource limitations. Tmin,r (Tmax,r ) can be obtained by setting cij ¼wijr (cij ¼ wijr) and implementing a classical shortest path algorithm on G for each resource rAR. 5.2. Computational tests Recall that TSA is designed to optimize a set of RCSP instances that have the same underlying graph and resource constraints but different arc costs, each instance corresponding to a column generation iteration. Therefore, we test TSA and benchmarking methods EMLSA and CPLEX in solving each set of 100 instances to simulate 100 iterations in column generation. We use 100 instances in a set (i.e., 100 column generation iterations) in our study to obtain accurate average performance, but note that in a column generation application, each sub-problem may be solved many – perhaps several thousand – times. Thus, the run time for a single iteration is important. It is worth mentioning that the goal of our computational tests is to evaluate the performance of TSA in solving a set of RCSP instances rather than column generation itself. Therefore, we choose to solve randomly generated column generation iterations because they do not depend upon master problem structure in a column generation for any particular application and, often, randomly generated instances are more challenging than ones encountered in practice. Arc costs in RCSP are reduced costs for a sub-problem in column generation that can be any real value; thus, we generate arc costs randomly from U( 100.0, 100.0) for each instance in Class 3 or 4. However, to be consistent with the cost structure in Beasley and Christofides [2], we generate arc costs from DU[0,5] for each instance in Class 1 or 2. TSA, EMLSA, and CPLEX are all exact methods to prescribe an optimal solution; thus, our tests focus on run times. 5.3. Computational results: tables and measures Tables 2 and 3 present results of tests on Class 1 and 2 instances, respectively. Column 1 gives the designation of sets of instances; columns (1a)–(1c) give 9R9, 9V9, and 9A9, respectively; and Column 2 gives Z. Columns 3–12 detail TSA results, respectively, g, the number of iterations of step 1 in stage 1 algorithm; the numbers of bottleneck nodes and arcs; the size of expanded graph GE, 9VE9 and 9AE9; the run times for stages 1 and 2; and the average, minimum, and maximum run times for stage 3 over the 100 instances in each set. Columns 13–15 give the average, minimum, and maximum run times per instance, respectively, required by CPLEX. Columns 16–24 detail EMLSA results, respectively, #inst, the number of instances that are solved to optimality by the EMLSA preprocessor over the 100 instances (phase 2 – the label-setting algorithm – is not used at all in solving these instances); the average DN/DA, the average numbers of nodes and arcs per instance that are deleted by the EMLSA preprocessor over the 100 instances; the average, minimum and maximum number of efficient labels; the average (avg-1),
Table 2 Results of solving each set of 100 Class 1 instances. (1) Set of instances
(1a) 9R9
(1b) 9V9
(1c) 9A9
(2) Avg Z
(4)
(5)
(6)
(7) TSA
(8)
g
# of bottleneck nodes
# of bottleneck arcs
9VE9
9AE9
Run time (s) Stage 1
rcsp3 rcsp4 rcsp11 rcsp12 rcsp19 rcsp20 (1) Set of instances
1
100
959
1
200
1971
1
500
4978
(13) CPLEX
(14)
(15)
Run time (s)
rcsp3 rcsp4 rcsp11 rcsp12 rcsp19 rcsp20
avg.
min
max
0.042 0.034 0.067 0.056 0.179 0.311
0.01 0.02 0.03 0.03 0.08 0.08
0.2 0.06 0.16 0.13 0.36 1.05
0.08 0.07 0.10 0.09 0.06 0.05
1 1 1 1 1 1
3 3 7 7 28 28
24 32 56 56 296 310 (20)
(21)
7662 5962 33,961 28,866 59,336 416,158 (22)
0 0 0 0 0 0
(10)
Stage 2
Stage 3
0.140 0.125 0.625 0.516 1.094 0.891
(16) EMLSA
(17)
(18)
#inst
DN/DA
# of eff. labels
avg.
avg.
min
max
avg-1
min
max
avg-2
23/732 23/736 3/669 10/1024 37/3151 72/4265
140 152 – – – –
106 72 – – – –
159 291 – – – –
0.016 0.011 – – – –
0.015 0 – – – –
0.016 0.031 – – – –
0.005 0.005 0.004 0.006 0.006 0.006
97 86 100 100 100 100
(19)
1016 849 3744 3262 6992 5739
(9)
(23)
(24)
(11)
(12)
Avg.
Min
Max
0.00016 0.00016 0.00093 0.00094 0.00297 0.00187
0 0 0 0 0 0
0.016 0.016 0.016 0.016 0.016 0.016
(25) threshold-1
(26) threshold-2
9 11 – – – –
30 25 222 100 411 203
Run time (s)
A ‘–’ in a row means that measure of the label-setting algorithm of EMLSA are not applicable because the EMLSA preprocessor finds optimal solutions for all 100 instances. The same notation of ‘–’ is used for Tables 4 and 5.
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
(3)
173
174
Table 3 Results of solving each set of 100 Class 2 instances. (1) Set of instances
(1) Set of instances
(1b) 9V9
(1c) 9A9
10
100
999
10
200
1960
10
300
4868
(13) CPLEX
(14)
(15)
Run time (s)
rcsp7 rcsp8 rcsp15 rcsp16 rcsp23 rcsp24 a b
avg.
min
max
0.489 1.875 2.716 9.656 14.423 27.613
0.08 0.19 0.30 0.86 0.44 7.47
1.66 3.25 9.53 20.86 35.56 48.69
(2) avg
(3) TSA
(4)
(5)
(6)
(7)
(8)
Z
g
# of
# of
9VE9
9AE9
Run time (s)
bottleneck
bottleneck
nodes
arcs
0.10 0.08 0.07 0.06 0.07 0.06
a
1 3 2 3 1 2
Stage 1
b
160(779) 293(c974) 259(1848) 362(1950) 309(3545) 379(4735)
10(28) 11(82) 20(136) 24(191) 31(127) 31(411)
(16) EMLSA
(17)
(18)
#inst
DN/DA
# of eff. labels
avg.
avg.
min
max
24/446 14/362 34/596 20/374 138/2517 45/660
2204 772 4966 1500 69,694 23,214
137 580 310 1007 89 2921
3082 855 6068 1571 134,023 25,720
33 10 22 2 42 6
(19)
419 31 153 9 3515 142 (20)
594 40 199 10 4755 187 (21)
0 0 0 0 0.016 0.016
(9)
(10)
Stage 2
Stage 3
0.906 0.188 1.594 0.313 99.642 7.843
(22)
(23)
(24)
avg-1
min
max
avg-2
0.294 0.094 0.656 0.195 10.717 3.164
0.016 0.063 0.047 0.140 0.047 0.375
0.406 0.110 0.922 0.204 20.907 3.672
0.198 0.085 0.513 0.192 6.224 2.975
(11)
(12)
avg.
min
max
0 0.00016 0 0 0.00109 0
0 0 0 0 0 0
0 0.016 0 0 0.016 0
(25) threshold 1
(26) threshold 2
3 2 2 2 9 2
5 2 3 2 16 3
Run time (s)
Number of bottleneck nodes identified by stage 1 algorithm; the term in parentheses gives total number of bottleneck nodes. Number of bottleneck arcs identified by stage 1 algorithm; the term in parentheses gives total number of bottleneck arcs.
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
rcsp7 rcsp8 rcsp15 rcsp16 rcsp23 rcsp24
(1a) 9R9
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
minimum, and maximum run times per instance calculated over the (100-#inst) instances that are solved by phase 2 of EMLSA (i.e., are not solved by its preprocessor); and avg-2, the run time per instance averaged over the total of 100 instances. Beasley and Christofides [2] specified the resource limitations for all Class 1 and 2 instances. We reverse-engineered their limitations to calculate the average value of Z over all resources for each set of instances to judge the tightness of their limitations. As shown in Column 2 of Tables 2 and 3, the average value of Z for all sets of Class 1 and 2 instances is less than 0.1, indicating that their resource limitations are tight. Note that we implement stages 1 and 2 of TSA once before solving the 100 instances in each set and stage 3 for each instance. In contrast, EMLSA must be implemented in its entirety for each instance. Therefore, to compare run times of TSA and EMLSA over each set of instances, we need to calculate the threshold number that defines the break-even point in the number of instances such that when the number of instances solved (i.e., the number of column generation iterations) is greater than or equal to threshold, the total time needed by TSA is less than that of EMLSA. Based on the respective run times, it is defined as cpuðstage 1-of-TSAÞ þ cpuðstage 2-of-TSAÞ threshold ¼ average-cpuðEMLSAÞaverage-cpuðstage 3-of-TSAÞ ð6Þ where d de denotes the smallest integer greater than or equal to . The numerator in (6) is the run time that TSA expends (onetime) in stages 1 and 2. If we consider the instances that are solved by phase 2 of EMLSA and use avg-1 for averagecpu(EMLSA) in (6), we obtain threshold-1 in Column 25 in Tables 2 and 3; if we consider all 100 instances and use avg-2 for average-cpu(EMLSA), we obtain threshold-2 in Column 26. Note that the threshold will be negative if the average cpu time of EMLSA is less than the average cpu time of stage 3 of TSA, meaning that EMLSA is faster, no matter how many column generation iterations (i.e., instances) are performed. Because it is entirely fortuitous for the EMLSA preprocessor to prescribe an
175
optimal solution and no criterion can determine a priori if the preprocessor will give an optimal solution, we use avg-1 as the average run time of EMLSA (unless otherwise stated) and threshold-1 as a key measure to compare the relative performances of TSA and EMLSA. Tables 4 and 5 present results of tests on Class 3 and 4 instances, respectively. Column 1 gives the designation of sets of instances (a combination (class type, 9V9), for example, s1-100 denoting the set of SRCSP instances of type s1 with 100 nodes). Column 2 indicates the value of Z, ranging from tight to loose resource limitation(s) (small to large values of Z). We test Class 3 instances using three values of Z ¼{0.1,0.5,0.9} and Class 4 instances using a subset of Z ¼{0.05,0.1,0.5,0.9,0.95}. Other columns correspond to those in Tables 2 and 3. From our tests on all four classes of instances, the run time required by stage 1 of TSA is negligible. For all SRCSP instances, the number of iterations of step 1 in the stage 1 algorithm, g, is 1, consistent with Proposition 1. For MRCSP, 1r g r9A9 (Proposition 2); but, in general, g is much smaller than 9A9. In fact, g r6 for all MRCSP instances in Classes 2 and 4. Further, the stage 1 algorithm identifies and deletes all bottlenecks for the SRCSP instances, while, for the MRCSP instances, it cannot identify some bottlenecks, verifying the analysis in Section 3.1.1. For TSA, stage 1 is resource-based and conducted only once for a set of instances that have the same underlying graph and resource constraints; thus, we report the numbers of bottleneck nodes and arcs for each set of instances as in Tables 2 and 3. These bottlenecks cannot be on any feasible solution no matter what the arc costs are. However, the EMLSA preprocessor is jointly based on resource and cost and must be implemented for each instance in the set and for each instance the numbers of nodes and arcs that are deleted by the EMLSA preprocessor are different. Thus, for each set, Tables 2 and 3 report the average numbers of nodes and arcs per instance that are deleted by the EMLSA preprocessor. Because of the difference between the mechanisms of TSA and EMLSA, this comparison may not be completely fair but it is interesting. Since the EMLSA preprocessor uses cost parameters as
Table 4 Results of solving each set of Class 3 instances. Set of instances
Z
TSA 9VE9
s1-100
s2-100
s1-500
s2-500
s1-1000
s2-1000
s1-3000
s2-3000
0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9
1797 5868 1154 525 1985 703 15,005 44,861 9485 11,605 44,532 11,105 38,269 111,872 21,040 37,379 173,035 40,907 78,963 248,658 9503 90,781 1,126,408 75,915
9AE9
15,306 61,395 11,956 2078 18,465 6976 152,044 526,350 111,743 57,040 469,984 125,874 378,134 1,239,197 239,205 177,380 1,761,164 447,040 705,546 2,591,222 94,517 352,494 10,827,859 828,642
Stage 2 (s)
0.359 1.328 0.171 0.062 0.328 0.109 2.938 8.718 1.625 1.313 9.203 1.938 6.734 20.36 3.453 4.344 39.032 7.328 13.125 42.719 1.593 9.954 362.911 12.422
Stage 3 (s)
0.0002 0.0014 0.0003 0 0 0.0002 0.0049 0.0125 0.0025 0.0019 0.0125 0.0031 0.0114 0.0300 0.0055 0.0069 0.0484 0.0102 0.0229 0.0686 0.0025 0.0155 0.3106 0.0223
CPLEX
EMLSA
Run time (s)
#inst
0.318 0.078 0.032 0.161 0.061 0.038 1.734 0.530 0.274 1.288 0.729 0.248 3.124 1.876 0.740 2.020 1.200 0.593 16.209 6.459 3.520 12.492 7.640 3.106
0 80 100 0 44 100 0 63 100 0 73 100 0 57 100 0 72 100 0 26 100 0 97 100
# of eff. labels
831 250 – 388 242 – 5477 523 – 4122 402 – 14,100 1180 – 9363 523 – 21,633 338 – 16,094 220 –
threshold-1
threshold-2
5 152 – 3 29 – 6 773 – 6 1288 – 5 1090 – 9 o 0 – 8 7453 – 15 o 0 –
5 402 92 3 42 54 6 2415 267 6 31,734 344 5 9980 316 9 o0 1211 8 o0 34 15 o0 449
Run time (s) avg-1
avg-2
0.0725 0.0101 – 0.0195 0.0114 – 0.5283 0.0238 – 0.2288 0.0197 – 1.4331 0.0487 – 0.5099 0.0301 – 1.7310 0.0743 – 0.6899 0.0627 –
0.0725 0.0047 0.0022 0.0195 0.0078 0.0022 0.5283 0.0161 0.0086 0.2288 0.0128 0.0088 1.4331 0.0320 0.0164 0.5099 0.0225 0.0163 1.7310 0.0684 0.0497 0.6899 0.0527 0.0500
176
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
Table 5 Results of solving each set of Class 4 instances. Set of instances
Z
TSA 9VE9
9AE9
Stage 2 (s)
Stage 3 (s)
CPLEX
EMLSA
Run time (s)
#inst
# of eff. labels
m2-20
m3-20
m1-100
m2-100
m3-100
m1-1000 m2-1000
m3-1000
0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9 0.05 0.1 0.95 0.05 0.25 0.95 0.05 0.1 0.95 0.05 0.95 0.05 0.95
3 3 1970 5721 265 1407 11 14 523 2349 8 14 1675 6549 142 1017 7 8 26,223 44,139 6403 62,057 388 492 44,096 369,813 76 104 201,708 377,057 6042 56,656 14,934 21,375 1,240,084 13,725,213 low memory
0 0.500 0.062 0.015 0.156 0 0.735 0.046 0 60.329 2.688 179.815 73.391 0.031 2496.938 2.578 22.641 1805.601
0 0.0002 0 0 0.0002 0 0.0002 0 0 0.0027 0.0014 0 0.0111 0 0.0230 0.0005 0.0017 0.3093
0.016 0.047 0.012 0.012 0.065 0.018 0.012 0.022 0.011 0.289 1.379 0.057 0.052 2.308 0.052 1.486 1.208 0.055 24.658 1.215 0.501 0.801
100 67 100 100 20 100 100 76 100 100 0 100 100 0 100 1 0 100 0 100 100 100
– 57 – – 21 – – 68 – – 17,976 – – 35,322 – 246 34,400 – 42,144 – – –
– 0.0095 – – 0.0033 – – 0.0065 – – 1.8606 – – 4.1661 – 0.0148 3.8741 – 3.5586 – – –
0.05 0.95
179,837 250,552
801.979 236.019
0.0200 0.0636
37.388 1.110
0 100
93,916 –
9.7659 –
260,888 3,090,572
threshold-2
– 54 – – 4 – – 117 – – 32 – – 43 – 2 648 – 6 – – –
0 81 34 12 5 33 0 262 43 0 32 3403 0 43 o0 2 648 1508 6 o0 4 –
82 –
82 o0
Run time (s) avg-1
m1-20
threshold-1
avg-2 0.0030 0.0064 0.0019 0.0013 0.0031 0.0050 0.0019 0.0030 0.0011 0.0033 1.8606 0.0022 0.0027 4.1661 0.0020 0.0148 3.8741 0.0022 3.5586 0.0161 0.0202 0.0163 9.7659 0.0161
A ‘ ’ in a row means that no performance measures are reported because the TSA preprocessor (Stage 1) identifies the instance with specified Z to be resource-infeasible.
well as resource limitations, it may delete more nodes and arcs than the stage 1 of TSA identifies using only resource limitations. The key aspect lies in the comparisons of run times, which we do thoroughly in the tables and the rest of this section. We detail these observations in Tables 2 and 3 but do not do so in Tables 4 and 5 to save space. 5.4. Results: comparison with CPLEX For all sets of Class 1, 2, and 3 instances, EMLSA is faster than CPLEX, on average, especially for large instances (e.g., s1-3000 and s2-3000) or when the resource limitations are median to loose (e.g., Z ¼0.5 and 0.9). For most sets of Class 4 instances, EMLSA is faster than CPLEX; however, for instance sets of m1-100 with Z ¼0.1, m2-100 with Z ¼0.25, and m3-100 with Z ¼0.1, CPLEX is slightly faster than EMLSA. But it is worth mentioning that for large Class 4 instances (e.g., m1-1000, m2-1000, and m3-1000), EMLSA is still faster than CPLEX. On the whole, EMLSA is faster than CPLEX with several limited exceptions. Therefore, we analyze threshold values hereafter only to compare TSA to EMLSA. In addition, for all four classes of instances, stage 3 of TSA is faster than CPLEX even up to several orders of magnitude; thus TSA is faster than CPLEX for solving a certain number of instances that have the same underlying graph and resource constraints but different arc costs. 5.5. Results: comparison of TSA and EMLSA For a set of RCSP instances having the same underlying graph and resource constraints, TSA is faster than EMLSA when the number of instances solved (i.e., column generation iterations) is larger than threshold-1. As shown in Tables 2 and 3, for all sets of Class 1 and 2 instances, threshold-1 is less than 11. Among these sets, stage 3 of TSA takes at most 0.003 s (rcsp19) on average for an instance, which is always faster than EMLSA and sometimes
even several orders of magnitude faster (e.g., rcsp8). For some sets of Class 1 instances (e.g., rcsp19), the EMLSA preprocessor solves all 100 instances. Even in such a case, the average time per instance required by stage 3 of TSA is still smaller than (the preprocessor of) EMLSA (avg-2), but threshold-2 is relatively large (e.g., 411 for rcsp19). As shown in Table 4, for all sets of Class 3 instances, when Z ¼0.1, threshold-1r15 and so TSA outperforms EMLSA. When Z ¼0.5, stage 2 of TSA takes a relatively long time to expand a graph, which may be rather large. Thus, more instances (29rthreshold-1 r7453) are needed to accumulate the time saving (the difference between the average run times per instance of EMLSA and stage 3 of TSA) to offset the time consumed in stage 2. On some sets of instances, TSA is, on average, slower per instance than EMLSA (threshold-1o0). When Z ¼0.9, the EMLSA preprocessor fortuitously prescribes the optimal solution for all 100 instances in each set. Intuitively, an optimal path in graph G can be found easily if the resource limitation is loose. However, it is challenging to find even a resource-feasible path in G when the limitation is tight; for example, the EMLSA preprocessor cannot find any optimal solution for the set of instances with a tight resource limitation (Z ¼0.1). Table 5 shows that, when Z is small ( r0.25) in the sets of Class 4 instances, threshold-1 is less than 100 except for set of instances m3-100 with Z ¼ 0.1 in which threshold-1 ¼ 648. For this exception, the average time per instance (stage 3 of TSA) is two orders of magnitude faster than EMLSA; however, the time that stage 2 of TSA requires to setup GE must be offset by the time saving in threshold-1 ¼648 instances. When Z ¼0.5 and the underlying graph is small (e.g., m-20 series including m1-20, m2-20 and m3-20 sets of instances), TSA outperforms EMLSA if the number of instances solved is large enough (4rthreshold-1 r117). However, for a medium value of Z and large graph (e.g., m-1000 series), TSA may not outperform EMLSA (threshold-1o0 for some sets of instances) because of the relatively long run time of its
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
stage 2 and the large size of resulting graph GE; thus, we only table tests for small and large values of Z on large-size instances. For each set of Class 4 instances with a loose resource limitation (Z Z0.9), the EMLSA preprocessor prescribes the optimal solution for all 100 instances. 5.6. Results: effects of the number of resources and resource limitations Both TSA and EMLSA are more effective on SRCSP (Classes 1 and 3) than on MRCSP (Classes 2 and 4) because the size of the state space increases with the number of resources (see Remark 1). For a given graph size, if Z is small (e.g., 0.1), TSA can identify and delete more bottlenecks for MRCSP than for SRCSP. However, as Z increases (e.g., 0.5 and 0.9), multiple resources do not help in identifying more bottlenecks. On the other hand, EMLSA generates more efficient labels when multiple resources are involved. The performances of both TSA and EMLSA are closely related to the resource limitation parameter Z. The number of efficient labels and run times required by EMLSA decrease monotonically as Z increases (i.e., as the resource limitation becomes looser). However, the size of GE and the total run time incurred by TSA for Z ¼0.5 are much larger than those for either small or large values of Z. If Z is small (e.g., 0.1), many bottlenecks can be identified and deleted so that few feasible v1 vn paths exist. On the other hand, if Z is large (e.g., 0.9), some paths are always feasible relative to resource limitations and GE includes them. The extreme case is Z ¼1.0, for which GE ¼GR. For MRCSP, the run time for stage 2 of TSA and the size of the resulting GE are quite sensitive to Z. For example, relative to sets of instances m3-100, as Z increases from 0.05 to 0.1, the size of GE increases from 76 nodes and 104 arcs to 201,708 nodes and 377,057 arcs and the run time of stage 2 of TSA increases from 0.031 to 2496 s. As Z increases further to 0.9, the size of GE decreases to 6042 nodes and 56,656 arcs and the run time of stage 2 decreases to 2.58 s. The run time of TSA is more sensitive to Z than is that of EMLSA.
6. Conclusions and recommendations for future research Tests show that, in general, TSA is faster than EMLSA, which, in turn, is faster than CPLEX. TSA requires little run time to solve an instance with either a small or large value of Z, but its run time increases for the mid-range value of Z; in contrast, EMLSA run time reduces as Z increases. Thus, when the resource limitation is tight – as can be expected in practice – TSA is faster. Stage 2 of TSA may require a substantial run time, but stage 3 of TSA is implemented very quickly to optimize an RCSP instance at each column generation iteration; in fact, it can be several orders of magnitude faster than EMLSA. For a medium-to-loose resource limitation, TSA outperforms EMLSA on graphs of small-to-medium size, while EMLSA may do better in solving instances on large graphs. Another point is that TSA is more stable than CPLEX and EMLSA. The run time of its stage 3 depends on the size of expanded graph GE and is almost the same for all instances in a given set (having the same underlying graph and resource constraints). However, the run time of CPLEX and the number of efficient labels and run time of EMLSA differ markedly from one instance to another in a given set. As an example, in solving rcsp23, over 58 (i.e., 100 #iter¼58) instances, the average, minimum, and maximum numbers of efficient labels are 69,694/89/134,023 and corresponding run times are 10.717/0.047/20.907 s. Tables 2 and 3 detail all average, minimum, and maximum values; however, to save space, Tables 4 and 3 only present the corresponding average values. Future research could further specialize algorithms to deal with particular issues that arise in the column generation context.
177
One special issue is that of reoptimizing when only a few arc costs are changed from one column generation iteration to the next. Another issue is that arc (i,j) in G may be forbidden or prescribed (i.e., the branching rule may fix the associated decision variable xij to 0 or 1, respectively). Considering the need to solve a subproblem repetitively, future research may be directed to develop effective methods to deal with these issues. Other future work could implement TSA embedded in column generation to solve real-world problems.
Acknowledgements This material is based upon work supported by the Texas Advanced Technology Program under Grant no. 000512-0248-2001 and by the National Science Foundation on Grant no. DMI-0217265. References [1] Aneja YP, Aggarwal V, Nair KPK. Shortest chain subject to side constraints. Networks 1983;13:295–302. [2] Beasley JE, Christofides N. An algorithm for the resource constrained shortest path problem. Networks 1989;19:379–94. [3] Desaulniers G, Desrosiers J, Dumas Y, Marc S, Rioux B, Solomon MM, et al. Crew pairing at Air France. Eur J Oper Res 1997;97:245–59. [4] Desrochers M. An algorithm for the shortest path problem with resource constraints. Technical Report G-88-27. GEARD, Ecole des H.E.C., Montreal, Canada, 1988. [5] Desrochers M, Desrosiers J, Solomon M. A new optimization algorithm for the vehicle routing problem with time windows. Oper Res 1992;40(2):342–54. [6] Desrochers M, Soumis F. A reoptimization algorithm for the shortest path problem with time windows. Eur J Oper Res 1988;35:242–54. [7] Desrochers M, Soumis F. A generalized permanent labeling algorithm for the shortest path problem with time windows. INFOR 1988;26:193–214. [8] Dijkstra EW. A note on two problems in connection with graphs. Numer. Math. 1959;1:269–71. [9] Dolgui A, Guschinsky N, Levin G. A special case of transfer lines balancing by graph approach. Eur J Oper Res 2006;168:732–46. [10] Dror M. Note on the complexity of the shortest path models for column generation in VRPTW. Oper Res 1994;42(5):977–8. [11] Dumitrescu I, Boland N. Algorithm for the weight constrained shortest path problem. Int Trans Oper Res 2001;8:15–30. [12] Dumitrescu I, Boland N. Improved preprocessing, labeling and scaling algorithms for the weight-constrained shortest path problem. Networks 2003;42(3):135–53. [13] Fahle T, Junker U, Karisch SE, Kohl N, Sellmann M, Vaaben B. Constraint programming based column generation for crew assignment. J Heuristics 2002;8(1):59–81. [14] Feillet D, Dejax P, Gendreau M, Gueguen C. An exact algorithm for the elementary shortest path problem with resource constraints: application to some vehicle routing problems. Networks 2004;44(3):216–29. [15] Gellermann T, Sellmann M, Wright R. Shorter path constraints for the resource constrained shortest path problem. CP-AI-OR 2005, Lecture notes in computer science, vol. 3524. p. 201–16. [16] Handler GY, Zang I. A dual algorithm for the constrained shortest path problem. Networks 1980;10:293–310. [17] Hassin R. Approximation schemas for the restricted shortest path problems. Math Oper Res 1992;17(1):36–42. [18] Holmberg K, Yuan D. A multicommodity network-flow problem with side constraints on paths solved by column generation. Informs J Comput 2003;15(1):42–57. [19] Ioachim I, Ge´linas S, Soumis F, Desrosiers J. A dynamic programming algorithm for the shortest path problem with time windows and linear node costs. Networks 1998;31:193–204. [20] Jaffe JM. Algorithms for finding paths with multiple constraints. Networks 1984;14:95–116. [21] Jaumard B, Semet F, Vovor T. A two-phase resource constrained shortest path algorithm for acyclic graphs. Les Cahiers du GERAD G-96-48. Montre´al: E´cole des Hautes E´tudes Commerciales; 1996. [22] Lorenz DH, Raz D. A simple efficient approximation scheme for the restricted shortest path problem. Oper Res Lett 2001;28:213–9. [23] Mehlhorn K, Ziegelmann M. Resource Constrained shortest paths: in: Paterson M, editor. 7th annual European Symposium on Algorithms (ESA 2000). Lecture notes in computer science, vol. 1879. Berlin: Springer-Verlag. p. 326–37. [24] Mingozzi A, Boschetti MA, Ricciardelli S, Bianco L. A set partitioning approach to the crew scheduling problem. Oper Res 1999;47(6):873–88. [25] Sellmann M. Cost-based filtering for shorter path constraints. In: Proceedings of the ninth international conference on the principles and practices of Constraint Programming (CP). Springer, 2003. p. 694–708.
178
X. Zhu, W.E. Wilhelm / Computers & Operations Research 39 (2012) 164–178
[26] Sellmann M, Gellermann T, Wright R. Cost-based filtering for shorter path constraints. Constraints 2007;12(2):207–38. [27] Wilhelm WE. A column-generation approach for the assembly system design problem with tool changes. Int J Flexible Manuf Syst 1999;11:177–205. [28] Wilhelm WE. A technical review of column generation in integer programming. Optim Eng 2001;2:159–200. [29] Wilhelm WE, Arambula I, Choudhry NN. A model to optimize picking operations on dual-head placement machines. IEEE Trans Autom Sci Eng 2006;3:1–15.
[30] Wilhelm WE, Choudhry ND, Damodaran P. A model to optimize placement operations on dual-head placement machines. Discrete Optim 2007;182:626–39. [31] Wilhelm WE, Damodaran P, Li J. Prescribing the content and timing of product upgrades. IIE Trans 2003;35(7):647–64. [32] Wilhelm WE, Liang D, Rao B, Warrier D, Zhu X, Bulusu S. Design of international assembly systems and their supply chains under NAFTA. Transport Res: Part E 2005;41:467–93. [33] Wilhelm WE, Zhu X. Enabling flexibility on a dual head placement machine by optimizing platform-tray-feeder picking operations. Flex Serv Manuf J 2009;21(1-2):1–30.