Variable neighborhood search for the cost constrained minimum label spanning tree and label constrained minimum spanning tree problems

Variable neighborhood search for the cost constrained minimum label spanning tree and label constrained minimum spanning tree problems

ARTICLE IN PRESS Computers & Operations Research 37 (2010) 1952–1964 Contents lists available at ScienceDirect Computers & Operations Research journ...

257KB Sizes 1 Downloads 112 Views

ARTICLE IN PRESS Computers & Operations Research 37 (2010) 1952–1964

Contents lists available at ScienceDirect

Computers & Operations Research journal homepage: www.elsevier.com/locate/caor

Variable neighborhood search for the cost constrained minimum label spanning tree and label constrained minimum spanning tree problems Zahra Naji-Azimi a, Majid Salari a, Bruce Golden b,, S. Raghavan b, Paolo Toth a a b

DEIS, University of Bologna, Viale Risorgimento 2, 40136 Bologna, Italy Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA

a r t i c l e in f o

a b s t r a c t

Available online 13 January 2010

Given an undirected graph whose edges are labeled or colored, edge weights indicating the cost of an edge, and a positive budget B, the goal of the cost constrained minimum label spanning tree (CCMLST) problem is to find a spanning tree that uses the minimum number of labels while ensuring its cost does not exceed B. The label constrained minimum spanning tree (LCMST) problem is closely related to the CCMLST problem. Here, we are given a threshold K on the number of labels. The goal is to find a minimum weight spanning tree that uses at most K distinct labels. Both of these problems are motivated from the design of telecommunication networks and are known to be NP-complete [15]. In this paper, we present a variable neighborhood search (VNS) algorithm for the CCMLST problem. The VNS algorithm uses neighborhoods defined on the labels. We also adapt the VNS algorithm to the LCMST problem. We then test the VNS algorithm on existing data sets as well as a large-scale dataset based on TSPLIB [12] instances ranging in size from 500 to 1000 nodes. For the LCMST problem, we compare the VNS procedure to a genetic algorithm (GA) and two local search procedures suggested in [15]. For the CCMLST problem, the procedures suggested in [15] can be applied by means of a binary search procedure. Consequently, we compared our VNS algorithm to the GA and two local search procedures suggested in [15]. The overall results demonstrate that the proposed VNS algorithm is of high quality and computes solutions rapidly. On our test datasets, it obtains the optimal solution in all instances for which the optimal solution is known. Further, it significantly outperforms the GA and two local search procedures described in [15]. & 2009 Elsevier Ltd. All rights reserved.

Keywords: Minimum spanning tree problem Minimum label spanning tree problem Heuristics Mixed integer programming Variable neighborhood search Genetic algorithm

1. Introduction The minimum label spanning tree (MLST) problem was introduced by Chang and Leu [2]. In this problem, we are given an undirected graph G=(V, E) with labeled edges; each edge has a single label from the set of labels L and different edges can have the same label. The objective is to find a spanning tree with the minimum number of distinct labels. The MLST is motivated from applications in the communications sector. Since communication networks sometimes include numerous different media such as fiber optics, cable, microwave or telephone lines and communication along each edge requires a specific media type, decreasing the number of different media types in the spanning tree reduces the complexity of the communication process. The MLST problem is known to be NP-complete [2]. Several researchers have ¨ studied the MLST problem including Bruggemann et al. [1], Cerulli et al. [3], Consoli et al. [4], Krumke and Wirth [6], Wan et al. [14], and Xiong et al. [16–18].

 Corresponding author. Tel.: + 1 301 405 2232; fax: +1 301 405 8655.

E-mail address: [email protected] (B. Golden). 0305-0548/$ - see front matter & 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2009.12.013

Recently Xiong et al. [15] introduced a more realistic version of the MLST problem called the label constrained minimum spanning tree (LCMST) problem. In contrast to the MLST problem, which completely ignores edge costs, the LCMST problem takes into account the cost or weight of edges in the network (we use the term cost and weight interchangeably in this paper). The objective of the LCMST problem is to find a minimum weight spanning tree that uses at most K labels (i.e., different types of communications media). Xiong et al. [15] describe two simple local search heuristics and a genetic algorithm for solving the LCMST problem. They also describe a mixed integer programming (MIP) model to solve the problem exactly. However, the MIP models were unable to find solutions for problems with greater than 50 nodes due to excessive memory requirements. The cost constrained minimum label spanning tree (CCMLST) problem is another realistic version of the MLST problem. The CCMLST problem was introduced by Xiong et al. [15]. In contrast to the LCMST problem, there is a threshold on the cost of the minimum spanning tree (MST) while minimizing the number of labels. Thus, given a graph G= (V, E), where each edge (i, j) has a label from the set L and an edge weight cij, and a positive budget B, the goal of the CCMLST problem is to find a spanning tree with the

ARTICLE IN PRESS Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

fewest number of labels whose weight does not exceed the budget B. The notion is to design a tree with the fewest number of labels while ensuring that the budget for the network design is not exceeded. (Notice that the objective function here is not the cost of the spanning tree, but rather the number of labels in the spanning tree.) Xiong et al. [15] showed that both the LCMST and the CCMLST are NP-Complete. Thus, the resolution of these problems requires heuristics. In this paper, we first focus on the CCMLST problem. We propose a variable neighborhood search (VNS) method for the CCMLST problem. The VNS algorithm uses neighborhoods defined on the labels. We also adapt the VNS method that we develop to the LCMST problem. We then compare the VNS method to the heuristics described by Xiong et al. [15]. To do so, we consider existing data sets and also design a set of nine Euclidean largescale datasets, derived from TSPLIB instances [12]. On the LCMST problem we directly compare our VNS method with the heuristic procedures of Xiong et al. [15] (which are the best known heuristics for the LCMST problem). On the CCMLST, we adapt the procedures of Xiong et al. [15] by embedding them in a binary search procedure. The VNS methods perform extremely well on both the CCMLST and LCMST problems, with respect to solution quality and computational running time. The rest of this paper is organized as follows. Section 2 describes the mathematical formulation proposed for the LCMST and CCMLST problems. Section 3 describes the VNS method for the CCMLST problem. Section 4 describes the VNS method for the LCMST problem. Section 5 describes the procedures of Xiong et al. [15] for the LCMST problem, and explains the binary search procedure. Section 6 reports on our computational experiments. Finally, Section 7 provides concluding remarks.

xij ¼



1

if arc ði; jÞ is used

0

otherwise

1953

and fijh ¼ flow of commodity h along arc ði; jÞ: The MIP formulation based on the multicommodity flow (mcf) model is as follows: X ðmcfÞ min yk ð1Þ kAL

X

subject to

xij ¼ n1

ð2Þ

ði;jÞ A A

X

i:ði;DðhÞÞ A A

X i:ði;1Þ A A

fijh r xij

ð3Þ

X

f1lh ¼ 1

8h A H

ð4Þ

X

fjlh ¼ 0

8h A H;

8j ah

ð5Þ

l:ðj;lÞ A A

8ði; jÞ A A;

xij þ xji r 1 X

8hA H

l:ð1;lÞ A A

fijh 

i:ði;jÞ A A

h fDðhÞl ¼1

l:ðDðhÞ;lÞ A A

fi1h 

X

X

h fiDðhÞ 

8hA H

ð6Þ

8ði; jÞ A E

xij r ðn1Þ  yk

ð7Þ 8kA L

ð8Þ

ði;jÞ A Ak

X

cij xij r B

ð9Þ

ði;jÞ A A

2. Mathematical formulation In this section, we provide a mixed integer programming (MIP) model for the CCMLST problem. It is based on a multicommodity network flow formulation, and is similar to the MIP model described in Xiong et al. [15] (though our notation is somewhat more compact). Further, we significantly strengthen the model by improving upon one of the constraints in their model. The main idea in the multicommodity network flow model is to direct the spanning tree away from an arbitrarily selected root node, and to use flow variables to model the connectivity requirement. To help define the multicommodity network flow model, we define a bidirected network obtained by replacing each undirected edge {i, j} by a pair of directed arcs (i, j) and (j, i). Let A denote the set of arcs, L= {1,2, y, l} the set of labels, V= {1,2, y, n} the set of nodes in the graph, and B the budget. Also, let Ak denote the set of all arcs with label k and cij be the cost of arc (i, j). Note that the cost and label of arcs (i, j) and (j, i) are identical to those of edge {i, j}. To model the fact that the spanning tree must be connected, we use the following well-known idea [8] and multicommodity network flow model. We pick node 1 as the root node (any node of the graph may be picked for this purpose). We then observe that the spanning tree on the nodes can be directed away from the root node. Consequently, we create a set of H commodities, where each commodity h has the root node as its origin and the destination (denoted by D(h)) is one of the nodes in {2, 3, y, n} (for a total of n  1 commodities). Each commodity has a supply of 1 unit of flow and a demand of 1 unit of flow. The variables in the multicommodity network flow formulation are defined as follows:  1 if label k is selected yk ¼ 0 otherwise

xij ; yk

A f0; 1g

fijh Z 0

8ði; jÞ A A;

8ði; jÞ A A; 8hA H:

8k A L

ð10Þ ð11Þ

In the objective function (1), we want to minimize the total number of labels used in the solution. Constraint (2) ensures the tree has exactly (n  1) arcs. Constraints (3)–(5) represent the flow balance constraints for the commodity flows. Constraint set (6) is a forcing constraint set. These constraints enforce the condition that if flow is sent along an arc, the arc must be included in the directed tree. Constraint set (7) ensures that either arc (i, j) or arc (j, i) can be in the solution, but not both (recall the tree must be directed away from the root node). Constraint set (8) is a forcing constraint set between arcs and labels. It says that if an arc with label k is used, then this label must be selected. Constraint (9) imposes the budget on the tree cost. Finally, constraint (10) defines the arc and label variables as binary, and constraint (11) defines the flow variables as non-negative. Note that to model the LCMST: (i) we use the left-hand side of constraint (9) as the objective, and (ii) constraint (9) is replaced P by the constraint k A L yk r K, where K denotes the limit on the number of labels permitted. As such (with the change in the objective function and constraint (9)) the multicommodity flow formulation is virtually identical to Xiong et al. [15] (we have eliminated the edge variables in the Xiong et al. [15] model and thus our notation is somewhat more compact). However, this formulation can be considerably strengthened by using the technique of constraint disaggregation (see page 185 of [5]) on constraint set (8). We replace this constraint with the stronger xij r yk

8kA L;

8ði; jÞ A Ak :

ð80 Þ

In our computational work, we use the multicommodity flow model with constraint set (80 ) to obtain lower bounds and optimal solutions on our test instances. We found that it is considerably

ARTICLE IN PRESS 1954

Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

stronger than the multicommodity flow model proposed by Xiong et al. [15] that has constraint (8).

minimum spanning tree algorithms [7,11]. Consequently, our search for a solution focuses on selecting labels (as opposed to edges), and our neighborhoods as such are neighborhoods on labels. Our solutions then are described in terms of the labels they contain (as opposed to the edges they contain). Furthermore, without loss of generality, we assume MSTCOST(L)rB, because if BoMSTCOST(L) the problem is infeasible.

3. Variable neighborhood search for the CCMLST problem In this section, we develop our variable neighborhood search algorithm for the CCMLST problem. We also describe how it can be applied to the LCMST problem. Variable neighborhood search is a metaheuristic proposed by Mladenovic and Hansen [9], which explicitly applies a strategy based on dynamically changing neighborhood structures. The algorithm is very general and many degrees of freedom exist for designing variants. The basic idea is to choose a set of neighborhood structures that vary in size. These neighborhoods can be arbitrarily chosen, but usually a sequence of neighborhoods with increasing cardinality is defined. In the VNS paradigm, an initial solution is generated, then the neighborhood index is initialized, and the algorithm iterates through the different neighborhood structures looking for improvements, until a stopping condition is met. We consider VNS as a framework, and start by constructing an initial solution. We then improve upon this initial solution using local search. Then, the improvement of the incumbent solution (R) continues in a loop until the termination criterion is reached. This loop contains a shaking phase and a local search phase. The shaking phase follows the VNS paradigm. It considers a specially designed neighborhood and makes random changes to the current solution that enables us to explore neighborhoods farther away from the current solution. The local search phase considers a more restricted neighborhood set and attempts to improve upon the quality of a given solution. We now make an important observation regarding the relationship between the selected labels and the associated solution. Given a set of labels RD L, the minimum cost solution on the labels R is the minimum spanning tree computed on the graph induced by the labels in R. We denote the minimum spanning tree on the graph induced by the labels in R as MST(R) and its cost by MSTCOST(R). These two can be computed rapidly using any of the well-known

3.1. Initial solution Our procedure to construct an initial solution focuses on selecting a minimal set of labels that result in a connected graph. Let Components(R) denote the number of connected components in the graph induced by the labels in R. This can easily be computed using depth first search [13]. Our procedure adds labels to our solution in a greedy fashion. The label selected for addition to the current set of labels is the one (amongst all the labels that are not in the current set of labels) that when added results in the minimum number of connected components. Ties between labels are broken randomly. In other words, we choose a label for addition to the current set of labels R randomly from the set S ¼ ft A ðL\RÞ : min

2

3

1

Labels:

3

2

1

a

2 b

2.5

1

2.5 c

Initial graph

1

1

2

2

2

2

3

1

3 1

2.5 Components (a) = 4

2

2

3

3 2

2 1

2.5

Components (c) = 2

Components (b) = 3

2.5

ð12Þ

This continues until the selected labels result in a single component. In Fig. 1, an example illustrating the initialization method is shown. Suppose there are three labels, namely a, b, and c, in the label set. Since the number of connected components after adding label c is less than for the two other labels, we add this label to the solution. However the graph is still not connected, so we go further by repeating this procedure with the remaining labels. Both labels a and b produce the same number of components, so we select one of them randomly (label b). Before considering the cost constraint, which we need to satisfy, we have found it useful to try to improve the quality of the initial solution slightly. To this aim, we swap used and unused labels in a given solution in order to decrease the cost of a minimum spanning tree on the selected labels. We scan through the labels in the current solution. We iteratively consider all

2 1

Components ðR [ ftgÞg:

2.5

Connected graph with label b and c Fig. 1. An example illustrating the selection of labels for the initial connected subgraph.

ARTICLE IN PRESS Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

unused labels and attempt to swap a given label in the current solution with an unused label if it results in an improvement (i.e., the cost of the minimum spanning tree on the graph induced by the labels decreases). As soon as an improvement is found, it is implemented and the next used label in the current solution is examined. This is illustrated with an example in Fig. 2. Consider A ¼ fb; cg we have MSTCOST(A)= 12 and MSTCOSTðfA\bg [agÞ ¼ 9. Therefore, we remove label b and add label a to the representation of our solution. At this stage, it is possible that the set of labels in the current solution does not result in a tree that satisfies the budget constraint. To find a set of labels that does, we iteratively add labels to the current set of labels by choosing the label that, when added, results in the lowest cost minimum spanning tree. In other words the label to be added is selected from S ¼ ft A ðL\RÞ : min MSTCOSTðR [ ftgÞg:

ð13Þ

and ties are broken randomly. We continue adding labels to the current solution in this fashion until we obtain a minimum spanning tree satisfying the cost constraint. (Recall since B ZMSTCOST(L) a feasible solution exists and the initialization phase will find one.) 3.2. Shaking phase The shaking phase follows the VNS paradigm and dynamically expands the neighborhood search area. Suppose R denotes the current solution (it really denotes the labels in the current solution, but as explained earlier it suffices to focus on labels). In this step, we use randomization to select a solution that is in the size k neighborhood of the solution R, i.e., Nk(R). Specifically Nk(R) is defined as the set of labels that can be obtained from R by performing a sequence of exactly k additions and/or deletions of labels. So N1(R) is the set of labels obtained from R by either adding exactly one label from R, or deleting exactly one label from R. N2(R) is the set of labels obtained from R by either adding exactly two labels, or deleting exactly two labels, or adding exactly one label and deleting exactly one label. The shaking phase may result in the selection of labels that do not result in a connected graph, or result in a minimum cost

2

2 1

2

1

1

2 1

3

MSTCOST ({b, c}) = 12

2.5

1

2.5

MSTCOST ({a, c}) = 9

Fig. 2. An example for the swap of used and unused labels.

1955

spanning tree that does not meet the budget constraint. If the set of labels results in a graph that is not connected, we add labels that are not in the current solution one by one, at random until the graph is connected. If the minimum spanning tree on the selected labels does not meet the budget constraint, we iteratively add labels to the current set of labels by choosing the label that when added results in the lowest cost minimum spanning tree. 3.3. Local search phase The local search phase consists of two parts. In the first part, the algorithm tries to swap each of the labels in the current solution with an unused one if it results in a lower minimum spanning tree cost. To this aim, it iteratively considers the labels in the solution and tests all possible exchanges of a given label with unused labels until it finds an exchange resulting in a lower MST cost. If we find such an exchange, we implement it (i.e., we ignore the remaining unused labels) and proceed to the next label in our solution. Obviously, a label remains in the solution if the algorithm cannot find a label swap resulting in an improvement. The second part of the local search phase tries to improve the quality of the solution by removing labels from the current set of labels. It iteratively, tries to remove each label. If the resulting set of labels provides a minimum spanning tree whose cost does not exceed the budget we permanently remove the label, and continue. Otherwise, the label remains in the solution. Our variable neighborhood search algorithm is outlined in Algorithm 1. 3.4. Discussion of algorithmic parameter choices We now discuss the different components of our VNS algorithm and identify the rationale for the different choices made within the algorithm. In constructing the initial solution, we first chose labels to construct a connected graph. Our choice was to choose the label that resulted in the greatest decrease in the number of connected components. An alternative is to simply add labels randomly. We conducted experiments with these 2 variants for connectivity of the initial solution, and found that choosing to add labels so that they result in the greatest decrease in the number of connected components provided significantly better solutions. Table 1 identifies the different parts of the VNS algorithm for the CCMLST, the parameter or algorithmic choices within each part, and the choice that resulted in the best solution. For example, when comparing the use of swapping against no swapping in the construction of the initial solution, swapping labels provided the best results. In Table 1, the parameter cost constraint refers to the scenario where the cost of the solution is strictly greater than B. Here, in order to reduce the cost of the solution, min avg cost considers the unused label with the smallest average cost to add to the solution. Feasibility postshaking phase refers to the part of the algorithm where feasibility

Table 1 Parameter/algorithmic choices within the VNS procedure for the CCMLST problem. Phase

Parameter varied

Values

Best value

Initial solution

Connectivity Swap Cost constraint Local search applied to initial solution

Random, min components Yes/no Random, min avg cost, min MST cost Yes/no

Min components Yes Min MST cost Yes

Feasibility post-shaking phase

Connectivity Cost constraint

Random, min components Min avg cost, min MST cost

Random Min MST cost

Local search

Swap

Yes/no

Yes

ARTICLE IN PRESS 1956

Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

of the solution obtained by the shaking phase is restored. As can be seen from Table 1, the best algorithmic choices within the different parts of the VNS algorithm are the ones used in our VNS algorithm described in Algorithm 1. Algorithm 1. Variable neighborhood search algorithm for the CCMLST problem VNS for CCMLST R = j; Initialization procedure (G, R); Local search (G, R); While termination criterion not met k= 1; While kr5 R ¼ Shaking_PhaseðG; k; RÞ; While Components ðRÞ 4 1 Select at random a label u A L\R and add it to R; End While MSTCOSTðRÞ 4 B S ¼ ft A ðL\RÞ : min MSTCOSTðR [ ftgÞg; Select at random a label u A S and add it to R; End Local search ðG; RÞ; If jRj ojRj then R ¼ R and k= 1 else k=k +1; End End Initialization procedure (G, R) While Components (R) 41 S ¼ ft A ðL\RÞ : min ComponentsðR [ ftgÞg; Select at random a label uAS and add it to R; End Consider the labels iAR one by one Swap the label i with the first unused label that strictly lowers the MST cost; End While MSTCOST(R)4B S ¼ ft A ðL\RÞ : min:6trueemMSTCOSTðR [ ftgÞg; Select at random a label uAS and add it to R; End Shaking_Phase (G, k, R) For i= 1, y, k r = random(0,1); If r r0.5 then delete at random a label from R else add at random a label to R; End Return(R); Local search(G,R) Consider the labels iAR one by one Swap the label i with the first unused label that strictly lowers the MST cost; End Consider the labels iAR one by one Delete label i from R, i.e. R =R\i; If Components (R)41 or MSTCOST(R) 4B then R ¼ R [ i; End

4. Variable neighborhood search for the LCMST problem We now adapt our VNS procedure for the CCMLST problem to the LCMST problem. Recall, the objective of the LCMST problem is to find a minimum weight spanning tree that uses at most

K labels. Given a choice of labels, the cost of the tree is simply the minimum spanning tree cost on the given choice of labels. Consequently, the LCMST problem is also a problem that essentially involves selecting a set of labels for the tree. Consequently, heuristics for the LCMST problem also search for a set of K or fewer labels that minimize the cost of the spanning tree. Our framework for the VNS method for the LCMST problem is essentially the same as the CCMLST problem with a few minor changes that we now explain. Initialization: As before, we first choose a set of labels in a greedy fashion to ensure that the resulting graph is connected. However, it is easy to observe that if R is a subset of labels, then the cost of an MST on any superset T of R is less than or equal to the cost of the MST on R. In other words, if R D T then MSTCOST(R)ZMSTCOST(T). Consequently, in order to try to minimize the cost, we iteratively add labels to the initial set of labels until we have K labels. The label to be added in any iteration is selected from the set S ¼ ft A ðL\RÞ : min MSTCOSTðR [ ftgÞg, with ties broken randomly. Shaking Phase: The shaking phase is identical to the shaking phase for the VNS method for the CCMLST problem. It may result in the selection of labels that do not result in a connected graph, or result in a set with more than K labels. If the set of labels results in a graph that is not connected, we add labels that are not in the current solution one by one, at random, until the graph is connected. If the set of labels exceeds K, we iteratively delete labels from the current set of labels by choosing the label that when deleted results in the lowest cost minimum spanning tree. Local Search: The local search procedure first adds labels to a given solution until it has K labels. The additional labels are selected iteratively, each time selecting a label that provides the greatest decrease in the cost of the minimum spanning tree, until we have K labels. The local search procedure then tries to swap each of the labels in the current solution with an unused one, if it results in a lower minimum spanning tree cost. This is done in an analogous fashion to the local search procedure in the VNS method for the CCMSLT problem. The pseudo code for the VNS method for the LCMST problem is provided in Algorithm 2. Before we turn our attention to our computational experiments we must note the recent work by Consoli et al. [4] on the MLST problem that includes a VNS method for the MLST problem. We briefly compare the two VNS methods (albeit on different problems). Both of the methods follow the main structure of VNS. Consequently, at a high level they are somewhat similar, but they are different in several details. For example, in the initialization phase instead of generating a random solution as done in the paper by Consoli et al. [4] we use a more involved procedure to generate an initial solution by considering the set of labels that produces fewer components and smaller MST cost. Additionally, we use local search to improve the solution found in the initialization phase prior to applying VNS. Finally, since the objectives of the LCMST and CCMLST problems are somewhat different from the MLST problem, our local search operator is quite different from that in Consoli et al. [4]. Algorithm 2. Variable neighborhood search algorithm for the LCMST problem VNS for LCMST R= j; Initialization procedure(G,R); Local search (G,R); While termination criterion not met k= 1; While k r5 R ¼ Shaking_PhaseðG; k; RÞ;

ARTICLE IN PRESS Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

While Components ðRÞ 41 Select at random a label u A L\R and add it to R; End While jRj 4K S ¼ ft A ðL\RÞ : min MSTCOSTðR\ftgÞg; Select at random a label uAS and delete it from R; End Local search ðG; RÞ; If MSTCOSTðRÞ o MSTCOSTðRÞ then R ¼ R0 and k= 1 else k =k+ 1; End End Initialization procedure(G,R) While Components (R) 41 S ¼ ft A ðL\RÞ : min ComponentsðR [ ftgÞg; Select at random a label uAS and add it to R; End While |R|oK S ¼ ft A ðL\RÞ : min MSTCOSTðR [ ftgÞg; Select at random a label uAS and add it to R. End. Shaking_Phase (G,k,R) For i =1, y, k r = random(0,1); If r r0.5 then delete at random a label from R else add at random a label to R; End Return(R); Local search(G,R) While |R| oK S ¼ ft A ðL\RÞ : min MSTCOSTðR [ ftgÞg; Select at random a label uAS and add it to R. End. Consider the labels iAR one by one Swap the label i with the first unused label that strictly lowers the MST cost; End

5. Applying heuristics for the LCMST problem to the CCMLST problem by means of binary search We first describe the heuristics of Xiong et al. [15] for the LCMST problem. We then describe how any heuristic for the LCMST problem may be applied to the CCMLST problem by using binary search. This allows us to apply the heuristics of Xiong et al. [15] to the CCMLST problem. The first proposed heuristic, LS1, by Xiong et al. [15] for the LCMST, begins with an arbitrary feasible solution, A={a1,a2, y aK} of K labels. Then, it starts a replacement loop as follows. It first attempts to replace a1 by the label in L\A that gives the greatest reduction in MST cost. In other words, it checks each label in L\A and selects the label tAL\A that results in the lowest value of MSTCOST({A\a1}[t). If it finds an improvement (i.e., MSTCOST ({A\a1}[t)oMSTCOST(A)) it sets a1 =t, otherwise it leaves a1 unchanged. Next, it repeats the procedure with label a2 attempting to replace it by the label in L\A that gives the greatest reduction in MST cost. It continues the replacement loop in this fashion until it considers replacing label aK by the label in L\A that gives the greatest reduction in MST cost. In one replacement loop all of the K labels in A are considered for replacement. This continues until no cost improvement can be made between two consecutive replacement loops [15].

1957

In the second heuristic, LS2, the algorithm starts with an arbitrary feasible solution A= {a1,a2, y, aK} of K labels. Then, it attempts to find improvements as follows. It adds a label in aK + 1AL\A to A. This results in a solution with K + 1 labels, which is not feasible. Consequently, it chooses amongst these K + 1 labels the label to delete that results in the smallest MST cost. To find the best possible improvement of this type, it searches among all possible additions of labels in L\A to A and selects the one that provides the greatest improvement in cost. The procedure continues until no further improvement is found [15]. In the GA proposed by Xiong et al. [15], a queen-bee crossover is applied. This approach has often been found to outperform more traditional crossover operators. In each generation, the best chromosome is declared the queen-bee and crossover is only allowed between the queen-bee and other chromosomes. For more details regarding crossover, mutation, etc., see [15]. Binary search is a popular algorithmic paradigm (see page 171 of [10]) that efficiently searches through an interval of integer values to find the smallest (or largest) integer that satisfies a specified property. It starts with an interval 1, y, l. In each step it checks whether the midpoint of the interval satisfies the specified property. If so, one may conclude the smallest integer that satisfies the specified property is in the lower half of the interval, otherwise the smallest integer that satisfies the specified property is in the upper half of the interval. It recursively searches through the interval in which the solution lies, until (after dlog2 le steps) the interval contains only one integer value and the procedure terminates. We now describe how to use the binary search method to apply any algorithm for the LCMST problem to the CCMLST problem. Let ALG denote any heuristic for the LCMST problem. It takes as input a graph G and a threshold K. ALG attempts to find a minimum cost spanning tree that uses at most K labels. Either it returns a feasible solution, i.e., a set of at most K labels, or it indicates that no feasible solution has been found by sending back an empty set of labels. The cost of the solution can then be determined by MSTCOST(R) where R denotes the set of labels. Note that MSTCOST(R) is infinity if R is an empty set. The details of the binary search method as applied to the CCMSLT problem are provided in Algorithm 3. The lower value for the number of labels is set to 1 and the upper value for the number of labels is set to l (the total number of labels). As is customary in binary search, ALG is executed setting the threshold on the number of labels to (lower+upper)/2. Note that since the number of labels must be integer, ALG rounds down any non-integral value of the threshold K. Essentially, anytime the cost of the tree found by ALG exceeds the budget B, we need more labels and increase the lower value to (lower+upper)/2. Anytime the cost of the tree found by ALG is within budget, we decrease the value of upper to (lower+upper)/2. Algorithm 3. Binary search method for the CCMLST problem Begin Set lower=1 and upper =l; While (upper lower)Z1 mid= (lower+upper)/2; R= ALG(G,mid); If MSTCOST(R)4B then lower= mid else upper = mid; End Output R=ALG(G,upper); End

6. Computational results In this section, we report on an extensive set of computational experiments on both the CCMLST and LCMST problems. All heuristics have been tested on a Pentium IV machine with a

ARTICLE IN PRESS 1958

Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

2.61 GHz processor and 2 GB RAM, under the Windows operating system. We also use ILOG CPLEX 10.2 to solve the MIP formulation. The two parameters that are adjustable within the VNS procedure are the value of k (the size of the largest neighborhood Nk(R) in the VNS method), and Iter, the number of iterations in which the algorithm is not able to improve the best known solution (which is the termination criterion). Increasing k, increases the size of the neighborhood but also increases the running time. We found that setting k=5 provides the best results without a significant increase in running time. Additionally, as the value of Iter is increased the running time of the algorithm is increase, though the quality of the solution improves. We found that setting Iter = 10 provides the best results in a reasonable amount of running time. We now describe how we generated our datasets, and then discuss our computational experience on these datasets for both CCMLST and LCMST problems. 6.1. Datasets Xiong et al. [15] created a set of test instances for the LCMST problem. These include 37 small instances with 50 nodes or less, 11 medium-sized instances that range from 100 to 200 nodes, and one large instance with 500 nodes. All of these instances are complete graphs. We used Xiong et al.’s small and medium-sized instances for our tests on the LCMST problem. For these problems, the optimal solutions are known for the small instances, but not for any of the medium-sized instances. We also generated a set of 18 large instances that range from 500 to 1000 nodes. These were created from nine large TSP instances in TSPLIB and considered to be Euclidean (since the problems arise in the telecommunications industry, the costs of edges are generally proportional to their length). To produce a labeled graph from a TSPLIB instance, we construct a complete graph using the coordinates of the nodes in the TSPLIB instance. The number of labels in the instance is one half of the total number of nodes, and the labels are randomly assigned. For the LCMST problem, we set the threshold on the maximum number of labels as 75 and 150, creating two instances for each labeled graph created from a TSPLIB instance. We adapted the instances created by Xiong et al. [15] to the CCMLST as follows. Essentially, for the labeled graph instance, we create a set of different budget values. These budget values are selected starting from slightly more than MSTCOST(L) with increments of approximately 500 units. The small instances in Xiong et al. were somewhat limited in the sense that the number of labels is equal to the number of nodes in the graph. Consequently, using Xiong et al.’s code, we generated additional labeled graphs where the number of labels was less than the

number of nodes in the graph. The main aim of the small instances was to test the quality of the VNS method and the binary search method on instances where the optimal solution is known. Consequently, we restricted our attention on these small instances to problems where we were able to compute the optimal solution using CPLEX on our MIP formulation. In this way, we created 104 small instances (with 10 to 50 nodes) where the optimal solution is known, and 60 medium-sized instances (with 100 to 200 nodes). We adapted the TSPLIB instances to create 27 large instances. We used the labeled graph generated from the TSPLIB instance, and created three instances from each labeled graph by varying the budget value. Specifically, we used budget values of 1.3, 1.6, and 1.9 times MSTCOST(L). 6.2. Results for CCMLST instances We first discuss our results on the 191 CCMLST instances. These results are described in Tables 2–6 for the small instances, Tables 7–9 for the medium-sized instances, and Table 10 for the large instances. On the 104 small instances, the VNS method found the optimal solution in all cases (recall that the optimal solution is known in all of these instances), while LS1, LS2, and GA generated the optimal solution 100, 100, and 102 times, respectively, out of the 104 instances. The average running time of the VNS method was 0.05 s, while LS1, LS2, and GA took 0.06, 0.07, and 0.18 s, respectively. For the small and medium-sized instances, the termination criterion used was 10 iterations without an improvement. On the 60 medium-sized instances, the VNS method generated the best solution in 59 out of the 60 instances, while LS1, LS2, and GA generated the best solution 46, 50, and 50 times, respectively, out of the 60 instances. The average running time of the VNS method was 20.59 s, while LS1, LS2, and GA took 63.52, 68.75, and 62.19 s, respectively. This indicates that the VNS method finds better solutions in a greater number of instances much more rapidly than any of the three comparative procedures. For the large instances the termination criterion used was a specified running time which is shown in the tables with the computational results. On the 27 large instances, the VNS method generated the best solution in 26 out of the 27 instances, while LS1, LS2, and GA generated the best solution 19, 18, and 14 times, respectively, out of the 27 instances. The average running time of the VNS method was 667 s, while LS1, LS2, and GA took 22,816, 25,814, and 11,477 s, respectively. This clearly shows the superiority of the VNS method, especially as the instances get larger. It finds the best solution in a greater number of instances (and for almost all instances) an order of magnitude faster than any of the three comparative procedures.

Table 2 VNS, GA, LS1, and LS2 for the CCMLST problem on 10 nodes. #Labels Cost restriction

10

7000 6500 6000 5500 5000 4500 4000 3500 3000 2500

Exact method

LS1

LS2

GA

VNS

Labels Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time

2 2 2 2 2 2 2 2 3 4

0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.01

0.01 0.00 0.01 0.01 0.01 0.00 0.01 0.00 0.00 0.01

0.00 0.02 0.00 0.02 0.00 0.00 0.00 0.02 0.00 0.02

0.02 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.23 0.02 0.03 0.02 0.11 0.09 0.05 0.06 0.03 0.13

2 2 2 2 2 2 2 2 3 4

0 0 0 0 0 0 0 0 0 0

2 2 2 2 2 2 2 2 3 4

0 0 0 0 0 0 0 0 0 0

2 2 2 2 2 2 2 2 3 4

0 0 0 0 0 0 0 0 0 0

2 2 2 2 2 2 2 2 3 4

0 0 0 0 0 0 0 0 0 0

ARTICLE IN PRESS Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

1959

Table 3 VNS, GA, LS1, and LS2 for the CCMLST Problem on 20 nodes. #Labels Cost restriction

Exact method

LS1

Labels Time

Labels Gap from optimal (%)

10

7500 7000 6500 6000 5500 5000 4500 4000 3500 3050

2 2 2 2 2 3 3 4 5 8

20

7500 7000 6500 6000 5500 5000 4500 4000 3500 3050

2 2 2 3 3 4 5 6 8 11

3.33 2.05 3.78 1.72 2.70 2.80 6.72 29.02 7.08 0.84

LS2

GA

VNS

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time

2 2 2 2 2 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

2 2 2 2 2 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.01 0.01 0.01

2 2 2 2 2 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.00 0.00 0.00 0.02 0.00 0.02 0.02 0.03

2 2 2 2 2 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.02 0.00 0.02 0.02 0.00 0.00 0.02 0.02 0.02 0.02

0.69 2 0.34 2 1.14 2 2.98 3 15.98 3 4.30 4 18.16 5 12.28 6 3.83 8 0.88 11

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.02 0.00 0.02 0.02 0.02 0.02 0.02

2 2 2 3 3 4 5 6 8 11

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02

2 2 2 3 3 4 5 6 8 11

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.00 0.02 0.02 0.03 0.03 0.02 0.03 0.06

2 2 2 3 3 4 5 6 8 11

0 0 0 0 0 0 0 0 0 0

0.02 0.00 0.00 0.00 0.02 0.02 0.02 0.02 0.02 0.02

Table 4 VNS, GA, LS1, and LS2 for the CCMLST Problem on 30 nodes. #Labels Cost restriction

Exact method

LS1

Labels Time

Labels Gap from optimal (%)

10

8000 7500 7000 6500 6000 5500 5000 4500 4000 3500

2 2 2 2 3 3 3 4 5 8

20

8000 7500 7000 6500 6000 5500 5000 4500 4000 3500

30

8000 7500 7000 6500 6000 5500 5000 4500 4000 3500

31.61 77.45 20.59 41.84 45.59 134.80 56.22 33.39 15.42 26.23

LS2

GA

VNS

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time

2 2 2 2 3 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.01 0.02

2 2 2 2 3 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01

2 2 2 2 3 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.03 0.02 0.02 0.03 0.03 0.03 0.03 0.03 0.03 0.06

2 2 2 2 3 3 3 4 5 8

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.03 0.02

3 3 3 4 4 5 5 6 8 14

69.92 3 132.47 3 9.69 3 31.55 4 76.02 4 111.28 5 74.59 5 46.27 6 22.90 8 177.91 14

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.03 0.02 0.03 0.03 0.03 0.03 0.04

3 3 3 4 4 5 5 6 8 14

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.03 0.02 0.02 0.03 0.03 0.03 0.03 0.04

3 3 3 4 4 5 5 6 8 14

0 0 0 0 0 0 0 0 0 0

0.08 0.08 0.08 0.09 0.08 0.09 0.09 0.09 0.13 0.17

3 3 3 4 4 5 5 6 8 14

0 0 0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.02 0.03 0.03 0.04 0.03 0.04 0.03

3 4 4 4 5 6 7 8 10 15

19.12 3 33.04 4 62.60 4 16.01 4 4.56 5 45.34 6 82.30 7 95.27 8 55.84 11 385.09 15

0 0 0 0 0 0 0 0 10 0

0.03 0.03 0.05 0.03 0.03 0.06 0.05 0.06 0.09 0.09

3 4 4 4 5 6 7 8 11 15

0 0 0 0 0 0 0 0 10 0

0.03 0.03 0.03 0.05 0.05 0.06 0.05 0.08 0.08 0.09

3 4 4 4 5 6 7 8 11 15

0 0 0 0 0 0 0 0 10 0

0.06 0.08 0.08 0.08 0.08 0.13 0.13 0.13 0.13 0.30

3 4 4 4 5 6 7 8 10 15

0 0 0 0 0 0 0 0 0 0

0.05 0.03 0.05 0.05 0.05 0.05 0.06 0.06 0.06 0.09

6.3. Results for LCMST instances We now discuss our results on the 66 LCMST instances. These results are described in Table 11 for the small instances, Table 12 for the medium-sized instances and Table 13 for the large instances.

On the 37 small instances, the VNS method found the optimal solution in all 37 instances (recall that the optimal solution is known in all of these instances), while LS1, LS2, and GA generated the best solution 34, 33, and 29 times, respectively, out of the 37 instances. The average running time of the VNS method was 0.13 s while LS1,

ARTICLE IN PRESS 1960

Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

Table 5 VNS, GA, LS1, and LS2 for the CCMLST Problem on 40 nodes. #Labels Cost restriction

Exact method

LS1

LS2

Labels Time

Labels Gap from optimal (%)

GA

VNS

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time

10

9000 8500 8000 7500 7000 6500

2 2 2 3 3 3

6125.77 24,107.41 74,551.36 12,550.08 1734.30 1049.59

2 2 2 3 3 3

0 0 0 0 0 0

0.01 0.01 0.01 0.01 0.01 0.02

2 2 2 3 3 3

0 0 0 0 0 0

0.01 0.01 0.01 0.01 0.01 0.01

2 2 2 3 3 3

0 0 0 0 0 0

0.03 0.03 0.03 0.06 0.05 0.05

2 2 2 3 3 3

0 0 0 0 0 0

0.03 0.02 0.03 0.02 0.03 0.03

20

9000 8500 8000 7500

3 3 4 4

3344.61 3709.23 1064.82 1475.95

3 3 4 4

0 0 0 0

0.03 0.03 0.04 0.04

3 3 4 4

0 0 0 0

0.04 0.04 0.04 0.04

3 3 4 4

0 0 0 0

0.09 0.09 0.09 0.09

3 3 4 4

0 0 0 0

0.03 0.03 0.05 0.05

40

9000 8500 8000 7500 7000 6500 6000 5500 5000

5 5 6 6 7 8 9 11 14

1514.57 5 2223.00 5 17,008.94 6 541.14 7 424.28 7 1713.71 8 2519.92 9 1437.73 11 12,402.04 14

0 0 0 17 0 0 0 0 0

0.12 0.14 0.14 0.16 0.16 0.20 0.20 0.22 0.33

5 5 6 6 7 8 9 11 14

0 0 0 0 0 0 0 0 0

0.11 0.11 0.12 0.12 0.16 0.17 0.20 0.22 0.27

5 5 6 6 7 8 9 11 14

0 0 0 0 0 0 0 0 0

0.19 0.20 0.20 0.20 0.27 0.31 0.31 0.41 0.39

5 5 6 6 7 8 9 11 14

0 0 0 0 0 0 0 0 0

0.09 0.09 0.09 0.11 0.13 0.14 0.16 0.16 0.19

Table 6 VNS, GA, LS1, and LS2 for the CCMLST Problem on 50 nodes. #Labels Cost restriction

Exact method

LS1

LS2

Labels Time

Labels Gap from optimal (%)

GA

VNS

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time Labels Gap from optimal (%)

Time

10

9500 9000 8500 8000 7500 7000 6500 6000

2 2 2 3 3 3 4 4

9798.88 8972.91 8984.60 1443.76 24,729.75 28,404.61 46,791.06 6459.06

2 2 2 3 3 3 4 4

0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02

2 2 2 3 3 3 4 4

0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02

2 2 2 3 3 3 4 4

0 0 0 0 0 0 0 0

0.6 0.6 0.6 0.6 0.6 0.8 0.8 0.9

2 2 2 3 3 3 4 4

0 0 0 0 0 0 0 0

0.02 0.02 0.02 0.03 0.03 0.03 0.05 0.05

20

9500 9000 8500 8000 7500 7000 6500

3 3 4 4 5 5 6

9216.76 1880.70 7873.89 12,730.25 7572.67 49,912.15 22,651.00

3 3 4 4 5 5 6

0 0 0 0 0 0 0

0.07 0.05 0.06 0.06 0.12 0.08 0.07

3 3 4 4 5 6 6

0 0 0 0 0 20 0

0.06 0.06 0.07 0.07 0.07 0.07 0.09

3 3 4 4 5 5 6

0 0 0 0 0 0 0

0.16 0.17 0.19 0.19 0.19 0.22 0.28

3 3 4 4 5 5 6

0 0 0 0 0 0 0

0.05 0.06 0.08 0.08 0.08 0.08 0.09

30

9500 9000 8500 8000

5 5 5 6

11,111.32 12,638.70 15,358.56 36,336.06

5 5 5 6

0 0 0 0

0.12 0.12 0.12 0.13

5 5 5 6

0 0 0 0

0.14 0.14 0.14 0.15

5 5 5 6

0 0 0 0

0.31 0.30 0.30 0.30

5 5 5 6

0 0 0 0

0.09 0.09 0.13 0.13

50

9000 8500 8000 7500 7000 6500

7 8 8 9 11 12

1322.60 7 10,455.06 8 62,948.11 9 30,317.90 9 97,384.13 11 1,19,300.76 13

0 0 13 0 0 8

0.33 0.31 0.31 0.33 0.50 0.53

7 8 9 9 11 13

0 0 13 0 0 8

0.37 0.39 0.44 0.42 0.53 0.58

7 8 8 9 11 13

0 0 0 0 0 8

0.52 0.59 0.61 0.55 0.98 0.97

7 8 8 9 11 12

0 0 0 0 0 0

0.22 0.22 0.27 0.25 0.27 0.31

LS2, and GA took 0.11, 0.11, and 0.05 s, respectively. For the small and medium-sized instances, the termination criterion used was 10 iterations without an improvement. On the 11 medium-sized instances, the VNS method generated the best solution in all 11 instances, while LS1, LS2, and GA generated the best solution 7, 8, and

3 times, respectively, out of the 11 instances. The average running time of the VNS method was 9.82 s while LS1, LS2, and GA took 35.51, 38.05, and 8.03 s, respectively. This indicates that the VNS method finds better solutions in a much greater number of instances than any of the three comparative procedures. This advantage is clear on the

ARTICLE IN PRESS Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

1961

Table 7 VNS, GA, LS1, and LS2 for the CCMLST Problem on 100 nodes. # Labels

Cost restriction

LS1 Labels

LS2 Time

Labels

GA Time

Labels

VNS Time

Labels

Time

50

11,500 11,000 10,500 10,000 9500 9000 8500 8000 7500 7000

11 12 13 14 16 18 19 22 26 36

2.13 2.28 3.32 3.48 3.60 3.41 4.23 3.82 3.63 3.78

11 12 13 14 16 17 20 22 27 36

2.33 2.39 3.43 3.29 3.98 3.84 4.56 4.35 4.71 3.65

11 12 13 14 15 17 19 23 27 36

3.61 3.61 4.53 4.31 4.17 5.22 5.88 6.89 9.17 9.13

11 12 13 14 15 17 19 22 26 36

1.05 1.09 1.22 1.25 1.36 1.47 1.61 2.70 2.61 1.36

100

11,500 11,000 10,500 10,000 9500 9000 8500 8000 7500 7000

16 17 20 21 23 25 28 32 39 50

7.22 8.28 8.58 10.41 9.67 12.68 14.09 15.07 17.44 18.72

16 17 19 21 23 25 28 32 38 50

8.94 9.98 10.33 11.01 11.36 11.68 12.01 12.95 17.66 18.94

16 17 19 21 23 25 28 33 39 50

7.95 9.95 9.86 10.31 12.09 11.86 11.42 15.81 16.09 17.42

16 17 19 21 23 25 28 32 38 50

2.73 3.20 3.13 3.13 3.59 3.83 4.39 4.84 5.23 5.86

The best solutions are in bold.

Table 8 VNS, GA, LS1, and LS2 for the CCMLST problem on 150 nodes. # Labels

Cost restriction

LS1

LS2

GA

VNS

Labels

Time

Labels

Time

Labels

Time

Labels

Time

75

13,000 12,500 12,000 11,500 11,000 10,500 10,000 9500 9000 8500

17 18 20 22 25 27 31 35 41 55

14.96 14.58 23.81 24.91 22.48 22.62 18.92 25.39 25.09 18.10

18 18 20 22 25 27 31 35 41 55

12.83 12.51 17.57 18.00 19.98 20.55 20.08 28.55 22.48 19.56

17 19 20 22 24 27 30 35 41 55

17.00 21.66 21.30 22.02 22.64 35.56 35.53 39.41 35.78 59.47

17 19 20 22 24 27 30 35 41 55

4.78 4.91 5.44 6.06 10.36 6.55 7.84 7.38 7.61 5.94

150

13,000 12,500 12,000 11,500 11,000 10,500 10,000 9500 9000 8500

28 30 32 35 38 42 47 53 62 79

48.99 52.69 59.57 62.60 62.58 75.69 73.85 75.80 78.06 93.13

28 30 32 35 38 42 46 53 62 79

50.86 61.85 51.53 53.18 66.05 85.19 86.46 90.01 80.87 98.56

28 30 32 35 38 43 46 53 62 79

36.38 36.41 36.17 38.17 42.80 43.36 45.80 51.80 58.53 79.47

28 30 32 35 38 41 46 53 62 79

12.80 13.92 16.52 16.92 18.20 37.38 33.22 23.42 25.27 26.19

The best solutions are in bold.

medium-sized instances. For the large instances the termination criterion used was a specified running time which is shown in the tables with the computational results. On the 18 large instances, the VNS method generated the best solution in 12 out of the 18 instances, while LS1, LS2, and GA generated the best solution 2, 4, and 2 times, respectively, out of the 18 instances. The average running time of the VNS method was 1089 s, while LS1, LS2, and GA took 3418, 3371, and 1617 s, respectively. This suggests that the VNS method is the best method in that it rapidly finds solutions for the LCMST problem the most number of times. However, there seem to be a fair number of instances (about a third) where an alternate heuristic like LS1, LS2, or

GA obtains a superior solution. On the whole, the VNS method is clearly the best amongst these four heuristics. In order to demonstrate in a systematic way that the VNS method outperforms LS1, LS2, and GA, we applied the Wilcoxon signed rank test, a well-known nonparametric statistical test to compare VNS against each of the other three heuristics (alternatively, we might have applied the Friedman test). In each comparison, we reject the null hypothesis that both heuristics are equivalent in performance (the alternate hypothesis is that VNS is superior). Based on this analysis, we infer that VNS outperforms LS1 at a significance level of aLS1 = 0.00003167, VNS outperforms LS2

ARTICLE IN PRESS 1962

Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

Table 9 VNS, GA, LS1, and LS2 for the CCMLST problem on 200 nodes. # Labels

Cost restriction

LS1 Labels

LS2 Time

Labels

GA Time

VNS

Labels

Time

Labels

Time

100

14,000 13,500 13,000 12,500 12,000 11,500 11,000 10,500 10,000 9500

22 24 26 29 32 36 40 46 55 70

37.67 39.91 55.61 59.08 66.69 54.23 70.93 72.70 81.17 73.51

22 24 26 29 32 35 40 46 55 70

46.45 46.96 63.99 60.53 69.67 60.70 80.11 75.86 87.80 65.46

22 24 26 29 32 35 40 46 55 70

50.78 45.77 59.70 64.97 62.81 61.25 83.45 123.86 138.72 194.38

22 24 26 29 32 35 40 46 55 70

13.05 14.41 15.33 16.34 17.47 19.44 20.30 39.14 22.73 18.88

200

14,000 13,500 13,000 12,500 12,000 11,500 11,000 10,500 10,000 9500

35 38 40 44 48 54 60 68 80 101

188.07 174.69 184.59 171.44 172.10 230.51 215.70 248.04 280.02 284.59

35 37 40 44 48 53 60 68 80 101

179.56 198.79 201.66 193.66 211.57 246.54 236.61 312.48 300.20 308.48

35 37 41 43 49 54 60 67 80 101

136.59 134.84 134.27 130.80 163.52 169.02 178.59 214.44 290.39 334.92

35 37 40 43 48 53 59 67 80 101

37.94 41.97 64.90 47.70 51.02 56.44 133.78 69.19 73.61 115.42

The best solutions are in bold.

Table 10 VNS, GA, LS1, and LS2 for the CCMLST problem on large datasets. # Nodes, # labels

Cost restriction

LS1

LS2

GA

VNS

Labels

Time

Labels

Time

Labels

Time

Labels

Time

Max time

532, 266

98,800 121,600 144,400

70 44 32

5182 3252 2468

70 44 32

4890 3505 3013

70 44 32

3131 2500 2157

70 44 32

111 103 81

300 300 300

574, 287

42,250 52,000 61,750

86 57 42

6830 4999 3796

86 57 42

7103 51,333 3753

86 57 42

4234 2945 2600

85 56 42

278 190 86

300 300 300

575, 287

8190 10,080 11,970

79 47 33

6697 4371 3272

79 47 33

6575 4585 3897

79 47 33

4664 2971 2695

78 47 33

233 133 70

300 300 300

654, 327

39,000 48,000 57,000

54 36 28

6839 5422 4373

55 36 28

7332 5622 4542

55 36 28

4279 3494 3117

54 35 27

203 257 199

700 700 700

657, 328

55,900 68,800 81,700

95 61 44

15,237 10,451 8139

95 61 44

16,660 10,294 8458

95 61 44

8763 6158 4789

95 61 44

603 249 183

700 700 700

666, 333

3510 4320 5130

75 49 35

9763 6815 5208

75 49 35

10,329 6894 5340

76 49 35

6466 4089 3515

75 48 35

477 647 232

700 700 700

724, 362

49,400 60,800 72,200

102 65 47

18,755 15,802 12,550

102 65 47

24,650 17,318 12,856

102 66 47

13,053 8995 7522

101 65 47

693 478 259

1000 1000 1000

783, 391

10,660 13,120 15,580

111 68 48

26,056 19,425 17,509

111 69 48

30,572 19,754 16,581

111 69 49

16,231 10,605 9539

111 69 48

975 492 358

1000 1000 1000

1000, 500

20,800,000 25,600,000 30,400,000

141 86 62

156,482 134,765 101,576

140 86 62

188,704 119,253 103,167

141 86 62

76,294 49,080 45,996

140 86 62

2923 3920 3587

4000 4000 4000

The best solutions are in bold.

at a significance level of aLS2 = 0.0001078, VNS outperforms GA at a significance level of aGA =0.00001335. Thus, VNS outperforms LS1, LS2, and GA at a significance level of 1  (1 aLS1)(1  aLS2) (1 aGA) =0.00015.

7. Conclusion In this paper, we considered both the CCMLST and LCMST problems. We developed a VNS method for both the CCMLST

ARTICLE IN PRESS Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

1963

Table 11 VNS, GA, LS1, and LS2 for the LCMST problem on small datasets. # Nodes, # labels

K

MIP value

LS1

LS2

GA

VNS

Gap

Time

Gap

Time

Gap

Time

Gap

Time

20, 20

2 3 4 5 6 7 8 9 10 11

6491.35 5013.51 4534.67 4142.57 3846.50 3598.05 3436.57 3281.05 3152.05 3034.01

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.02 0.04 0.05 0.05 0.05 0.05 0.05 0.04 0.05 0.05

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.01 0.04 0.05 0.04 0.04 0.05 0.05 0.05 0.04 0.04

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.00 0.00 0.01 0.01 0.02 0.02 0.02 0.02 0.02 0.03

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.01 0.02 0.02 0.02 0.02 0.03 0.03 0.03 0.03 0.03

30, 30

3 4 5 6 7 8 9 10 11 12 13

7901.81 6431.58 5597.36 5106.94 4751.00 4473.11 4196.71 3980.99 3827.23 3702.08 3585.42

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.52 0.00 0.00 0.00

0.04 0.06 0.07 0.11 0.12 0.07 0.08 0.09 0.09 0.09 0.09

0.00 0.00 0.00 0.00 0.46 0.00 0.00 0.52 0.00 0.23 0.00

0.05 0.05 0.06 0.07 0.07 0.07 0.11 0.16 0.13 0.09 0.10

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.41 0.00 0.00

0.01 0.02 0.03 0.04 0.05 0.05 0.05 0.06 0.07 0.07 0.07

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.05 0.05 0.06 0.06 0.08 0.09 0.10 0.12 0.14 0.13 0.14

40, 40

3 4 5 6 7 8 9 10 11 12

11,578.61 9265.42 8091.45 7167.27 6653.23 6221.63 5833.39 5547.08 5315.92 5164.14

0.00 1.72 0.00 0.00 0.13 0.00 0.00 0.00 0.00 0.00

0.05 0.06 0.09 0.11 0.12 0.29 0.26 0.16 0.18 0.35

0.00 2.56 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.06 0.07 0.10 0.11 0.13 0.20 0.22 0.24 0.20 0.21

0.00 1.72 0.75 2.53 0.13 0.00 0.48 0.00 0.00 0.00

0.02 0.03 0.03 0.07 0.05 0.09 0.10 0.10 0.15 0.09

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.11 0.09 0.15 0.20 0.23 0.28 0.25 0.33 0.31 0.31

50, 50

3 4 5 6 7 8

14,857.09 12,040.89 10,183.95 9343.69 8594.36 7965.52

0.00 0.00 0.00 0.00 0.00 0.00

0.07 0.15 0.21 0.25 0.30 0.24

0.00 0.00 0.00 0.00 0.00 0.00

0.08 0.12 0.26 0.34 0.31 0.34

3.08 0.00 0.00 0.00 1.51 0.00

0.02 0.04 0.12 0.12 0.12 0.20

0.00 0.00 0.00 0.00 0.00 0.00

0.14 0.14 0.28 0.25 0.25 0.30

Table 12 VNS, GA, LS1, and LS2 for the LCMST problem on medium-sized datasets. # Nodes, # Labels

100, 100, 100, 150, 150, 150, 150, 200, 200, 200, 200,

50 100 100 75 75 150 150 100 100 200 200

Labels

20 20 40 20 40 20 40 20 40 20 40

LS1

LS2

GA

VNS

Cost

Time

Cost

Time

Cost

Time

Cost

Time

8308.68 10,055.85 7344.72 11,882.62 9046.71 15,427.54 10,618.58 14,365.95 10,970.94 18,951.05 12,931.46

3.92 5.89 11.36 7.47 17.42 25.33 41.67 27.19 46.23 44.61 159.56

8308.68 10,055.85 7335.61 11,846.80 9046.71 15,398.42 10,627.36 14,365.95 10,970.94 18,959.37 12,941.85

3.26 6.35 12.70 13.95 18.57 19.88 55.38 17.25 49.26 89.58 132.40

8335.75 10,138.27 7335.61 11,854.17 9047.22 15,688.78 10,728.93 14,382.65 10,970.94 18,900.25 12,987.29

2.94 1.70 2.68 4.49 12.63 3.82 7.04 10.64 12.84 13.06 16.49

8308.68 10,055.85 7335.61 11,846.80 9046.71 15,398.42 10,618.58 14,365.95 10,970.93 18,900.25 12,931.46

1.54 1.24 2.44 12.61 2.83 17.73 18.45 14.21 5.80 9.50 21.64

The best solutions are in bold.

and LCMST problems. We compared the solutions obtained by the VNS method to optimal solutions for small instances and to solutions obtained by three heuristics LS1, LS2, and GA that were previously proposed for the LCMST problem (but can be easily adapted to the CCMLST problem as well). We generated small and medium-sized instances in a similar fashion to Xiong

et al. [15], and generated a set of large instances from the TSPLIB dataset. The VNS method was clearly the best heuristic for the CCMLST instances. Of the 191 instances, it provided the best solution in 189 instances. Furthermore, for the large instances, its running time is an order of magnitude faster than those of the three other heuristics. For

ARTICLE IN PRESS 1964

Z. Naji-Azimi et al. / Computers & Operations Research 37 (2010) 1952–1964

Table 13 VNS, GA, LS1, and LS2 for the LCMST problem on large datasets. # Nodes, # labels

532, 266 574, 287 575, 287 654, 327 657, 328 666, 333 724, 362 783, 391 1000, 500

Labels

75 150 75 150 75 150 75 150 75 150 75 150 75 150 75 150 75 150

LS1

LS2

GA

VNS

Cost

Time

Cost

Time

Cost

Time

Cost

Time

Max time

95,671.13 78,418.55 44,799.25 34,521.44 8319.41 6683.95 34,751.11 30,107.18 61,970.46 47,355.97 3509.17 2821.43 56,600.13 42,874.35 12,560.76 9509.07 27,246,908.00 20,196,378.00

397 703 451 850 654 892 861 1325 2001 2430 706 1441 1374 4869 2839 2766 9100 27,869

95,617.19 78,392.49 44,650.40 34,501.63 8309.18 6683.48 34,809.59 30,092.77 61,978.58 47,351.50 3500.22 2821.70 56,377.24 42,855.57 12,539.51 9495.57 27,280,794.00 20,216,576.00

558 1011 485 944 706 996 1117 1901 1290 2624 981 2043 1435 3694 2466 3748 13,092 21,587

95,808.91 78,400.90 44,682.80 34,502.26 8335.96 6683.95 34,795.62 30,103.64 61,977.86 47,413.26 3505.82 2820.87 56,245.65 42,847.26 12,529.96 9502.96 27,428,732.00 20,258,258.00

383 627 415 753 501 822 551 1004 684 1443 663 1920 973 2273 1039 2024 4955 8085

95,562.78 78,400.90 44,633.91 34,457.48 8307.45 6683.48 34,722.55 30,090.78 61,941.07 47,397.51 3496.59 2820.63 56,510.83 42,835.26 12,542.81 9495.57 27,257,080.00 20,229,668.00

236 363 377 442 335 370 451 1108 502 1185 414 773 700 1060 875 1180 3333 5903

500 500 500 500 500 500 1200 1200 1200 1200 1200 1200 1200 1200 1200 1200 7200 7200

The best solutions are in bold.

all the 104 instances where the optimal solution was known, the VNS method obtained the optimal solution. On the LCMST instances, the VNS method was also by far the best heuristic. It provided the best solution in all of the 48 small and medium-sized instances. On the large instances, it was by far the best, providing the best solution in 12 out of the 18 instances. For all of the 37 instances where the optimal solution was known, the VNS method obtained the optimal solution. It also ran significantly faster than the three heuristics LS1, LS2, and GA. Furthermore, our statistical analysis revealed that VNS clearly outperforms the three other heuristics at a significance level of 0.00015.

Acknowledgements This work was conducted while the first two authors were visiting the University of Maryland and we thank DEIS, University of Bologna for the financial support of their stay at this university. We thank Dr. Yupei Xiong for the use of his software for the LCMST problem. We also thank the referees. References ¨ [1] Bruggemann T, Monnot J, Woeginger GJ. Local search for the minimum label spanning tree problem with bounded color classes. Operations Research Letters 2003;31:195–201. [2] Chang R-S, Leu S-J. The minimum labeling spanning trees. Information Processing Letters 1997;63(5):277–82. [3] Cerulli R, Fink A, Gentili M, Voß S. Metaheuristics comparison for the minimum labelling spanning tree problem. In: Golden B, Raghavan S, Wasil E, editors. The Next Wave in Computing, Optimization, and Decision Technologies. Berlin: Springer-Verlag; 2008. p. 93–106.

[4] Consoli S, Darby-Dowman K, Mladenovic N, Moreno-Perez JA. Greedy randomized adaptive search and variable neighbourhood search for the minimum labelling spanning tree problem. European Journal of Operational Research 2009;196(2):440–9. [5] Eiselt HA, Sandblom C-L. Integer Programming and Network Models. Springer; 2000. [6] Krumke SO, Wirth HC. On the minimum label spanning tree problem. Information Processing Letters 1998;66(2):81–5. [7] Kruskal JB. On the shortest spanning subset of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society 1956;7(1):48–50. [8] Magnanti T, Wolsey L. Optimal trees. In: Ball M, Magnanti T, Monma C, Nemhauser G, editors. Network models. Handbooks in operations research and management science, Vol. 7. Amsterdam: North-Holland; 1996. p. 503–615. [9] Mladenovic N, Hansen P. Variable neighborhood search. Computers & Operations Research 1997;24:1097–100. [10] Papadimitriou CH, Steiglitz K. Combinatorial Optimization: Algorithms and Complexity. New Jersey: Prentice-Hall; 1982. [11] Prim RC. Shortest connection networks and some generalizations. Bell System Technical Journal 1957;36:1389–401. [12] Reinelt G. TSPLIB, a traveling salesman problem library. ORSA Journal on Computing 1991;3:376–84. [13] Tarjan R. Depth first search and linear graph algorithms. SIAM Journal of Computing 1972;1(2):215–25. [14] Wan Y, Chen G, Xu Y. A note on the minimum label spanning tree. Information Processing Letters 2002;84:99–101. [15] Xiong Y, Golden B, Wasil E, Chen S. The label-constrained minimum spanning tree problem. In: Raghavan S, Golden B, Wasil E, editors. Telecommunications Modeling, Policy, and Technology. Berlin: Springer; 2008. p. 39–58. [16] Xiong Y, Golden B, Wasil E. A one-parameter genetic algorithm for the minimum labeling spanning tree problem. IEEE Transactions on Evolutionary Computation 2005;9(1):55–60. [17] Xiong Y, Golden B, Wasil E. Worst-case behavior of the MVCA heuristic for the minimum labeling spanning tree problem. Operations Research Letters 2005;33(1):77–80. [18] Xiong Y, Golden B, Wasil E. Improved heuristics for the minimum label spanning tree problem. IEEE Transactions on Evolutionary Computation 2006;10(6):700–3.