An efficient local search framework for the minimum weighted vertex cover problem

An efficient local search framework for the minimum weighted vertex cover problem

Accepted Manuscript An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem Ruizhi Li , Shuli Hu , Haochen Zhang , Minghao ...

2MB Sizes 5 Downloads 85 Views

Accepted Manuscript

An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem Ruizhi Li , Shuli Hu , Haochen Zhang , Minghao Yin PII: DOI: Reference:

S0020-0255(16)30625-9 10.1016/j.ins.2016.08.053 INS 12456

To appear in:

Information Sciences

Received date: Revised date: Accepted date:

1 July 2015 10 August 2016 15 August 2016

Please cite this article as: Ruizhi Li , Shuli Hu , Haochen Zhang , Minghao Yin , An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem, Information Sciences (2016), doi: 10.1016/j.ins.2016.08.053

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem Ruizhi Li, Shuli Hu, Haochen Zhang, and Minghao Yin* School of Computer Science and Information Technology, Northeast Normal University, Changchun 130024, China *

CR IP T

corresponding author: [email protected]

Abstract:The minimum weighted vertex cover (MWVC) problem, an extension of the classical minimum vertex cover (MVC) problem, is an important NP-complete combinatorial optimization problem with a wide range of

applications. The objective of this paper is to design an efficient local search algorithm to solve the MWVC

AN US

problem. First, the weighted edge strategy is proposed to define the dynamic scoring strategy so that our algorithm

can find different possible optimal solutions. Second, the weighted configuration checking (WCC) strategy is

proposed to overcome the cycling problem in local search. By combining the WCC strategy with the scoring

M

strategy, we design the vertex selection strategy to determine the vertex to be selected as a candidate solution

ED

component. Based on these strategies, a novel local search framework, namely diversion local search based on

weighted configuration checking (DLSWCC), is presented. DLSWCC is evaluated against several state-of-the-art

PT

algorithms on various benchmark instances. Experimental results show that DLSWCC outperforms its competitors

CE

in terms of both solution quality and computational efficiency in most classical instances. Specifically, DLSWCC

can obtain22 new upper bounds of 71 moderate-scale problem instances, 5 new upper bounds of 15 large-scale

AC

problem instances, and 56 new upper bounds of 56 massive graph instances. Keywords: minimum weighted vertex cover, weighted configuration checking, scoring strategy, local search,

massive graph instances

1. Introduction The minimum vertex cover (MVC) problem is to find a minimum subset of vertices that contains at least one

endpoint of each edge [17][18][43]. This problem is a core optimization problem that has been studied extensively.

ACCEPTED MANUSCRIPT

It has important applications in various fields such as network security, industrial machine assignment, and data

aggregation [7][10][25][26][39][41][49][51][56]. As a dual problem of the maximum independent set (MIS)

problem [9][20][23][35][40], the MVC problem has also been applied to social networks, pattern recognition,

molecular biology, and economics [1][2][6][24][32][33] [65].

CR IP T

The minimum weighted vertex cover (MWVC) problem can be viewed as a generalized version of the MVC

problem. In the MWVC problem, each vertex is associated with a weight, and the problem is to find a vertex cover

whose total weight is the smallest. Obviously, the problem reduces to the classical MVC problem when all the

AN US

vertices share the same weight. The MWVC problem plays an important role in many real-world applications such

as wireless communication, circuit design, and network flows [50][58][59].

As is well known, the decision version of the MVC problem is one of the prominent problems among Karp’s

M

21 NP-complete combinatorial problems [37]. Moreover, it is NP-hard to approximate the MVC problem within

any factor smaller than 1.3606 [22]. Therefore, for large and hard instances, researchers usually resort to heuristic

ED

approaches to obtain good solutions within reasonable time. In the past decade, several heuristic algorithms have

PT

been proposed to solve the MVC problem. An evolutionary approach to the MVC problem and related surveys on

this type of algorithm can be found in [24]. Further, ant colony approaches have been proposed in [27]. The cover

CE

edge randomly (COVER) [51] algorithm is a recently developed iterative best improvement algorithm that uses

AC

edge weights to guide the local search. Cai et al. proposed two local search-based algorithms: EWLS and EWCC

[14][15]. On the basis of these algorithms, they further proposed two heuristic strategies—two-stage exchange and

edge weighting with forgetting—and introduced a new local search algorithm called NuMVC [16]. Subsequently, a

vertex weighting scheme was combined with NuMVC, leading to a new algorithm called TwMVC [12].

For the weighted version of the MVC problem, researchers have devoted their efforts toward the development

of heuristics for generating good or near-optimal solutions within reasonable time. The algorithm proposed in [19]

ACCEPTED MANUSCRIPT

starts with an empty partial solution, adds one vertex at a time, and selects the vertex with the minimum ratio

between the vertex weight and the current degree in each step. In [5], Balaji et al. proposed an effective algorithm

called the support ratio algorithm (SRA), which can find the minimum weighted vertex cover effectively with the

terminology support of a vertex introduced in the new model. In [57], Stefan applied a modified reactive tabu

CR IP T

search approach to solve the problem. In [4], a new heuristic operator was introduced in the domain of the

randomized gravitational emulation search (RGES) algorithm to maintain feasibility specifically for the vertex

cover problem. Furthermore, in [54], Shyu et al. introduced a meta-heuristic based on the ant colony optimization

aforementioned

AN US

(ACO) approach to find approximate solutions. Subsequently, in [36], Jovanovic and Tuba improved the

algorithm by introducing a pheromone correction heuristic strategy that incorporates information

about the best-found solution to exclude suspicious elements in order to avoid being trapped in local optima. In [8],

M

Bouamama proposed a simple yet efficient population-based iterative greedy algorithm for solving the MWVC

problem. In [69], Zhou et al. proposed a multi-start iterative tabu search algorithm to solve the MWVC problem.

ED

Although heuristic algorithms are largely successful in solving the MWVC problem, research on such

PT

methods remains in its nascent stages. Compared with local search algorithms, which can solve the MVC problem

efficiently and effectively, MWVC algorithms cannot scale up and are time-consuming, especially for large-scale

CE

instances. This may be because the MWVC problem is much harder and more complicated; thus, it is more

AC

difficult to solve from the viewpoint of algorithm design. Owing to the complicated structure of the MWVC

problem, it could be much easier for local search algorithms to visit the revisited space during local search. In

addition, a careful inspection of current MWVC heuristic algorithms shows that most heuristic functions are static

during the local search. This would cause the algorithm to be trapped in local optima, thereby leading to bad

solutions.

To address the above-mentioned issues, the present paper proposes two heuristics and introduces a novel local

ACCEPTED MANUSCRIPT

search framework for solving the MWVC problem. The first heuristic is a dynamic vertex scoring strategy

designed for vertex selection. As is well known, the selection of vertices to be added into or removed from the

candidate solution plays an important role in the local search performance. A good selection of such vertices would

be beneficial for guiding the search in the right direction, whereas a bad selection would mislead the algorithm.

CR IP T

Previous scoring mechanisms that assess the selected vertices are mainly based on the static scoring strategy. In

other words, the vertex score is constant throughout the search process. Such a scoring mechanism may lose its

effectiveness when the local search is trapped inlocal optima. Therefore, this paper proposes a dynamic scoring

AN US

strategy that dynamically modifies the edge weights of the uncovered edges when it is trapped in local optima.

Experimental results provide further evidence regarding the effectiveness and general applicability of this

mechanism.

M

The second heuristic is a variant of the configuration checking (CC) strategy for handling the cycling problem

during the local search. Compared with previous mechanisms, such as the tabu strategy and random

ED

walk[3][11][21][28][29][30][34][38][42][44][55][60][62], the CC strategy considers the circumstances of the

PT

selected solution component to change its value rather than the direct properties of the solution component. This

strategy has been successfully applied to various combinatorial optimization problems, such as the MVC problem,

CE

propositional satisfiability (SAT) problem, MaxSAT problem, and set cover problem, and it has inspired a series of

AC

state-of-the-art local search algorithms, such as EWCC, CCASat, and uSLC [13][15][47][63][70]. In this paper, we

adapt the aforementioned strategy to solve the MWVC problem. However, direct application of the CC strategy

cannot lead to a successful algorithm, because the original CC strategy is too rigid in the context of the MWVC

problem. Therefore, we propose a new strategy, namely the weighted configuration checking (WCC) strategy,

which is looser than the CC strategy and obtains more promising search areas. Experimental results not only show

that the proposed strategy is more effective than the original CC strategy but also confirm its superiority on various

ACCEPTED MANUSCRIPT

types of benchmarks.

By incorporating the two strategies described above, an efficient algorithm, namely diversion local search

based on weighted configuration checking (DLSWCC) is proposed to solve the MWVC problem. To test the

efficiency of DLSWCC, it is compared with several state-of-the-art MWVC algorithms, namely RGES [4], ACO

CR IP T

[54][54], ACO+SEE [36], PBIG [8], and MS-ITS [69], using benchmark instances originally proposed in [54]. Experimental results show that DLSWCC significantly outperforms the other algorithms in terms of solution

quality and running time, and it improves some current best known solutions.

AN US

It is important to note that previous MWVC local search algorithms mainly focus on academic benchmarks.

In these benchmarks, the number of vertices only takes values from 10 to 1000. However, the proliferation of the

Internet, followed by various scientific advancements and the widespread deployment of sensors, has resulted in

M

increasingly massive data sets. In many real-world applications, such as social networks, biological networks, and

recommendation problems, the number of vertices usually takes values greater than 1000 (greater than one million

ED

in certain cases). Therefore, in this study, we obtained a large number of massive graph benchmarks from the

PT

Network Data Repository [52]. These graphs are modeled on the basis of real-world applications, including

biological networks, collaboration networks, Facebook networks, interaction networks, infrastructure networks,

CE

Amazon recommendation networks, scientific computation networks, social networks, technological networks, and

AC

web lint networks. Experimental results show that our algorithm exhibits much better performance than other local

search algorithms. Specifically, our algorithm is able to obtain new upper bounds for all the massive graph

benchmarks.

The remainder of this paper is organized as follows. Section 2 reviews the relevant background knowledge.

Section 3 introduces the dynamic weights of edges and the dynamic scoring strategy. Section 4 explains the

weighted configuration checking strategy for the MWVC problem. Section 5 discusses the design of a new vertex

ACCEPTED MANUSCRIPT

selection strategy. Section 6 describes the proposed efficient local search framework, namely DLSWCC. Section 7

presents and discusses the experimental results. Finally, Section 8 concludes the paper and briefly explores

directions for future work.

2. Basic Definitions and Notations

CR IP T

In this section, we shall introduce some basic concepts and definitions. An undirected graph is denoted by G(V, E), where V = {v1, v2, ⋯, vn} is the set of vertices and E = {e1, e2, ⋯, em } is the set of edges. An undirected weighted graph is denoted by G(V, E, w), where each vertex vi (i = 1, 2, ⋯, n) is associated with a weight w(vi). We shall use N(v) = {u∈V | (u, v)∈E} to denote the set of neighbors of a vertex v, and d(v) = |N(v)| to denote the degree

AN US

of v. Given a candidate solution C, we shall use svj{0, 1} to denote the state of a vertex vj, where svj = 1 implies that vjC and svj = 0 implies that vj∉C. The three variants of the vertex cover problem are defined as follows. Definition1 (vertex cover, VC) Given an undirected graph G(V, E),where V is the set of vertices and E is the set

M

of edges, a vertex cover is a subset C ⊆ V such that every edge in G has at least one endpoint in C.

ED

Definition2 (minimum vertex cover, MVC) Given an undirected graph G(V, E), where V is the set of vertices and

E is the set of edges, the minimum vertex cover problem is to find the smallest vertex cover in G. The MVC

AC

CE

PT

problem can be formally defined as follows.

Minimize𝑤(𝐶) = ∑𝑛𝑗=1 𝑠𝑣𝑗

(1)

subject to 𝑠𝑣𝑖 +𝑠𝑣𝑗 ≥1, ∀(vi , vj ) ∈ E,

(2)

𝑠𝑣𝑖 , 𝑠𝑣𝑗 ∈ *0, 1+, 𝑖, 𝑗 = 1,2, ⋯ , 𝑛

(3)

Equation (2) ensures that each edge is covered by at least one vertex, and Equation (3) is the integral of the

constraint.

Definition3 (minimum weighted vertex cover, MWVC) Given an undirected weighted graph G(V, E, w), where

V is the set of vertices, E is the set of edges, and each vertex vi has a weight w(vi), the minimum weighted vertex

ACCEPTED MANUSCRIPT

cover problem is to find the vertex cover with the minimum sum of weights of the constituent vertices. The

MWVC problem can be formally defined as follows. Minimize w(C) = ∑nj=1 𝑤(𝑣𝑗 ) 𝑠𝑣𝑗

(4)

subject to 𝑠𝑣𝑖 + 𝑠𝑣𝑗 ≥ 1, ∀(𝑣𝑖 , 𝑣𝑗 ) ∈ 𝐸,

(5)

𝑠𝑣𝑖 , 𝑠𝑣𝑗 ∈ *0, 1+, 𝑖, 𝑗 = 1,2, ⋯ , 𝑛

CR IP T

(6)

Equation (5) ensures that each edge is covered by at least one vertex, and Equation (6) is the integral of the

constraint. The problem reduces to the MVC problem when the weights of all the vertices are equal to 1.

AN US

3. Dynamic Scoring Strategy

Given a set of candidate vertices, selecting the vertices that should be added into or removed from the

candidate solution plays an important role in the efficiency of the search. Previous scoring mechanisms of the

selected vertex v are mainly based on a static scoring strategy. In other words, the vertex score is constant

M

throughout the search process. Such an assessment would lose its effectiveness when the local search is trapped in

ED

a local minimum. In this section, we introduce an alternative dynamic scoring strategy to measure the benefit of

changing the state of a vertex during the local search. This strategy together with the WCC strategy introduced in

PT

the next section can be used to select a vertex as a candidate solution component. Before we introduce the dynamic

CE

scoring mechanism, we clarify the concept of dynamic edge weight.

AC

Definition 4 (dynamic edge weight) Given an undirected graph G (V, E, w), for each vertex eE, we use an

edge weight function, denoted by dynamic_weight(e), associated with e, which is maintained dynamically during

the local search process.

Specifically, two dynamic weight rules are introduced as follows.

Weight_Rule 1: In the initialization process, for each edge eE, dynamic_weight(e) is set to 1.

Weight_Rule 2: At the end of each loop during the local search process, each edge eE will be checked to

ACCEPTED MANUSCRIPT

determine whether it is covered by the candidate solution. If e is uncovered, dynamic_weight (e) will be increased

by 1.

According to Definition 4, supposing that a candidate solution is a subset of vertices C⊆V, a Boolean function

cover(e, C) is used to denote whether an edge eE is covered by a candidate solution C, i.e., whether at least one

calculated as follows. score(𝑣) =

cost(𝐶)−cost(𝐶’) 𝑤(𝑣)

CR IP T

endpoint of e belongs to C. Then, for the MWVC problem, the score of a vertex v, denoted by score(v), can be

(7)

AN US

where C denotes the current solution, C' = C\{v} if vC and C' = C∪{v} otherwise, and w(v) denotes the weight

of vertex v. Further, cost(C) is the total weight of edges not covered by C, which can be described as follows. cost(C) =



dynamic_weight(e)

cover(e,C) = false

(8)

M

As can be seen from Equation (8), if the local search is trapped in a local minimum, according to

Weight_Rule 2, the weight of the uncovered edge will be increased incrementally; thus, the score of the vertices

ED

related to the edge will be increased incrementally. In this manner, the search will proceed in the right direction

PT

and jump out of the local minimum.

It should be noted that such a mechanism is especially important for large-scale problem instances, because it

CE

is much easier for the local search to be trapped in a local optimum for these instances. However, for large-scale

AC

instances, the efficiency of the scoring mechanism is also an important issue. It is imperative to rapidly determine

the score of each vertex. In our implementation, we use a fast incremental evaluation technique based on a

streamlined calculation for updating the score after each addition or removal. Specifically, given a vertex v, the

initial score of v is initialized as

score0(v) = d(v) where d(v) denotes the degree of v.

(9)

ACCEPTED MANUSCRIPT

Suppose that in the ith step of the local search, the score of v is denoted by scorei(v). Then, if v is added into or removed from the candidate solution, we only need to update the scores of v and its neighbors as follows.

scorei+1(v) = -scorei(v);

(10)

for each vertexuN(v), scorei(u) -

dynamic_weight(u,v)

(u) + {scorei

dynamic_weight(u,v)

𝑤(𝑢)

, (uC)  (vC) = True (11)

𝑤(𝑢)

, otherwise

(11)

CR IP T

scorei+1(u) =

For other vertices v’ that are not neighbors of v, the score will not be changed, i.e.,

scorei+1(v') = scorei(v')

(12)

AN US

In order to further clarify the fast incremental evaluation technique, we shall use an example to illustrate it.

Example 1 Consider the graph shown in Figure 1, which has five vertices and four edges. Suppose that the weight

of each vertex is equal to 1 and the current candidate solution C = {a,b}. Let the current weight of the vertices be

M

dynamic_weight(a,b) = 1, dynamic_weight(a,c) = 3, dynamic_weight(a,d) = 2, and dynamic_weight(d,e) = 3.

ED

Further, let the current step of the local search be the ith step. According to Equation (7), the scores of the vertices

are scorei(a) = 3, scorei(b) = -1,scorei(c) = 3, scorei(d) = -5, and scorei(e) = 0. If we add vertex a into the candidate

PT

solution, we need not update the values of all the vertices except for a and its neighbors. According to Equation

CE

(10), scorei+1(a) = -scorei(a) = -3. Vertices b, c, and d are the neighbors of vertex a, and their scores need to be updated. FromEquation (11), we getscorei+1(b) = score(b) + dynamic_weight(a,b) = 0, scorei+1(c) = scorei(c) -

AC

dynamic_weight(a,c) = 0, and scorei+1(d) = scorei(d) + dynamic_weight(a,d) = -3. According to Equation (12), the scores of the other vertices need not be updated; hence, scorei+1(e) = scorei(e) = 0. If we remove a vertex from the candidate solution, the scores are updated similarly.

ACCEPTED MANUSCRIPT

add vertex a into C

Scorei+1(a)=-3, scorei+1(b)= 0,scorei+1(c)=0 scorei+1(d)= -3,scorei+1(e)=0

scorei(a)=3, scorei(b)= -1,scorei(c)=3 scorei(d)= -5,scorei(e)=0

CR IP T

Fig. 1 An example of the fast incremental evaluation technique. Proposition 1 The incremental evaluation technique is sound and the time complexity of updating the scores after

each addition or removal is linear.

Proof. We first prove the soundness of the incremental evaluation technique, and we consider the process of

scorei(v) = k ,i.e., scorei(v) =

cost(𝐶)−cost(𝐶’) 𝑤(𝑣)

AN US

removing a vertex from the graph G(V,E). Suppose that the current candidate solution is C for a vertex v, vC, and = k, according to Equation (7), where C' = C\{v}.If we select v and

remove v from the current candidate C,the current candidate solution will be C'. According to Equation (7), 𝑐𝑜𝑠𝑡(𝐶’)−𝑐𝑜𝑠𝑡(𝐶’’) 𝑤(𝑣)

, where C'' = C'∪{v} = C; hence,scorei+1(v) = -k. For a vertex uN(v), suppose that

M

scorei+1(v) =

ED

scorei(u) = m and dynamic_weight(v,u) = n. If uC, it implies that the edge (v, u) is covered by vertices v and u at the same time. If we select v and remove v from the current candidate C, the edge(v,u) will be covered only by

PT

vertex u. In the next iteration, if vertex u is selected for removal from the candidate, the edge(v,u) will be

CE

uncovered; hence, scorei+1(u) = scorei(u) - dynamic_weight(v,u) = m - n. On the other hand, if u∉C, the edge (v,u) is covered only by vertex v. If we select v and remove v from the current candidate C, the edge(v,u) will be

AC

uncovered. In the next iteration, if vertex u is selected for addition into the candidate, the edge(v,u) will be covered

by vertex u; hence, scorei+1(u) = scorei(u) + dynamic_weight(v,u) = m + n. For other verticesv’ that are not neighbors of v, if we select v and remove v from the current candidate C, the number of vertices that are covered by v’ will be not changed regardless of whether it is covered. Hence, the score will not be changed, i.e., scorei+1(v') = scorei(v'). We then analyze the time complexity of the technique. The fast incremental evaluation technique is based on

ACCEPTED MANUSCRIPT

Equations (10) and (11). If we add vertex v into the candidate solution, we only need to update the scores of vertex

v and its neighbors. Note that score(v) is updated with the opposite value of the original value given by Equation

(10). The scores of the neighbors of v are updated by adding or subtracting a number, as shown in Equation (11).

Thus, the worst time complexity is due to updating the scores of the neighbors of v’ and it depends on the number



O(

(V)),



where

(V)



max{|d(v)|

|

vV}.

AN US

4. Weighted Configuration Checking Strategy

=

CR IP T

of neighbors. We can easily conclude that the complexity of the fast incremental evaluation technique is low, i.e.,

The cycling problem, which refers to revisiting the same part during the search process, is a serious issue in

local search. Recently, Cai et al. [15] proposed the configuration checking (CC)strategy, which can exploit the

problem structure to prevent stochastic local search algorithms from revisiting the same scenario. In the CC

M

strategy, the concept of configuration, which we refer to as unweighted configuration in this paper, is defined as

ED

follows.

Definition 5 (unweighted configuration) Given an undirected graph G(V, E, w) and assuming C to be the

PT

current candidate solution, the unweighted configuration of a vertex v is a vector S consisting of the states of all the

CE

vertices in N(v) under the current candidate solution.

AC

According to the CC strategy, supposing that the local search procedure maintains a current candidate solution

C, when selecting a vertex v to add into C, for a vertex v∉C, if the configuration of v has not been changed since

its last removal from C, which means that the circumstance of v remains stable, then v should not be added back to

C; otherwise, the algorithm could easily lead to a scenario that it has recently faced, which is likely to result in the

cycling problem.

A straightforward CC strategy for MWVC can be easily devised as follows. An array config is maintained,

ACCEPTED MANUSCRIPT

whose element is an indicator: config[v] = 1 implies that the configuration of vertex v has changed since the last

removal of v from C; otherwise, config[v] = 0. Initially, config[v] is initialized as 1 for each vertex v, as each vertex

is allowed to be selected initially. During the search, when a vertex v is added to the current solution, config[v] is

set to 1 for each vertex uN(v). When a vertex v is removed from the current solution, config[v] is set to 0 and

CR IP T

config[u] is set to 1 for each vertex uN(v).

However, such direct application of the CC strategy to the MWVC problem would mislead the search by

restricting the addition of some promising vertices. In other words, the restriction of the original CC strategy is too

AN US

rigid. Section 3 introduced a dynamic scoring mechanism, where each edge of the graph is associated with a

weight. This weight will be maintained dynamically throughout the local search, thereby enabling the search to

jump out of a local optimum. Our strategy also considers the dynamic weight of the edge in order to avoid the

M

cycling problem. Next, we shall introduce the concept of weighted configuration.

Definition 6 (weighted configuration) Given an edge-weighted undirected graph G = (V, E, w) and assuming

ED

C to be the current candidate solution, the weighted configuration of a vertex v is a two-tuple, where S is a

PT

vector consisting of the states of all the vertices in N(v) under the current candidate solution, and W is a vector

consisting of the weights of all the incident edges of all the vertices in N(v).

CE

According to Definition 6, we can modify the original CC strategy into a less rigid version, referred to as

AC

weighted configuration checking (WCC). This heuristic is specified as follows.

An array wconfig will be maintained, whose element acts as an indicator: wconfig[v] = 1 implies that the

weighted configuration of vertex v has changed since the last removal of v from C; otherwise, wconfig[v] = 0. The

wconfig array is maintained by the following four rules:

WCC_Rule 1: Initially, for each vertex v, wconfig[v] is initialized as 1.

WCC_Rule 2: When removing v from C, wconfig[v] is reset to 0; for each uN(v), wconfig[u] is set to 1.

ACCEPTED MANUSCRIPT

WCC_Rule 3: When adding v into C, for each uN(v), wconfig[u] is set to 1.

WCC_Rule 4: When updating dynamic_weight(e), i.e., the weight of edge e, where e is adjacent to vertices u

and v, both wconfig[u] and wconfig[v] are set to 1.

Clearly, the main difference between the unweighted configuration and the weighted configuration of a vertex

CR IP T

v is that the latter considers not only the states of all the vertices in N(v) under the current candidate solution but

also the weights of all the incident edges of all the vertices in N(v), whereas the former only considers the states of

all the vertices in N(v). Therefore, a vertex can be added into or removed from the candidate solution once the

AN US

states of v or the weights of its incident edges have been changed. Given a vertex v, we say that v is an original CC

variable (OCCV) if config[v] = 1 and v is a weighted CC variable (WCCV) if wconfig[v] = 1. It is easy to see that

the following proposition stands.

Proposition 2 For a graph G(V, E) and vV, if some vertices become OCCV by picking one vertex v to be

M

added or removed, then those vertices are also WCCV.

Proof. During the current procedure, one vertex v is selected to be added into or removed from the candidate

ED

solution. All the neighbors N(v) of this vertex are OCCCV and WCCV. When updating the weight of edge e, both the endpoints of e are WCCV.



PT

According to Proposition 2, we know that WCCV variables include OCCV variables. In other words, the

CE

WCC strategy is looser than the original CC strategy. Thus, the algorithm can reserve some potential vertices and

AC

be led to promising search areas. The experimental results presented in Section 7.3 confirm the effectiveness of the

WCC strategy.

5. Vertex Selection Strategy Using the dynamic scoring mechanism and the WCC strategy introduced in the previous sections, we develop

the vertex selection strategy. First, we shall introduce the concept of age. The age of a vertex is defined as the

number of search steps that have occurred since its state was last changed. Specifically, the vertex selection

strategy is based on the following two rules.

ACCEPTED MANUSCRIPT

Remove_Rule: For vertices in the candidate solution, select one vertexvwith the greatest score(v) value. If

thereis more than one such vertex, break ties in favor of the oldest one, i.e., the one with the greatest value of age.

Add_Rule: For vertices not in the candidate solution with the WCC value equal to 1, select one vertex v with

the greatest score(v) value. If there is more than one such vertex, break ties in favor of the oldest one, i.e., the one

CR IP T

with the greatest value of age.

From these two rules, we can see that, when selecting a vertex to be added into the candidate solution, we

need to select a vertex v with the highest score(v). Thus, a candidate solution can cover as many edges as

AN US

possible after this vertex is added into it, and simultaneously, the least weight value is added into the candidate

solution. On the other hand, to avoid visiting the previous candidate solution, the WCC value of the added vertex

should be 1. When there is more than one vertex with the highest score, we need to pick the oldest one, i.e., the one

M

with the greatest value of age. For removing a vertex from the candidate solution, a similar process is followed,

except that the WCC value of the selected vertex needs to be 1.

ED

6. Framework of DLSWCC

The DLSWCC algorithm follows the general local search procedure. First, an initial candidate solution C is

PT

constructed greedily; then, local search improvement is achieved using a perturbing method to improve the initial

CE

solution C. Let w(C) denote the objective value of the candidate solution and w(C) = ∑v∈C w(v). The upper bound

AC

(UB) of the objective value is initially set as UB = w(C). If better solutions exist, their objective values should be

smaller than UB. The objective of local search improvement is to solve a series of new problems: given the

original problem and an integer UB, find a feasible solution whose objective value is smaller than UB but which is

still able to cover all the edges in E. The candidate solution becomes infeasible when it cannot cover all the edges

in E. Our algorithm repeatedly perturbs infeasible solutions with a smaller objective value than UB. Thus, once the

initial candidate solution has been constructed, the vertices from C are first removed until C becomes an infeasible

ACCEPTED MANUSCRIPT

solution under UB. In this process, if any better solutions are found, UB and C* should be updated. The

dynamic_weight of every edge is updated in each iteration when the candidate solution becomes infeasible. In this manner, the dynamic_weight of each uncovered vertex is increased by 1, thus giving the ―hard to cover‖ edges a

better chance to be covered by the new C in the following iterations. On the basis of the explanation provided

Algorithm DLSWCC initialize wconfig array according to WCC_Rule1;

2

initialize the dynamic_weight of each edge assigned as 1;

3

initialize the score of each vertex assigned as the degree of the vertex;

4

initialize the candidate solution C greedily;

5

UB = w(C);

6

C*←C;

7

iter←0;

8

while stop criterion is not satisfied do

9

AN US

1

CR IP T

above, we outline our heuristic algorithm as follows.

while C covers all edges then UB = w(C);

11

C*←C;

12

v←x with the greatest score in C, breaking ties in favor of the oldest one;

13

C←C\{v};

14

update wconfig array according to WCC_Rule 2;

ED

M

10

end while

16

v←x with the greatest score in C and v is not in tabu_list, breaking ties in favor of the oldest one;

17

C←C\{v};

18

update wconfig array according to WCC_Rule 2;

19

clear tabu_list;

20

while C uncovers some edges do

22

v←x with the greatest score not in C and wconfig[x]==1, breaking ties in favor of the oldest one; if w(C)+w(v)≥UB then break; C←C∪{v};

AC

23

CE

21

PT

15

24

update wconfig array according to WCC_Rule3;

25

dynamic_weight [e]←dynamic_weight [e]+1, for each uncovered edge by C;

26

update wconfig array according to WCC_Rule4;

27

add v into tabu_list;

28

endwhile

29

iter←iter+1;

30 endwhile 31 return C*; When our algorithm commences, the wconfig array should be initialized to 1 (line 1), which means that the

ACCEPTED MANUSCRIPT

weighted configuration checking strategy allows all the vertices to be added into the candidate solution and the

dynamic_weightof each edge is set to 1 (line 2). Next, the score of each vertex is set to the degree of the vertex. In

line 4, the candidate solution C is generated with the greedy initialization method, which finds a solution to the

MWVC problem by iteratively selecting one vertex with the greatest score. We calculate UB,theobjective value of

6) and the number of iterations of the local search (line 7), respectively.

CR IP T

the candidate solution, in line 5. We use C* and iter to denote the current best solution found using DLSWCC (line

After the initialization process, the main outer loop from line 8 to line 30 is executed until the time limit or

AN US

maximum number of iterations is reached. When a solution is obtained, i.e., C covers all edges, UB and the best

solution C* are updated by w(C) and C respectively (lines 10, 11), which means that our algorithm needs to search

for a smaller solution. Therefore, our algorithm should select a vertex in the candidate solution according to

M

Remove_Rule (line 12) and remove the selected vertex from C(line 13) until there are some uncovered edges. The

WCC array should be updated by WCC_Rule2 (line 14).Furthermore, we select anothervertex with the greatest

ED

scorein the candidate solution, and the selected vertex should not be in the tabu_list, which means that this process

PT

avoids picking some recently added vertices in the last iteration that are to be removed from the candidate solution

(line 16). If such a vertex is found, it is removed from the candidate solution (line 17). DLSWCC updates the

CE

WCC array according to WCC_Rule2and clears tabu_list (lines18, 19).

AC

The inner loop is from line 20 to line 28 until the candidate solution covers all the edges. Based on the

Add_Rule, our algorithm picks a vertex v (line 21). If the sum of the weights of the vertices in C⋃{v} is greater

than UB, then the inner loop breaks(line 22). Otherwise, we add vertex v into the current candidate solution and

update wconfig according to WCC_Rule3 (lines 23, 24). After adding a vertex, the weights of the uncovered edges

will be increased by one (line 25) and the WCC values of each vertex of the uncovered edges are updated

according to WCC_Rule 4 (line 26). Then, we add v into tabu_list (line 27). At the end of the inner loop, the value

ACCEPTED MANUSCRIPT

of iter is increased by one (line 29). When the stopping criterion, i.e., the time limit or the maximum number of

iterations, is reached, the best solution to the MWVC problem will be returned (line 31).

7. Experimental results In this section, we will present and discuss the results of experiments conducted by adopting the proposed

compared with several state-of-the-art algorithms proposed in the literature.

7.1 Benchmark instances

CR IP T

DLSWCC algorithm to solvea large number of MWVC benchmark problem instances. Further, DLSWCC is

The experiments were run on a large number of benchmark instances, each consisting of an undirected

AN US

vertex-weighted graph with n vertices and m edges. These instances were classified according to their number of

vertices (n) into four different classes:

1. Class SPI: a class of small-scale problem instances (SPI) including 400 instances, with n taking values of

M

{10, 15, 20, 25}.

2. Class MPI: a class of moderate-scale problem instances (MPI) including710 instances, with n taking

ED

values of {50, 100, 150,200, 250, 300}.

PT

3. Class LPI: a class of large-scale problem instances (LPI) including15instances, with n taking values of

CE

{500, 800, 1000}.

4.Class Massive Graph Instances: a class consisting of 56 very large instances, with n taking values greater

AC

than 1000.

The instances of the classes SPI, MPI, and LPI were originally proposed in [54]. In each class, and for each n,

a whole range of instances exists, including rather sparse graphs as well as rather dense graphs. The instances of

the first two classes share the following characteristics: (i) 10 problem instances are randomly generated per

combination of n and m, and the results are presented as an average over the objective function values obtained for

the 10 instances. (ii) The weight w(v) of each vertex vV is randomly drawn from a uniform distribution, either

ACCEPTED MANUSCRIPT

from the interval [20, 120] (referred to as Type I) or from the interval [1, d(v)2] (referred to as Type II), where d(v)

is the degree of vertex v. In contrast, class LPI consists of only one problem instance per combination of n and m.

The vertex weights are randomly drawn from a uniform distribution from the interval [20, 120].

Further, it should be noted that previous MWVC local search algorithms have mainly focused on academic

CR IP T

benchmark problems. In these benchmarks, the number of vertices only takes values from 10 to 1000. However,

the proliferation of the Internet, followed by various scientific advancements and the widespread deployment of

sensors, has resulted in increasingly massive data sets. In many real-world applications, the number of vertices

AN US

usually takes values greater than 1000 (greater than one million in some cases). Therefore, in this study, we

collected a large amount of massive graph instances from the Network Data Repository [52]. Some of these

benchmarks have recently been used to test parallel algorithms for maximum clique and coloring problems

M

[53][61]. The graphs in our experiments can be categorized into 10 groups: biological networks, collaboration

networks, Facebook networks, interaction networks, infrastructure networks, Amazon recommend networks,

ED

scientific computation networks, social networks, technological networks, and web lint networks. The class of

PT

massive graph instances consists of one problem instance per combination of n and m. The vertex weights are

randomly drawn from a uniform distribution from the interval [20, 120].

CE

7.2 Comparison with state-of-the-art algorithms

AC

We compared the performance of DLSWCC with that of state-of-the-art algorithms in solving the MWVC

problem. The considered state-of-the-art algorithms were the randomized gravitational emulation search (RGES)

algorithm [4], the ant colony optimization (ACO) algorithm [54], an improved ACO algorithm (ACO+SEE) [36],

the population-based iterated greedy (PBIG) algorithm [8], and a multi-start iterated tabu search (MS-ITS)

algorithm [69]. Note that the proposed algorithm was executed only once on each instance of class SPI and class

MPI, and 10 times on each instance of class LPI and each massive graph instance.

ACCEPTED MANUSCRIPT

We programmed the DLSWCC algorithm in C and executed it on a PC with an Intel® Xeon®E7-4830 CPU

(2.13 GHz). The executable files of the above-mentioned algorithms (except MS-ITS) have not been provided;

hence, we only compared the results of our algorithm with the experimental results provided for the instances of

the classes SPI, MPI, and LPI. Section 7.9 compares our algorithm with MS-ITS for massive graph instances on

CR IP T

the same computer.MS-ITS has been tested on a computer with an AMD A6-3400M APU (1.40 GHz), RGES has

been tested on aPC with a PIV CPU (3.2 GHz)running Windows XP, PBIG has been executed on a cluster of PCs equipped with Intel® X3350 CPUs(2667 MHz), and ACO has been executed on a PC with an Intel CoreTM(2) Duo

AN US

CPU (4.00 GHz). It should be noted that the machines used for most of the above-mentioned algorithms are faster

than the machine used in this study. For each instance, the DLSWCC algorithm was performed, and each run was

terminated when a given time limit(1000 s) or maximum number of iterations (1,000,000) was reached.

7.3 Effectiveness of WCC strategy

M

The aim of the first experiment was to evaluate the effectiveness of the WCC strategy. In this experiment, we

ED

compared the DLSWCC algorithm with two alternative algorithms: DLSNOCC and DLSECC. DLSNOCC works

without the WCC strategy, i.e., it selects the vertex with the greatest score, breaking ties in favor of the oldest one

PT

during the adding procedure without considering the WCC values. DLSECC works with a straightforward

CE

extension of the CC strategy instead of the WCC strategy. We tested the three algorithms on class LPI over 10 runs

with different random seeds per instance.

AC

In Table 1, the first two columns indicate the number of vertices (n) and the number of edges (m), and the

following columns (Best, Avg)indicate the best objective values and average objective values for each algorithm.

The row Avge denotes the average of the columns Bestand Avg, the row Worse denotes the number of objective

values obtained by the algorithm worse than those obtained by the DLSWCC algorithm, the row Better denotes the

number of objective values obtained by the algorithm better than those obtained by the DLSWCC algorithm, and

the row Equal denotes the number of objective values obtained by the algorithm equaling those obtained by the

ACCEPTED MANUSCRIPT

DLSWCC algorithm. For each combination, we compared the best performance (Best) and the average

performance (Avg) of the three algorithms. The bold values indicate the best solution values obtained among the

three algorithms. For the best performance, in all 15 cases, DLSWCC obtained the best results. DLSNOCC was

only able to match the results of DLSWCC in 2 cases. DLSECC was able to match the results of DLSWCC in 9

CR IP T

cases. For the average performance, in all 15 cases, DLSWCC obtained the best results. Further, DLSNOCC could

not match any of the results, while DLSECC was able to match the results of DLSWCC in 8 cases.

From the table, we can conclude that the DLSECC algorithm performs better than the DLSNOCC algorithm.

AN US

This means that a direct extension of the original CC strategy does work, because this strategy can avoid some

cycling search problems during the local search. However, under the tight restriction of the original configuration

mechanism, some promising search spaces will be omitted. On the other hand, the WCC strategy can achieve a

M

good trade-off between avoiding cycling search and improving the diversity of the algorithm. Compared with

DLSECC, DLSWCC shows much better performance not only in terms of the average performance but also in

ED

terms of the best performance. Table 1 Results of Algorithm DLSNOCC, DLSECC and DLSWCC for instances of Class LPI.

PT

DLSNOCC

Nodes

Best

Avg

Best

Avg

AC

800

1000

DLSWCC Best

Avg

500

12626

12629.7

12616

12616

12616

12616

1000

16475

16480.6

16465

16465

16465

16465

2000

20891

20981.7

20865

20867

20863

20866.2

5000

27247

27520.5

27241

27241

27241

27241

10000

29573

29624.9

29573

29573

29573

29573

500

15053

15053

15025

15025

15025

15025

1000

22747

22757.3

22747

22747

22747

22747

2000

31436

31554.2

31304

31307.5

31301

31305

5000

38722

38809.9

38553

38560

38553

38569.1

10000

44509

44623.8

44356

44356

44351

44353.9

1000

24757

24783.1

24723

24723

24723

24723

5000

45369

45388.8

45255

45255.5

45203

45238.9

10000

51649

51861.3

51402

51422

51378

51380.4

15000

58208

58458.8

58007

58019

57994

57995

CE

500

DLSECC

Edges

ACCEPTED MANUSCRIPT

20000

59890

60025.4

59678

59684

Avg

33276.8

33370.2

33187.33

33190.73

Worse

13

15

6

7

Better

0

0

0

0

Equal

2

0

9

8

59651

59655.3

33178.9

33183.6

7.4 Effectiveness of scoring strategy

CR IP T

We conducted an experiment to evaluate the effectiveness of the dynamic scoring strategy introduced in

Section 3. In this experiment, we compared the DLSWCC algorithm with an alternative algorithm, namely

DLSWCC_STATIC, which works with a static scoring strategy, i.e., at the end of each loop during the local search

process, the weight of each edge eE will not be updated if e is uncovered by the candidate solution. We tested the

AN US

two algorithms on class LPI over 10 runs with different random seeds per instance.

Table 2 Results of Algorithm DLSWCC_STATIC and DLSWCC for instances of Class LPI. DLSWCC_STATIC Nodes

Edges Best 500 1000 2000

CE AC

1000

Avg

12930.4

12616

12616

16789.0

16465

16465

21454

21508.0

20863

20866.2

27850

28136.9

27241

27241

10000

29923

30230.6

29573

29573

500

15262

15262.0

15025

15025

1000

23175

23175.0

22747

22747

2000

32234

32235.9

31301

31305

5000

40153

40252.0

38553

38569.1

10000

45249

45414.6

44351

44353.9

1000

25260

25348.2

24723

24723

5000

46262

46273.8

45203

45238.9

10000

52776

52985.3

51378

51380.4

15000

59000

59207.6

57994

57995

20000

60050

60460.0

59651

59655.3

33178.9

33183.6

PT

800

Best

16789

ED

5000

12870

Avg

M

500

DLSWCC

Avg

33887.1

34013.9

Worse

15

15

Better

0

0

Equal

0

0

In Table 2, the columns Nodes, Edges, Best, Avg, and Time have the same meanings as those in Table 1. The

rows Avge, Worse, Better, and Equal also have the same meaning as those in Table 1. For each combination, we

ACCEPTED MANUSCRIPT

compared the best performance (Best) and the average performance (Avg) of the two algorithms. The bold values

indicate the better solution values obtained between the two algorithms compared.

As discussed in Section 3, in the static scoring mechanism, the vertex score does not change throughout the

search process; thus, it might be easier for the search to be trapped in a local minimum. However, in the dynamic

CR IP T

scoring strategy, the scores of the vertices related to the uncovered edge will be changed dynamically, thereby

enabling the local search to jump out of local optima. The experimental results listed in Table 2 provide further

evidence regarding the effectiveness of the dynamic scoring mechanism. For both the best performance and the

AN US

average performance, in all 15 cases, the algorithm with the dynamic scoring mechanism obtained better results

than the algorithm with the static scoring mechanism.

We then used two representative instances (i.e., n = 1000, m = 1000 and n = 1000, m = 10000) to further investigate the influence of an important component of the proposed DLSWCC algorithm, i.e., the dynamic

M

scoring strategy described in Section 3.The evolution of the objective value with the number of iterations is shown in Fig. 2. As can be seen in this figure, DLSWCC can jump out of local optima and obtain a better solution, while

ED

DLSWCC_STATIC is trapped in a local optimum and can only find worse solutions. This experiment clearly demonstrates the importance of the proposed dynamic scoring strategy.

PT

25500 25400 25200

AC

CE

Objective value

25300 25100 25000

DLSWCC_STATIC

24900

DLSWCC

24800 24700 24600 0

10000

20000

30000

Number of iterations (a) Instance 1(n=1000,m=1000)

40000

ACCEPTED MANUSCRIPT

53200 53000 Objective value

52800 52600 52400 52200

DLSWCC_STATIC

52000

DLSWCC

51800 51400 0

10000

20000

30000

Number of iterations (b) Instance 1(n=1000,m=10000)

CR IP T

51600 40000

AN US

Fig. 2 The evolution of the objective value with the number of iterations.

7.5Efficiency of fast incremental evaluation technique

We now discuss and analyze the importance of the fast incremental evaluation technique of the proposed DLSWCC algorithm. The main component of the DLSWCC algorithm is the vertex selection strategy. For a local

M

search method, it is particularly important to rapidly determine the vertex to be selected and added into or removed from the candidate solution. As described in Section 3, we propose a fast evaluation technique for updating the

ED

score of each vertex after changing the state of a vertex. In other words, once a change is performed, only the scores of some specific vertices affected by the change are updated accordingly instead of recalculating the scores

PT

of all possible vertices.

In order to evaluate the effectiveness of this fast evaluation technique, we carried out computational

CE

experiments to compare the performance of the DLSWCC algorithm with and without this technique (denoted by DLSWCC+F and DLSWCC-F, respectively) on two representative instances (i.e., n = 1000, m = 1000 and n =

AC

1000, m = 10000).

For both instances, our algorithms were independently run 10 times. The evolution of the average CPU time

with the number of iterations is shown in Fig. 1. Specifically, for Instance 1 (n = 1000, m = 1000) and Instance 2 (n = 1000, m = 10000), it can be clearly seen that the average CPU time of DLSWCC+F is respectively around 3 and 5 times shorter than that of DLSWCC-F. This experiment clearly demonstrates that the proposed fast incremental evaluation technique plays an important role in the efficiency of our algorithm.

Average CPU time in second

ACCEPTED MANUSCRIPT

15 10 5

DLSWCC+F

0

DLSWCC-F 1 3 5 7 9 11 13 15 17 19

CR IP T

30 20 10

DLSWCC+F DLSWCC-F

0

AN US

Average CPU time in second

Number of iteration (*103) (a) Instance 1 (n=1000,m=1000)

1 3 5 7 9 11 13 15 17 19

Number of iteration (*103) (b) Instance 2(n=1000,m=10000)

M

Fig. 3 Performance comparison of DLSWCC algorithm with and without the fast evaluation technique

7.6 Experimental results on SPI instances

ED

In order to further confirm the effectiveness of our algorithm, we compared DLSWCC with other

state-of-the-art algorithms in terms of solution quality and computational time for class SPI, as shown in Tables 3

PT

and 4. The first two columns of both tables indicate the number of vertices (n) and the number of edges (m).The

CE

column OPT represents the optimal solution values (averaged over 10 instances) as originally provided in [54].The

AC

columns AVG and TIME represent the average objective values and average computing time in seconds for each

algorithm. The average computing time of ACO+SEE has not been provided in [36]. Note that the values equal to

the optima are marked in bold. The row Avge represents the average of the columns AVG and TIME, while the

rows Worse, Better, and Equal have the same meaning as those in Table 1.

From Tables 3 and 4, we can observe that DLSWCC, MS-ITS, and PBIG obtained similar results in terms of

both AVG and TIME; moreover, these three algorithms yielded better solutions than the other algorithms (ACO and

ACCEPTED MANUSCRIPT

ACO+SEE). Specifically, the ACO algorithm matched the optimal solutions for 10 out of 20 instances under Type

I and 6 out of 20 instances under Type II, while the ACO+SEE algorithm matched the optimal solutions for 17 out

of 20 instances under Type I and 16 out of 20 instances under Type II. Further, DLSWCC, MS-ITS, and PBIG

easily matched the optimal solutions for all the 40 instances of the two types. In addition, DLSWCC, MS-ITS, and

CR IP T

PBIG outperformed the ACO algorithm on most of the instances in terms of computational time (within no more

than 0.001s).

In summary, these algorithms (except for the ant colony optimization algorithms) can obtain the best values of

AN US

the instances. This might be because these instances are all relatively small for the number of nodes (less than 25).

Therefore, these instances are relatively easy to solve for these algorithms.

Table 3 Results for instances of class SPI (Type I). ACO

ACO + SEE

OPT

PBIG

MS-ITS

DLSWCC

AVG

TIME

AVG

AVG

TIME

AVG

TIME

AVG

TIME

284.0

284.0

0.000

284.0

0.000

284.0

0.000

398.7

284.0

284.0

0.000

20

398.7

398.7

0.008

398.7

0.000

398.7

0.000

398.7

0.000

30

431.3

431.3

0.003

431.3

431.3

0.000

431.3

0.000

431.3

0.000

40

508.5

508.5

0.003

508.5

508.5

0.000

508.5

0.000

508.5

0.000

20

441.9

441.9

0.005

441.9

441.9

0.000

441.9

0.000

441.9

0.000

40

570.4

574.2

0.011

570.4

570.4

0.000

570.4

0.000

570.4

0.000

60

726.2

729

0.008

726.2

726.2

0.000

726.2

0.000

726.2

0.000

807.5

814.6

0.010

807.5

807.5

0.000

807.5

0.000

807.5

0.000

880.0

880.0

0.008

880.0

880.0

0.000

880.0

0.000

880.0

0.000

473.0

473.0

0.005

473.0

473.0

0.000

473.0

0.000

473.0

0.000

40

659.3

661.4

0.016

660.3

659.3

0.000

659.3

0.001

659.3

0.000

60

861.8

861.8

0.014

861.8

861.8

0.000

861.8

0.001

861.8

0.000

80

898.0

905.4

0.016

899.9

898.0

0.000

898.0

0.001

898.0

0.000

100

1026.2

1026.8

0.016

1026.2

1026.2

0.000

1026.2

0.001

1026.2

0.000

120

1038.2

1041.5

0.017

1038.2

1038.2

0.000

1038.2

0.001

1038.2

0.000

40

756.6

756.6

0.019

756.6

756.6

0.000

756.6

0.000

756.6

0.000

80

1008.1

1009.6

0.022

1008.1

1008.1

0.000

1008.1

0.001

1008.1

0.000

100

1106.9

1107.4

0.025

1109.1

1106.9

0.000

1106.9

0.000

1106.9

0.000

150

1264.0

1264.0

0.031

1264.0

1264.0

0.000

1264.0

0.000

1264.0

0.000

200

1373.4

1377.7

0.030

1373.4

1373.4

0.000

1373.4

0.001

1373.4

0.000

Avge

777.4

0.013

776

775.7

0.000

775.7

0.000

775.7

0.000

Worse

10

3

0

15

80 100

CE

20

AC

20

25

ED

10

PT

10

Edges

M

Nodes

0

ACCEPTED MANUSCRIPT

Better

0

0

0

0

Equal

10

17

20

20

Table 4 Results for instances of class SPI (Type II). ACO

PBIG

ACO + SEE

MS-ITS

DLSWCC

OPT AVG

TIME

AVG

AVG

TIME

AVG

TIME

AVG

TIME

18.8

0.003

18.8

18.8

0.000

18.8

0.000

18.8

0.000

51.1

51.1

0.003

51.1

51.1

0.000

51.1

0.000

51.1

0.000

30

127.9

127.9

0.003

127.9

127.9

0.000

127.9

0.000

127.9

0.000

40

268.3

268.3

0.010

268.3

268.3

0.000

268.3

0.000

268.3

0.000

20

34.7

34.7

0.005

34.7

34.7

0.000

34.7

0.000

34.7

0.000

40

170.5

171.5

0.010

170.5

170.5

0.000

170.5

0.000

170.5

0.000

60

360.5

360.8

0.008

360.5

360.5

0.000

360.5

0.000

360.5

0.000

80

697.9

698.7

0.014

697.9

697.9

0.000

697.9

0.000

697.9

0.000

100

1130.4

1137.8

0.008

1130.4

1130.4

0.001

1130.4

0.000

1130.4

0.000

20

32.9

33.0

0.011

32.9

32.9

0.000

32.9

0.001

32.9

0.000

40

111.6

111.8

0.017

111.8

111.6

0.000

111.6

0.000

111.6

0.000

60

254.1

254.4

0.016

254.1

254.1

0.000

254.1

0.001

254.1

0.000

80

452.2

453.1

0.016

452.3

452.2

0.000

452.2

0.001

452.2

0.000

100

775.2

775.2

0.016

775.2

775.2

0.000

775.2

0.000

775.2

0.000

120

1123.1

1125.5

0.017

1123.1

1123.1

0.001

1123.1

0.001

1123.1

0.000

40

98.7

98.8

0.025

98.7

98.7

0.000

98.7

0.000

98.7

0.000

80

372.7

373.3

0.026

373.0

372.7

0.000

372.7

0.001

372.7

0.000

100

595.0

595.1

0.028

595.1

595.0

0.000

595.0

0.001

595.0

0.000

150

1289.9

1291.7

0.030

1290.9

1289.9

0.000

1289.9

0.001

1289.9

0.000

200

2709.5

2713.1

0.030

2709.5

2709.5

0.000

2709.5

0.001

2709.5

0.000

Avge

534.7

0.015

533.8

533.8

0.000

533.8

0.000

533.8

0.000

Worse

14

5

0

0

0

0

0

0

6

15

20

20

20

25

Better

CE

Equal

AN US

15

CR IP T

18.8

20

ED

10

PT

10

Edges

M

Nodes

AC

7.7 Experimental results on MPI instances The results for the problem instances of class MPI are summarized in Tables 5 and 6, in accordance with the

previously described case of the instances of class SPI in Table 3. The additional column RGES in Table 6

represents the results of the RGES algorithm, which was only applied to the Type II instances of class MPI. Note

that the optimal solutions are not known for the instances of this class, as the optimal solutions are still unknown

for problem instances of MPI, LPI, and massive graph instances. Therefore, the column OPT is missing.

ACCEPTED MANUSCRIPT

As shown in Tables 5 and 6, DLSWCC finds solutions no worse than the previous best known solutions,

except for the case with n = 200, m = 100 under Type II, and it is significantly superior in terms of running time

compared to the other state-of-the-art algorithms. Specifically, RGES matched the best solutions for 2 out of 32

instances and obtained a better solution than the other algorithms for one case (n = 200, m = 100) under Type II.

CR IP T

ACO matched the best solutions for 0 out of 39 instances under Type I and 2 out of 32 instances under Type II.

ACO+SEE matched the best solutions for 0 out of 39 instances under Type I and 2 out of 32 instances under Type

II. PBIG matched the best solutions for 12 out of 39 instances under Type I and 18 out of 32 instances under Type

AN US

II. MS-ITS matched the best solutions for 26 out of 39 instances under Type I and 20 out of 32 instances under

Type II. DLSWCC matched the best solutions for 39 out of 39 instances under Type I and 32 out of 32 instances

under Type II.

M

More importantly, DLSWCC could dominate 22 out of 71 cases, i.e., DLSWCC obtained 22 new upper

bounds of 71 instances. Under Type I, ACO had an average computational time of 2.338s, PBIG had an average

ED

computational time of 2.489s, MS-ITS had an average computational time of 0.511s, and DLSWCC had an

PT

average computational time of 0.028s. Under Type II, ACO had an average computational time of 3.071s, PBIG

had an average computational time of 4.230s, MS-ITS had an average computational time of 0.452s, and

CE

DLSWCC had an average computational time of 0.044s. It is obvious that the DLSWCC algorithm is faster than

AC

the other algorithms under both types.

A careful observation showed that, when the instances are relatively small, and the number of nodes is less

than 200, PBIG and MS-ITS algorithms can still find the most optimal values of the instances. However, as the

size of the graph increases, the previous algorithms seem unable to find the optimal solutions, while our algorithm

can still find the best solutions for all instances. This is because, when the size of the graph increases, it is easier

for the algorithm to be trapped in a local optimum or cycling search; the proposed WCC strategy and dynamic

ACCEPTED MANUSCRIPT

scoring mechanism can effectively overcome such problems. Moreover, from the time efficiency aspect, DLSWCC

shows the best performance among all the algorithms in solving moderate-scale problem instances under both

Type I and Type II. Table 5 Results for instances of Class MPI (Type I). ACO

200

50

1282.1

0.063

1280.9

1280.0

0.016

1280.0

0.001

1280.0

0.000

100

1741.1

0.083

1740.7

1735.3

0.007

1735.3

0.002

1735.3

0.000

250

2287.4

0.097

2280.6

2272.3

0.006

2272.3

0.002

2272.3

0.000

500

2679.0

0.102

2669.3

2661.9

0.003

2661.9

0.002

2661.9

0.000

750

2959.0

0.125

2957.3

2951.0

0.027

2951.0

0.002

2951.0

0.000

1000

3211.2

0.117

3199.8

3193.7

0.019

3193.7

0.003

3193.7

0.000

100

2552.9

0.273

2544.0

2537.6

0.019

2534.2

0.027

2534.2

0.000

250

3626.4

0.367

3614.9

3602.7

0.057

3601.6

0.010

3601.6

0.000

500

4692.1

0.433

4636.4

4600.6

0.182

4600.6

0.047

4600.6

0.000

750

5076.4

0.502

5082.8

5045.5

0.088

5045.5

0.059

5045.5

0.000

1000

5534.1

0.456

5522.7

5509.4

0.084

5508.2

0.135

5508.2

0.000

2000

6095.7

0.589

6068.3

6051.9

0.501

6051.9

0.011

6051.9

0.000

150

3684.9

0.691

3676.8

3667.3

0.088

3667.0

0.030

3666.9

0.001

250

4769.7

0.891

4754.9

4720.3

0.116

4719.9

0.090

4719.9

0.001

500

6224.0

1.194

6228.7

6165.7

0.294

6165.4

0.234

6165.4

0.005

750

7014.7

1.042

6996.3

6963.7

0.522

6967.0

0.118

6956.4

0.003

1000

7441.8

1.206

7383.6

7368.8

0.536

7359.7

0.190

7359.7

0.006

2000

8631.2

1.103

8597.2

8562.0

0.824

8549.4

0.317

8549.4

0.010

3000

8950.2

0.966

8940.2

8899.8

1.300

8899.8

0.080

8899.8

0.009

5588.7

1.674

5572.4

5551.9

0.157

5551.6

0.108

5551.6

0.005

7259.2

2.160

7233.7

7192.4

0.547

7195.1

0.120

7191.9

0.010

8349.8

2.602

8300.3

8274.5

0.664

8269.9

0.283

8269.9

0.006

1000

9262.2

2.221

9208.4

9150.6

1.019

9150.0

0.777

9145.5

0.012

2000

10916.5

2.437

10891.1

10831.0

2.726

10830.0

0.650

10830.0

0.024

3000

11689.1

2.497

11680.4

11600.2

2.866

11599.6

0.596

11595.8

0.015

250

6197.8

2.273

6169.2

6148.7

0.390

6148.7

0.109

6148.7

0.014

500

8538.8

4.016

8495.9

8440.7

1.344

8438.8

1.137

8436.2

0.016

750

9869.4

4.047

9815.5

9752.8

2.447

9745.9

0.521

9745.9

0.214

1000

10866.6

3.755

10791.0

10753.7

2.371

10752.1

0.933

10751.7

0.037

2000

12917.7

3.942

12827.0

12757.6

3.471

12755.9

2.298

12751.5

0.033

3000

13882.5

4.276

13830.6

13723.5

3.233

13723.3

1.766

13723.3

0.057

5000

14801.8

3.842

14735.9

14676.7

22.508

14669.7

0.648

14669.7

0.043

300

7342.7

4.322

7326.6

7296.0

0.711

7295.8

0.469

7295.8

0.021

500

9517.4

5.178

9491.9

9403.1

1.596

9410.8

0.963

9403.1

0.024

250

AC

CE

750

300

TIME

AVG

TIME

AVG

TIME

CR IP T

AVG

500

250

DLSWCC

AVG

AN US

150

MS-ITS

TIME

M

100

PBIG

AVG

ED

50

ACO + SEE

Edges

PT

Nodes

ACCEPTED MANUSCRIPT

11166.9

6.055

11156.5

11038.1

3.349

11032.0

0.698

11029.3

1000

12241.7

6.231

12163.7

2000

14894.9

6.488

14834.6

3000

16054.1

6.299

5000

17545.4

Avg

7881.0

Worse

0.038

12108.9

3.095

12107.7

0.720

12098.5

0.04

14749.9

10.982

14737.7

3.230

14732.2

0.099

15910.5

15848.2

11.636

15841.4

0.949

15840.8

0.172

6.558

17479.8

17350.6

17.283

17342.9

1.605

17342.9

0.195

2.338

7848.5

7806.1

2.489

7804.0

0.511

7802.8

0.028

39

39

27

13

Better

0

0

0

0

Equal

0

0

12

26

CR IP T

750

Table 6 Results for instances of Class MPI (Type II). ACO

150

200

AVG

AVG

TIME

83.9

0.072

83.9

83.7

0.002

100

276.2

276.2

0.097

274.4

271.2

0.003

250

1886.4

1886.8

0.111

1870.3

1853.4

0.010

500

7914.5

7915.9

0.120

7876.7

750

20134.1

20134.1

0.111

50

67.4

67.4

100

169.1

250

DLSWCC

AVG

TIME

AVG

TIME

83.7

0.005

83.7

0.000

271.2

0.004

271.2

0.000

1853.4

0.003

1853.4

0.000

7825.1

0.000

AN US

83.9

7825.1

0.009

7825.1

0.008

20087.6

20079.0

0.010

20079.0

0.003

20079.0

0.000

0.184

67.2

67.2

0.002

67.2

0.010

67.2

0.000

169.1

0.334

167.8

166.6

0.017

166.6

0.023

166.6

0.000

890.4

901.7

0.514

895.3

886.5

0.065

886.5

0.039

886.5

0.002

500

3725.3

3726.7

0.481

3707.0

3693.6

0.101

3693.6

0.031

3693.6

0.000

750

8745.5

8754.5

0.444

50

65.8

65.8

0.292

100

144.7

144.7

0.583

8742.3

8680.2

0.129

8680.2

0.194

8680.2

0.000

65.9

65.8

0.001

65.8

0.010

65.8

0.000

144.1

144.0

0.026

144.0

0.065

144.0

0.000

250

624.4

625.7

1.387

624.8

616

0.187

615.8

0.199

615.8

0.014

500

2365.2

2375.0

1.908

2358.6

2331.5

0.572

2331.5

0.177

2331.5

0.006

750

5798.6

5799.2

1.295

5707.0

5698.7

0.550

5698.5

0.244

5698.5

0.008

50

59.6

59.6

0.463

59.6

59.6

0.001

59.6

0.016

59.6

0.000

132.6

134.7

0.981

134.6

134.5

0.021

134.5

0.050

134.5

0.000

488.4

488.7

2.413

487.9

483.1

0.280

484.5

0.300

483.1

0.013

CE

250 500

1843.6

1843.6

3.423

1818.7

1804.3

1.286

1803.9

0.375

1803.9

0.062

750

4112.8

4112.8

3.600

4077.0

4043.6

0.768

4043.5

0.452

4043.5

0.040

250

423.2

423.2

3.311

421.2

419.0

0.476

419.0

0.129

419.0

0.012

500

1457.4

1457.4

5.781

1454.3

1435.7

4.219

1434.7

0.869

1434.2

0.201

750

3315.9

3315.9

5.983

3289.4

3261.0

2.935

3256.4

0.459

3256.1

0.115

AC 300

TIME

50

100

250

MS-ITS

AVG

M

100

PBIG

ACO + SEE

RGES

ED

50

Edges

PT

Nodes

1000

6058.2

6058.2

6.297

6040.0

5989.4

7.978

5988.2

0.259

5986.4

0.120

2000

26149.1

26149.1

4.859

25932.1

25658.5

11.809

25646.4

1.698

25636.5

0.066

5000

171917.2

171917.2

4.856

171500.7

170269.1

37.770

170269.1

1.298

170269

0.052

250

403.9

403.9

5.372

402.7

399.5

0.353

399.6

0.397

399.4

0.016

500

1239.1

1239.1

9.155

1237.3

1216.4

5.439

1217.2

0.527

1216.4

0.074

750

2678.2

2678.2

10.994

2674.1

2639.4

6.545

2640.6

0.362

2639.3

0.259

1000

4895.5

4895.5

9.045

4867.9

4796.3

15.204

4796.2

0.993

4795.0

0.203

2000

21295.2

21295.2

7.242

21107.7

20891.6

10.911

20886.4

1.834

20881.3

0.056

ACCEPTED MANUSCRIPT

143243.5

143243.5

6.553

142292.6

141265.3

27.674

141226.8

3.418

141220.4

0.084

Avg

3000

13831.4

13832.6

3.071

13764.7

13663.4

4.230

13661.52

0.452

13660.6

0.044

Worse

29

30

30

14

12

Better

1

0

0

0

0

Equal

2

2

2

18

20

7.8 Experimental results on LPI instances

CR IP T

Table 7 compares DLSWCC with ACO+SEE, PBIG, and MS-ITS for instances of the class LPI. The

instances of this class are larger than those of class SPI and class MPI. MS-ITS is the best available algorithm for

these instances. In contrast to classes SPI and MPI, class LPI only contains a single instance per combination of n

AN US

and m. Therefore, the results are given as averages over 10 runs with different random seeds per instance. For each

of the four algorithms, the columns Best, Avg, and Time have the same meaning as those in Table 1. The best value

among the four is indicated in bold.

M

As can be seen in the table, DLSWCC obtained the best resultsin terms of best performance and average

performance. Moreover, DLSWCC could dominate 5 out of 15 cases, i.e., DLSWCC obtained 5 new upper bounds

ED

of these instances. In particular, when the number of nodes was equal to 1000, the previous algorithms could only

PT

match one best value of the instances.

We can see from the table that, when the size of the instances increases, our algorithm performs better in

CE

terms of both the best solution and the average solution. Because cycling search and local minima are more

AC

common for large-scale instances, the experimental results further prove that the proposed WCC strategy and

dynamic scoring strategy can avoid cycling search efficiently and enable the algorithm to jump out of local

minima.

Table 7 Results for instances of Class LPI

ACO + SEE

500

Best

Avg

Best

Avg

500

12675

12687.7

12616

12620.0

1000

16516

16574.9

16465

16470.1

2000

21000

21093.0

20863

20870.8

5000

27294

27585.5

27318

27428.2

10000

29573

29796.4

29573

29666.8

15049

15069.9

15025

15025.0

22792

22852.1

22747

22763.0

2000

31680

31786.9

31355

31422.6

5000

38830

38906.7

38665

38718.7

10000

44499

44691.7

44396

1000

24856

24925.4

24746

5000

45446

45588.7

45255

10000

51875

52105.0

51378

15000

58394

58654.8

20000

60010

Avg Worse Better

AC

Avg

1.601

12623

10.240 9.283

DLSWCC

Time

Best

Avg

Time

12635.0

3.057

12616

12616.0

5.184

16480

16483.1

9.348

16465

16465.0

0.826

20863

20866.9

9.606

20863

20866.2

11.024

34.707

27241

27241.0

9.103

27241

27241.0

5.263

36.405

29573

29573.0

36.250

29573

29573.0

14.930

2.535

15046

15054.1

6.726

15025

15025.0

0.442

22760

22760.0

13.994

22747

22747.0

1.589

58.008

31309

31345.7

40.967

31301

31305.0

1.725

113.842

38553

38557.1

67.096

38553

38569.1

2.812

44397.8

96.467

44351

44359.9

93.750

44351

44353.9

0.824

24763.1

14.435

24735

24766.1

19.281

24723

24723.0

5.500

45295.4

178.311

45230

45256.9

113.739

45203

45238.9

7.912

51540.9

325.956

51378

51423

209.673

51378

51380.4

9.680

58014

58145.2

363.179

58014

58068.9

242.143

57994

57995.0

7.098

60268.2

59790

59847.9

647.563

59675

59719.9

243.324

59651

59655.3

3.333

33366

33505.8

33213.7

33265.0

126.941

33188.7

33207.4

74.803

33178.9

33183.6

5.210

14

15

8

14

9

12

0

0

0

0

0

1

1

0

7

1

6

2

CE

Equal

Best

11.589

ED

1000

MS-ITS Time

M

500 1000

PT

800

PBIG

Edges

AN US

Nodes

CR IP T

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT

7.9 Experimental results on massive graph instances We carried out extensive experiments to evaluate DLSWCC on a broad range of real-world graphs and

compare it with MS-ITS, which obviously performs better than the other algorithms in academic instances. The

two algorithms were run under the same experimental conditions. The instances of the class massive graph

CR IP T

instances are the largest ones in the entire benchmark set. They contain a single instance per combination of n and

m. Therefore, the results are given as averages over 10 runs with different random seeds per instance.

Table 8 lists the solutions of DLSWCC and MS-ITS for the class massive graph instances. The first column

lists the names of the instances. The next two columns indicate the number of vertices (n) and the number of edges

AN US

(m). For each of the two algorithms, the columns Best, Avg, and Time have the same meaning as those in Table 1.

Note that the bold value indicates the better solution value obtained between the two algorithms compared. For

some instances, MS-ITS failed to find a vertex cover; for these cases, the column entry for MS-ITS is marked as

M

―N/A‖.

ED

We can observe that for all the instances, the number of nodes was greater than 1000 (greater than 500,000 in

some cases). Although MS-ITS shows much better performance than the other algorithms in academic instances, it

PT

cannot solve many instances of these graphs, as discussed previously. For all 56 cases, MS-ITS could only solve 27

CE

instances. The largest instances solved by MS-ITS had 21363 nodes. Moreover, for all the instances, DLSWCC could find better ―best results‖, while MS-ITS was only able to match DLSWCC in 4 relatively small cases

AC

(graphs ca-GrQc, ia-fb-messages, ia-reality, and web-google) for the best performance. The experimental results

provide evidence that both the WCC strategy and the dynamic scoring strategy are efficient tools for solving

large-scale real-world MWVC problems, owing to their ability to avoid local minima. Table 8 Results for massive graph instances. MS-ITS Graph bio-dmela

DLSWCC

Nodes

Edges

Best

Avg

Time

Best

Avg

Time

7393

25569

149452

149556.8

2126.409

148508

148540.4

135.42

24265.0

3.81

646529

647019.1

420.94

7048010

7048225.5

792.42

685813

686344.3

489.20

1.233

29390

29390.0

3.67

N/A

N/A

6618986

6619251.8

496.89

N/A

N/A

N/A

8986085

8986982.4

1106.33

7515

28303

28303.0

8.424

28298

28298.0

0.15

4158

13422

122330

122331.5

674.020

122278

122332.5

95.51

ca-HepPh

11204

117619

372836

373069.8

10475.341

365251

365530.8

308.65

ca-MathSciNet

332689

820644

N/A

N/A

N/A

7668338

7668818.0

838.24

socfb-Berkeley13

22900

852419

N/A

N/A

N/A

1011694

1011902.5

907.12

socfb-CMU

6621

249959

296930

297032.5

1011.864

292362

292428.8

110.98

socfb-Duke14

9885

506437

458100

460907.8

1629.423

450799

450898.3

312.80

socfb-Indiana

29732

1305757

N/A

N/A

N/A

1375506

1377961.4

1193.10

socfb-MIT

6402

251230

274242

274443.0

12093.173

272431

272472.4

171.22

socfb-OR

63392

816886

N/A

socfb-Penn94

41536

1362220

N/A

socfb-Stanford3

11586

568309

506903

socfb-UCLA

20453

747604

913929

socfb-UConn

17206

604867

792021

socfb-UCSB37

14917

482215

677548

socfb-UIllinois

30795

1264421

N/A

socfb-Wisconsin87

23831

835946

ia-email-EU

32430

54397

M

ACCEPTED MANUSCRIPT

24269

24290.0

53.883

ca-AstroPh

17903

196972

662655

662926.5

16156.917

ca-citeseer

227320

814134

N/A

N/A

N/A

ca-CondMat

21363

91286

704287

704798.5

9860.340

ca-CSphd

1882

1740

29550

29609.8

ca-dblp-2010

226413

716460

N/A

ca-dblp-2012

317080

1049866

ca-Erdos992

6100

ca-GrQc

ia-email-univ

1133

5451

ia-enron-large

33696

ia-fb-messages

1266

ia-reality

6809

ia-wiki-Talk

N/A

2114652

2116501

1169.49

N/A

N/A

1827780

1829265.1

1052.84

507561.5

4388.662

495332

495411.3

479.12

915068.0

154.893

888489

888857.8

774.81

793196.8

15843.684

771427

771744.5

638.52

678029.5

5456.765

659407

659615.9

447.50

N/A

N/A

1414900

1417140.9

1197.28

N/A

N/A

N/A

1071625

1072009.6

1081.07

N/A

N/A

N/A

48269

48269.0

5.98

32931

32933

91.708

32931

32931.0

1.49

180811

N/A

N/A

N/A

695112

695294.8

774.01

6451

32300

32316.5

44.546

32300

32300.1

2.27

7680

4894

4894

10.436

4894

4894.0

0.03

92117

360767

N/A

N/A

N/A

962030

962194.9

1194.68

4941

6594

121386

121503.3

993.332

120116

120146.5

110.40

ED

N/A

CE

inf-power

24265

CR IP T

1948

AN US

1458

PT

bio-yeast

91813

125704

N/A

N/A

N/A

2629821

2630671.0

1195.30

sc-nasasrb

54870

1311227

N/A

N/A

N/A

3004611

3005889.1

1021.60

sc-shipsec1

140385

1707759

N/A

N/A

N/A

6843870

6844747.6

2056.61

soc-brightkite

56739

212945

N/A

N/A

N/A

1187631

1187962.3

1162.34

soc-delicious

536108

1365961

N/A

N/A

N/A

4957627

4958206.4

1720.58

soc-douban

154908

327162

N/A

N/A

N/A

515270

515288.1

1111.14

soc-epinions

26588

100120

N/A

N/A

N/A

539569

539915.5

593.16

soc-gowalla

196591

950327

N/A

N/A

N/A

4729181

4729405.5

909.89

soc-slashdot

70068

358647

N/A

N/A

N/A

1247682

1248151.4

1182.44

soc-twitter-follows

404719

713319

N/A

N/A

N/A

135811

135811.0

314.75

tech-as-caida2007

26475

53381

N/A

N/A

N/A

200511

200755.8

357.47

tech-internet-as

40164

85123

N/A

N/A

N/A

312123

312308.4

490.53

tech-p2p-gnutella

62561

147878

N/A

N/A

N/A

917822

918207.3

1058.55

AC

rec-amazon

ACCEPTED MANUSCRIPT

tech-RL-caida

190914

N/A

N/A

tech-routers-rf

2113

6632

44919

44936.5

60.996

44894

44902.3

34.89

tech-WHOIS

7476

56943

128568

128588.0

6499.892

128337

128345.3

122.98

web-arabic-2005

163598

1747269

N/A

N/A

N/A

6572535

6573003.0

1855.93

web-BerkStan

12305

19500

292693

293081.0

2304.889

286665

286871.4

290.43

web-edu

3031

6474

79499

79545.5

242.268

79078

79100.8

47.24

web-google

1299

2773

27842

27842.0

0.7585

27842

27842.0

2.56

web-indochina-2004

11358

47606

409686

409765.0

3187.944

405419

405773.4

255.25

web-sk-2005

121422

334419

N/A

N/A

N/A

3135635

3135843.5

115.32

web-spam

4767

37375

129440

129534.8

644.218

128980

128994.8

92.55

web-webbase-2001

16062

25593

144674

144718.5

399.876

144361

144444.9

186.70

Worse

52

53

Better

0

1

Equal

4

2

4203838

4204531.8

414.25

AN US

8. Conclusion

N/A

CR IP T

607610

This paper proposed a new local search algorithm, namely DLSWCC, to solve the MWVC problem. The

dynamic scoring strategy was proposed to enable our algorithm to find different possible optimal solutions. Further,

the weighted configuration checking (WCC) strategy was introduced to overcome the cycling problem in local

M

search. By combining the WCC strategy with the dynamic scoring strategy, we designed the vertex selection

ED

strategy to determine the vertex to be selected as a candidate solution component. In addition, DLSWCC was

compared with several state-of-the-art algorithms on benchmark instances; the experimental results showed that

PT

DLSWCCis effective and efficient. For the class SPI, DLSWCC obtained optimal solutions for all 20 instances.

CE

For the class MPI, DLSWCC obtained 22 new upper bounds of 71 instances. For the class LPI, DLSWCC

AC

obtained 5 new upper bounds of 15 instances. For the class massive graph instances, DLSWCC obtained 56 new upper bounds of 56 instances. The theoretical aspects of our approach require further investigation. A possible direction for future work is to extend the techniques used in our algorithms to other local search frameworks for

the MWVC problem. Finally, because our framework has virtually no parameters, it can be easily adapted to other

combinatorial problems [31][45][46][48][55][64][66][67][68].

Acknowledgements

ACCEPTED MANUSCRIPT

The authors of this paper wish to extend their sincere gratitude to all the anonymous reviewers for their efforts.

This work was supported in part by NSFC (under Grant Nos. 61370156, 61403076, and 61403077) and the

Program for New Century Excellent Talents in University (NCET-13-0724).

Reference C. Aggarwal, J. Orlin, R. Tai, Optimized crossover for the independent set problem, Oper. Res. 45 (1997)

CR IP T

[1]

226–234. [2]

D.V. Andrade, M.G.C. Resende, R.F.F. Werneck, Fast local search for the maximum independent set problem, in: Proc. of WEA-08, 2008, pp. 220–234.

A. Arab, A. Alfi. An adaptive gradient descent-based local search in memetic algorithm applied to optimal

AN US

[3]

controller design, Inform. Sci. 299(2015)117-12. [4]

S.R. Balachandar, K. Kannan, A meta-heuristic algorithm for vertex covering problem based on gravity, Int. J. Math. & Statistical Sci. 1(3)(2009)130-136.

S. Balaji, V. Swaminathan, K. Kannan, An effective algorithm for minimum weighted vertex cover problem,

M

[5]

Int. J. Comput. & Math. Sci. 4(2010) 34-38.

V.C. Barbosa, L.C.D. Campos, A novel evolutionary formulation of the maximum independent set problem, J.

ED

[6]

Comb. Optim. 8 (4) (2004) 419–437.

H. Bhasin, M. Amini. The Applicability of Genetic algorithm to Vertex Cover[J]. International Journal of

PT

[7]

[8]

S. Bouamama, C. Blum, A. Boukerram, A population-based iterated greedy algorithm for the minimum

[9]

CE

Computer Applications, 123(17)(2015)29.

weight vertex cover problem[J]. Appl. Soft Comput. 2012, 12(6): 1632-1639.

AC

C. Brause, N. C. Lê, I. Schiermeyer. The maximum independent set problem in subclasses of subcubic graphs[J]. Discrete Mathematics, 338(10)(2015)1766-1778.

[10] B. Brešar, R. Krivoš-Belluš, G. Semanišin, P. Šparl. On the weighted k-path vertex cover problem[J]. Discrete Applied Mathematics, 177(2014) 14-18. [11] J. Brimberg, N. Mladenović, D. Urošević. Solving the maximally diverse grouping problem by skewed general variable neighborhood search, Inform. Sci. 295(2015)650-675. [12] S.W. Cai, J. Lin, K.L. Su, Two weighting local search for minimum vertex cover, Proc. AAAI. 2015. [13] S.W. Cai, K.L. Su, Local search for Boolean Satisfiability with configuration checking and subscore, Artif.

ACCEPTED MANUSCRIPT

Intell. 204( 2013)75-98. [14] S.W. Cai, K.L. Su, Q.L. Chen, EWLS: A New Local Search for Minimum Vertex Cover, Proc. AAAI. 2010. [15] S.W. Cai, K.L. Su, A. Sattar, Local search with edge weighting and configuration checking heuristics for minimum vertex cover, Artif. Intell. 175(9)(2011)1672-1696. [16] S.W. Cai, K.L. Su, C. Luo, A. Sattar, NuMVC: An efficient local search algorithm for minimum vertex cover, J. Artif. Intell. Res. (2013)687-716.

minimal vertex covers of graphs, Inform. Sci. 325( 2015) 87-97.

CR IP T

[17] J.K. Chen, Y.J. Lin, G.P. Lin, J.J. Li, Z.M. Ma. The relationship between attribute reducts in rough sets and

[18] J.K. Chen, Y.J. Lin, J.J. Li, G. Lin, Z.M. Ma, A. Tan. A rough set method for the minimum vertex cover problem of graphs[J]. Applied Soft Computing,, 42(2016) 360-367.

AN US

[19] V. Chvátal, A greedy heuristic for the set-covering problem, Math. Oper. Res. 3(1979)233–235.

[20] A. Coja-Oghlan, C. Efthymiou. On independent sets in random graphs[J]. Random Structures & Algorithms, 47(3)(2015) 436-486.

[21] J.A. Delgado-Osuna, M. Lozano, C. García-Martínez. An alternative artificial bee colony algorithm with

M

destructive–constructive neighbourhood operator for the problem of composing medical crews, Inform. Sci. 326(2016)215-226.

ED

[22] I. Dinur, S. Safra, On the hardness of approximating minimum vertex cover, Ann. of Math. 162 (2) (2005) 439–486.

PT

[23] S. Dobrev, R. Královič, R. Královič. Advice complexity of maximum independent set in sparse and bipartite graphs[J]. Theory of Computing Systems, 56(1)(2015) 197-219.

CE

[24] I. Evans, An evolutionary heuristic for the minimum vertex cover problem, in: Proc. of EP-98, 1998, pp. 377–386.

AC

[25] Z. Fang, Y. Chu, K. Qiao, X. Feng, K. Xu. Combining edge weight and vertex weight for minimum vertex cover problem[M]//Frontiers in Algorithmics. Springer International Publishing, 8497 (2014)71-81.

[26] H. Fernau, F. V. Fomin, G. Philip, S. Saurabh. On the parameterized complexity of vertex cover and edge cover with connectivity constraints[J]. Theoretical Computer Science, 565(2015)1-15.

[27] S. Gilmour, M. Dras, Kernelization as heuristic structure for the vertex cover problem, in: ANTS Workshop, 2006, pp. 452–459. [28] F. Glove, Tabu search—Part I, ORSA Journal on computing 1(1989) 190–206. [29] F. Glover, Tabu search—Part II, ORSA Journal on computing 2(1990) 4–32.

ACCEPTED MANUSCRIPT

[30] F. Glover, T. Ye, A. P. Punnen, G. Kochenberger. Integrating tabu search and VLSN search to develop enhanced algorithms: A case study using bipartite boolean quadratic programs[J]. European Journal of Operational Research,

241(3) (2015) 697-707.

[31] B. Gu, V.S. Sheng, Z.J. Wang, D. Ho, S. Osman, and S. Li, Incremental learning for ν-Support Vector Regression, Neural Networks, 67(2015)140-150. [32] Y.M. Hu, B. Yang, H.S. Wong. A weighted local view method based on observation over ground truth for

CR IP T

community detection, Inform. Sci. (2016) doi:10.1016/j.ins.2016.03.028.

[33] X. Huang, H. Cheng, J.X. Yu. Dense community detection in multi-valued attributed networks, Inform. Sci. 314(2015) 77-99.

[34] Y. Jin, J.K. Hao, Hybrid evolutionary search for the minimum sum coloring problem of graphs, Inform. Sci.

AN US

352 (2016) 15-34.

[35] Y. Jin, J.K. Hao, General swap-based multiple neighborhood tabu search for the maximum independent set problem,

Eng. Appl.

Artif. Intel. 37(2015)20-33.

[36] R. Jovanovic, M. Tuba, An ant colony optimization algorithm with improved pheromone correction strategy for the minimum weight vertex cover problem, Applied Soft Computing Journal, 11 (8)(2011)5360–5366.

M

[37] R. Karp, R.E. Miller, J.W. Theater, in: Complexity of Computer Computations, Plenum Press, New York, 1972.

ED

[38] S. Kifah, S. Abdullah, An adaptive non-linear great deluge algorithm for the patient-admission problem, Inform. Sci. 295(2015) 573-585.

PT

[39] G. Kochenberger, M. Lewis, F. Glover, H. Wang. Exact solutions to generalized vertex covering problems: a comparison of two models[J]. Optimization Letters, 9(7):(2015) 1331-1339.

CE

[40] N.C. Lê, C. Brause, I. Schiermeyer, Extending the MAX Algorithm for Maximum Independent Set, Discuss. Math. Graph T. 35(2)(2015) 365-386.

AC

[41] R.H. Li, J. X. Yu, X. Huang, H. Cheng, Z. Shang, Measuring the impact of MVC attack in large complex networks, Inform. Sci. 278(2014) 685-702.

[42] J. Li, Q. Pan, Solving the large-scale hybrid flow shop scheduling problem with limited buffers by a hybrid artificial bee colony algorithm, Inform. Sci. 316 ( 2015) 487-502. [43] R.Z. Li, S.L Hu, Y.Y. Wang, M.H. Yin, A local search algorithm with tabu strategy and perturbation mechanism

for

generalized

doi:10.1007/s00521-015-2172-9.

vertex

cover

problem,

Neural

Comput.

Appl.

(2016),

ACCEPTED MANUSCRIPT

[44] X.T. Li, M.H. Yin, Modified cuckoo search algorithm with self adaptive parameter method, Inform. Sci. 298(2015) 80-97. [45] X.T. Li, M.H. Yin, Multiobjective binary biogeography based optimization for feature selection using gene expression data, NanoBioscience, IEEE Trans. on, 12(4)(2013) 343-353. [46] Y. Liu, C. Yang, W.K.S. Tang, C, Li, Optimal topological design for distributed estimation over sensor networks, Inform. Sci. 254(2014) 83-97.

satisfiability, Cybernetics, IEEE Trans. on, 45(5)(2015) 1014-1027.

CR IP T

[47] C. Luo, S.W. Cai, K.L. Su, W. Wu, Clause states based configuration checking in local search for

[48] T.H. Ma, J.J. Zhou, M.L. Tang, Y. Tian, A. Al-Dhelaan, M. Al-Rodhaan, and S.Y.y Lee, Social network and tag sources based augmenting collaborative recommender system, IEICE transactions on Information and

AN US

Systems, E98-D(4)(2015)902-910.

[49] W. Pullan, H.H. Hoos, Dynamic local search for the maximum clique problem, J. Artif. Intell. Res. (JAIR) 25 (2006) 159–185.

[50] W. Pullan, Optimisation of unweighted/weighted maximum independent sets and minimum vertex covers,

M

Discrete Optim. 6(2)(2009)214-219.

[51] S. Richter, M. Helmert, C. Gretton, A stochastic local search approach to vertex cover, in: Proc. of KI-07,

ED

2007, pp. 412–426.

[52] R. Rossi, N. Ahmed, The network data repository with interactive graph analytics and visualization, Proc.

PT

AAAI. 2015.

[53] R. Rossi, N. Ahmed, Coloring large complex networks, Social Network Analysis and Mining (SNAM),

CE

(2014) 1–52.

[54] S.J. Shyu, P. Yin, B.M.T. Lin, An ant colony optimization algorithm for the minimum weight vertex cover

AC

problem, Ann. Oper. Res. 131 (1–4) (2004) 283–304.

[55] E. Teymourian, V. Kayvanfar, GH.M. Komaki, M. Zandieh, Enhanced intelligent water drops and cuckoo search algorithms for solving the capacitated vehicle routing problem, Inform. Sci. 334(2016) 354-378.

[56] J. Tu. A fixed-parameter algorithm for the vertex cover P3 problem[J]. Information Processing Letters, 115(2)(2015) 96-99. [57] S. Voß, A. Fink, A hybridized tabu search approach for the minimum weight vertex cover problem, J. Heuristics, 18(6)(2012)869-876. [58] L. Wang, W. Du, Z. Zhang, X. Zhang, A PTAS for minimum weighted connected vertex cover P_3 problem

ACCEPTED MANUSCRIPT

in 3-dimensional wireless sensor networks, J. Comb. Optim. (2015)1-17. [59] L. Wang, X. Zhang, Z. Zhang, H. Broersma. A PTAS for the minimum weight connected vertex cover P3 problem on unit disk graphs[J]. Theoretical computer science, 571(2015) 58-66. [60] D. Wang, H. Xiong, D. Fang. A Neighborhood Expansion Tabu Search Algorithm Based On Genetic Factors[J]. Open Journal of Social Sciences, 4(03) (2016) 303-308. [61] Y.Y. Wang, S.W. Cai, and M.H. Yin, Two efficient local search algorithms for maximum weight clique

CR IP T

problem, Proc. AAAI. 2016.

[62] Y.Y. Wang, R.Z. Li, Y.P. Zhou, M.H. Yin. A path cost-based GRASP for minimum independent dominating set problem,Neural Comput Appl. (2016) doi:10.1007/s00521-016-2324-6.

[63] Y.Y. Wang Y, D.T. Ouyang, L.M. Zhang, M.H. Yin , A novel local search for unicost set covering problem

AN US

using hyperedge configuration checking and weight diversity, SCIENCE CHINA Info Sci. (2015). doi:10.1007/s11432-015-5377-8.

[64] X.Z. Wen, L. Shao, Y. Xue, and W. Fang, A rapid learning algorithm for vehicle classification, 295(1)(2015)395-406.

Inform. Sci.

Inform. Sci. 334( 2016) 103-121.

M

[65] Q.H. Wu, J.K. Hao, A clique-based exact method for optimal winner determination in combinatorial auctions,

ED

[66] Z.H. Xia, X.H. Wang, X.M. Sun, and B.W. Wang, Steganalysis of least significant bit matching using multi-order differences, Security and Communication Networks, 7(8)(2014)1283-1291.

PT

[67] E.T. Yassen, M. Ayob, M.Z.A. Nazri, N.R. Sabar, Meta-harmony search algorithm for the vehicle routing problem with time windows, Inform. Sci. 325(2015) 140-158.

CE

[68] X. Zhang, X. Li, J. Wang. Local search algorithm with path relinking for single batch-processing machine scheduling problem, Neural Comput Appl. (2016) doi: 10.1007/s00521-016-2339-z.

AC

[69] T. Zhou, Z.P. Lü, Y. Wang, J. Ding, B. Peng, Multi-start iterated tabu search for the minimum weight vertex cover problem, J. Comb. Optim. (2015) 1-17 doi:10.1007/s10878-015-9909-3.

[70] Y.P. Zhou, H.C. Zhang, R.Z. Li, and J.N. Wang, Two Local Search Algorithms for Partition Vertex Cover Problem. J. Comput. Theor. Nanosci. 13(2016) 743-751.