Accepted Manuscript
An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem Ruizhi Li , Shuli Hu , Haochen Zhang , Minghao Yin PII: DOI: Reference:
S0020-0255(16)30625-9 10.1016/j.ins.2016.08.053 INS 12456
To appear in:
Information Sciences
Received date: Revised date: Accepted date:
1 July 2015 10 August 2016 15 August 2016
Please cite this article as: Ruizhi Li , Shuli Hu , Haochen Zhang , Minghao Yin , An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem, Information Sciences (2016), doi: 10.1016/j.ins.2016.08.053
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
An Efficient Local Search Framework for the Minimum Weighted Vertex Cover Problem Ruizhi Li, Shuli Hu, Haochen Zhang, and Minghao Yin* School of Computer Science and Information Technology, Northeast Normal University, Changchun 130024, China *
CR IP T
corresponding author:
[email protected]
Abstract:The minimum weighted vertex cover (MWVC) problem, an extension of the classical minimum vertex cover (MVC) problem, is an important NP-complete combinatorial optimization problem with a wide range of
applications. The objective of this paper is to design an efficient local search algorithm to solve the MWVC
AN US
problem. First, the weighted edge strategy is proposed to define the dynamic scoring strategy so that our algorithm
can find different possible optimal solutions. Second, the weighted configuration checking (WCC) strategy is
proposed to overcome the cycling problem in local search. By combining the WCC strategy with the scoring
M
strategy, we design the vertex selection strategy to determine the vertex to be selected as a candidate solution
ED
component. Based on these strategies, a novel local search framework, namely diversion local search based on
weighted configuration checking (DLSWCC), is presented. DLSWCC is evaluated against several state-of-the-art
PT
algorithms on various benchmark instances. Experimental results show that DLSWCC outperforms its competitors
CE
in terms of both solution quality and computational efficiency in most classical instances. Specifically, DLSWCC
can obtain22 new upper bounds of 71 moderate-scale problem instances, 5 new upper bounds of 15 large-scale
AC
problem instances, and 56 new upper bounds of 56 massive graph instances. Keywords: minimum weighted vertex cover, weighted configuration checking, scoring strategy, local search,
massive graph instances
1. Introduction The minimum vertex cover (MVC) problem is to find a minimum subset of vertices that contains at least one
endpoint of each edge [17][18][43]. This problem is a core optimization problem that has been studied extensively.
ACCEPTED MANUSCRIPT
It has important applications in various fields such as network security, industrial machine assignment, and data
aggregation [7][10][25][26][39][41][49][51][56]. As a dual problem of the maximum independent set (MIS)
problem [9][20][23][35][40], the MVC problem has also been applied to social networks, pattern recognition,
molecular biology, and economics [1][2][6][24][32][33] [65].
CR IP T
The minimum weighted vertex cover (MWVC) problem can be viewed as a generalized version of the MVC
problem. In the MWVC problem, each vertex is associated with a weight, and the problem is to find a vertex cover
whose total weight is the smallest. Obviously, the problem reduces to the classical MVC problem when all the
AN US
vertices share the same weight. The MWVC problem plays an important role in many real-world applications such
as wireless communication, circuit design, and network flows [50][58][59].
As is well known, the decision version of the MVC problem is one of the prominent problems among Karp’s
M
21 NP-complete combinatorial problems [37]. Moreover, it is NP-hard to approximate the MVC problem within
any factor smaller than 1.3606 [22]. Therefore, for large and hard instances, researchers usually resort to heuristic
ED
approaches to obtain good solutions within reasonable time. In the past decade, several heuristic algorithms have
PT
been proposed to solve the MVC problem. An evolutionary approach to the MVC problem and related surveys on
this type of algorithm can be found in [24]. Further, ant colony approaches have been proposed in [27]. The cover
CE
edge randomly (COVER) [51] algorithm is a recently developed iterative best improvement algorithm that uses
AC
edge weights to guide the local search. Cai et al. proposed two local search-based algorithms: EWLS and EWCC
[14][15]. On the basis of these algorithms, they further proposed two heuristic strategies—two-stage exchange and
edge weighting with forgetting—and introduced a new local search algorithm called NuMVC [16]. Subsequently, a
vertex weighting scheme was combined with NuMVC, leading to a new algorithm called TwMVC [12].
For the weighted version of the MVC problem, researchers have devoted their efforts toward the development
of heuristics for generating good or near-optimal solutions within reasonable time. The algorithm proposed in [19]
ACCEPTED MANUSCRIPT
starts with an empty partial solution, adds one vertex at a time, and selects the vertex with the minimum ratio
between the vertex weight and the current degree in each step. In [5], Balaji et al. proposed an effective algorithm
called the support ratio algorithm (SRA), which can find the minimum weighted vertex cover effectively with the
terminology support of a vertex introduced in the new model. In [57], Stefan applied a modified reactive tabu
CR IP T
search approach to solve the problem. In [4], a new heuristic operator was introduced in the domain of the
randomized gravitational emulation search (RGES) algorithm to maintain feasibility specifically for the vertex
cover problem. Furthermore, in [54], Shyu et al. introduced a meta-heuristic based on the ant colony optimization
aforementioned
AN US
(ACO) approach to find approximate solutions. Subsequently, in [36], Jovanovic and Tuba improved the
algorithm by introducing a pheromone correction heuristic strategy that incorporates information
about the best-found solution to exclude suspicious elements in order to avoid being trapped in local optima. In [8],
M
Bouamama proposed a simple yet efficient population-based iterative greedy algorithm for solving the MWVC
problem. In [69], Zhou et al. proposed a multi-start iterative tabu search algorithm to solve the MWVC problem.
ED
Although heuristic algorithms are largely successful in solving the MWVC problem, research on such
PT
methods remains in its nascent stages. Compared with local search algorithms, which can solve the MVC problem
efficiently and effectively, MWVC algorithms cannot scale up and are time-consuming, especially for large-scale
CE
instances. This may be because the MWVC problem is much harder and more complicated; thus, it is more
AC
difficult to solve from the viewpoint of algorithm design. Owing to the complicated structure of the MWVC
problem, it could be much easier for local search algorithms to visit the revisited space during local search. In
addition, a careful inspection of current MWVC heuristic algorithms shows that most heuristic functions are static
during the local search. This would cause the algorithm to be trapped in local optima, thereby leading to bad
solutions.
To address the above-mentioned issues, the present paper proposes two heuristics and introduces a novel local
ACCEPTED MANUSCRIPT
search framework for solving the MWVC problem. The first heuristic is a dynamic vertex scoring strategy
designed for vertex selection. As is well known, the selection of vertices to be added into or removed from the
candidate solution plays an important role in the local search performance. A good selection of such vertices would
be beneficial for guiding the search in the right direction, whereas a bad selection would mislead the algorithm.
CR IP T
Previous scoring mechanisms that assess the selected vertices are mainly based on the static scoring strategy. In
other words, the vertex score is constant throughout the search process. Such a scoring mechanism may lose its
effectiveness when the local search is trapped inlocal optima. Therefore, this paper proposes a dynamic scoring
AN US
strategy that dynamically modifies the edge weights of the uncovered edges when it is trapped in local optima.
Experimental results provide further evidence regarding the effectiveness and general applicability of this
mechanism.
M
The second heuristic is a variant of the configuration checking (CC) strategy for handling the cycling problem
during the local search. Compared with previous mechanisms, such as the tabu strategy and random
ED
walk[3][11][21][28][29][30][34][38][42][44][55][60][62], the CC strategy considers the circumstances of the
PT
selected solution component to change its value rather than the direct properties of the solution component. This
strategy has been successfully applied to various combinatorial optimization problems, such as the MVC problem,
CE
propositional satisfiability (SAT) problem, MaxSAT problem, and set cover problem, and it has inspired a series of
AC
state-of-the-art local search algorithms, such as EWCC, CCASat, and uSLC [13][15][47][63][70]. In this paper, we
adapt the aforementioned strategy to solve the MWVC problem. However, direct application of the CC strategy
cannot lead to a successful algorithm, because the original CC strategy is too rigid in the context of the MWVC
problem. Therefore, we propose a new strategy, namely the weighted configuration checking (WCC) strategy,
which is looser than the CC strategy and obtains more promising search areas. Experimental results not only show
that the proposed strategy is more effective than the original CC strategy but also confirm its superiority on various
ACCEPTED MANUSCRIPT
types of benchmarks.
By incorporating the two strategies described above, an efficient algorithm, namely diversion local search
based on weighted configuration checking (DLSWCC) is proposed to solve the MWVC problem. To test the
efficiency of DLSWCC, it is compared with several state-of-the-art MWVC algorithms, namely RGES [4], ACO
CR IP T
[54][54], ACO+SEE [36], PBIG [8], and MS-ITS [69], using benchmark instances originally proposed in [54]. Experimental results show that DLSWCC significantly outperforms the other algorithms in terms of solution
quality and running time, and it improves some current best known solutions.
AN US
It is important to note that previous MWVC local search algorithms mainly focus on academic benchmarks.
In these benchmarks, the number of vertices only takes values from 10 to 1000. However, the proliferation of the
Internet, followed by various scientific advancements and the widespread deployment of sensors, has resulted in
M
increasingly massive data sets. In many real-world applications, such as social networks, biological networks, and
recommendation problems, the number of vertices usually takes values greater than 1000 (greater than one million
ED
in certain cases). Therefore, in this study, we obtained a large number of massive graph benchmarks from the
PT
Network Data Repository [52]. These graphs are modeled on the basis of real-world applications, including
biological networks, collaboration networks, Facebook networks, interaction networks, infrastructure networks,
CE
Amazon recommendation networks, scientific computation networks, social networks, technological networks, and
AC
web lint networks. Experimental results show that our algorithm exhibits much better performance than other local
search algorithms. Specifically, our algorithm is able to obtain new upper bounds for all the massive graph
benchmarks.
The remainder of this paper is organized as follows. Section 2 reviews the relevant background knowledge.
Section 3 introduces the dynamic weights of edges and the dynamic scoring strategy. Section 4 explains the
weighted configuration checking strategy for the MWVC problem. Section 5 discusses the design of a new vertex
ACCEPTED MANUSCRIPT
selection strategy. Section 6 describes the proposed efficient local search framework, namely DLSWCC. Section 7
presents and discusses the experimental results. Finally, Section 8 concludes the paper and briefly explores
directions for future work.
2. Basic Definitions and Notations
CR IP T
In this section, we shall introduce some basic concepts and definitions. An undirected graph is denoted by G(V, E), where V = {v1, v2, ⋯, vn} is the set of vertices and E = {e1, e2, ⋯, em } is the set of edges. An undirected weighted graph is denoted by G(V, E, w), where each vertex vi (i = 1, 2, ⋯, n) is associated with a weight w(vi). We shall use N(v) = {u∈V | (u, v)∈E} to denote the set of neighbors of a vertex v, and d(v) = |N(v)| to denote the degree
AN US
of v. Given a candidate solution C, we shall use svj{0, 1} to denote the state of a vertex vj, where svj = 1 implies that vjC and svj = 0 implies that vj∉C. The three variants of the vertex cover problem are defined as follows. Definition1 (vertex cover, VC) Given an undirected graph G(V, E),where V is the set of vertices and E is the set
M
of edges, a vertex cover is a subset C ⊆ V such that every edge in G has at least one endpoint in C.
ED
Definition2 (minimum vertex cover, MVC) Given an undirected graph G(V, E), where V is the set of vertices and
E is the set of edges, the minimum vertex cover problem is to find the smallest vertex cover in G. The MVC
AC
CE
PT
problem can be formally defined as follows.
Minimize𝑤(𝐶) = ∑𝑛𝑗=1 𝑠𝑣𝑗
(1)
subject to 𝑠𝑣𝑖 +𝑠𝑣𝑗 ≥1, ∀(vi , vj ) ∈ E,
(2)
𝑠𝑣𝑖 , 𝑠𝑣𝑗 ∈ *0, 1+, 𝑖, 𝑗 = 1,2, ⋯ , 𝑛
(3)
Equation (2) ensures that each edge is covered by at least one vertex, and Equation (3) is the integral of the
constraint.
Definition3 (minimum weighted vertex cover, MWVC) Given an undirected weighted graph G(V, E, w), where
V is the set of vertices, E is the set of edges, and each vertex vi has a weight w(vi), the minimum weighted vertex
ACCEPTED MANUSCRIPT
cover problem is to find the vertex cover with the minimum sum of weights of the constituent vertices. The
MWVC problem can be formally defined as follows. Minimize w(C) = ∑nj=1 𝑤(𝑣𝑗 ) 𝑠𝑣𝑗
(4)
subject to 𝑠𝑣𝑖 + 𝑠𝑣𝑗 ≥ 1, ∀(𝑣𝑖 , 𝑣𝑗 ) ∈ 𝐸,
(5)
𝑠𝑣𝑖 , 𝑠𝑣𝑗 ∈ *0, 1+, 𝑖, 𝑗 = 1,2, ⋯ , 𝑛
CR IP T
(6)
Equation (5) ensures that each edge is covered by at least one vertex, and Equation (6) is the integral of the
constraint. The problem reduces to the MVC problem when the weights of all the vertices are equal to 1.
AN US
3. Dynamic Scoring Strategy
Given a set of candidate vertices, selecting the vertices that should be added into or removed from the
candidate solution plays an important role in the efficiency of the search. Previous scoring mechanisms of the
selected vertex v are mainly based on a static scoring strategy. In other words, the vertex score is constant
M
throughout the search process. Such an assessment would lose its effectiveness when the local search is trapped in
ED
a local minimum. In this section, we introduce an alternative dynamic scoring strategy to measure the benefit of
changing the state of a vertex during the local search. This strategy together with the WCC strategy introduced in
PT
the next section can be used to select a vertex as a candidate solution component. Before we introduce the dynamic
CE
scoring mechanism, we clarify the concept of dynamic edge weight.
AC
Definition 4 (dynamic edge weight) Given an undirected graph G (V, E, w), for each vertex eE, we use an
edge weight function, denoted by dynamic_weight(e), associated with e, which is maintained dynamically during
the local search process.
Specifically, two dynamic weight rules are introduced as follows.
Weight_Rule 1: In the initialization process, for each edge eE, dynamic_weight(e) is set to 1.
Weight_Rule 2: At the end of each loop during the local search process, each edge eE will be checked to
ACCEPTED MANUSCRIPT
determine whether it is covered by the candidate solution. If e is uncovered, dynamic_weight (e) will be increased
by 1.
According to Definition 4, supposing that a candidate solution is a subset of vertices C⊆V, a Boolean function
cover(e, C) is used to denote whether an edge eE is covered by a candidate solution C, i.e., whether at least one
calculated as follows. score(𝑣) =
cost(𝐶)−cost(𝐶’) 𝑤(𝑣)
CR IP T
endpoint of e belongs to C. Then, for the MWVC problem, the score of a vertex v, denoted by score(v), can be
(7)
AN US
where C denotes the current solution, C' = C\{v} if vC and C' = C∪{v} otherwise, and w(v) denotes the weight
of vertex v. Further, cost(C) is the total weight of edges not covered by C, which can be described as follows. cost(C) =
∑
dynamic_weight(e)
cover(e,C) = false
(8)
M
As can be seen from Equation (8), if the local search is trapped in a local minimum, according to
Weight_Rule 2, the weight of the uncovered edge will be increased incrementally; thus, the score of the vertices
ED
related to the edge will be increased incrementally. In this manner, the search will proceed in the right direction
PT
and jump out of the local minimum.
It should be noted that such a mechanism is especially important for large-scale problem instances, because it
CE
is much easier for the local search to be trapped in a local optimum for these instances. However, for large-scale
AC
instances, the efficiency of the scoring mechanism is also an important issue. It is imperative to rapidly determine
the score of each vertex. In our implementation, we use a fast incremental evaluation technique based on a
streamlined calculation for updating the score after each addition or removal. Specifically, given a vertex v, the
initial score of v is initialized as
score0(v) = d(v) where d(v) denotes the degree of v.
(9)
ACCEPTED MANUSCRIPT
Suppose that in the ith step of the local search, the score of v is denoted by scorei(v). Then, if v is added into or removed from the candidate solution, we only need to update the scores of v and its neighbors as follows.
scorei+1(v) = -scorei(v);
(10)
for each vertexuN(v), scorei(u) -
dynamic_weight(u,v)
(u) + {scorei
dynamic_weight(u,v)
𝑤(𝑢)
, (uC) (vC) = True (11)
𝑤(𝑢)
, otherwise
(11)
CR IP T
scorei+1(u) =
For other vertices v’ that are not neighbors of v, the score will not be changed, i.e.,
scorei+1(v') = scorei(v')
(12)
AN US
In order to further clarify the fast incremental evaluation technique, we shall use an example to illustrate it.
Example 1 Consider the graph shown in Figure 1, which has five vertices and four edges. Suppose that the weight
of each vertex is equal to 1 and the current candidate solution C = {a,b}. Let the current weight of the vertices be
M
dynamic_weight(a,b) = 1, dynamic_weight(a,c) = 3, dynamic_weight(a,d) = 2, and dynamic_weight(d,e) = 3.
ED
Further, let the current step of the local search be the ith step. According to Equation (7), the scores of the vertices
are scorei(a) = 3, scorei(b) = -1,scorei(c) = 3, scorei(d) = -5, and scorei(e) = 0. If we add vertex a into the candidate
PT
solution, we need not update the values of all the vertices except for a and its neighbors. According to Equation
CE
(10), scorei+1(a) = -scorei(a) = -3. Vertices b, c, and d are the neighbors of vertex a, and their scores need to be updated. FromEquation (11), we getscorei+1(b) = score(b) + dynamic_weight(a,b) = 0, scorei+1(c) = scorei(c) -
AC
dynamic_weight(a,c) = 0, and scorei+1(d) = scorei(d) + dynamic_weight(a,d) = -3. According to Equation (12), the scores of the other vertices need not be updated; hence, scorei+1(e) = scorei(e) = 0. If we remove a vertex from the candidate solution, the scores are updated similarly.
ACCEPTED MANUSCRIPT
add vertex a into C
Scorei+1(a)=-3, scorei+1(b)= 0,scorei+1(c)=0 scorei+1(d)= -3,scorei+1(e)=0
scorei(a)=3, scorei(b)= -1,scorei(c)=3 scorei(d)= -5,scorei(e)=0
CR IP T
Fig. 1 An example of the fast incremental evaluation technique. Proposition 1 The incremental evaluation technique is sound and the time complexity of updating the scores after
each addition or removal is linear.
Proof. We first prove the soundness of the incremental evaluation technique, and we consider the process of
scorei(v) = k ,i.e., scorei(v) =
cost(𝐶)−cost(𝐶’) 𝑤(𝑣)
AN US
removing a vertex from the graph G(V,E). Suppose that the current candidate solution is C for a vertex v, vC, and = k, according to Equation (7), where C' = C\{v}.If we select v and
remove v from the current candidate C,the current candidate solution will be C'. According to Equation (7), 𝑐𝑜𝑠𝑡(𝐶’)−𝑐𝑜𝑠𝑡(𝐶’’) 𝑤(𝑣)
, where C'' = C'∪{v} = C; hence,scorei+1(v) = -k. For a vertex uN(v), suppose that
M
scorei+1(v) =
ED
scorei(u) = m and dynamic_weight(v,u) = n. If uC, it implies that the edge (v, u) is covered by vertices v and u at the same time. If we select v and remove v from the current candidate C, the edge(v,u) will be covered only by
PT
vertex u. In the next iteration, if vertex u is selected for removal from the candidate, the edge(v,u) will be
CE
uncovered; hence, scorei+1(u) = scorei(u) - dynamic_weight(v,u) = m - n. On the other hand, if u∉C, the edge (v,u) is covered only by vertex v. If we select v and remove v from the current candidate C, the edge(v,u) will be
AC
uncovered. In the next iteration, if vertex u is selected for addition into the candidate, the edge(v,u) will be covered
by vertex u; hence, scorei+1(u) = scorei(u) + dynamic_weight(v,u) = m + n. For other verticesv’ that are not neighbors of v, if we select v and remove v from the current candidate C, the number of vertices that are covered by v’ will be not changed regardless of whether it is covered. Hence, the score will not be changed, i.e., scorei+1(v') = scorei(v'). We then analyze the time complexity of the technique. The fast incremental evaluation technique is based on
ACCEPTED MANUSCRIPT
Equations (10) and (11). If we add vertex v into the candidate solution, we only need to update the scores of vertex
v and its neighbors. Note that score(v) is updated with the opposite value of the original value given by Equation
(10). The scores of the neighbors of v are updated by adding or subtracting a number, as shown in Equation (11).
Thus, the worst time complexity is due to updating the scores of the neighbors of v’ and it depends on the number
△
O(
(V)),
△
where
(V)
□
max{|d(v)|
|
vV}.
AN US
4. Weighted Configuration Checking Strategy
=
CR IP T
of neighbors. We can easily conclude that the complexity of the fast incremental evaluation technique is low, i.e.,
The cycling problem, which refers to revisiting the same part during the search process, is a serious issue in
local search. Recently, Cai et al. [15] proposed the configuration checking (CC)strategy, which can exploit the
problem structure to prevent stochastic local search algorithms from revisiting the same scenario. In the CC
M
strategy, the concept of configuration, which we refer to as unweighted configuration in this paper, is defined as
ED
follows.
Definition 5 (unweighted configuration) Given an undirected graph G(V, E, w) and assuming C to be the
PT
current candidate solution, the unweighted configuration of a vertex v is a vector S consisting of the states of all the
CE
vertices in N(v) under the current candidate solution.
AC
According to the CC strategy, supposing that the local search procedure maintains a current candidate solution
C, when selecting a vertex v to add into C, for a vertex v∉C, if the configuration of v has not been changed since
its last removal from C, which means that the circumstance of v remains stable, then v should not be added back to
C; otherwise, the algorithm could easily lead to a scenario that it has recently faced, which is likely to result in the
cycling problem.
A straightforward CC strategy for MWVC can be easily devised as follows. An array config is maintained,
ACCEPTED MANUSCRIPT
whose element is an indicator: config[v] = 1 implies that the configuration of vertex v has changed since the last
removal of v from C; otherwise, config[v] = 0. Initially, config[v] is initialized as 1 for each vertex v, as each vertex
is allowed to be selected initially. During the search, when a vertex v is added to the current solution, config[v] is
set to 1 for each vertex uN(v). When a vertex v is removed from the current solution, config[v] is set to 0 and
CR IP T
config[u] is set to 1 for each vertex uN(v).
However, such direct application of the CC strategy to the MWVC problem would mislead the search by
restricting the addition of some promising vertices. In other words, the restriction of the original CC strategy is too
AN US
rigid. Section 3 introduced a dynamic scoring mechanism, where each edge of the graph is associated with a
weight. This weight will be maintained dynamically throughout the local search, thereby enabling the search to
jump out of a local optimum. Our strategy also considers the dynamic weight of the edge in order to avoid the
M
cycling problem. Next, we shall introduce the concept of weighted configuration.
Definition 6 (weighted configuration) Given an edge-weighted undirected graph G = (V, E, w) and assuming
ED
C to be the current candidate solution, the weighted configuration of a vertex v is a two-tuple
, where S is a
PT
vector consisting of the states of all the vertices in N(v) under the current candidate solution, and W is a vector
consisting of the weights of all the incident edges of all the vertices in N(v).
CE
According to Definition 6, we can modify the original CC strategy into a less rigid version, referred to as
AC
weighted configuration checking (WCC). This heuristic is specified as follows.
An array wconfig will be maintained, whose element acts as an indicator: wconfig[v] = 1 implies that the
weighted configuration of vertex v has changed since the last removal of v from C; otherwise, wconfig[v] = 0. The
wconfig array is maintained by the following four rules:
WCC_Rule 1: Initially, for each vertex v, wconfig[v] is initialized as 1.
WCC_Rule 2: When removing v from C, wconfig[v] is reset to 0; for each uN(v), wconfig[u] is set to 1.
ACCEPTED MANUSCRIPT
WCC_Rule 3: When adding v into C, for each uN(v), wconfig[u] is set to 1.
WCC_Rule 4: When updating dynamic_weight(e), i.e., the weight of edge e, where e is adjacent to vertices u
and v, both wconfig[u] and wconfig[v] are set to 1.
Clearly, the main difference between the unweighted configuration and the weighted configuration of a vertex
CR IP T
v is that the latter considers not only the states of all the vertices in N(v) under the current candidate solution but
also the weights of all the incident edges of all the vertices in N(v), whereas the former only considers the states of
all the vertices in N(v). Therefore, a vertex can be added into or removed from the candidate solution once the
AN US
states of v or the weights of its incident edges have been changed. Given a vertex v, we say that v is an original CC
variable (OCCV) if config[v] = 1 and v is a weighted CC variable (WCCV) if wconfig[v] = 1. It is easy to see that
the following proposition stands.
Proposition 2 For a graph G(V, E) and vV, if some vertices become OCCV by picking one vertex v to be
M
added or removed, then those vertices are also WCCV.
Proof. During the current procedure, one vertex v is selected to be added into or removed from the candidate
ED
solution. All the neighbors N(v) of this vertex are OCCCV and WCCV. When updating the weight of edge e, both the endpoints of e are WCCV.
□
PT
According to Proposition 2, we know that WCCV variables include OCCV variables. In other words, the
CE
WCC strategy is looser than the original CC strategy. Thus, the algorithm can reserve some potential vertices and
AC
be led to promising search areas. The experimental results presented in Section 7.3 confirm the effectiveness of the
WCC strategy.
5. Vertex Selection Strategy Using the dynamic scoring mechanism and the WCC strategy introduced in the previous sections, we develop
the vertex selection strategy. First, we shall introduce the concept of age. The age of a vertex is defined as the
number of search steps that have occurred since its state was last changed. Specifically, the vertex selection
strategy is based on the following two rules.
ACCEPTED MANUSCRIPT
Remove_Rule: For vertices in the candidate solution, select one vertexvwith the greatest score(v) value. If
thereis more than one such vertex, break ties in favor of the oldest one, i.e., the one with the greatest value of age.
Add_Rule: For vertices not in the candidate solution with the WCC value equal to 1, select one vertex v with
the greatest score(v) value. If there is more than one such vertex, break ties in favor of the oldest one, i.e., the one
CR IP T
with the greatest value of age.
From these two rules, we can see that, when selecting a vertex to be added into the candidate solution, we
need to select a vertex v with the highest score(v). Thus, a candidate solution can cover as many edges as
AN US
possible after this vertex is added into it, and simultaneously, the least weight value is added into the candidate
solution. On the other hand, to avoid visiting the previous candidate solution, the WCC value of the added vertex
should be 1. When there is more than one vertex with the highest score, we need to pick the oldest one, i.e., the one
M
with the greatest value of age. For removing a vertex from the candidate solution, a similar process is followed,
except that the WCC value of the selected vertex needs to be 1.
ED
6. Framework of DLSWCC
The DLSWCC algorithm follows the general local search procedure. First, an initial candidate solution C is
PT
constructed greedily; then, local search improvement is achieved using a perturbing method to improve the initial
CE
solution C. Let w(C) denote the objective value of the candidate solution and w(C) = ∑v∈C w(v). The upper bound
AC
(UB) of the objective value is initially set as UB = w(C). If better solutions exist, their objective values should be
smaller than UB. The objective of local search improvement is to solve a series of new problems: given the
original problem and an integer UB, find a feasible solution whose objective value is smaller than UB but which is
still able to cover all the edges in E. The candidate solution becomes infeasible when it cannot cover all the edges
in E. Our algorithm repeatedly perturbs infeasible solutions with a smaller objective value than UB. Thus, once the
initial candidate solution has been constructed, the vertices from C are first removed until C becomes an infeasible
ACCEPTED MANUSCRIPT
solution under UB. In this process, if any better solutions are found, UB and C* should be updated. The
dynamic_weight of every edge is updated in each iteration when the candidate solution becomes infeasible. In this manner, the dynamic_weight of each uncovered vertex is increased by 1, thus giving the ―hard to cover‖ edges a
better chance to be covered by the new C in the following iterations. On the basis of the explanation provided
Algorithm DLSWCC initialize wconfig array according to WCC_Rule1;
2
initialize the dynamic_weight of each edge assigned as 1;
3
initialize the score of each vertex assigned as the degree of the vertex;
4
initialize the candidate solution C greedily;
5
UB = w(C);
6
C*←C;
7
iter←0;
8
while stop criterion is not satisfied do
9
AN US
1
CR IP T
above, we outline our heuristic algorithm as follows.
while C covers all edges then UB = w(C);
11
C*←C;
12
v←x with the greatest score in C, breaking ties in favor of the oldest one;
13
C←C\{v};
14
update wconfig array according to WCC_Rule 2;
ED
M
10
end while
16
v←x with the greatest score in C and v is not in tabu_list, breaking ties in favor of the oldest one;
17
C←C\{v};
18
update wconfig array according to WCC_Rule 2;
19
clear tabu_list;
20
while C uncovers some edges do
22
v←x with the greatest score not in C and wconfig[x]==1, breaking ties in favor of the oldest one; if w(C)+w(v)≥UB then break; C←C∪{v};
AC
23
CE
21
PT
15
24
update wconfig array according to WCC_Rule3;
25
dynamic_weight [e]←dynamic_weight [e]+1, for each uncovered edge by C;
26
update wconfig array according to WCC_Rule4;
27
add v into tabu_list;
28
endwhile
29
iter←iter+1;
30 endwhile 31 return C*; When our algorithm commences, the wconfig array should be initialized to 1 (line 1), which means that the
ACCEPTED MANUSCRIPT
weighted configuration checking strategy allows all the vertices to be added into the candidate solution and the
dynamic_weightof each edge is set to 1 (line 2). Next, the score of each vertex is set to the degree of the vertex. In
line 4, the candidate solution C is generated with the greedy initialization method, which finds a solution to the
MWVC problem by iteratively selecting one vertex with the greatest score. We calculate UB,theobjective value of
6) and the number of iterations of the local search (line 7), respectively.
CR IP T
the candidate solution, in line 5. We use C* and iter to denote the current best solution found using DLSWCC (line
After the initialization process, the main outer loop from line 8 to line 30 is executed until the time limit or
AN US
maximum number of iterations is reached. When a solution is obtained, i.e., C covers all edges, UB and the best
solution C* are updated by w(C) and C respectively (lines 10, 11), which means that our algorithm needs to search
for a smaller solution. Therefore, our algorithm should select a vertex in the candidate solution according to
M
Remove_Rule (line 12) and remove the selected vertex from C(line 13) until there are some uncovered edges. The
WCC array should be updated by WCC_Rule2 (line 14).Furthermore, we select anothervertex with the greatest
ED
scorein the candidate solution, and the selected vertex should not be in the tabu_list, which means that this process
PT
avoids picking some recently added vertices in the last iteration that are to be removed from the candidate solution
(line 16). If such a vertex is found, it is removed from the candidate solution (line 17). DLSWCC updates the
CE
WCC array according to WCC_Rule2and clears tabu_list (lines18, 19).
AC
The inner loop is from line 20 to line 28 until the candidate solution covers all the edges. Based on the
Add_Rule, our algorithm picks a vertex v (line 21). If the sum of the weights of the vertices in C⋃{v} is greater
than UB, then the inner loop breaks(line 22). Otherwise, we add vertex v into the current candidate solution and
update wconfig according to WCC_Rule3 (lines 23, 24). After adding a vertex, the weights of the uncovered edges
will be increased by one (line 25) and the WCC values of each vertex of the uncovered edges are updated
according to WCC_Rule 4 (line 26). Then, we add v into tabu_list (line 27). At the end of the inner loop, the value
ACCEPTED MANUSCRIPT
of iter is increased by one (line 29). When the stopping criterion, i.e., the time limit or the maximum number of
iterations, is reached, the best solution to the MWVC problem will be returned (line 31).
7. Experimental results In this section, we will present and discuss the results of experiments conducted by adopting the proposed
compared with several state-of-the-art algorithms proposed in the literature.
7.1 Benchmark instances
CR IP T
DLSWCC algorithm to solvea large number of MWVC benchmark problem instances. Further, DLSWCC is
The experiments were run on a large number of benchmark instances, each consisting of an undirected
AN US
vertex-weighted graph with n vertices and m edges. These instances were classified according to their number of
vertices (n) into four different classes:
1. Class SPI: a class of small-scale problem instances (SPI) including 400 instances, with n taking values of
M
{10, 15, 20, 25}.
2. Class MPI: a class of moderate-scale problem instances (MPI) including710 instances, with n taking
ED
values of {50, 100, 150,200, 250, 300}.
PT
3. Class LPI: a class of large-scale problem instances (LPI) including15instances, with n taking values of
CE
{500, 800, 1000}.
4.Class Massive Graph Instances: a class consisting of 56 very large instances, with n taking values greater
AC
than 1000.
The instances of the classes SPI, MPI, and LPI were originally proposed in [54]. In each class, and for each n,
a whole range of instances exists, including rather sparse graphs as well as rather dense graphs. The instances of
the first two classes share the following characteristics: (i) 10 problem instances are randomly generated per
combination of n and m, and the results are presented as an average over the objective function values obtained for
the 10 instances. (ii) The weight w(v) of each vertex vV is randomly drawn from a uniform distribution, either
ACCEPTED MANUSCRIPT
from the interval [20, 120] (referred to as Type I) or from the interval [1, d(v)2] (referred to as Type II), where d(v)
is the degree of vertex v. In contrast, class LPI consists of only one problem instance per combination of n and m.
The vertex weights are randomly drawn from a uniform distribution from the interval [20, 120].
Further, it should be noted that previous MWVC local search algorithms have mainly focused on academic
CR IP T
benchmark problems. In these benchmarks, the number of vertices only takes values from 10 to 1000. However,
the proliferation of the Internet, followed by various scientific advancements and the widespread deployment of
sensors, has resulted in increasingly massive data sets. In many real-world applications, the number of vertices
AN US
usually takes values greater than 1000 (greater than one million in some cases). Therefore, in this study, we
collected a large amount of massive graph instances from the Network Data Repository [52]. Some of these
benchmarks have recently been used to test parallel algorithms for maximum clique and coloring problems
M
[53][61]. The graphs in our experiments can be categorized into 10 groups: biological networks, collaboration
networks, Facebook networks, interaction networks, infrastructure networks, Amazon recommend networks,
ED
scientific computation networks, social networks, technological networks, and web lint networks. The class of
PT
massive graph instances consists of one problem instance per combination of n and m. The vertex weights are
randomly drawn from a uniform distribution from the interval [20, 120].
CE
7.2 Comparison with state-of-the-art algorithms
AC
We compared the performance of DLSWCC with that of state-of-the-art algorithms in solving the MWVC
problem. The considered state-of-the-art algorithms were the randomized gravitational emulation search (RGES)
algorithm [4], the ant colony optimization (ACO) algorithm [54], an improved ACO algorithm (ACO+SEE) [36],
the population-based iterated greedy (PBIG) algorithm [8], and a multi-start iterated tabu search (MS-ITS)
algorithm [69]. Note that the proposed algorithm was executed only once on each instance of class SPI and class
MPI, and 10 times on each instance of class LPI and each massive graph instance.
ACCEPTED MANUSCRIPT
We programmed the DLSWCC algorithm in C and executed it on a PC with an Intel® Xeon®E7-4830 CPU
(2.13 GHz). The executable files of the above-mentioned algorithms (except MS-ITS) have not been provided;
hence, we only compared the results of our algorithm with the experimental results provided for the instances of
the classes SPI, MPI, and LPI. Section 7.9 compares our algorithm with MS-ITS for massive graph instances on
CR IP T
the same computer.MS-ITS has been tested on a computer with an AMD A6-3400M APU (1.40 GHz), RGES has
been tested on aPC with a PIV CPU (3.2 GHz)running Windows XP, PBIG has been executed on a cluster of PCs equipped with Intel® X3350 CPUs(2667 MHz), and ACO has been executed on a PC with an Intel CoreTM(2) Duo
AN US
CPU (4.00 GHz). It should be noted that the machines used for most of the above-mentioned algorithms are faster
than the machine used in this study. For each instance, the DLSWCC algorithm was performed, and each run was
terminated when a given time limit(1000 s) or maximum number of iterations (1,000,000) was reached.
7.3 Effectiveness of WCC strategy
M
The aim of the first experiment was to evaluate the effectiveness of the WCC strategy. In this experiment, we
ED
compared the DLSWCC algorithm with two alternative algorithms: DLSNOCC and DLSECC. DLSNOCC works
without the WCC strategy, i.e., it selects the vertex with the greatest score, breaking ties in favor of the oldest one
PT
during the adding procedure without considering the WCC values. DLSECC works with a straightforward
CE
extension of the CC strategy instead of the WCC strategy. We tested the three algorithms on class LPI over 10 runs
with different random seeds per instance.
AC
In Table 1, the first two columns indicate the number of vertices (n) and the number of edges (m), and the
following columns (Best, Avg)indicate the best objective values and average objective values for each algorithm.
The row Avge denotes the average of the columns Bestand Avg, the row Worse denotes the number of objective
values obtained by the algorithm worse than those obtained by the DLSWCC algorithm, the row Better denotes the
number of objective values obtained by the algorithm better than those obtained by the DLSWCC algorithm, and
the row Equal denotes the number of objective values obtained by the algorithm equaling those obtained by the
ACCEPTED MANUSCRIPT
DLSWCC algorithm. For each combination, we compared the best performance (Best) and the average
performance (Avg) of the three algorithms. The bold values indicate the best solution values obtained among the
three algorithms. For the best performance, in all 15 cases, DLSWCC obtained the best results. DLSNOCC was
only able to match the results of DLSWCC in 2 cases. DLSECC was able to match the results of DLSWCC in 9
CR IP T
cases. For the average performance, in all 15 cases, DLSWCC obtained the best results. Further, DLSNOCC could
not match any of the results, while DLSECC was able to match the results of DLSWCC in 8 cases.
From the table, we can conclude that the DLSECC algorithm performs better than the DLSNOCC algorithm.
AN US
This means that a direct extension of the original CC strategy does work, because this strategy can avoid some
cycling search problems during the local search. However, under the tight restriction of the original configuration
mechanism, some promising search spaces will be omitted. On the other hand, the WCC strategy can achieve a
M
good trade-off between avoiding cycling search and improving the diversity of the algorithm. Compared with
DLSECC, DLSWCC shows much better performance not only in terms of the average performance but also in
ED
terms of the best performance. Table 1 Results of Algorithm DLSNOCC, DLSECC and DLSWCC for instances of Class LPI.
PT
DLSNOCC
Nodes
Best
Avg
Best
Avg
AC
800
1000
DLSWCC Best
Avg
500
12626
12629.7
12616
12616
12616
12616
1000
16475
16480.6
16465
16465
16465
16465
2000
20891
20981.7
20865
20867
20863
20866.2
5000
27247
27520.5
27241
27241
27241
27241
10000
29573
29624.9
29573
29573
29573
29573
500
15053
15053
15025
15025
15025
15025
1000
22747
22757.3
22747
22747
22747
22747
2000
31436
31554.2
31304
31307.5
31301
31305
5000
38722
38809.9
38553
38560
38553
38569.1
10000
44509
44623.8
44356
44356
44351
44353.9
1000
24757
24783.1
24723
24723
24723
24723
5000
45369
45388.8
45255
45255.5
45203
45238.9
10000
51649
51861.3
51402
51422
51378
51380.4
15000
58208
58458.8
58007
58019
57994
57995
CE
500
DLSECC
Edges
ACCEPTED MANUSCRIPT
20000
59890
60025.4
59678
59684
Avg
33276.8
33370.2
33187.33
33190.73
Worse
13
15
6
7
Better
0
0
0
0
Equal
2
0
9
8
59651
59655.3
33178.9
33183.6
7.4 Effectiveness of scoring strategy
CR IP T
We conducted an experiment to evaluate the effectiveness of the dynamic scoring strategy introduced in
Section 3. In this experiment, we compared the DLSWCC algorithm with an alternative algorithm, namely
DLSWCC_STATIC, which works with a static scoring strategy, i.e., at the end of each loop during the local search
process, the weight of each edge eE will not be updated if e is uncovered by the candidate solution. We tested the
AN US
two algorithms on class LPI over 10 runs with different random seeds per instance.
Table 2 Results of Algorithm DLSWCC_STATIC and DLSWCC for instances of Class LPI. DLSWCC_STATIC Nodes
Edges Best 500 1000 2000
CE AC
1000
Avg
12930.4
12616
12616
16789.0
16465
16465
21454
21508.0
20863
20866.2
27850
28136.9
27241
27241
10000
29923
30230.6
29573
29573
500
15262
15262.0
15025
15025
1000
23175
23175.0
22747
22747
2000
32234
32235.9
31301
31305
5000
40153
40252.0
38553
38569.1
10000
45249
45414.6
44351
44353.9
1000
25260
25348.2
24723
24723
5000
46262
46273.8
45203
45238.9
10000
52776
52985.3
51378
51380.4
15000
59000
59207.6
57994
57995
20000
60050
60460.0
59651
59655.3
33178.9
33183.6
PT
800
Best
16789
ED
5000
12870
Avg
M
500
DLSWCC
Avg
33887.1
34013.9
Worse
15
15
Better
0
0
Equal
0
0
In Table 2, the columns Nodes, Edges, Best, Avg, and Time have the same meanings as those in Table 1. The
rows Avge, Worse, Better, and Equal also have the same meaning as those in Table 1. For each combination, we
ACCEPTED MANUSCRIPT
compared the best performance (Best) and the average performance (Avg) of the two algorithms. The bold values
indicate the better solution values obtained between the two algorithms compared.
As discussed in Section 3, in the static scoring mechanism, the vertex score does not change throughout the
search process; thus, it might be easier for the search to be trapped in a local minimum. However, in the dynamic
CR IP T
scoring strategy, the scores of the vertices related to the uncovered edge will be changed dynamically, thereby
enabling the local search to jump out of local optima. The experimental results listed in Table 2 provide further
evidence regarding the effectiveness of the dynamic scoring mechanism. For both the best performance and the
AN US
average performance, in all 15 cases, the algorithm with the dynamic scoring mechanism obtained better results
than the algorithm with the static scoring mechanism.
We then used two representative instances (i.e., n = 1000, m = 1000 and n = 1000, m = 10000) to further investigate the influence of an important component of the proposed DLSWCC algorithm, i.e., the dynamic
M
scoring strategy described in Section 3.The evolution of the objective value with the number of iterations is shown in Fig. 2. As can be seen in this figure, DLSWCC can jump out of local optima and obtain a better solution, while
ED
DLSWCC_STATIC is trapped in a local optimum and can only find worse solutions. This experiment clearly demonstrates the importance of the proposed dynamic scoring strategy.
PT
25500 25400 25200
AC
CE
Objective value
25300 25100 25000
DLSWCC_STATIC
24900
DLSWCC
24800 24700 24600 0
10000
20000
30000
Number of iterations (a) Instance 1(n=1000,m=1000)
40000
ACCEPTED MANUSCRIPT
53200 53000 Objective value
52800 52600 52400 52200
DLSWCC_STATIC
52000
DLSWCC
51800 51400 0
10000
20000
30000
Number of iterations (b) Instance 1(n=1000,m=10000)
CR IP T
51600 40000
AN US
Fig. 2 The evolution of the objective value with the number of iterations.
7.5Efficiency of fast incremental evaluation technique
We now discuss and analyze the importance of the fast incremental evaluation technique of the proposed DLSWCC algorithm. The main component of the DLSWCC algorithm is the vertex selection strategy. For a local
M
search method, it is particularly important to rapidly determine the vertex to be selected and added into or removed from the candidate solution. As described in Section 3, we propose a fast evaluation technique for updating the
ED
score of each vertex after changing the state of a vertex. In other words, once a change is performed, only the scores of some specific vertices affected by the change are updated accordingly instead of recalculating the scores
PT
of all possible vertices.
In order to evaluate the effectiveness of this fast evaluation technique, we carried out computational
CE
experiments to compare the performance of the DLSWCC algorithm with and without this technique (denoted by DLSWCC+F and DLSWCC-F, respectively) on two representative instances (i.e., n = 1000, m = 1000 and n =
AC
1000, m = 10000).
For both instances, our algorithms were independently run 10 times. The evolution of the average CPU time
with the number of iterations is shown in Fig. 1. Specifically, for Instance 1 (n = 1000, m = 1000) and Instance 2 (n = 1000, m = 10000), it can be clearly seen that the average CPU time of DLSWCC+F is respectively around 3 and 5 times shorter than that of DLSWCC-F. This experiment clearly demonstrates that the proposed fast incremental evaluation technique plays an important role in the efficiency of our algorithm.
Average CPU time in second
ACCEPTED MANUSCRIPT
15 10 5
DLSWCC+F
0
DLSWCC-F 1 3 5 7 9 11 13 15 17 19
CR IP T
30 20 10
DLSWCC+F DLSWCC-F
0
AN US
Average CPU time in second
Number of iteration (*103) (a) Instance 1 (n=1000,m=1000)
1 3 5 7 9 11 13 15 17 19
Number of iteration (*103) (b) Instance 2(n=1000,m=10000)
M
Fig. 3 Performance comparison of DLSWCC algorithm with and without the fast evaluation technique
7.6 Experimental results on SPI instances
ED
In order to further confirm the effectiveness of our algorithm, we compared DLSWCC with other
state-of-the-art algorithms in terms of solution quality and computational time for class SPI, as shown in Tables 3
PT
and 4. The first two columns of both tables indicate the number of vertices (n) and the number of edges (m).The
CE
column OPT represents the optimal solution values (averaged over 10 instances) as originally provided in [54].The
AC
columns AVG and TIME represent the average objective values and average computing time in seconds for each
algorithm. The average computing time of ACO+SEE has not been provided in [36]. Note that the values equal to
the optima are marked in bold. The row Avge represents the average of the columns AVG and TIME, while the
rows Worse, Better, and Equal have the same meaning as those in Table 1.
From Tables 3 and 4, we can observe that DLSWCC, MS-ITS, and PBIG obtained similar results in terms of
both AVG and TIME; moreover, these three algorithms yielded better solutions than the other algorithms (ACO and
ACCEPTED MANUSCRIPT
ACO+SEE). Specifically, the ACO algorithm matched the optimal solutions for 10 out of 20 instances under Type
I and 6 out of 20 instances under Type II, while the ACO+SEE algorithm matched the optimal solutions for 17 out
of 20 instances under Type I and 16 out of 20 instances under Type II. Further, DLSWCC, MS-ITS, and PBIG
easily matched the optimal solutions for all the 40 instances of the two types. In addition, DLSWCC, MS-ITS, and
CR IP T
PBIG outperformed the ACO algorithm on most of the instances in terms of computational time (within no more
than 0.001s).
In summary, these algorithms (except for the ant colony optimization algorithms) can obtain the best values of
AN US
the instances. This might be because these instances are all relatively small for the number of nodes (less than 25).
Therefore, these instances are relatively easy to solve for these algorithms.
Table 3 Results for instances of class SPI (Type I). ACO
ACO + SEE
OPT
PBIG
MS-ITS
DLSWCC
AVG
TIME
AVG
AVG
TIME
AVG
TIME
AVG
TIME
284.0
284.0
0.000
284.0
0.000
284.0
0.000
398.7
284.0
284.0
0.000
20
398.7
398.7
0.008
398.7
0.000
398.7
0.000
398.7
0.000
30
431.3
431.3
0.003
431.3
431.3
0.000
431.3
0.000
431.3
0.000
40
508.5
508.5
0.003
508.5
508.5
0.000
508.5
0.000
508.5
0.000
20
441.9
441.9
0.005
441.9
441.9
0.000
441.9
0.000
441.9
0.000
40
570.4
574.2
0.011
570.4
570.4
0.000
570.4
0.000
570.4
0.000
60
726.2
729
0.008
726.2
726.2
0.000
726.2
0.000
726.2
0.000
807.5
814.6
0.010
807.5
807.5
0.000
807.5
0.000
807.5
0.000
880.0
880.0
0.008
880.0
880.0
0.000
880.0
0.000
880.0
0.000
473.0
473.0
0.005
473.0
473.0
0.000
473.0
0.000
473.0
0.000
40
659.3
661.4
0.016
660.3
659.3
0.000
659.3
0.001
659.3
0.000
60
861.8
861.8
0.014
861.8
861.8
0.000
861.8
0.001
861.8
0.000
80
898.0
905.4
0.016
899.9
898.0
0.000
898.0
0.001
898.0
0.000
100
1026.2
1026.8
0.016
1026.2
1026.2
0.000
1026.2
0.001
1026.2
0.000
120
1038.2
1041.5
0.017
1038.2
1038.2
0.000
1038.2
0.001
1038.2
0.000
40
756.6
756.6
0.019
756.6
756.6
0.000
756.6
0.000
756.6
0.000
80
1008.1
1009.6
0.022
1008.1
1008.1
0.000
1008.1
0.001
1008.1
0.000
100
1106.9
1107.4
0.025
1109.1
1106.9
0.000
1106.9
0.000
1106.9
0.000
150
1264.0
1264.0
0.031
1264.0
1264.0
0.000
1264.0
0.000
1264.0
0.000
200
1373.4
1377.7
0.030
1373.4
1373.4
0.000
1373.4
0.001
1373.4
0.000
Avge
777.4
0.013
776
775.7
0.000
775.7
0.000
775.7
0.000
Worse
10
3
0
15
80 100
CE
20
AC
20
25
ED
10
PT
10
Edges
M
Nodes
0
ACCEPTED MANUSCRIPT
Better
0
0
0
0
Equal
10
17
20
20
Table 4 Results for instances of class SPI (Type II). ACO
PBIG
ACO + SEE
MS-ITS
DLSWCC
OPT AVG
TIME
AVG
AVG
TIME
AVG
TIME
AVG
TIME
18.8
0.003
18.8
18.8
0.000
18.8
0.000
18.8
0.000
51.1
51.1
0.003
51.1
51.1
0.000
51.1
0.000
51.1
0.000
30
127.9
127.9
0.003
127.9
127.9
0.000
127.9
0.000
127.9
0.000
40
268.3
268.3
0.010
268.3
268.3
0.000
268.3
0.000
268.3
0.000
20
34.7
34.7
0.005
34.7
34.7
0.000
34.7
0.000
34.7
0.000
40
170.5
171.5
0.010
170.5
170.5
0.000
170.5
0.000
170.5
0.000
60
360.5
360.8
0.008
360.5
360.5
0.000
360.5
0.000
360.5
0.000
80
697.9
698.7
0.014
697.9
697.9
0.000
697.9
0.000
697.9
0.000
100
1130.4
1137.8
0.008
1130.4
1130.4
0.001
1130.4
0.000
1130.4
0.000
20
32.9
33.0
0.011
32.9
32.9
0.000
32.9
0.001
32.9
0.000
40
111.6
111.8
0.017
111.8
111.6
0.000
111.6
0.000
111.6
0.000
60
254.1
254.4
0.016
254.1
254.1
0.000
254.1
0.001
254.1
0.000
80
452.2
453.1
0.016
452.3
452.2
0.000
452.2
0.001
452.2
0.000
100
775.2
775.2
0.016
775.2
775.2
0.000
775.2
0.000
775.2
0.000
120
1123.1
1125.5
0.017
1123.1
1123.1
0.001
1123.1
0.001
1123.1
0.000
40
98.7
98.8
0.025
98.7
98.7
0.000
98.7
0.000
98.7
0.000
80
372.7
373.3
0.026
373.0
372.7
0.000
372.7
0.001
372.7
0.000
100
595.0
595.1
0.028
595.1
595.0
0.000
595.0
0.001
595.0
0.000
150
1289.9
1291.7
0.030
1290.9
1289.9
0.000
1289.9
0.001
1289.9
0.000
200
2709.5
2713.1
0.030
2709.5
2709.5
0.000
2709.5
0.001
2709.5
0.000
Avge
534.7
0.015
533.8
533.8
0.000
533.8
0.000
533.8
0.000
Worse
14
5
0
0
0
0
0
0
6
15
20
20
20
25
Better
CE
Equal
AN US
15
CR IP T
18.8
20
ED
10
PT
10
Edges
M
Nodes
AC
7.7 Experimental results on MPI instances The results for the problem instances of class MPI are summarized in Tables 5 and 6, in accordance with the
previously described case of the instances of class SPI in Table 3. The additional column RGES in Table 6
represents the results of the RGES algorithm, which was only applied to the Type II instances of class MPI. Note
that the optimal solutions are not known for the instances of this class, as the optimal solutions are still unknown
for problem instances of MPI, LPI, and massive graph instances. Therefore, the column OPT is missing.
ACCEPTED MANUSCRIPT
As shown in Tables 5 and 6, DLSWCC finds solutions no worse than the previous best known solutions,
except for the case with n = 200, m = 100 under Type II, and it is significantly superior in terms of running time
compared to the other state-of-the-art algorithms. Specifically, RGES matched the best solutions for 2 out of 32
instances and obtained a better solution than the other algorithms for one case (n = 200, m = 100) under Type II.
CR IP T
ACO matched the best solutions for 0 out of 39 instances under Type I and 2 out of 32 instances under Type II.
ACO+SEE matched the best solutions for 0 out of 39 instances under Type I and 2 out of 32 instances under Type
II. PBIG matched the best solutions for 12 out of 39 instances under Type I and 18 out of 32 instances under Type
AN US
II. MS-ITS matched the best solutions for 26 out of 39 instances under Type I and 20 out of 32 instances under
Type II. DLSWCC matched the best solutions for 39 out of 39 instances under Type I and 32 out of 32 instances
under Type II.
M
More importantly, DLSWCC could dominate 22 out of 71 cases, i.e., DLSWCC obtained 22 new upper
bounds of 71 instances. Under Type I, ACO had an average computational time of 2.338s, PBIG had an average
ED
computational time of 2.489s, MS-ITS had an average computational time of 0.511s, and DLSWCC had an
PT
average computational time of 0.028s. Under Type II, ACO had an average computational time of 3.071s, PBIG
had an average computational time of 4.230s, MS-ITS had an average computational time of 0.452s, and
CE
DLSWCC had an average computational time of 0.044s. It is obvious that the DLSWCC algorithm is faster than
AC
the other algorithms under both types.
A careful observation showed that, when the instances are relatively small, and the number of nodes is less
than 200, PBIG and MS-ITS algorithms can still find the most optimal values of the instances. However, as the
size of the graph increases, the previous algorithms seem unable to find the optimal solutions, while our algorithm
can still find the best solutions for all instances. This is because, when the size of the graph increases, it is easier
for the algorithm to be trapped in a local optimum or cycling search; the proposed WCC strategy and dynamic
ACCEPTED MANUSCRIPT
scoring mechanism can effectively overcome such problems. Moreover, from the time efficiency aspect, DLSWCC
shows the best performance among all the algorithms in solving moderate-scale problem instances under both
Type I and Type II. Table 5 Results for instances of Class MPI (Type I). ACO
200
50
1282.1
0.063
1280.9
1280.0
0.016
1280.0
0.001
1280.0
0.000
100
1741.1
0.083
1740.7
1735.3
0.007
1735.3
0.002
1735.3
0.000
250
2287.4
0.097
2280.6
2272.3
0.006
2272.3
0.002
2272.3
0.000
500
2679.0
0.102
2669.3
2661.9
0.003
2661.9
0.002
2661.9
0.000
750
2959.0
0.125
2957.3
2951.0
0.027
2951.0
0.002
2951.0
0.000
1000
3211.2
0.117
3199.8
3193.7
0.019
3193.7
0.003
3193.7
0.000
100
2552.9
0.273
2544.0
2537.6
0.019
2534.2
0.027
2534.2
0.000
250
3626.4
0.367
3614.9
3602.7
0.057
3601.6
0.010
3601.6
0.000
500
4692.1
0.433
4636.4
4600.6
0.182
4600.6
0.047
4600.6
0.000
750
5076.4
0.502
5082.8
5045.5
0.088
5045.5
0.059
5045.5
0.000
1000
5534.1
0.456
5522.7
5509.4
0.084
5508.2
0.135
5508.2
0.000
2000
6095.7
0.589
6068.3
6051.9
0.501
6051.9
0.011
6051.9
0.000
150
3684.9
0.691
3676.8
3667.3
0.088
3667.0
0.030
3666.9
0.001
250
4769.7
0.891
4754.9
4720.3
0.116
4719.9
0.090
4719.9
0.001
500
6224.0
1.194
6228.7
6165.7
0.294
6165.4
0.234
6165.4
0.005
750
7014.7
1.042
6996.3
6963.7
0.522
6967.0
0.118
6956.4
0.003
1000
7441.8
1.206
7383.6
7368.8
0.536
7359.7
0.190
7359.7
0.006
2000
8631.2
1.103
8597.2
8562.0
0.824
8549.4
0.317
8549.4
0.010
3000
8950.2
0.966
8940.2
8899.8
1.300
8899.8
0.080
8899.8
0.009
5588.7
1.674
5572.4
5551.9
0.157
5551.6
0.108
5551.6
0.005
7259.2
2.160
7233.7
7192.4
0.547
7195.1
0.120
7191.9
0.010
8349.8
2.602
8300.3
8274.5
0.664
8269.9
0.283
8269.9
0.006
1000
9262.2
2.221
9208.4
9150.6
1.019
9150.0
0.777
9145.5
0.012
2000
10916.5
2.437
10891.1
10831.0
2.726
10830.0
0.650
10830.0
0.024
3000
11689.1
2.497
11680.4
11600.2
2.866
11599.6
0.596
11595.8
0.015
250
6197.8
2.273
6169.2
6148.7
0.390
6148.7
0.109
6148.7
0.014
500
8538.8
4.016
8495.9
8440.7
1.344
8438.8
1.137
8436.2
0.016
750
9869.4
4.047
9815.5
9752.8
2.447
9745.9
0.521
9745.9
0.214
1000
10866.6
3.755
10791.0
10753.7
2.371
10752.1
0.933
10751.7
0.037
2000
12917.7
3.942
12827.0
12757.6
3.471
12755.9
2.298
12751.5
0.033
3000
13882.5
4.276
13830.6
13723.5
3.233
13723.3
1.766
13723.3
0.057
5000
14801.8
3.842
14735.9
14676.7
22.508
14669.7
0.648
14669.7
0.043
300
7342.7
4.322
7326.6
7296.0
0.711
7295.8
0.469
7295.8
0.021
500
9517.4
5.178
9491.9
9403.1
1.596
9410.8
0.963
9403.1
0.024
250
AC
CE
750
300
TIME
AVG
TIME
AVG
TIME
CR IP T
AVG
500
250
DLSWCC
AVG
AN US
150
MS-ITS
TIME
M
100
PBIG
AVG
ED
50
ACO + SEE
Edges
PT
Nodes
ACCEPTED MANUSCRIPT
11166.9
6.055
11156.5
11038.1
3.349
11032.0
0.698
11029.3
1000
12241.7
6.231
12163.7
2000
14894.9
6.488
14834.6
3000
16054.1
6.299
5000
17545.4
Avg
7881.0
Worse
0.038
12108.9
3.095
12107.7
0.720
12098.5
0.04
14749.9
10.982
14737.7
3.230
14732.2
0.099
15910.5
15848.2
11.636
15841.4
0.949
15840.8
0.172
6.558
17479.8
17350.6
17.283
17342.9
1.605
17342.9
0.195
2.338
7848.5
7806.1
2.489
7804.0
0.511
7802.8
0.028
39
39
27
13
Better
0
0
0
0
Equal
0
0
12
26
CR IP T
750
Table 6 Results for instances of Class MPI (Type II). ACO
150
200
AVG
AVG
TIME
83.9
0.072
83.9
83.7
0.002
100
276.2
276.2
0.097
274.4
271.2
0.003
250
1886.4
1886.8
0.111
1870.3
1853.4
0.010
500
7914.5
7915.9
0.120
7876.7
750
20134.1
20134.1
0.111
50
67.4
67.4
100
169.1
250
DLSWCC
AVG
TIME
AVG
TIME
83.7
0.005
83.7
0.000
271.2
0.004
271.2
0.000
1853.4
0.003
1853.4
0.000
7825.1
0.000
AN US
83.9
7825.1
0.009
7825.1
0.008
20087.6
20079.0
0.010
20079.0
0.003
20079.0
0.000
0.184
67.2
67.2
0.002
67.2
0.010
67.2
0.000
169.1
0.334
167.8
166.6
0.017
166.6
0.023
166.6
0.000
890.4
901.7
0.514
895.3
886.5
0.065
886.5
0.039
886.5
0.002
500
3725.3
3726.7
0.481
3707.0
3693.6
0.101
3693.6
0.031
3693.6
0.000
750
8745.5
8754.5
0.444
50
65.8
65.8
0.292
100
144.7
144.7
0.583
8742.3
8680.2
0.129
8680.2
0.194
8680.2
0.000
65.9
65.8
0.001
65.8
0.010
65.8
0.000
144.1
144.0
0.026
144.0
0.065
144.0
0.000
250
624.4
625.7
1.387
624.8
616
0.187
615.8
0.199
615.8
0.014
500
2365.2
2375.0
1.908
2358.6
2331.5
0.572
2331.5
0.177
2331.5
0.006
750
5798.6
5799.2
1.295
5707.0
5698.7
0.550
5698.5
0.244
5698.5
0.008
50
59.6
59.6
0.463
59.6
59.6
0.001
59.6
0.016
59.6
0.000
132.6
134.7
0.981
134.6
134.5
0.021
134.5
0.050
134.5
0.000
488.4
488.7
2.413
487.9
483.1
0.280
484.5
0.300
483.1
0.013
CE
250 500
1843.6
1843.6
3.423
1818.7
1804.3
1.286
1803.9
0.375
1803.9
0.062
750
4112.8
4112.8
3.600
4077.0
4043.6
0.768
4043.5
0.452
4043.5
0.040
250
423.2
423.2
3.311
421.2
419.0
0.476
419.0
0.129
419.0
0.012
500
1457.4
1457.4
5.781
1454.3
1435.7
4.219
1434.7
0.869
1434.2
0.201
750
3315.9
3315.9
5.983
3289.4
3261.0
2.935
3256.4
0.459
3256.1
0.115
AC 300
TIME
50
100
250
MS-ITS
AVG
M
100
PBIG
ACO + SEE
RGES
ED
50
Edges
PT
Nodes
1000
6058.2
6058.2
6.297
6040.0
5989.4
7.978
5988.2
0.259
5986.4
0.120
2000
26149.1
26149.1
4.859
25932.1
25658.5
11.809
25646.4
1.698
25636.5
0.066
5000
171917.2
171917.2
4.856
171500.7
170269.1
37.770
170269.1
1.298
170269
0.052
250
403.9
403.9
5.372
402.7
399.5
0.353
399.6
0.397
399.4
0.016
500
1239.1
1239.1
9.155
1237.3
1216.4
5.439
1217.2
0.527
1216.4
0.074
750
2678.2
2678.2
10.994
2674.1
2639.4
6.545
2640.6
0.362
2639.3
0.259
1000
4895.5
4895.5
9.045
4867.9
4796.3
15.204
4796.2
0.993
4795.0
0.203
2000
21295.2
21295.2
7.242
21107.7
20891.6
10.911
20886.4
1.834
20881.3
0.056
ACCEPTED MANUSCRIPT
143243.5
143243.5
6.553
142292.6
141265.3
27.674
141226.8
3.418
141220.4
0.084
Avg
3000
13831.4
13832.6
3.071
13764.7
13663.4
4.230
13661.52
0.452
13660.6
0.044
Worse
29
30
30
14
12
Better
1
0
0
0
0
Equal
2
2
2
18
20
7.8 Experimental results on LPI instances
CR IP T
Table 7 compares DLSWCC with ACO+SEE, PBIG, and MS-ITS for instances of the class LPI. The
instances of this class are larger than those of class SPI and class MPI. MS-ITS is the best available algorithm for
these instances. In contrast to classes SPI and MPI, class LPI only contains a single instance per combination of n
AN US
and m. Therefore, the results are given as averages over 10 runs with different random seeds per instance. For each
of the four algorithms, the columns Best, Avg, and Time have the same meaning as those in Table 1. The best value
among the four is indicated in bold.
M
As can be seen in the table, DLSWCC obtained the best resultsin terms of best performance and average
performance. Moreover, DLSWCC could dominate 5 out of 15 cases, i.e., DLSWCC obtained 5 new upper bounds
ED
of these instances. In particular, when the number of nodes was equal to 1000, the previous algorithms could only
PT
match one best value of the instances.
We can see from the table that, when the size of the instances increases, our algorithm performs better in
CE
terms of both the best solution and the average solution. Because cycling search and local minima are more
AC
common for large-scale instances, the experimental results further prove that the proposed WCC strategy and
dynamic scoring strategy can avoid cycling search efficiently and enable the algorithm to jump out of local
minima.
Table 7 Results for instances of Class LPI
ACO + SEE
500
Best
Avg
Best
Avg
500
12675
12687.7
12616
12620.0
1000
16516
16574.9
16465
16470.1
2000
21000
21093.0
20863
20870.8
5000
27294
27585.5
27318
27428.2
10000
29573
29796.4
29573
29666.8
15049
15069.9
15025
15025.0
22792
22852.1
22747
22763.0
2000
31680
31786.9
31355
31422.6
5000
38830
38906.7
38665
38718.7
10000
44499
44691.7
44396
1000
24856
24925.4
24746
5000
45446
45588.7
45255
10000
51875
52105.0
51378
15000
58394
58654.8
20000
60010
Avg Worse Better
AC
Avg
1.601
12623
10.240 9.283
DLSWCC
Time
Best
Avg
Time
12635.0
3.057
12616
12616.0
5.184
16480
16483.1
9.348
16465
16465.0
0.826
20863
20866.9
9.606
20863
20866.2
11.024
34.707
27241
27241.0
9.103
27241
27241.0
5.263
36.405
29573
29573.0
36.250
29573
29573.0
14.930
2.535
15046
15054.1
6.726
15025
15025.0
0.442
22760
22760.0
13.994
22747
22747.0
1.589
58.008
31309
31345.7
40.967
31301
31305.0
1.725
113.842
38553
38557.1
67.096
38553
38569.1
2.812
44397.8
96.467
44351
44359.9
93.750
44351
44353.9
0.824
24763.1
14.435
24735
24766.1
19.281
24723
24723.0
5.500
45295.4
178.311
45230
45256.9
113.739
45203
45238.9
7.912
51540.9
325.956
51378
51423
209.673
51378
51380.4
9.680
58014
58145.2
363.179
58014
58068.9
242.143
57994
57995.0
7.098
60268.2
59790
59847.9
647.563
59675
59719.9
243.324
59651
59655.3
3.333
33366
33505.8
33213.7
33265.0
126.941
33188.7
33207.4
74.803
33178.9
33183.6
5.210
14
15
8
14
9
12
0
0
0
0
0
1
1
0
7
1
6
2
CE
Equal
Best
11.589
ED
1000
MS-ITS Time
M
500 1000
PT
800
PBIG
Edges
AN US
Nodes
CR IP T
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
7.9 Experimental results on massive graph instances We carried out extensive experiments to evaluate DLSWCC on a broad range of real-world graphs and
compare it with MS-ITS, which obviously performs better than the other algorithms in academic instances. The
two algorithms were run under the same experimental conditions. The instances of the class massive graph
CR IP T
instances are the largest ones in the entire benchmark set. They contain a single instance per combination of n and
m. Therefore, the results are given as averages over 10 runs with different random seeds per instance.
Table 8 lists the solutions of DLSWCC and MS-ITS for the class massive graph instances. The first column
lists the names of the instances. The next two columns indicate the number of vertices (n) and the number of edges
AN US
(m). For each of the two algorithms, the columns Best, Avg, and Time have the same meaning as those in Table 1.
Note that the bold value indicates the better solution value obtained between the two algorithms compared. For
some instances, MS-ITS failed to find a vertex cover; for these cases, the column entry for MS-ITS is marked as
M
―N/A‖.
ED
We can observe that for all the instances, the number of nodes was greater than 1000 (greater than 500,000 in
some cases). Although MS-ITS shows much better performance than the other algorithms in academic instances, it
PT
cannot solve many instances of these graphs, as discussed previously. For all 56 cases, MS-ITS could only solve 27
CE
instances. The largest instances solved by MS-ITS had 21363 nodes. Moreover, for all the instances, DLSWCC could find better ―best results‖, while MS-ITS was only able to match DLSWCC in 4 relatively small cases
AC
(graphs ca-GrQc, ia-fb-messages, ia-reality, and web-google) for the best performance. The experimental results
provide evidence that both the WCC strategy and the dynamic scoring strategy are efficient tools for solving
large-scale real-world MWVC problems, owing to their ability to avoid local minima. Table 8 Results for massive graph instances. MS-ITS Graph bio-dmela
DLSWCC
Nodes
Edges
Best
Avg
Time
Best
Avg
Time
7393
25569
149452
149556.8
2126.409
148508
148540.4
135.42
24265.0
3.81
646529
647019.1
420.94
7048010
7048225.5
792.42
685813
686344.3
489.20
1.233
29390
29390.0
3.67
N/A
N/A
6618986
6619251.8
496.89
N/A
N/A
N/A
8986085
8986982.4
1106.33
7515
28303
28303.0
8.424
28298
28298.0
0.15
4158
13422
122330
122331.5
674.020
122278
122332.5
95.51
ca-HepPh
11204
117619
372836
373069.8
10475.341
365251
365530.8
308.65
ca-MathSciNet
332689
820644
N/A
N/A
N/A
7668338
7668818.0
838.24
socfb-Berkeley13
22900
852419
N/A
N/A
N/A
1011694
1011902.5
907.12
socfb-CMU
6621
249959
296930
297032.5
1011.864
292362
292428.8
110.98
socfb-Duke14
9885
506437
458100
460907.8
1629.423
450799
450898.3
312.80
socfb-Indiana
29732
1305757
N/A
N/A
N/A
1375506
1377961.4
1193.10
socfb-MIT
6402
251230
274242
274443.0
12093.173
272431
272472.4
171.22
socfb-OR
63392
816886
N/A
socfb-Penn94
41536
1362220
N/A
socfb-Stanford3
11586
568309
506903
socfb-UCLA
20453
747604
913929
socfb-UConn
17206
604867
792021
socfb-UCSB37
14917
482215
677548
socfb-UIllinois
30795
1264421
N/A
socfb-Wisconsin87
23831
835946
ia-email-EU
32430
54397
M
ACCEPTED MANUSCRIPT
24269
24290.0
53.883
ca-AstroPh
17903
196972
662655
662926.5
16156.917
ca-citeseer
227320
814134
N/A
N/A
N/A
ca-CondMat
21363
91286
704287
704798.5
9860.340
ca-CSphd
1882
1740
29550
29609.8
ca-dblp-2010
226413
716460
N/A
ca-dblp-2012
317080
1049866
ca-Erdos992
6100
ca-GrQc
ia-email-univ
1133
5451
ia-enron-large
33696
ia-fb-messages
1266
ia-reality
6809
ia-wiki-Talk
N/A
2114652
2116501
1169.49
N/A
N/A
1827780
1829265.1
1052.84
507561.5
4388.662
495332
495411.3
479.12
915068.0
154.893
888489
888857.8
774.81
793196.8
15843.684
771427
771744.5
638.52
678029.5
5456.765
659407
659615.9
447.50
N/A
N/A
1414900
1417140.9
1197.28
N/A
N/A
N/A
1071625
1072009.6
1081.07
N/A
N/A
N/A
48269
48269.0
5.98
32931
32933
91.708
32931
32931.0
1.49
180811
N/A
N/A
N/A
695112
695294.8
774.01
6451
32300
32316.5
44.546
32300
32300.1
2.27
7680
4894
4894
10.436
4894
4894.0
0.03
92117
360767
N/A
N/A
N/A
962030
962194.9
1194.68
4941
6594
121386
121503.3
993.332
120116
120146.5
110.40
ED
N/A
CE
inf-power
24265
CR IP T
1948
AN US
1458
PT
bio-yeast
91813
125704
N/A
N/A
N/A
2629821
2630671.0
1195.30
sc-nasasrb
54870
1311227
N/A
N/A
N/A
3004611
3005889.1
1021.60
sc-shipsec1
140385
1707759
N/A
N/A
N/A
6843870
6844747.6
2056.61
soc-brightkite
56739
212945
N/A
N/A
N/A
1187631
1187962.3
1162.34
soc-delicious
536108
1365961
N/A
N/A
N/A
4957627
4958206.4
1720.58
soc-douban
154908
327162
N/A
N/A
N/A
515270
515288.1
1111.14
soc-epinions
26588
100120
N/A
N/A
N/A
539569
539915.5
593.16
soc-gowalla
196591
950327
N/A
N/A
N/A
4729181
4729405.5
909.89
soc-slashdot
70068
358647
N/A
N/A
N/A
1247682
1248151.4
1182.44
soc-twitter-follows
404719
713319
N/A
N/A
N/A
135811
135811.0
314.75
tech-as-caida2007
26475
53381
N/A
N/A
N/A
200511
200755.8
357.47
tech-internet-as
40164
85123
N/A
N/A
N/A
312123
312308.4
490.53
tech-p2p-gnutella
62561
147878
N/A
N/A
N/A
917822
918207.3
1058.55
AC
rec-amazon
ACCEPTED MANUSCRIPT
tech-RL-caida
190914
N/A
N/A
tech-routers-rf
2113
6632
44919
44936.5
60.996
44894
44902.3
34.89
tech-WHOIS
7476
56943
128568
128588.0
6499.892
128337
128345.3
122.98
web-arabic-2005
163598
1747269
N/A
N/A
N/A
6572535
6573003.0
1855.93
web-BerkStan
12305
19500
292693
293081.0
2304.889
286665
286871.4
290.43
web-edu
3031
6474
79499
79545.5
242.268
79078
79100.8
47.24
web-google
1299
2773
27842
27842.0
0.7585
27842
27842.0
2.56
web-indochina-2004
11358
47606
409686
409765.0
3187.944
405419
405773.4
255.25
web-sk-2005
121422
334419
N/A
N/A
N/A
3135635
3135843.5
115.32
web-spam
4767
37375
129440
129534.8
644.218
128980
128994.8
92.55
web-webbase-2001
16062
25593
144674
144718.5
399.876
144361
144444.9
186.70
Worse
52
53
Better
0
1
Equal
4
2
4203838
4204531.8
414.25
AN US
8. Conclusion
N/A
CR IP T
607610
This paper proposed a new local search algorithm, namely DLSWCC, to solve the MWVC problem. The
dynamic scoring strategy was proposed to enable our algorithm to find different possible optimal solutions. Further,
the weighted configuration checking (WCC) strategy was introduced to overcome the cycling problem in local
M
search. By combining the WCC strategy with the dynamic scoring strategy, we designed the vertex selection
ED
strategy to determine the vertex to be selected as a candidate solution component. In addition, DLSWCC was
compared with several state-of-the-art algorithms on benchmark instances; the experimental results showed that
PT
DLSWCCis effective and efficient. For the class SPI, DLSWCC obtained optimal solutions for all 20 instances.
CE
For the class MPI, DLSWCC obtained 22 new upper bounds of 71 instances. For the class LPI, DLSWCC
AC
obtained 5 new upper bounds of 15 instances. For the class massive graph instances, DLSWCC obtained 56 new upper bounds of 56 instances. The theoretical aspects of our approach require further investigation. A possible direction for future work is to extend the techniques used in our algorithms to other local search frameworks for
the MWVC problem. Finally, because our framework has virtually no parameters, it can be easily adapted to other
combinatorial problems [31][45][46][48][55][64][66][67][68].
Acknowledgements
ACCEPTED MANUSCRIPT
The authors of this paper wish to extend their sincere gratitude to all the anonymous reviewers for their efforts.
This work was supported in part by NSFC (under Grant Nos. 61370156, 61403076, and 61403077) and the
Program for New Century Excellent Talents in University (NCET-13-0724).
Reference C. Aggarwal, J. Orlin, R. Tai, Optimized crossover for the independent set problem, Oper. Res. 45 (1997)
CR IP T
[1]
226–234. [2]
D.V. Andrade, M.G.C. Resende, R.F.F. Werneck, Fast local search for the maximum independent set problem, in: Proc. of WEA-08, 2008, pp. 220–234.
A. Arab, A. Alfi. An adaptive gradient descent-based local search in memetic algorithm applied to optimal
AN US
[3]
controller design, Inform. Sci. 299(2015)117-12. [4]
S.R. Balachandar, K. Kannan, A meta-heuristic algorithm for vertex covering problem based on gravity, Int. J. Math. & Statistical Sci. 1(3)(2009)130-136.
S. Balaji, V. Swaminathan, K. Kannan, An effective algorithm for minimum weighted vertex cover problem,
M
[5]
Int. J. Comput. & Math. Sci. 4(2010) 34-38.
V.C. Barbosa, L.C.D. Campos, A novel evolutionary formulation of the maximum independent set problem, J.
ED
[6]
Comb. Optim. 8 (4) (2004) 419–437.
H. Bhasin, M. Amini. The Applicability of Genetic algorithm to Vertex Cover[J]. International Journal of
PT
[7]
[8]
S. Bouamama, C. Blum, A. Boukerram, A population-based iterated greedy algorithm for the minimum
[9]
CE
Computer Applications, 123(17)(2015)29.
weight vertex cover problem[J]. Appl. Soft Comput. 2012, 12(6): 1632-1639.
AC
C. Brause, N. C. Lê, I. Schiermeyer. The maximum independent set problem in subclasses of subcubic graphs[J]. Discrete Mathematics, 338(10)(2015)1766-1778.
[10] B. Brešar, R. Krivoš-Belluš, G. Semanišin, P. Šparl. On the weighted k-path vertex cover problem[J]. Discrete Applied Mathematics, 177(2014) 14-18. [11] J. Brimberg, N. Mladenović, D. Urošević. Solving the maximally diverse grouping problem by skewed general variable neighborhood search, Inform. Sci. 295(2015)650-675. [12] S.W. Cai, J. Lin, K.L. Su, Two weighting local search for minimum vertex cover, Proc. AAAI. 2015. [13] S.W. Cai, K.L. Su, Local search for Boolean Satisfiability with configuration checking and subscore, Artif.
ACCEPTED MANUSCRIPT
Intell. 204( 2013)75-98. [14] S.W. Cai, K.L. Su, Q.L. Chen, EWLS: A New Local Search for Minimum Vertex Cover, Proc. AAAI. 2010. [15] S.W. Cai, K.L. Su, A. Sattar, Local search with edge weighting and configuration checking heuristics for minimum vertex cover, Artif. Intell. 175(9)(2011)1672-1696. [16] S.W. Cai, K.L. Su, C. Luo, A. Sattar, NuMVC: An efficient local search algorithm for minimum vertex cover, J. Artif. Intell. Res. (2013)687-716.
minimal vertex covers of graphs, Inform. Sci. 325( 2015) 87-97.
CR IP T
[17] J.K. Chen, Y.J. Lin, G.P. Lin, J.J. Li, Z.M. Ma. The relationship between attribute reducts in rough sets and
[18] J.K. Chen, Y.J. Lin, J.J. Li, G. Lin, Z.M. Ma, A. Tan. A rough set method for the minimum vertex cover problem of graphs[J]. Applied Soft Computing,, 42(2016) 360-367.
AN US
[19] V. Chvátal, A greedy heuristic for the set-covering problem, Math. Oper. Res. 3(1979)233–235.
[20] A. Coja-Oghlan, C. Efthymiou. On independent sets in random graphs[J]. Random Structures & Algorithms, 47(3)(2015) 436-486.
[21] J.A. Delgado-Osuna, M. Lozano, C. García-Martínez. An alternative artificial bee colony algorithm with
M
destructive–constructive neighbourhood operator for the problem of composing medical crews, Inform. Sci. 326(2016)215-226.
ED
[22] I. Dinur, S. Safra, On the hardness of approximating minimum vertex cover, Ann. of Math. 162 (2) (2005) 439–486.
PT
[23] S. Dobrev, R. Královič, R. Královič. Advice complexity of maximum independent set in sparse and bipartite graphs[J]. Theory of Computing Systems, 56(1)(2015) 197-219.
CE
[24] I. Evans, An evolutionary heuristic for the minimum vertex cover problem, in: Proc. of EP-98, 1998, pp. 377–386.
AC
[25] Z. Fang, Y. Chu, K. Qiao, X. Feng, K. Xu. Combining edge weight and vertex weight for minimum vertex cover problem[M]//Frontiers in Algorithmics. Springer International Publishing, 8497 (2014)71-81.
[26] H. Fernau, F. V. Fomin, G. Philip, S. Saurabh. On the parameterized complexity of vertex cover and edge cover with connectivity constraints[J]. Theoretical Computer Science, 565(2015)1-15.
[27] S. Gilmour, M. Dras, Kernelization as heuristic structure for the vertex cover problem, in: ANTS Workshop, 2006, pp. 452–459. [28] F. Glove, Tabu search—Part I, ORSA Journal on computing 1(1989) 190–206. [29] F. Glover, Tabu search—Part II, ORSA Journal on computing 2(1990) 4–32.
ACCEPTED MANUSCRIPT
[30] F. Glover, T. Ye, A. P. Punnen, G. Kochenberger. Integrating tabu search and VLSN search to develop enhanced algorithms: A case study using bipartite boolean quadratic programs[J]. European Journal of Operational Research,
241(3) (2015) 697-707.
[31] B. Gu, V.S. Sheng, Z.J. Wang, D. Ho, S. Osman, and S. Li, Incremental learning for ν-Support Vector Regression, Neural Networks, 67(2015)140-150. [32] Y.M. Hu, B. Yang, H.S. Wong. A weighted local view method based on observation over ground truth for
CR IP T
community detection, Inform. Sci. (2016) doi:10.1016/j.ins.2016.03.028.
[33] X. Huang, H. Cheng, J.X. Yu. Dense community detection in multi-valued attributed networks, Inform. Sci. 314(2015) 77-99.
[34] Y. Jin, J.K. Hao, Hybrid evolutionary search for the minimum sum coloring problem of graphs, Inform. Sci.
AN US
352 (2016) 15-34.
[35] Y. Jin, J.K. Hao, General swap-based multiple neighborhood tabu search for the maximum independent set problem,
Eng. Appl.
Artif. Intel. 37(2015)20-33.
[36] R. Jovanovic, M. Tuba, An ant colony optimization algorithm with improved pheromone correction strategy for the minimum weight vertex cover problem, Applied Soft Computing Journal, 11 (8)(2011)5360–5366.
M
[37] R. Karp, R.E. Miller, J.W. Theater, in: Complexity of Computer Computations, Plenum Press, New York, 1972.
ED
[38] S. Kifah, S. Abdullah, An adaptive non-linear great deluge algorithm for the patient-admission problem, Inform. Sci. 295(2015) 573-585.
PT
[39] G. Kochenberger, M. Lewis, F. Glover, H. Wang. Exact solutions to generalized vertex covering problems: a comparison of two models[J]. Optimization Letters, 9(7):(2015) 1331-1339.
CE
[40] N.C. Lê, C. Brause, I. Schiermeyer, Extending the MAX Algorithm for Maximum Independent Set, Discuss. Math. Graph T. 35(2)(2015) 365-386.
AC
[41] R.H. Li, J. X. Yu, X. Huang, H. Cheng, Z. Shang, Measuring the impact of MVC attack in large complex networks, Inform. Sci. 278(2014) 685-702.
[42] J. Li, Q. Pan, Solving the large-scale hybrid flow shop scheduling problem with limited buffers by a hybrid artificial bee colony algorithm, Inform. Sci. 316 ( 2015) 487-502. [43] R.Z. Li, S.L Hu, Y.Y. Wang, M.H. Yin, A local search algorithm with tabu strategy and perturbation mechanism
for
generalized
doi:10.1007/s00521-015-2172-9.
vertex
cover
problem,
Neural
Comput.
Appl.
(2016),
ACCEPTED MANUSCRIPT
[44] X.T. Li, M.H. Yin, Modified cuckoo search algorithm with self adaptive parameter method, Inform. Sci. 298(2015) 80-97. [45] X.T. Li, M.H. Yin, Multiobjective binary biogeography based optimization for feature selection using gene expression data, NanoBioscience, IEEE Trans. on, 12(4)(2013) 343-353. [46] Y. Liu, C. Yang, W.K.S. Tang, C, Li, Optimal topological design for distributed estimation over sensor networks, Inform. Sci. 254(2014) 83-97.
satisfiability, Cybernetics, IEEE Trans. on, 45(5)(2015) 1014-1027.
CR IP T
[47] C. Luo, S.W. Cai, K.L. Su, W. Wu, Clause states based configuration checking in local search for
[48] T.H. Ma, J.J. Zhou, M.L. Tang, Y. Tian, A. Al-Dhelaan, M. Al-Rodhaan, and S.Y.y Lee, Social network and tag sources based augmenting collaborative recommender system, IEICE transactions on Information and
AN US
Systems, E98-D(4)(2015)902-910.
[49] W. Pullan, H.H. Hoos, Dynamic local search for the maximum clique problem, J. Artif. Intell. Res. (JAIR) 25 (2006) 159–185.
[50] W. Pullan, Optimisation of unweighted/weighted maximum independent sets and minimum vertex covers,
M
Discrete Optim. 6(2)(2009)214-219.
[51] S. Richter, M. Helmert, C. Gretton, A stochastic local search approach to vertex cover, in: Proc. of KI-07,
ED
2007, pp. 412–426.
[52] R. Rossi, N. Ahmed, The network data repository with interactive graph analytics and visualization, Proc.
PT
AAAI. 2015.
[53] R. Rossi, N. Ahmed, Coloring large complex networks, Social Network Analysis and Mining (SNAM),
CE
(2014) 1–52.
[54] S.J. Shyu, P. Yin, B.M.T. Lin, An ant colony optimization algorithm for the minimum weight vertex cover
AC
problem, Ann. Oper. Res. 131 (1–4) (2004) 283–304.
[55] E. Teymourian, V. Kayvanfar, GH.M. Komaki, M. Zandieh, Enhanced intelligent water drops and cuckoo search algorithms for solving the capacitated vehicle routing problem, Inform. Sci. 334(2016) 354-378.
[56] J. Tu. A fixed-parameter algorithm for the vertex cover P3 problem[J]. Information Processing Letters, 115(2)(2015) 96-99. [57] S. Voß, A. Fink, A hybridized tabu search approach for the minimum weight vertex cover problem, J. Heuristics, 18(6)(2012)869-876. [58] L. Wang, W. Du, Z. Zhang, X. Zhang, A PTAS for minimum weighted connected vertex cover P_3 problem
ACCEPTED MANUSCRIPT
in 3-dimensional wireless sensor networks, J. Comb. Optim. (2015)1-17. [59] L. Wang, X. Zhang, Z. Zhang, H. Broersma. A PTAS for the minimum weight connected vertex cover P3 problem on unit disk graphs[J]. Theoretical computer science, 571(2015) 58-66. [60] D. Wang, H. Xiong, D. Fang. A Neighborhood Expansion Tabu Search Algorithm Based On Genetic Factors[J]. Open Journal of Social Sciences, 4(03) (2016) 303-308. [61] Y.Y. Wang, S.W. Cai, and M.H. Yin, Two efficient local search algorithms for maximum weight clique
CR IP T
problem, Proc. AAAI. 2016.
[62] Y.Y. Wang, R.Z. Li, Y.P. Zhou, M.H. Yin. A path cost-based GRASP for minimum independent dominating set problem,Neural Comput Appl. (2016) doi:10.1007/s00521-016-2324-6.
[63] Y.Y. Wang Y, D.T. Ouyang, L.M. Zhang, M.H. Yin , A novel local search for unicost set covering problem
AN US
using hyperedge configuration checking and weight diversity, SCIENCE CHINA Info Sci. (2015). doi:10.1007/s11432-015-5377-8.
[64] X.Z. Wen, L. Shao, Y. Xue, and W. Fang, A rapid learning algorithm for vehicle classification, 295(1)(2015)395-406.
Inform. Sci.
Inform. Sci. 334( 2016) 103-121.
M
[65] Q.H. Wu, J.K. Hao, A clique-based exact method for optimal winner determination in combinatorial auctions,
ED
[66] Z.H. Xia, X.H. Wang, X.M. Sun, and B.W. Wang, Steganalysis of least significant bit matching using multi-order differences, Security and Communication Networks, 7(8)(2014)1283-1291.
PT
[67] E.T. Yassen, M. Ayob, M.Z.A. Nazri, N.R. Sabar, Meta-harmony search algorithm for the vehicle routing problem with time windows, Inform. Sci. 325(2015) 140-158.
CE
[68] X. Zhang, X. Li, J. Wang. Local search algorithm with path relinking for single batch-processing machine scheduling problem, Neural Comput Appl. (2016) doi: 10.1007/s00521-016-2339-z.
AC
[69] T. Zhou, Z.P. Lü, Y. Wang, J. Ding, B. Peng, Multi-start iterated tabu search for the minimum weight vertex cover problem, J. Comb. Optim. (2015) 1-17 doi:10.1007/s10878-015-9909-3.
[70] Y.P. Zhou, H.C. Zhang, R.Z. Li, and J.N. Wang, Two Local Search Algorithms for Partition Vertex Cover Problem. J. Comput. Theor. Nanosci. 13(2016) 743-751.