Sequential seeding strategy for social influence diffusion with improved entropy-based centrality

Journal Pre-proof Sequential seeding strategy for social influence diffusion with improved entropy-based centrality Chengzhang Ni, Jun Yang, Demei Kon...

Download PDF

935KB Sizes 0 Downloads 4 Views

Report

PDF Reader
Full Text

Journal Pre-proof Sequential seeding strategy for social influence diffusion with improved entropy-based centrality Chengzhang Ni, Jun Yang, Demei Kong

PII: DOI: Reference:

S0378-4371(19)32041-2 https://doi.org/10.1016/j.physa.2019.123659 PHYSA 123659

To appear in:

Physica A

Received date : 31 May 2019 Revised date : 29 September 2019 Please cite this article as: C. Ni, J. Yang and D. Kong, Sequential seeding strategy for social influence diffusion with improved entropy-based centrality, Physica A (2019), doi: https://doi.org/10.1016/j.physa.2019.123659. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier B.V.

*Manuscript Click here to view linked References

Journal Pre-proof

Highlights Sequential seeding strategy for social influence diffusion with improved entropybased centrality

of

Chengzhang Ni, Jun Yang, Demei Kong • A measurement based on heterogeneity of confidence in neighbors is developed.

p ro

• The entropy-based centrality is proposed to measure the individual’s influence. • Sequential seeding strategy are conducted for comparison with sing-stage strategy.

Jo

urn

al

Pr e-

• The proposed centrality is the best of all in the BA scale-free network.

Journal Pre-proof

Sequential seeding strategy for social influence diffusion with improved entropy-based centrality Chengzhang Ni, Jun Yang∗, Demei Kong

of

School of Management, Huazhong University of Science and Technology, Wuhan, 430074 Hubei, China

Abstract

p ro

In this paper, we investigate the centrality problem of selecting seed targets for sequential seeding strategy in social networks. Based on the concept of entropy, we design a novel improved centrality by integrating interaction intimacy and confidence level to measure the total influence of an individual which can be decomposed into direct effect and indirect effect.

Pr e-

In addition, we formulate the sequential seeding strategy to evaluate the performance of the proposed centrality and compare it with the counterpart of the single-stage seeding strategy. Furthermore, extensive experiments are conducted for comparison with the other centralities including betweenness, closeness, degree, and eigenvector in two empirical and four artificial social networks. By simulations, we find that the proposed entropy-based centrality is superior to other centralities in terms of diffusion speed and influence coverage in the BA scale-free

al

network. Parameter analysis of sequential seeding strategy demonstrates that the proposed centrality can achieve the greatest total influence coverage in the case where the individual’s

Keywords:

urn

confidence in each neighbor is treated equally.

social network, influence diffusion, entropy, centrality, seeding strategy.

1

1. Introduction

In the past decades, the emergence of the extensive social network services makes it pos-

3

sible to investigate and analyze social influence diffusion in real-world networks. Diffusion

4

in the social ecology of human beings has long been a very common phenomenon, including

5

disease spread in the crowds through contact[1], information dissemination and rumors spread

Jo

2

∗

Corresponding author Email address: [email protected] (Jun Yang)

Preprint submitted to Physica A

September 29, 2019

Journal Pre-proof

on online or offline social media[2], and innovation communication in the society[3]. In partic-

7

ular, a central hot topic that received considerable attention in diffusion research field is the

8

maximization problem of social influence[4], which can be interpreted as how to select a small

9

subset of individuals in a social network as initial influential seeds such that social influence

10

diffusion cascade triggered from these seeds can lead to an optimal coverage. One of the most

11

popular applications related to social influence diffusion is word-of-mouth marketing of new

12

product[5][6]. For example, an individual’s decision to adopt a new product or not partly

13

depends on neighbors’ attitude toward this product. The greater the influence of the neigh-

14

bors who adopt the product, the larger the probability that the individual will be influenced.

15

Therefore, for a corporation who aim to launch new products into the market, how to choose

16

and seed the most influential individuals effectively is apparently an important strategy to

17

optimize influence diffusion and control seeding budget. In addition, the research results of

18

social influence diffusion problem have been widely applied to many business activities rang-

19

ing from public opinion analysis[7], on-line recommendation[8], and advertising release[9], to

20

expert identification[10], etc.

Pr e-

p ro

of

6

Being a fundamental research domain of the social influence maximization problem, the

22

identification problem of influential individuals is frequently processed to quantify centrality

23

of the individual in the social network. Thus, multitudinous centrality measurements have

24

been proposed including[11–26]. Thereinto, the centrality measurements, e.g. betweenness

25

centrality[11], Katz centrality[12], closeness centrality[11], eccentricity[13], and information

26

centrality index[14], are established on path traffic flows of the social network, which can

27

affect and mirror the social influence of individuals.In addition to the path-based central-

28

ity mentioned above, centrality measurements in[15–26] emphasize analyzing an individual’s

29

social influence resulted from its neighbor’s location in the social network. Although above-

30

mentioned centrality measurements are available to identify the influential individual through

31

excavating the topological structure of the social network to characterize its social influence,

32

they ignore the role of the complexity and uncertainty of individual neighborhood structure in

33

analyzing network centrality. There are some shreds of evidence that entropy-based centrality

34

has been applied extensively to quantify the complexity and uncertainty of networks[27–34].

35

Therefore, in this paper, we propose an improved entropy-based centrality measurement con-

36

sidering connection weight and the difference in the degree of confidence in neighbors.

Jo

urn

al

21

2

Journal Pre-proof

Furthermore, in order to make social influence diffusion more efficient, how to design

38

seeding strategies has also aroused wide research interests. Since seeking the optimal seeding

39

strategy is known to be NP-hard, there has been considerable focus on greedy algorithms[35]

40

and heuristic methods[4][36]. A popular application in the existing researches is that all seed

41

individuals are activated during the initial phase and then spread their social influence to

42

neighbor individuals. This is the so-called single-stage seeding strategy. However, if all seed

43

individuals are launched at the initial time step, there exists such a situation where some seeds

44

are redundant due to their potentiality of being influenced naturally in the following stages.

45

Therefore, several seeding strategies are proposed in[37–40], which divides behavior of seeding

46

individuals into sequential actions according to some rules rather than completing it at the

47

initial stage.

p ro

of

37

Motivated by the above discussion, we propose a novel centrality measurement to evaluate

49

individuals’ social influence based on the connection weight influence entropy and confidence

50

influence entropy. Considering the confidence strength among nodes pair and weights of con-

51

nections, the proposed centrality can be suitably qualified to describe the social influence of

52

individual on its one-hop neighbors and two-hop neighbors. By exploiting the concept of

53

entropy to measure the complexity and uncertainty of social influence, the vital influential

54

individuals can be consequently and effectively detected. In addition, based on the proposed

55

centrality and the independent cascade model, we design a sequential seeding strategy to com-

56

pare the performance of social influence diffusion with the counterpart of sing-stage seeding

57

strategy. For comparison, other four centralities including degree centrality, betweenness cen-

58

trality, closeness centrality, and eigenvector centrality are also applied to these two seeding

59

strategies. To sum up, our contributions in this paper are summarized as follows:

60

(1) We propose a novel centrality measurement based on the concept of entropy incorporating

61

individual’s confidence level, which mines potential pattern of evaluation for individual’s

62

social influence, thereby identify influential individuals to trigger the diffusion cascading.

63

(2) We develop a new method to characterize the relationship between social interactions and

64

the strength of social influence. It can evaluate the direct influence and indirect influence

65

by integrating confidence influence entropy and weight influence entropy to measure the

66

impact of confidence level and interaction intimacy on social influence.

Jo

urn

al

Pr e-

48

3

Journal Pre-proof

67

(3) We provide a comprehensive comparison of our sequential seeding strategy with single-

68

stage seeding strategy based on this proposed centrality and perform extensive experiments

69

to prove its effectiveness and efficiency to some extent. The remainder of this paper is organized as follows: In section 2, the literature review is

71

discussed. In section 3, we introduce the preliminary work containing diffusion model and the

72

concept of entropy. In section 4, we propose an improved centrality measurement elaborated on

73

the entropy-based social influence such that seed targets chosen according to that centrality

74

method can trigger the spreading process following a widely applied independent cascade

75

model. In addition, the proposed seeding strategies are presented. In section 5, we compare

76

the performance of five centralities in different networks for sequential seeding strategy and

77

single-stage strategy, and conduct parameter analysis. In section 6, conclusions are given.

78

2. Literature review

Pr e-

p ro

of

70

In order to achieve maximum social influence diffusion efficiency, the criterion of seed

80

selection and the seeding strategies are two important procedures. In this section, we provide

81

related literature reviews in these two factors including the centrality of seed selection and

82

arrangement of seeding strategy.

83

(1) The centrality of seed selection

al

79

In social network analysis, some well-known classical centrality measurements have been

85

applied to identify the influential individuals by measuring the network structure[41]. De-

86

gree centrality is the most direct and popular measurement of node centrality in network

87

analysis[42]. Katz centrality takes account of all paths between a pair of nodes to calculate

88

social influence[12]. Eigenvector centrality first proposed in[43] assigns each node in the net-

89

work a relative score associated with the network adjacency matrix. The renowned PageRank

90

of Google Search is an application based on the variant version of eigenvector centrality[44].

91

Closeness centrality reflects the proximity of a node to other nodes in the network, whereas,

92

betweenness centrality indicates the importance of a node by the number of shortest paths

93

passing through it[11]. Generally speaking, the concept of betweenness centrality and close-

94

ness centrality represents controllability and accessibility, respectively. In the literature on

Jo

urn

84

4

Journal Pre-proof

95

the comparison of centralities’ performance, Hinz et al.[38] compared the performance and

96

efficiency of four different seeding strategies based on nodes’ central character including Hubs,

97

Bridges, Fringes and Random, resulting in that the Hubs strategy outperforms other three

98

counterparts. Similarly, Banerjee, A. et al.[45] presented that individuals with a high value of

99

eigenvector centrality have greater capability to influence neighbors in the context of available microfinance loan program.

of

100

In addition to the above centrality measurements, the scholars put forward many other

102

improved centrality approaches. Stephenson and Zelen[46] proposed a novel centrality to cal-

103

culate the “information” contained in all potential paths between pairs of nodes. Furthermore,

104

Chen et al.[15] proposed a semi-local centrality measure as a tradeoff between the low rele-

105

vant degree centrality and other time-consuming measures by exploiting the information of

106

multiple-hop neighbors. Considering the significance of the node location in a given network,

107

Kitsak et al.[19]suggested the coreness centrality may be more effective index in identifying

108

the most influential spreaders. However, in terms of the computational complexity of the

109

original k-core centrality[47], it requires global network topological information, which may

110

lead to computational inefficiency especially in the large-scale network. In addition, k-core

111

centrality may fail to distinguish plenty of nodes with the same k-core. Thus, many improved

112

k-core algorithms[22, 24, 48] have been successively proposed from different perspectives. H-

113

index is introduced to deal with discussed above drawback in[47][49], and H-index is proposed

114

originally to measure scholars’ academic influence by calculating the least citations for the

115

least publications[26]. Subsequently, Lv et al.[50] proposed an influence evaluation model

116

by constructing an integrated operator containing the degree metrics, H-index, and coreness

117

method.

urn

al

Pr e-

p ro

101

In addition to the classical centrality measurements and improved centrality approaches

119

discussed above, recently the information entropy techniques initially proposed by Shannon[51]

120

have been extended to demonstrate the quantitative analysis results of influence in[27–32]. In

121

recent CentralityM easure articles, a model was presented by Peng et al.[33] to quantitatively

122

evaluate social influence in mobile social networks by introducing the concept of entropy to

123

depict the uncertainty and complexity of social influence, focusing on friend entropy and in-

124

teraction frequency entropy. Furthermore, Qiao.T et al.[34] proposed the re-defined entropy

125

centrality model, which describe associations among node pairs, to measure the potential in-

Jo

118

5

Journal Pre-proof

fluence of actor in communication activity. Base on this model, they proposed an extended

127

model to the case of directed and weighted networks in[52], which characterizes the total influ-

128

ence of individual by calculating the structural entropy and the interaction frequency entropy.

129

Although all the above information entropy-based centrality measures are proposed from the

130

perspective of network topology[33, 34, 52], these measures have until now been difficult to

131

conceptually compare. In addition, Tutzauer[53] proposed an entropy-based measure based on

132

the ways that traffic propagates by transfer and flows through the whole network. It should be

133

mentioned that this measure needs to obtain the global information of the network structure

134

due to the calculation of the centrality. However, global information of a node is usually diffi-

135

cult to obtain in social network. Different from the combination of topological structure and

136

information entropy in their works, the purpose of this paper is to propose an entropy-based

137

measure of centrality for individuals characterized by individual’s intimacy of connections and

138

confidence in neighbors, each focusing on the personal emotion and local information that

139

determine the diffusion of social influence. Specifically, a measurement taking into account

140

interaction intimacy and confidence level based on the heterogeneity of confidence in neigh-

141

bors is developed, and the entropy-based calculation can be realized by local information of

142

individuals. Moreover, to the best of our knowledge, no literatures have studied the sequential

143

seeding program by using entropy-based centrality and shed light on the tradeoff of influence

144

coverage and seeding frequency. This paper is thus the first to compare the performance of

145

the proposed entropy-based centrality with that of other centralities under different seeding

146

strategy. Next, we build on the centrality measurements to develop the seeding strategies.

147

(2) Seeding strategies

urn

al

Pr e-

p ro

of

126

In the early seminal works[38][45], a large number of relevant researches about seeding

149

strategies focus on single-stage seeding strategy where all chosen seed nodes are activated

150

at the beginning of the diffusion process. Recently, some works proposed a novel adaptive

151

approach that seeds are activated over time, and the centrality of seed selection need to be

152

reassessed in each seeding stage in order to obtain a higher activation rate[54][55]. This is the

153

so-called sequential seeding strategy. Sequential seeding strategies focus on how to designate

154

the suitable subset of individuals as seeds to trigger a cascade of influence diffusion and

155

determine the appropriate seeding mechanisms at different stages of diffusion. Considering

Jo

148

6

Journal Pre-proof

the three state assumptions including the ‘non-active’, ‘active’ and ‘available’, Seeman et

157

al.[54]proposed a two-stage framework for seeding actions. Tong G et al.[56] presented a

158

greedy adaptive seeding strategy and an effective heuristic algorithm based on the dynamic

159

independent cascade model. Comparing the sequential seeding strategy and the single-stage

160

seeding strategy, Jankowski J et al.[37] demonstrated that the former is superior to the later in

161

term of the activation coverage. Furthermore, Jankowski J et al.[57] proposed the sequential

162

seeding strategies with buffering, which can avoid selecting the nodes naturally activated by

163

other nodes. Chierichetti F. et al.[58] studied how to determine seeding sequence for two

164

competitive marketers in order to maximize the expected coverage. Liu[59] proposed the

165

push-driven cascade model with the consideration of limited user attention and controlling

166

factor over the diffusion process. More specifically, seeding action occurs at each time step

167

to obtain the overall activation scope. In recent years, Goldenberg D et al.[60] proposed a

168

scheduled seeding method which focuses on finding not only the optimal set of seed nodes

169

but also the right timing to implement the seeding actions. They extract three different

170

properties, named the stochastic dynamics, diminishing the social effect and state-dependent

171

seeding from the existing diffusion models to apply in the scheduled approach. Analogously,

172

Sela A et al.[55] suggested a greedy heuristic scheduled seeding approach motivated by the

173

timing aspect of seeding which takes into account the identification of seeds at the initial stage

174

and determination of seeding time over the diffusion process.

al

Pr e-

p ro

of

156

Although the above works have made major progress in methods of measuring centrality,

176

the research on entropy-based centrality is still in a nascent stage, which still does not accu-

177

rately capture the inherent fundamental rules followed by social influence diffusion. In fact,

178

an individual’s opinion will not always be accepted by his neighbors, because the individual’s

179

social influence depends largely on how much his neighbors trust him. Different from the

180

literature mentioned above, this paper proposes the appropriate entropy-based centrality by

181

integrating interaction intimacy and confidence level and further analyze the effect of confi-

182

dence level on the sequential seeding strategy. Nevertheless, the impact of seeding frequency

183

during sequential seeding horizon on diffusion speed and activation coverage has also rarely

184

been investigated. Therefore, how the different consecutive seeding actions affect the acceler-

185

ation and coverage of influence diffusion is subsequently analyzed in the context of sequential

186

seeding strategy.

Jo

urn

175

7

Journal Pre-proof

187

3. Preliminaries

188

3.1. Diffusion model In considering social influence diffusion through a social network, the appropriate diffusion

190

model is first needed to be set up. The literature[61] introduced different models of influence

191

diffusion, and therein the Independent Cascade Model (ICM) and the linear threshold model

192

(LTM) had been widely applied and lie at the core of most extended versions of diffusion

193

model. ICM is used to arrange a given spreading probability for each connection through

194

which a node already influenced in the latest time tick will affect its neighbor in the next

195

time tick. In LTM, an inactive node will transform into active status under the condition

196

that the sum of the spreading probability of the connections with active neighbors exceeds

197

its threshold value. In particular, the method of LTM is based on the operation of the node-

198

specific threshold from the perspective of the entity, and whether the node became active or

199

not depends on its active neighbors’ aggregate effect. However, in ICM a connection tying the

200

active node and the inactive node is given the single chance through which the inactive node

201

is successfully influenced with a certain probability independently of the previous historical

202

records[36].For the reason that only the connection-specific diffusion model is involved in this

203

paper, we adopt the ICM to carry out experiments. It is worth emphasizing that, unlike the

204

conceptually traditional version of ICM, we redefine the propagation probability by considering

205

the weight of the connection between adjacent nodes. In previous studies, Gang Y. et al.[62]

206

ij defined the probability as τij = ( wmax )α with which the node i with the inactive state is affected

207

by its active neighbor node j, where α > 0 refers to the tunable parameter, wij corresponds

208

to the weighted value of the directed connection, and wmax denotes the maximal weighted

209

value. Moreover, another definition of infection transmission proposed by Wang W. et al.[63]

210

stated the probability as 1 − (1 − γ)wij , where wij is still the weighted value on the given

211

edge connecting node i and node j, and the positive parameter γ chosen in the interval [0, 1]

212

denotes the original propagation probability. According to such a realistic assumption that the

213

probability of a node being affected by neighbor nodes is related to the original propagation

214

probability and weighted value in weighted social networks, we thus adopt the latter version

215

of influence spreading probability, i.e., pij = 1 − (1 − γ)wij , to substitute the counterpart of

216

traditional ICM in our paper. Each node (or individual) will be mentioned as being either

217

active (the adopter of the social influence) or inactive in the remaining sections.

al

Pr e-

p ro

of

189

Jo

urn

w

8

Journal Pre-proof

Starting with an initial set of active nodes, the newly activated node i is given an only single

219

chance to influence each currently inactive neighbor node j, and it succeeds with a probability

220

pij = 1 − (1 − γ)wij . Here, the weight wij can represent intuitively the heterogeneity level of

221

each connection. If node j has multiple neighbor nodes that are newly activated, then those

222

nodes will attempt to activate node j in random order. Supposing the node j is activated

223

successfully by one of its neighbor node i, j will turn into the active state at the next time

224

step. It is worth noting that whether node i succeeds or not, the further attempts of node

225

i to influence its neighbor nodes will not happen in subsequent phases. Again, this process

226

is repeated until the one more influenced active node in the network does not exist and the

227

propagation process ends. According to these rules, the influence diffusion unfolds in the

228

discrete time step. Based on this improved diffusion model, next, we evaluate the effect of

229

different centrality measurements on selecting seed targets and seeding strategies for maximal

230

coverage of influence diffusion.

231

3.2. Information entropy

Pr e-

p ro

of

218

In 1948, Shannon proposed the concept of information entropy in his well-known work

233

“A Mathematical Theory of Communication” which pointed out redundancy is ubiquitous in

234

any information and elaborated on how to measure the uncertainty of information using a

235

mathematical language. In general, the more information can be transmitted in a system, the

236

higher the entropy value it possesses. Information entropy has been widely applied in many

237

fields such as data mining, statistical inference, image processing, and so on.

al

232

According to the definition of Shannon’ information entropy, given a discrete random

239

variable X with a set of possible events xi whose probability of occurrence is represented by

240

p(xi ), i = 1, 2, ..., n, the information entropy is defined as follows:

urn

238

Jo

H(X) = H(x1 , x2 , ..., xn ) = −

n ∑

p(xi )log10 p(xi ).

(1)

i=1

241

the base 10 in formula (1) is selected in the logarithm without loss of generality. Specif-

242

ically, the definition contains the following three basic attributes: (1) information entropy

243

is continuously changing with p(xi ); (2) when all events occur with equal probability, i.e.,

244

p(xi ) = 1/n, i = 1, 2, ..., n, the information entropy increases monotonically as the total num-

245

ber of events n increases. That is, the more choices the event has, the greater the uncertainty 9

Journal Pre-proof

246

involved in the results; (3) When a choice can be broken down into two consecutive choices,

247

the entropy values before and after decomposition should be equal and the uncertainty is the

248

same. Many studies have investigated a number of important aspects of information measure, in-

250

cluding a magnitude-based information measure[64], the partition-independent graph entropy

251

for capturing the information of graphs[65], the information function based on the degree of

252

node[30], and so on. One of the salient contributions of Shannon’ information entropy[66] is

253

the remarkable extended application in social network, including hypergraph partitioning[67],

254

community detection[68], data forwarding[69] and influence measure[70]. Therefore, informa-

255

tion entropy techniques can provide an ideal basis to accurately formulate centrality measure-

256

ment for social influence in a model free manner. In this paper, we are now ready to introduce

257

a novel method to assess nodes’ social influence from the perspective of information entropy.

258

4. Model description

Pr e-

p ro

of

249

The seminal work done by Christakis et al.[71] found that effective influence can be de-

260

tectable within the scope of two-hop local network, i.e., the influence from those nodes whose

261

location is beyond the boundary of two-hop neighbors can be omitted, consequently, we de-

262

compose the social network of each individual node into two-hop sub-network for capturing

263

influence diffusion through the entire network. Now, let we consider an undirected, weighted

264

social network G(V, E, W ), where V denotes the finite set of nodes, E corresponds to the set

265

of undirected edges connecting a pair of nodes, and set W represents corresponding weight

266

value for a given edge. Consistent with the idea originated from wireless multi-hop network

267

where nodes communicate in a limited range[72], due to the limited social power, an individual

268

in a social network can impose meaningful influence on others only located in its local small

269

world. Motivated by this inspiration, if node i is connected directly with node j, denoted

270

as eij ∈ E, we express that node i is the one-hop neighbor of node j and vice versa. Cor-

271

respondingly, i has a direct influence on its one-hop neighbor j which can be represented as

272

DIi . Considering the analogous case where node i and node j have not the direct connection

273

but mutual neighbor node k, we call node i and node j are two-hop neighbors. It means node

274

i has the capability to influence node j through their common neighbor node k. For example,

275

the contagion processes of social norms and technological innovation are largely attributed to

Jo

urn

al

259

10

Journal Pre-proof

their mediators[73]. Related examples abound in the real network. Similarly, we define this

277

two-hop-distance influence of node i exerting on node j as indirect influence denoted as IIi .

278

According to the above discussion, we build a practical centrality measurement to evaluate the

279

influence of each node based on the concept of information entropy in the following section.

280

4.1. Entropy-based centrality measurement

of

276

With the purpose of interpreting the definition of entropy-based centrality measurement,

282

we consider an undirected, weighted local network G(V, E, W ) mentioned above and the net-

283

work topology can be demonstrated in Figure 1. In this schema, each node in the set V

284

represents an individual in a social network, E denotes the set of undirected edges connecting

285

two adjacent individuals, and the set of weight W corresponds to the intimacy of connection

286

through which the influence flows. As shown on the side of each edge, wij denotes the weighted

287

value on the given edge connecting node i and node j. To quantify the influence of a given in-

288

dividual, we deconstruct individual’s influence into two components including direct influence

289

DI and indirect influence II, achieved by integrating the connection weight and confidence

290

level to the information entropy.

Pr e-

p ro

281

In order to compute on direct influence, we proposed a novel definition of entropy-based

292

centrality, which takes into consideration two aspects including connection intimacy of in-

293

dividual and confidence level among neighbors, each focusing on the personal emotion that

294

determines the diffusion of social influence. We believe that a more effective influence indi-

295

cator will be demonstrated when considering the connection weight generally referred to as

296

the interaction intimacy. The higher the connection weight for individual pairs, the greater

297

the degree of mutual influence between them. Motivated by this idea, the weight influence

298

entropy for individual i, denoted as DIiw , is defined as follows:

urn

al

291

Jo

DIiw

=−

Ni ∑ j=1

wij ∑Ni k=1

wik

wij · log10 ∑Ni k=1

wik

(2)

299

Where Ni denotes the total number of neighbors of individual i, and wij indicates the weight

300

of connection bonding individual i with its neighbor j.

301

Generally, the probability of an individual i transmitting influence to his or her neighbor

302

j will not always be consistent. Additionally, which neighbor is chosen as a recipient of

303

information depends on how trustworthy individual i think neighbor j. Therefore, considering 11

Journal Pre-proof

e1，4, w1，4

e1，2 , w

V1 3

1， 2

,w

, w 7，8 e 7，8

V7

e7

7， 12

， 12

， 12

,w

V12

,w

V10

, w 10 1

11 ，

1 0， V11 e 1

e11，12 , w11，12

Pr e-

9， 12

,w

7， 11

e7

e9

， 10

8， 10

， 11

,w

V9

V8 e 8

5， 10

e 7，9, w7，9

of

3， 7

6， 7

e5，10 , w

e6，7 , w

e 6，9, w6，9

3,5

V5

e3，7 , w

V6

e3,5 , w

V3

e3，6, w3，6

p ro

e4

， 6

,w e 2，3

w 2 ,5

e1,3 , w1,

V2 3 2，

e 2,5,

4， 6

V4

Figure 1: An undirected weighted social network

304

the heterogeneity of confidence in neighbors, we define the confidence level according to the

305

recipient’s relative social status measured by its degree ratio among all neighbors, where the

306

probability of confidence level Tij is given as follows:

al

Tij = ∑

kjβ

(3)

β l∈Ni (kl )

Where Ni indicates the set of individual i’s neighbors, kj indicates the degree of recipient j,

308

and tunable parameter β called the confidence strength reinforces the sensitivity of recipient’s

309

degree kj to the confidence probability Tij . The individual’s degree kj can refer to the level

310

of node heterogeneity. When β > 0, it means that individual i has more trend to influence

311

those who possess a higher degree and vice versa. Specially β = 0, individual i influences all

312

neighbors with equal probability of confidence.

314

Jo

313

urn

307

Based on the above discussion, the definition of the confidence influence entropy DIic for individual i is stated as follows: DIic

=−

Ni ∑ j=1

Tij · log10 Tij = −

Ni ∑ j=1

∑

kjβ

β l∈Ni (kl )

· log10 ∑

kjβ

β l∈Ni (kl )

(4)

315

As explained above, since the weight influence entropy and the confidence influence entropy

316

play the componential role in constructing the direct influence of i on its one-hop neighbors, 12

Journal Pre-proof

317

denoted as DIi , we define the direct influence as the summation of DIiw and DIic multiplied

318

by two coefficients respectively, which is represented as follows: DIi = θ1 · DIiw + θ2 · DIic

(5)

Where θ1 and θ2 denote the weight coefficients of DIiw and DIic , respectively, and note that

320

θ1 + θ2 = 1.

of

319

Considering the case that individual k is one of the two-hop neighbors of individual i,

322

let Nik denote the total number of the mutual one-hop neighbors between individual i and

323

individual k. Namely, it means that the Nik paths exist between i and k. After already

324

quantifying the direct influence on one-hop neighbors according to the above discussion, here

325

it naturally raises a question about how to calculate the influence of i on its two-hop neighbor

326

k. Motivated by the relationship theory of three degrees proposed in[71], the indirect influence of individual i on its two-hop neighbor k, denoted as IIik , is given by: ∑Nik j=1 DIi · DIj IIik = Nik

Pr e-

327

p ro

321

(6)

328

Where DIi indicates the direct influence of i, and DIj indicates the direct influence of each

329

mutual neighbor j between i and k.

330

and k, which is shown in Figure 2. Thus, the indirect influence of i on k is stated as follows:

al

331

Let us take Nik = 3 for example, and it means there are three paths between individual i

DIi · DIj + DIi · DIl + DIi · DIm 3

urn

IIik =

332

333

334

Jo

Vi

DIi DIi

DIi

Vj Vl Vm

(7)

DIj DIl

Vk

DIm

Figure 2: Three paths between i and k

Based on the above distinct analysis, the average indirect influence of individual i on its all two-hop neighbors, denoted as IIi , is represented as follows: ∑Mi IIik IIi = k=1 Mi 13

(8)

Journal Pre-proof

335

Where Mi indicates the total number of individual i’s two-hop neighbors. Accordingly, the

336

indirect influence of the node v1 in Figure 1 can be expressed as follows: II1 =

338

(9)

According to the above illustration, the total influence of individual i, represented by Ii , can be denoted as follows: Ii = φ1 · DIi + φ2 · IIi

of

337

(DI1 · DI2 + DI1 · DI3 )/2 + DI1 · DI3 + (DI1 · DI4 + DI1 · DI3 )/2 3

(10)

Where the coefficients φ1 and φ2 stand for the weight of direct influence and indirect influence,

340

respectively, and φ1 + φ2 = 1.

341

4.2. Seeding strategies

p ro

339

In line with the given centrality measurements mentioned above, including betweenness

343

centrality, closeness centrality degree centrality, eigenvector centrality, and the proposed

344

entropy-based centrality, correspondingly, we rank all nodes in descending order and then

345

compare the simulation results of the single-stage seeding strategy with that of sequential

346

seeding strategy based on the same social network, the identical parameters of diffusion model,

347

and the equal number of seed targets. In this subsection, two kinds of seeding strategies are

348

unfolded in discrete steps as follows:

349

Single-stage seeding strategy:

al

Pr e-

342

(1) At the time tick t0 , the top k most influential nodes are selected as the initial seed targets

351

based on a given centrality measurement and activate them all, which indicates that they

352

have the capacity to influence their neighbors. Simultaneously, it means the start of

353

influence diffusion;

urn

350

(2) At the next time tick t > t0 , each newly activated node i in the previous step has a

355

chance to activate its direct inactive neighbor j with the probability pij = 1 − (1 − γ)wij .

356

Regardless of whether the activation succeeds or not, the activated nodes will not make

357

further attempts to activate their neighbors in the subsequent time ticks;

Jo

354

358

(3) The influence diffusion process continues without any external support until no more

359

activated nodes appear at certain time tick TSN referred to as saturation time, attaining

360

the influence coverage CSN . 14

Journal Pre-proof

Sequential seeding strategy:

362

In the sequential seeding strategy, we take the same number of seed targets as in the case

363

of single-stage seeding strategy, but activate them among m consecutive seeding frequency

364

interspersed with diffusion stages. Especially, once the last diffusion stage stops, another

365

fraction of seed targets are selected to trigger the new diffusion process until the sum of all

366

selected seed targets is equal to k, and the seed selection for each seeding action is based on

367

the yet inactive node rankings. Here, what needs to be emphasized is that the same number

368

of seed targets are selected uniformly for each seeding frequency, and the heuristic strategy of

369

uneven sequential seeding rule will be investigated in our further work.

370

(1) At the initial time tick t0 , the top x1 (x1 < k) most influential seeds are selected based

371

on a given centrality measurement and make them active, which triggers the influence

372

diffusion process;

Pr e-

p ro

of

361

(2) At any time tick t > t0 in the first diffusion stage, each activated seed has only one chance

374

to activate their inactive neighbors with the probability pij = 1 − (1 − γ)wij . Hence, the

375

only way to continue diffusion is activating the top x2 (x1 + x2 ≤ k) ranked nodes not yet

376

activated. In the subsequent seeding frequency m, the next activation is suspended until

377

the coverage derived from current diffusion step does not increase, unless the all k seed

378

targets are run out;

380

(3) Repeat the above procedures, and the whole diffusion process ends with maximization of influence coverage CSQ at the saturation time TSQ .

urn

379

al

373

Contrast to the single-stage seeding strategy consisting of single activation and one diffusion

382

process, the sequential seeding strategy embodies multistage diffusion process in which each

383

activation is followed by homologous diffusion stage. For each of the sequential seeding action,

384

the new seeds are selected from those not yet activated nodes, and this allows to avoid selecting

385

seeds that will be activated anyway through spontaneous diffusion.

386

5. Evaluation of model performance

Jo

381

387

In this section, we conduct several experiments and analyze related results. Since the

388

network structure plays a significant role in modeling the influence diffusion and validating 15

Journal Pre-proof

the efficiency of the proposed centrality measurement, we conduct extensive experiments on

390

several networks, including two empirical networks and four classical artificial networks, to

391

compare the relative performance with other centrality measurements. The two empirical

392

social networks are constructed by collecting the relative data in Facebook.com, where the

393

propagation phenomenon of influence, such as the US presidential campaign, the promotion

394

of innovation policies, and the spread of fashion, is ubiquitous. Due to the limitation in our

395

computational capacity, we have to select the two undirected unweighted online collegiate

396

networks: Caltech Facebook Network[74] with 769 nodes and 16656 edges, and Princeton

397

Facebook Network[75] with 6596 nodes and 293320 edges, respectively. The two networks

398

consist of the complete set of users and relationship extracted from the Facebook network

399

of California Institute of Technology and Princeton University. Each node corresponds to an

400

individual and the undirected edge indicates the social connection. The network data statistics

401

are calculated using Python’s Networkx package and displayed in Table 1.

Pr e-

p ro

of

389

In addition, we also verify the performance of the proposed centrality on the four classical

403

artificial networks modeled by the BA scale-free network, the regular ring network, the WS

404

small-world network, and the ER random network, respectively. The generation processes of

405

these network structures are stated as follows: The BA scale-free network can start from fully

406

connected nodes, a new node with m(m < m0 ) edges is added to the existing network at each

407

time step according to the preferential attachment, i.e., the probability of being connected

408

to the existing node i is proportional to its degree ki . In simulations, we set N = 769 for

409

comparison with empirical social network Caltech, and m = 2, i.e., the average degree is

410

about 4; In the regular ring network, the starting point is an N nodes ring, in which each

411

node is symmetrically connected to its 2m(m for each side) nearest neighbors for a total of

412

N m edges. Here, we also set N = 769 and m = 2; The WS small-world network can be

413

constructed by starting from a regular ring network with 769 nodes and 4 edges per node,

414

and then rewiring each edge with a random probability p = 0.3; Similarly, we generate ER

415

random network with 769 nodes and average degree < k >= 4, achieved by rewiring each edge

416

with the probability p = 0.3. Here, we randomly assign values to weights ranging from 0 to 1

417

for simplicity. After establishing the above network structures, we generate a random number

418

(the value varies between 0 and 1), which is subject to the uniform distribution, for each edge

419

as its weight.

Jo

urn

al

402

16

Journal Pre-proof

Table 1: Basic information of two real-world social networks Edges

Density

Maximum degree

Average degree

Assortativity

Clustering coefficient

Maximum k-core

Caltech

769

16656

0.0564

248

43

-0.0653

0.4093

36

Princeton

6596

293320

0.0135

628

88

0.0911

0.2369

77

5.1. Model comparison

of

420

Nodes

In order to obtain the detailed comparison between the proposed model and the other

422

centrality measurements mentioned above under the condition of single-stage seeding strategy

423

and sequential seeding strategy respectively, we repeat the numerical simulation 100 times on

424

the same network for each comparison and take the average value of the influence coverage

425

which can be seen as an indicator of the centrality’s effectiveness. In each simulation of

426

different seeding strategies, we select k = 10 top influential nodes as seed targets in total.

427

Correspondingly, the results are presented in Figure 3.

Pr e-

p ro

421

Figure 3 illustrates the influence coverage of single and sequential stage seeding strategies

429

varies over time under different centralities in six networks when related parameters are set as

430

follows: β = 2, γ = 0.4, θ1 = θ2 = 0.5, φ1 = φ2 = 0.5, m = 5. The set of subgraphs in the left

431

column corresponds to the single-stage seeding strategy and the one on the right corresponds

432

to the sequential seeding strategy. Our main purpose is to compare the performance of seeding

433

strategies under different centralities. As shown in Figure 3, we can confirm intuitively the

434

fact that influence coverage increases with the time step. For a given centrality measurement,

435

the initial propagation speed of single stage seeding strategy outperforms the counterpart of

436

sequential seeding strategy, but the latter achieves generally a larger range of influence arrival.

437

Therefore, there exists a balance of coverage and speed for seeding strategies. The coverage of

438

each centrality varies with different seeding strategies in different networks. For example, the

439

results shown in BA scale-free network contain two subfigures: the influence coverage of five

440

centralities are almost identical on the left single-stage seeding strategy (Figure 3(a)) and the

441

entropy-based centrality performs best on the right sequential seeding strategy (Figure 3(b)).

442

Further, in the simulations of the left column, it is worth noting that the curves generated in

443

ER random network (Figure 3(g)), Caltech (Figure 3(i)) and Princeton (Figure 3(k)) social

444

network are almost overlapping for different centralities, which means that the performance of

445

entropy-based centrality is similar to that of other four classic centralities in those networks

Jo

urn

al

428

17

p ro

of

Journal Pre-proof

(b) sequential seeding in BA scale-free network

al

Pr e-

(a) single-stage seeding in BA scale-free network

(d) sequential seeding in regular network

Jo

urn

(c) single seeding in regular network

(e) single seeding in WS small-world network

(f) sequential seeding in WS small-world network

18

p ro

of

Journal Pre-proof

(h) sequential seeding in ER random network

Pr e-

(g) single seeding in ER random network

(j) sequential seeding in Caltech collegiate network

Jo

urn

al

(i) single seeding in Caltech collegiate network

(k) single seeding in Princeton collegiate network

(l) sequential seeding in Princeton collegiate network

Figure 3: The influence coverage of single-stage seeding strategy vs. sequential seeding strategy based on five centralities in different networks

19

Journal Pre-proof

and implies the structure of the two chosen empirical social networks may be analogous to

447

the ER random network. Similarly, for sequential seeding strategy in ER random network

448

(Figure 3(h)), Caltech (Figure 3(j)) and Princeton (Figure 3(l)) social network, several curves

449

of different centralities almost coincide with each other. However, the priority of performance

450

for the five centralities in the BA scale-free network (Figure 3(b)), regular network (Figure

451

3(d)), WS small-world network (Figure 3(f)) differ from each other. This indicates that the

452

network structure has a significant impact on the performance of sequential seeding strategy.

453

Even though the effectiveness of eigenvector centrality is significantly superior to that of

454

all other centralities in the regular network (Figure 3(d)), eigenvector centrality performs

455

badly than closeness centrality and the proposed entropy-based centrality in the BA scale-free

456

network (Figure 3(b)).

p ro

of

446

In above four typical type of artificial networks (in fact, the Caltech network and Princeton

458

network can be classified as the ER random network according to the above analysis), we have

459

observed the abnormal situation which shows the consistent performance under single-stage

460

seeding strategy and sequential seeding strategy in ER random network. As shown in Figure

461

3(g),3(h),3(i),3(j),3(k),3(l), the curves generated from five centralities are almost overlapping

462

for both seeding strategies. One plausible reason we guess is that due to the randomness

463

of the network structure, which may lead to the indistinguishable difference between the

464

five centralities, the speed of propagation is so rapid that the influence coverage reaches its

465

maximum value at the first diffusion stage of sequential seeding strategy, which is not much

466

different from the single-stage seeding strategy. In addition, Figure 3 shows an intuitive result

467

that the sequential seeding strategy has a larger influence coverage than single-stage seeding

468

strategy for the corresponding centrality, while the consuming time is exactly opposite. Note

469

that some of the seed targets that can be influenced naturally by its neighbor nodes should

470

not have been selected as initial seeds. There exists a balance of influence coverage and time

471

consumed.

Jo

urn

al

Pr e-

457

472

In summary, in terms of the influence coverage in BA scale-free network (Figure 3(a),3(b)),

473

the performance of the sequential seeding strategy is better than that of the single-stage seed-

474

ing strategy regardless of based on any centralities, therein, the performance of the proposed

475

entropy-based centrality is optimal. In subgraph (Figure 3(d)), both the diffusion speed and

476

the influence coverage in sequential seeding strategy are more outstanding when applying 20

Journal Pre-proof

eigenvector centrality than all other centralities, of which the performance is almost the same.

478

However, the influence coverage for the proposed entropy-based centrality is still slightly opti-

479

mal in the single-stage seeding strategy (Figure 3(c)). Comparing the two strategies in terms

480

of influence coverage in Figure 3(e),3(f), the proposed centrality is inferior to the optimal

481

degree centrality. The curves in the rest of subgraphs(Figure 3(g),3(h),3(i),3(j),3(k),3(l)) are

482

roughly the same since their underlying network structure is of the identical type. As can be

483

observed from the left column of subgraphs in BA scale-free network (Figure 3(a)), regular

484

network (Figure 3(c)), and WS small-world network (Figure 3(e)), although the acceleration

485

of entropy-based centrality is poorly graded for single-stage seeding strategy, final influence

486

coverage of entropy-based centrality reaches at a higher level than in other centralities. How-

487

ever, the situation is different for sequential seeding strategy in the right column of subgraphs

488

(Figure 3(b),3(d),3(f)). In BA scale-free network (Figure 3(b)), for example, the coverage of

489

entropy-based centrality for sequential seeding strategy shoots up to the maximum value at

490

the highest acceleration, but neither acceleration nor coverage for entropy-based centrality can

491

be seen to have drastic effects in the case of regular network (Figure 3(d)) and WS small-world

492

network (Figure 3(f)).

493

5.2. Parameter analysis

Pr e-

p ro

of

477

To systematically investigate how model parameters affect the performance of sequential

495

seeding strategy, we conduct experiments by individually tuning related parameter to obtain

496

the influence coverage. In the light of the optimal performance of entropy-based centrality

497

under sequential seeding strategy in the BA scale-free network, we still apply this BA scale-

498

free network as the controlled test platform. In the following simulations, we mainly focus

499

on two model parameters: the confidence strength β which reflects the individual sensitivity

500

to its neighbor’s degree heterogeneity in the whole network, the seeding frequency m which

501

is to determine the trade-off between influence coverage and time cost. All other values of

502

parameters remain the same: γ = 0.4, θ1 = θ2 = 0.5, φ1 = φ2 = 0.5. The basic values of

503

confidence strength and seeding frequency are set as β = 2, m = 5, which means that when

504

we study the effect of a parameter, another parameter keeps unchanged at the basic value.

Jo

urn

al

494

505

Figure 4 describes the influence coverage of the proposed entropy-based centrality mea-

506

surement with confidence strength β in the BA scale-free network in the context of sequential

507

seeding strategy. Each value next to the data point indicates the corresponding time consumed 21

Pr e-

p ro

of

Journal Pre-proof

Figure 4: the non-monotonous behavior of influence coverage with the increase of confidence strength

by using sequential seeding strategy. The parameter of confidence strength captures the sen-

509

sitivity of a neighboring node affected by a node. Our original purpose is to observe whether

510

there exists a monotonous relationship between the confidence strength β and the influence

511

coverage. As can be seen from the results of Figure 4, as the value of β increases from −5 to

512

5 with an interval of 1, we find that values of the influence coverage show the obvious fluctu-

513

ations. For instance, as the value of β increases from -5 to 0 when β ≤ 0, the corresponding

514

influence coverage firstly and sharply decreases from 262 to 243, and then fluctuates crosswise

515

in a small range, and lastly and quickly increases from 251 to 278, corresponding to the value

516

of β from −1 to 0. Similarly, in the right part of the curve, similar behaviors are observed.

517

In short, the influence coverage reaches the maximum value in the case where β = 0. Then, a

518

question arises: what factors can bring about this oscillational phenomenon? Considering the

519

definition form of confidence influence entropy, since all probabilities of a node influencing its

520

different neighboring nodes are equal when �=0, i.e., the distribution of confidence in neigh-

521

boring nodes becomes most uniform (Tij = 1/Ni in Equation 3), the magnitude of confidence

522

influence entropy is vintage. This also corresponds to the fact that if there exist more elements

Jo

urn

al

508

22

Journal Pre-proof

contained in a system, the uncertainty about one outcome should increase. The results mean

524

that when the individual’s trust in neighbors is treated equally, that is, the social force on the

525

node’s heterogeneity is significantly less influential, it can achieve the largest total influence

526

coverage.

al

Pr e-

p ro

of

523

Figure 5: Results from sequential seeding strategies with four kinds of seeding frequency in the BA

urn

scale-free network

Figure 5 presents the influence coverage of the proposed entropy-based centrality mea-

528

surement with different seeding frequencies m over time t in the BA scale-free network. By

529

labeling and amplifying the beginning of curves shown in Figure 5, we can visually distinguish

530

four seeding strategies containing different seeding frequency. In the numerical simulations,

531

each data point is the average of 100 realizations on the generated BA scale-free network. The

532

statistics of simulation results are given in Figure 6. Due to the non-dynamic network struc-

533

ture, the standard deviation of each simulation result is very small relative to the network scale.

534

Therefore, the robustness of simulation performance can be guaranteed. Similar to Figure 3,

535

the related parameters are set as follows: β = 2, γ = 0.4, θ1 = θ2 = 0.5, φ1 = φ2 = 0.5, and the

536

total number of seeds is still set as 10. It’s worth noting that the curve of red dot for seeding

Jo

527

23

p ro

of

Journal Pre-proof

Figure 6: the statistical data about the standard deviation of 100 realizations on the generated BA scale-free network in Figure 5

frequency m = 2, which represents that sequential seeding strategy is performed in two stages

538

and five seed targets are selected and used in each stage, is relatively particular. Considering

539

the scale-free particularity of the BA network, we speculate that the network structure may

540

lead to this unexpected phenomenon. Generally speaking, the BA scale-free network contains

541

a small number of nodes with extremely large degree and a vast majority of nodes with very

542

small degree. Correspondingly, the nodes sorted by the value of entropy-based centrality have

543

such similar characteristics, i.e., a small number of nodes have significant influence and most

544

of the nodes have the general influence. For this critical situation of seeding frequency m = 2,

545

some nodes with extremely large power of influence are influenced naturally, leading to that

546

subsequent seed targets can only be selected among ordinary nodes and then slow infection.

urn

al

Pr e-

537

As can be seen from the results, as the value of seeding frequency m increases, the number

548

of influence coverage increases. On the other hand, the acceleration of influence diffusion

549

decreases as m increases. This is to say, there exists the trade-off between coverage and

550

acceleration of influence diffusion. The insights provided by the above results have several

551

management implications. For the company with an intense desire to occupy quickly market

552

share, the samples of the product should be distributed to individuals in a short time as pos-

553

sible. Adversely, when ignoring the acceleration of market share occupation, a company with

554

a desire to obtain market scale advantage should adopt the specific strategy which distributes

555

samples in as many stages as possible. In other words, under the same marketing cost budget,

556

the larger the number of marketing actions of distributing samples a company has, the greater

Jo

547

24

Journal Pre-proof

557

the market share it ultimately gets.

558

6. Conclusion In this paper, we present an improved centrality measurement elaborated on the concept of

560

entropy, which takes into account the weight of connection and confidence level. We propose a

561

new method to quantify the influence entropy for each individual, containing weight influence

562

entropy and confidence influence entropy, and then design seeding strategies. By applying

563

a widely realistic diffusion model to capture the influence diffusion process, we focus on the

564

performance of sequential seeding strategy compared with the counterpart of the single-stage

565

seeding strategy. Firstly, we compare the influence coverage of several type of centrality

566

measurements in the context of sing-stage seeding strategy and sequential seeing strategy,

567

respectively, and the extensive experimental results demonstrate that the proposed entropy-

568

based centrality is superior to other centralities in terms of diffusion speed and influence

569

coverage in the BA scale-free network. Then, based on the proposed entropy-based centrality,

570

we investigate the effect of the confidence strength β on the influence coverage by utilizing

571

sequential seeding strategy in the BA scale-free network. Finally, we observe the performance

572

of sequential seeding strategies with entropy-based centrality and give the correlation between

573

the influence coverage and parameter m of the seeding frequency. It is worth emphasizing

574

that there exists a critical situation due to the specific structure of the BA scale-free network.

575

By deeply excavating the topology of the social network, we can better choose the proper

576

seeding strategy to obtain the optimal diffusion effect. Simulation results show that the

577

proposed entropy-based centrality is superior to other centralities in terms of diffusion speed

578

and influence coverage in the BA scale-free network. Parameter analysis of sequential seeding

579

strategy demonstrates that the proposed centrality can achieve the greatest total influence

580

coverage in the case where the individual’s confidence in each neighbor is treated equally.

Jo

urn

al

Pr e-

p ro

of

559

581

As for future work, more sequential seeding strategies about entropy-based centrality will

582

be further investigated. For example, it is a significant issue to find the appropriate compromise

583

between the sequential seeding strategies and the cost of diffusion. Moreover, the seeding

584

strategy on dynamic social networks is also a promising research direction. Taking this one

585

step further, we expect our findings can provide an inspiration to observe more potential

586

properties of seeding strategy, and can be extended in the real-world networks. 25

Journal Pre-proof

587

Acknowledgment The authors are grateful to the anonymous reviewers and the editor for their valuable

589

comments and suggestions. This research is supported by National Science Foundation of

590

China (No. 71672065).

591

References

594

595

596

597

016128.

p ro

593

[1] M. E. J. Newman, Spread of epidemic disease on networks, Physical review E 66 (2002)

[2] D. Zinoviev, V. Duong, H. Zhang, A game theoretical approach to modeling information dissemination in social networks, arXiv preprint arXiv:1006.5493 (2010). [3] C. Mast, S. Huck, A. Zerfass, Innovation communication, Innovation Journalism 2 (2005)

Pr e-

592

of

588

165.

598

[4] W. Chen, Y. Wang, S. Yang, Efficient influence maximization in social networks, in:

599

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery

600

and data mining, ACM, 2009, pp. 199–208.

603

604

communication mix, Management science 54 (2008) 477–491.

al

602

[5] Y. Chen, J. Xie, Online consumer review: word-of-mouth as a new element of marketing

[6] V. Mahajan, E. Muller, R. A. Kerin, Introduction strategy for new products with positive and negative word-of-mouth, Management Science 30 (1984) 1389–1404.

urn

601

605

[7] E. D’Andrea, P. Ducange, A. Bechini, A. Renda, F. Marcelloni, Monitoring the public

606

opinion about the vaccination topic from tweets analysis, Expert Systems with Applica-

607

tions 116 (2019) 209–226.

[8] Z. Zhao, H. Lu, D. Cai, X. He, Y. Zhuang, User preference learning for online social

609

recommendation, IEEE Transactions on Knowledge and Data Engineering 28 (2016)

610

2522–2534.

Jo

608

611

[9] J. Lim, T. Li, The optimal advertising-allocation rules for sequentially released products:

612

the case of the motion picture industry, Journal of Advertising Research 58 (2018) 228–

613

239. 26

Journal Pre-proof

614

[10] M. Rafiei, A. A. Kardan, A novel method for expert finding in online communities based

615

on concept map and pagerank, Human-centric computing and information sciences 5

616

(2015) 10.

618

[11] U. Brandes, S. P. Borgatti, L. C. Freeman, Maintaining the duality of closeness and betweenness centrality, Social Networks 44 (2016) 153–159.

of

617

[12] J. Zhao, T.-H. Yang, Y. Huang, P. Holme, Ranking candidate disease genes from gene

620

expression and protein interaction: a katz-centrality based approach, PloS one 6 (2011)

621

e24306.

622

623

p ro

619

[13] P. Hage, F. Harary, Eccentricity and centrality in networks, Social networks 17 (1995) 57–63.

[14] I. Poulakakis, G. F. Young, L. Scardovi, N. E. Leonard, Information centrality and

625

ordering of nodes for accuracy in noisy decision-making networks, IEEE Transactions on

626

Automatic Control 61 (2015) 1040–1045.

Pr e-

624

[15] D. Chen, L. Lü, M.-S. Shang, Y.-C. Zhang, T. Zhou, Identifying influential nodes in

628

complex networks, Physica a: Statistical mechanics and its applications 391 (2012)

629

1777–1787.

al

627

[16] C. Gao, D. Wei, Y. Hu, S. Mahadevan, Y. Deng, A modified evidential methodology of

631

identifying influential nodes in weighted networks, Physica A: Statistical Mechanics and

632

its Applications 392 (2013) 5490–5500.

634

635

636

[17] T. Petermann, P. De Los Rios, Role of clustering and gridlike ordering in epidemic spreading, Physical Review E 69 (2004) 066116. [18] D.-B. Chen, H. Gao, L. Lü, T. Zhou, Identifying influential nodes in large-scale directed

Jo

633

urn

630

networks: the role of clustering, PloS one 8 (2013) e77455.

637

[19] M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Muchnik, H. E. Stanley, H. A. Makse,

638

Identification of influential spreaders in complex networks, Nature physics 6 (2010) 888.

639

[20] A. Zeng, C.-J. Zhang, Ranking spreaders by decomposing complex networks, Physics

640

Letters A 377 (2013) 1031–1035. 27

Journal Pre-proof

641

642

643

644

[21] S. Pei, L. Muchnik, J. S. Andrade Jr, Z. Zheng, H. A. Makse, Searching for superspreaders of information in real-world social media, Scientific reports 4 (2014) 5547. [22] J.-G. Liu, Z.-M. Ren, Q. Guo, Ranking the spreading influence in complex networks, Physica A: Statistical Mechanics and its Applications 392 (2013) 4154–4159. [23] Q. Hu, Y. Gao, P. Ma, Y. Yin, Y. Zhang, C. Xing, A new approach to identify influential

646

spreaders in complex networks, in: International Conference on Web-Age Information

647

Management, Springer, 2013, pp. 99–104.

649

p ro

648

of

645

[24] B. Min, F. Liljeros, H. A. Makse, Finding influential spreaders from human activity beyond network location, PloS one 10 (2015) e0136831.

[25] Y. Liu, M. Tang, T. Zhou, Y. Do, Improving the accuracy of the k-shell method by

651

removing redundant links: from a perspective of spreading dynamics, Scientific reports

652

5 (2015) 13172.

656

657

658

659

660

661

662

[27] S. Cao, M. Dehmer, Y. Shi, Extremality of degree-based graph entropies, Information Sciences 278 (2014) 22–33.

al

655

of the National academy of Sciences 102 (2005) 16569–16572.

[28] Z. Chen, M. Dehmer, Y. Shi, Bounds for degree-based network entropies, Applied Mathematics and Computation 265 (2015) 983–993.

urn

654

[26] J. E. Hirsch, An index to quantify an individual’s scientific research output, Proceedings

[29] A. G. Nikolaev, R. Razib, A. Kucheriya, On efficient use of entropy centrality for social network analysis and community detection, Social Networks 40 (2015) 154–162. [30] S. Cao, M. Dehmer, Degree-based entropies of networks revisited, Applied Mathematics and Computation 261 (2015) 141–147.

Jo

653

Pr e-

650

663

[31] T. Nie, Z. Guo, K. Zhao, Z.-M. Lu, Using mapping entropy to identify node centrality

664

in complex networks, Physica A: Statistical Mechanics and its Applications 453 (2016)

665

290–297.

666

667

[32] L. Fei, Y. Deng, A new method to identify influential nodes based on relative entropy, Chaos, Solitons & Fractals 104 (2017) 257–267. 28

Journal Pre-proof

669

670

671

672

673

[33] S. Peng, A. Yang, L. Cao, S. Yu, D. Xie, Social influence modeling using information theory in mobile social networks, Information Sciences 379 (2017) 146–159. [34] T. Qiao, W. Shan, C. Zhou, How to identify the most powerful node in complex networks? a novel entropy centrality approach, Entropy 19 (2017) 614. [35] Y. Ni, L. Xie, Z.-Q. Liu, Minimizing the expected complete influence time of a social network, Information Sciences 180 (2010) 2514–2527.

of

668

[36] D. Kempe, J. Kleinberg, É. Tardos, Maximizing the spread of influence through a so-

675

cial network, in: Proceedings of the ninth ACM SIGKDD international conference on

676

Knowledge discovery and data mining, ACM, 2003, pp. 137–146.

p ro

674

[37] J. Jankowski, P. Bródka, P. Kazienko, B. K. Szymanski, R. Michalski, T. Kajdanow-

678

icz, Balancing speed and coverage by sequential seeding in complex networks, Scientific

679

reports 7 (2017) 891.

682

683

684

685

empirical comparison, Journal of Marketing 75 (2011) 55–71. [39] Y. Ni, Sequential seeding to optimize influence diffusion in a social network, Applied Soft Computing 56 (2017) 730–737.

al

681

[38] O. Hinz, B. Skiera, C. Barrot, J. U. Becker, Seeding strategies for viral marketing: an

[40] J. Jankowski, B. K. Szymanski, P. Kazienko, R. Michalski, P. Bródka, Probing limits of information spread with sequential seeding, Scientific reports 8 (2018) 13996.

urn

680

Pr e-

677

[41] S. P. Borgatti, Centrality and network flow, Social networks 27 (2005) 55–71.

687

[42] Q. Liu, T. Hong, Sequential seeding for spreading in complex networks: influence of

688

the network topology, Physica A: Statistical Mechanics and its Applications 508 (2018)

689

10–17.

690

691

692

693

Jo

686

[43] P. Bonacich, Factoring and weighting approaches to status scores and clique identification, Journal of mathematical sociology 2 (1972) 113–120. [44] T. L. Griffiths, M. Steyvers, A. Firl, Google and the mind: Predicting fluency with pagerank, Psychological Science 18 (2007) 1069–1076. 29

Journal Pre-proof

694

695

696

697

[45] A. Banerjee, A. G. Chandrasekhar, E. Duflo, M. O. Jackson, The diffusion of microfinance, Science 341 (2013) 1236498. [46] K. Stephenson, M. Zelen, Rethinking centrality: Methods and examples, Social networks 11 (1989) 1–37. [47] K. L. Calvert, E. W. Zegura, M. J. Donahoo, Core selection methods for multicast routing,

699

in: Proceedings of Fourth International Conference on Computer Communications and

700

Networks-IC3N’95, IEEE, 1995, pp. 638–642.

p ro

of

698

701

[48] H. Q. Cheng, Y. Y. Shen, M. P. Fei, G. Yang, Y. Zhang, X. C. Xiao, A new approach to

702

identify influential spreaders in complex networks, ActaPhys. Sin 62 (2013) 140101.

703

[49] P. Basaras, D. Katsaros, L. Tassiulas, Detecting influential spreaders in complex, dynamic

707

708

709

710

711

712

713

714

Pr e-

706

[50] L. Lü, T. Zhou, Q.-M. Zhang, H. E. Stanley, The h-index of a network node and its relation to degree and coreness, Nature communications 7 (2016) 10168. [51] C. E. Shannon, A mathematical theory of communication, Bell system technical journal 27 (1948) 379–423.

[52] T. Qiao, W. Shan, G. Yu, C. Liu, A novel entropy-based centrality approach for identi-

al

705

networks, Computer 46 (2013) 24–29.

fying vital nodes in weighted networks, Entropy 20 (2018) 261. [53] F. Tutzauer, Entropy as a measure of centrality in networks characterized by path-transfer

urn

704

flow, Social networks 29 (2007) 249–265. [54] L. Seeman, Y. Singer, Adaptive seeding in social networks, in: 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, IEEE, 2013, pp. 459–468. [55] A. Sela, I. Ben-Gal, A. S. Pentland, E. Shmueli, Improving information spread through

716

a scheduled seeding approach, in: Proceedings of the 2015 IEEE/ACM International

717

Conference on Advances in Social Networks Analysis and Mining 2015, ACM, 2015, pp.

718

629–632.

719

720

Jo

715

[56] G. Tong, W. Wu, S. Tang, D.-Z. Du, Adaptive influence maximization in dynamic social networks, IEEE/ACM Transactions on Networking (TON) 25 (2017) 112–125. 30

Journal Pre-proof

721

[57] J. Jankowski, P. Bródka, R. Michalski, P. Kazienko, Seeds buffering for information

722

spreading processes, in: International Conference on Social Informatics, Springer, 2017,

723

pp. 628–641.

725

[58] F. Chierichetti, J. Kleinberg, A. Panconesi, How to schedule a cascade in an arbitrary graph, SIAM Journal on Computing 43 (2014) 1906–1920.

of

724

[59] S. Lin, Q. Hu, F. Wang, S. Y. Philip, Steering information diffusion dynamically against

727

user attention limitation, in: 2014 IEEE International Conference on Data Mining, IEEE,

728

2014, pp. 330–339.

p ro

726

[60] D. Goldenberg, A. Sela, E. Shmueli, Timing matters: influence maximization in so-

730

cial networks through scheduled seeding, IEEE Transactions on Computational Social

731

Systems 5 (2018) 621–638.

Pr e-

729

732

[61] D. Kempe, J. Kleinberg, É. Tardos, Influential nodes in a diffusion model for social

733

networks, in: International Colloquium on Automata, Languages, and Programming,

734

Springer, 2005, pp. 1127–1138.

736

[62] Y. Gang, Z. Tao, W. Jie, F. Zhong-Qian, W. Bing-Hong, Epidemic spread in weighted scale-free networks, Chinese Physics Letters 22 (2005) 510.

al

735

[63] W. Wang, M. Tang, H.-F. Zhang, H. Gao, Y. Do, Z.-H. Liu, Epidemic spreading on

738

complex networks with general degree and weight distributions, Physical Review E 90

739

(2014) 042803.

741

742

743

744

745

[64] D. Bonchev, N. Trinajstić, Information theory, distance matrix, and molecular branching, The Journal of Chemical Physics 67 (1977) 4517–4533. [65] M. Dehmer, Information processing in complex networks: graph entropy and information

Jo

740

urn

737

functionals, Applied Mathematics and Computation 201 (2008) 82–94. [66] J. L. Proops, Entropy, information and confusion in the social sciences, Journal of Interdisciplinary Economics 1 (1987) 225–242.

31

Journal Pre-proof

746

[67] W. Yang, G. Wang, M. Z. A. Bhuiyan, K.-K. R. Choo, Hypergraph partitioning for social

747

networks based on information entropy modularity, Journal of Network and Computer

748

Applications 86 (2017) 59–71. [68] J. D. Cruz, C. Bothorel, F. Poulet, Entropy based community detection in augmented

750

social networks, in: 2011 International Conference on computational aspects of social

751

networks (CASoN), IEEE, 2011, pp. 163–168.

753

[69] P. Yuan, H. Ma, H. Fu, Hotspot-entropy based data forwarding in opportunistic social

p ro

752

of

749

networks, Pervasive and Mobile Computing 16 (2015) 136–154.

[70] S. He, X. Zheng, D. Zeng, K. Cui, Z. Zhang, C. Luo, Identifying peer influence in online

755

social networks using transfer entropy, in: Pacific-Asia Workshop on Intelligence and

756

Security Informatics, Springer, 2013, pp. 47–61.

759

760

761

762

763

764

works and human behavior, Statistics in medicine 32 (2013) 556–577. [72] B. Latré, B. Braem, I. Moerman, C. Blondia, P. Demeester, A survey on wireless body area networks, Wireless Networks 17 (2011) 1–18.

[73] B. Wejnert, Integrating models of diffusion of innovations: A conceptual framework,

al

758

[71] N. A. Christakis, J. H. Fowler, Social contagion theory: examining dynamic social net-

Annual review of sociology 28 (2002) 297–326. [74] A. L. Traud, E. D. Kelsic, P. J. Mucha, M. A. Porter, Comparing community structure

urn

757

Pr e-

754

to characteristics in online collegiate social networks, SIAM review 53 (2011) 526–543. [75] C. L. Staudt, Y. Marrakchi, H. Meyerhenke, Detecting communities around seed nodes

766

in complex networks, in: 2014 IEEE International Conference on Big Data (Big Data),

767

IEEE, 2014, pp. 62–69.

Jo

765

32

Journal Pre-proof *Declaration of Interest Statement

Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Jo

urn

al

Pr e-

p ro

of

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Sequential seeding strategy for social influence diffusion with improved entropy-based centrality

Sequential seeding strategy for social influence diffusion with improved entropy-based centrality

Recommend Documents