Multi-criteria node criticality assessment framework for critical infrastructure networks

Multi-criteria node criticality assessment framework for critical infrastructure networks

Multi-Criteria Node Criticality Assessment Framework for Critical Infrastructure Networks Journal Pre-proof Multi-Criteria Node Criticality Assessme...

3MB Sizes 0 Downloads 71 Views

Multi-Criteria Node Criticality Assessment Framework for Critical Infrastructure Networks

Journal Pre-proof

Multi-Criteria Node Criticality Assessment Framework for Critical Infrastructure Networks Luca Faramondi, Gabriele Oliva, Roberto Setola PII: DOI: Reference:

S1874-5482(20)30002-0 https://doi.org/10.1016/j.ijcip.2020.100338 IJCIP 100338

To appear in:

International Journal of Critical Infrastructure Protection

Received date: Revised date: Accepted date:

26 October 2019 30 December 2019 28 January 2020

Please cite this article as: Luca Faramondi, Gabriele Oliva, Roberto Setola, Multi-Criteria Node Criticality Assessment Framework for Critical Infrastructure Networks, International Journal of Critical Infrastructure Protection (2020), doi: https://doi.org/10.1016/j.ijcip.2020.100338

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier B.V.

Multi-Criteria Node Criticality Assessment Framework for Critical Infrastructure Networks Luca Faramondi, Gabriele Oliva, Roberto Setola Unit of Automatic Control, Department of Engineering, Universit´ a Campus Bio-Medico di Roma, Rome, Italy

Abstract Spotting criticalities in Critical Infrastructure networks is a crucial task in order to implement effective protection strategies against exogenous or malicious events. Yet, most of the approaches in the literature focus on specific aspects (e.g., presence of hubs, minimum paths) and there is a need to identify tradeoffs among importance metrics that are typically clashing with each other. In this paper we propose an approach for the assessment of criticalities which combines multi-criteria decision making techniques and topological/dynamical centrality measures. In particular, we resort to the Sparse Analytic Hierarchy Process (SAHP) technique to calculate the relevance of the different metrics based on pairwise comparisons of the metrics by Subject Matter Experts (SMEs) and to merge the different metrics into a holistic indicator of node criticality/importance that takes into account all the metrics. With the aim to experimentally demonstrate the potential of the proposed approach, we consider a case study related to the Central London Tube Network. According to the experimental results, the proposed aggregated ranking exhibits negligible correlation with the single metrics being aggregated, thus suggesting that the proposed approach effectively combines the different metrics into a new perspective. Keywords: Critical Infrastructure Protection, Vulnerability Identification, ∗ Corresponding

author Email address: [email protected] (Luca Faramondi)

Preprint submitted to Journal of LATEX Templates

January 31, 2020

Multi-Criteria Decision-Making, Analytic Hierarchy Process, Centrality, Critical node detection

1. Introduction Critical Infrastructures are vulnerable to both man-made and natural disasters (e.g., see [1, 2, 3]). Due to the potentially catastrophic consequences of such disruptions, it is extremely important to plan for their protection ( e.g., see 5

[4, 5]. In particular, since most of such systems have a geographical dispersed but tightly interrelated structure, it is quite convenient to consider infrastructures in terms of networks, where the subsystems are represented by nodes, interconnected by edges that model their relation (e.g., connection, flow or goods exchange, cyber dependency, etc.). Assessing the importance/criticality

10

of infrastructures’ subsystems or composing elements is a fundamental task in order to implement effective protection strategies, e.g., raising security at specific locations whose disruption would cause particularly severe consequences. For instance, in [6] a criticality assessment methodology is developed that aims to integrate existing security plans and risk assessments performed in isolated

15

infrastructures; in [7] a methodology is developed that explicitly considers dynamic, time-based dependency analysis; in [8] an approach is proposed in which CI stakeholders create a common understanding of the causes and effects of a described threat scenario; in [9], a framework is provided for vulnerability assessment of gas pipeline-road networks due to gas pipeline failures.

20

In this

view, the importance/criticality of a subsystem may not depend just on the characteristics of such subsystems, but also on the complex web of connections and relations that intertwine such composing elements [10, 11]. Notably, the assessment of the importance of nodes in a network has been a popular topic in the literature, and several metrics have been developed, each focusing on a specific

25

aspect (e.g., degree, maximal flow, shortest path) [12, 13, 14, 15, 16, 17, 18]. For instance, in [14] a new metric of centrality is introduced for smart grids, which allows for identifying vulnerabilities and predicting brownouts and blackouts; in

2

[18] a methodology to identify vulnerabilities in water distribution networks is developed. Moreover, in several cases, information is available only for a sub30

set of the elements, e.g., in the case of water distribution networks where only few measurement points are available [19, 20] or when few control devices are scattered across the network [21, 22] In this paper, we aim at addressing this issue by providing a multi-criteria framework that is able to combine several metrics of node importance–possibly

35

defined over a subset of the nodes–into an aggregated measure; such a measure may represent the basis for developing a tool for CI operators aimed at supporting the decisions related to the prioritization of protection investments, considering a plethora of metrics at the same time with an holistic perspective. In more detail, we consider two complementary levels of analysis. As a first

40

step, we compare the different metrics by resorting to the experience of Subject Matter Experts (SMEs), which analyzes pairs of alternatives; based on these comparisons, we are able to calculate the importance of each metric for the SMEs. Then, we combine the different metrics taking into account the importance of each metric. Both tasks are done via the Sparse Analytic Hierarchy

45

Process (SAHP) method [23, 24, 25], which extends the traditional Analytic Hierarchy Process (AHP) [26] to the case of partial information. In more detail, we use the SAHP methodology in order to handle several criteria at once (i.e., the metrics being aggregated), each weighted according to the previously computed importance values. From a technical side, we formulate our problem in terms of

50

a least-squares optimization problem and we provide a closed form solution for the global optimal solution. Note that this paper extends our early work in [27] to the case of different weights for the different metrics, possibly defined over different graphs having the same set of nodes (e.g., we consider different sets of edges, each conveying specific information such as structural interconnection,

55

flow or other dependencies) and possibly defined over subsets of the nodes. It should be noted that the problem of aggregating rankings has raised some interest in previous researches: in [28] Kendall and Hausdorff distances are used to compare rankings and a median-based approach is used to identify an overall 3

ranking; in [29] interval ordinal rankings are considered; in [30] (and references 60

therein) the bucket order problem is considered, i.e., finding an agreement based on several ranking matrices with ordinal information; in [31] centrality measures are combined to devise a control strategy that minimizes control energy in networked dynamical systems. Notice that, in [4], the authors quantify the correlation of centrality measures with risk levels in Dependency Risk Graphs

65

and provide an heuristic algorithm to recursively select a subset of nodes based on the centrality measure with the highest correlation. With respect to previous literature, the aggregated ranking hereby proposed has a number of benefits: (i) it can be applied to incomplete metrics that are not defined over each node;(ii) being the result of a least-squares minimization

70

problem, it represents the optimal tradeoff among the considered metrics; (iii) it provides a numerical characterization of the criticality of each node; (iv) it does not rely on prior knowledge on disruption risks and impacts; (v) from a technical standpoint, it is not computationally expensive, as it consists in solving a system of n linear equations with n unknowns, where n is the number

75

of nodes in the network1 . The remainder of this paper is organized as follows: Section 2 provides some background notation and preliminary definition; Section 3 develops the proposed multi-criteria framework; Section 4 provides a validation of the proposed approach with respect to a case study involving the London Central Tube

80

Network; finally, Section 5 collects some conclusive remarks and future work directions. 1 From

the user point of view, the most time-consuming tasks to be performed within the

proposed approach are the computation of the numerical values according to the different metrics being considered and the pairwise comparison of the metrics by SMEs (see Section 3 for details on these tasks). Once this information has been gathered, the proposed method boils down to calculations for which closed-forms are given in the paper (i.e., see Section 3), which involve only matrix multiplication and computation of the Moore-Penrose pseudoinverse of a matrix.

4

2. Preliminaries In this section we provide useful notation and preliminaries that will be extensively used in the remainder of the paper. 85

2.1. Graph Theory In this paper we denote vectors via boldface letters, while matrices are represented with an uppercase letter. Moreover, with the notation Wij we denote the (i, j)-th entry of a matrix W and with qi the i-th entry of a vector q. Let G = {V, E} be a graph with n nodes V = {v1 , . . . , vn } and e edges E ⊆ V × V \ {(vi , vi ) | vi ∈ V }, where (vi , vj ) ∈ E represents the existence of a link from node vi to node vj . A graph is said to be undirected if (vi , vj ) ∈ E whenever (vj , vi ) ∈ E, and is said to be directed otherwise. In the following we will consider only undirected graphs. A graph is connected if for each pair of nodes vi , vj there is a path over G that connects them. Let the neighborhood Ni of a node vi be the set of nodes vj that are connected to vi via an edge (vi , vj ) in E. The degree di of a node vi is the number of its neighborhood nodes, i.e., di = |Ni |. The adjacency matrix A of a graph G = {V, E} with n nodes is the n × n matrix such that Aij = 1 if (vj , vi ) ∈ E and Aij = 0 otherwise. The Laplacian matrix associated to a graph G is the n × n matrix L(A), having the following structure.    di ,    Lij (A) = −1,     0,

if i = j if i 6= j and (vi , vj ) ∈ E. otherwise

2.2. Kendall’s Correlation

Given two pairs of values (ai , bi ) and (aj , bj ), we say they are concordant if both ai > aj and bi > bj or if both ai < aj and bi < bj ; similarly the pairs are discordant if ai > aj and bi < bj or if ai < aj and bi > bj . If ai = aj or bi = bj the pairs are neither concordant nor discordant. Given two vectors a ∈ Rn and

5

b ∈ Rn , the Kendall’s correlation [32] τ is defined as τ= 90

C−P , n(n − 1)/2

where C and P are the set of concordant and discordant pairs (ai , bi ) and (aj , bj ), respectively. When b is a permutation of the components of a, the Kendall’s correlation τ can be interpreted as a measure of the degree of shuffling of b with respect to a, between minus one and one. In this sense τ = 1 implies a = b, while τ = −1

95

represents the fact b is in reverse order with respect to a. The closer is τ to (minus) one, therefore, the more the two rankings are (anti-) correlated, while the closer is τ to zero the more the two rankings are independent. 2.3. Analytic Hierarchy Process The Analytic Hierarchy Process (AHP), introduced by Thomas Saaty [26],

100

is an effective tool for dealing with complex decision making, and may aid the decision maker in setting priorities and making the best decision. By reducing complex decisions to a series of pairwise comparisons, and then synthesizing the results, the AHP helps to capture both subjective and objective aspects of a decision.

105

Consider a set of n alternatives, and suppose that each alternative i is associated to an unknown positive value wi > 0, which represents its utility or value. Within AHP, we aim at finding the unknown values wi based on an estimation of the ratios wi /wj between each couple of alternatives, collected in the n × n matrix W . Such a setting is typical in contexts involving human

110

decision-makers, which are usually more comfortable providing relative comparisons among the utilities of the different alternatives (e.g., “Alternative X is twice better than alternative Y”), rather than directly assessing the value of each alternative (i..e, “The value of alternative X is α”). Briefly, AHP consists in a way to compute the weights wi given the ratio estimates collected in W . In

115

particular, the approach proposed by Saaty relies on the fact that, in the ideal case when Wij is exactly equal to the ratio wi /wj , the dominant eigenvector of 6

W is exactly the vector w = [wi , . . . , wn ]T , up to a scaling factor. However, real world data is typically affected by inconsistencies; for instance, alternative A is twice better than B and B is twice better than C, but A is three times better 120

than C, i.e., the preferences are not transitive. In this case, we have that there is no vector w such that Wij = wi /wj , and we need to resort to approximations or compromises.

3. Multi-Criteria Criticality Detection Framework As discussed in the introduction, classical approaches and metrics for node 125

criticality/importance assessment in networks are typically focused on a single aspect of the system (e.g., the hubs, the nodes that belong to several minimum paths, the nodes that, if removed, disconnect the network, etc.). In this section, we develop a framework to combine several metrics into a holistic indicator of node criticality. To this end, we will resort to the Sparse Analytic Hierar-

130

chy Process (SAHP) methodology (and, specifically, to the Logarithmic Least Squares (LLS) approach) at two different levels: from one side, we will use the methodology to assess the importance of the different metrics with the help of SMEs; based on the importance of each metric, we will develop a a multi-criteria decision making framework where the metrics are combined, again, using the

135

SAHP approach. In more detail, the proposed Multi-Criteria Vulnerability Detection (MCVD) method is conceptually composed of four stages (as synthesized in Figure 1): the metrics set definition stage, the metric relevance estimation stage, the values computation stage and a measures aggregation stage. A brief explanation of each phase is as follows:

140

I) Metrics Set Definition: this stage aims at identifying the set of metrics of interest. In particular, given the set of n nodes of interest, we consider several topologies at once (e.g., structural graph, flow graph, interdependency graph, etc.) and m metrics M1 . . . Mm to be used in the MCVD process; we reiterate that each metric Mi provides a numerical value for

145

the importance of each node in the set, based on a specific property eval7

Data about

the infrastructure

Stage 1

Stage 3

Metrics Set Definition

SME

Contribution

Values Computation

Stage 2

Stage 4

Metric Relevance Estimation

Measures Aggregation

Figure 1: Logical scheme of the steps that compose the proposed MCVD methodology.

uated considering one of the particular graphs (e.g., the degree in terms of number of connections in the structural graph or the weighted degree in terms of total flow passing through the node, computed over the flow graph). 150

II) Metric Relevance Estimation: within this stage, we aim at ranking the relevance for the specific infrastructure of the different metrics identified in the previous step. In other terms, we aim at computing positive weights wi to be associated to each metric Mi , where wi quantifies the importance of the i-th metric, according to the preferences specified by an SME 2 .

155

III) Value Computation: this step is devoted to actually computing the value associated to each node according to the different metrics. The output of this phase consists of m vectors (r(1) , . . . , r(m) ), each representing (i)

the value associated to each node according to a specific metric (i.e, rj is the value of the j-th node according to the i-th metric). 2 As

better illustrated later, the approach can be easily generalized to the case when multiple

SMEs are involved in the process.

8

160

IV) Measures Aggregation: Notice that the numerical values associated to the same node according to the different metrics can be incoherent and quite heterogeneous and thus hard to combine into a single value (e.g., combining a metric in [0, 1] and another metric ranging from 0 to 2n would require normalizations that would defy the purpose of a simple weighted

165

(i)

(i)

average). To overcome this problem, we consider the ratios rj /rk that model the relative importance of node j over node k according to the i-th metric. Based on such ratios, and on the importance wi associated to the i-th metric, we compute the holistic utilities zi for each node according to an extension of SAHP. The proposed extension is discussed later, in

170

Section 3.2. Having presented the four stages of the proposed approach, we conclude the section by an in-depth discussion of the Metric Relevance Estimation and Measures Aggregation stages. 3.1. Metric Relevance Estimation Stage

175

This stage aims at assessing the utility wi of each metric, based on information provided by an SME; the main tool adopted is the SAHP methodology. Notice that standard AHP requires information on all pairs of alternatives; this poses a heavy burden on the SMEs, rendering the technique quite impractical when the number of alternatives is large. To overcome this issue, in the

180

literature several techniques have been proposed [33, 24, 34], which are able to handle missing comparisons, i.e., W is a sparse matrix with Wij = 0 when no comparison is available for the pair i, j. In this view, an effective way to represent the available information is to assume a graph-theoretical perspective, where the alternatives play the role of nodes, while the availability of a

185

nonzero entry Wij corresponds to an edge between alternative i and alternative j. In order to reconstruct the utilities of the alternatives, it is sufficient that the graph G obtained as described above is connected (the graph is undirected as we assume that Wji = Wij−1 ).

9

Among other approaches to solve this problem, one of the most effective 190

ones (see, for instance, [34] where several approaches are compared against artificially generated perturbations) is the Logarithmic Least-Squares approach (LLS), where one aims at finding the vector w∗ that solves the following optimization problem Problem 1. Find w∗ ∈ Rm that solves    2  m X m  1 X x i w∗ = arg min . ln(Wij ) − ln  2 xj x∈Rm + i=1

(1)

j∈Ni

An effective strategy to solve the above (constrained) problem is to operate the substitution y = ln(x), where ln(·) is the component-wise logarithm, so that Eq. (1) can be rearranged as    m m X  1 X 2 (ln(Wij ) − yi + yj )  , w∗ = exp arg min  y∈Rm  2 i=1

(2)

j∈Ni

where exp(·) is the component-wise exponential. Let us define κ(y) =

m m 1X X 2 (ln(Wij ) − yi + yj ) ; 2 i=1 j∈Ni

because of the substitution y = ln(x), the problem becomes convex and unconstrained, and its global minimum is in the form w∗ = exp(y∗ ), where y∗ satisfies m X ∂κ(y) (ln(Wij ) − yi∗ + yj∗ ) = 0, = ¯ ∂yi y=y∗ j∈Ni

∀i = 1, . . . , m.

Let us consider the m × m matrix P such that Pij = ln(Wij ) if Wij > 0 and Pij = 0, otherwise; we can express the above conditions in a compact form as L(A)y∗ = P 1n ,

(3)

where L(A) is the Laplacian matrix associated to the graph G, considering an adjacency matrix A with unitary weights, i.e., Aij ∈ {0, 1}. Notice that, since for hypothesis G is undirected and connected, the Laplacian matrix L(A) has rank m − 1 [35]. Therefore, we approximate the solution by computing y∗ = L(A)† P 1n , 10

where L(A)† is the Moore-Penrose left pseudoinverse of L(A). In the case multiple SMEs are involved, the proposed solution can be eas-

195

ily generalized as illustrated in [36, 37] to aggregate the different information provided by each SME. We omit the details on such a procedure here, being this paper focalized on how merge different metrics (rather than how to merge different opinions related to the same domain). 200

3.2. Measures Aggregation Stage As discussed above, the aim of this stage is the aggregation of multiple metrics M1 , . . . , Mm , considering the importance weight w1 , ..., wm of each metric. Specifically, for each metric Mi we consider the vector r(i) ∈ Rn , which collects (i)

the numerical value for the nodes according to the i-th metric (in this view, ra

represents the numerical value of node a according to the i-th metric). However, especially when some of the metrics are available only for a subset of the nodes (e.g., flows measured only at a small fraction of the nodes), the different values are not easily combined via a weighted average, also due to the different scales of each metric. To overcome this limitation, in this section we consider the ratios (i)

(i)

rj /rk , that model the relative preference of node j over node k according to the i-th metric. By casting the metrics in terms of ratios, we adopt an SAHP approach, and specifically we extend the SAHP approach to the case of multiple information sources (i.e., the different metrics) and heterogeneous importance of each source (i.e., using the weights wi ). In more detail, for each metric Mi , (i)

(i)

let j = 1 if the metric is available at node j and j = 0, otherwise.We convert the associated ranking vector r(i) into an n × n matrix R(i) such that the (i)

(a, b)-th entry Rab is in the form   ra(i) /r(i) , b (i) Rab =  0

(i)

(i)

if a = 1 and b = 1 otherwise.

(i)

In other words, Rab models the relative utility or importance of the a-th node over the b-th one according to the i-th metric, provided that the i-th metric is defined/available at both nodes a and b. 11

(i)

Notice that there might be entries rb

= 0 for which ratios would be un-

defined; in order to overcome this issue, we treat zero-valued entries as a lack (i)

(i)

of information, i.e., we assume b = 0 whenever rb

= 0. In order to repre-

sent the available information according to each metric, we follow a graphtheoretical perspective.Specifically, for each metric Mi we consider a graph G(i) = {V, E (i) } over the set of nodes such that (va , vb ) 6∈ E (i) whenever (i)

Rab = 0, and (va , vb ) ∈ E (i) , otherwise. In other words, G(i) is a graph where nodes for which information is available (or such that the associated value is zero) are isolated while the remaining nodes belong to a complete sub-graph. Notice that, in the following we assume that the graph G = {V, E} with m [

E=

E (i)

i=1

is connected; in other words, the union of the edges describing the ratios ac205

cording to the different metrics must guaranty that each pair of nodes can be compared, either directly (via a ratio) or via a path involving ratios. Notice that, in order to avoid considering excessively large and small ratios, it is possi(i)

ble to treat entries ra  1 as zero, provided that the overall graph G remains

connected. By considering the matrices R(i) , we aim at finding the aggregated

210

ranking vector r∗ ∈ Rn+ (i.e., r∗ has positive entries) that solves the following problem Problem 2. Find r∗ ∈ Rn that solves r∗ = arg min f (r) = r∈Rn +

m X i=1

wi

n X

X

a=1 b | R(i) 6=0 ab



2 (i) ln(Rab ) − log(ra ) + log(rb ) ,

(4)

where the weights wi are the results of Problem 1 that have been solved in the second stage of the MCVD problem. The above problem aims at finding the vector r∗ such that the logarithm of the ratio of its components is the least 215

(i)

squares compromise among the logarithms of the corresponding ratios Rab . In other words, Problem 2 aims at finding the relevance ra , to be assigned to the a-th node, such that the ratios ra∗ /rb∗ minimize the deviation from respect to (i)

the ratios Rab for the m considered metrics. In order to solve this problem, 12

which is in general non-convex and may have non-unique solution, we aim at 220

finding a vector z∗ such that r∗ = exp(z∗ ), where exp(·) is the componentwise exponential; in other words, we aim at solving the following unconstrained problem. Problem 3. Find z∗ ∈ Rn that solves m n 1X X X arg min g(z) = wi 2 i=1 a=1 z∈Rn (i)

b | Rab 6=0



(i)

ln(Rab ) − za + zb

2

.

(5)

The above problem is easily solved in a closed form. Specifically, being an unconstrained convex problem, the minimum is attained at z∗ such that, for all a ∈ {1, . . . , n}, it holds −

m X i=1

wi

X (i)

b | Rab 6=0



∂g(z) ∗ ∂za |z=z

= 0, i.e.,

m  X (i) ln(Rab ) − za + zb + wi i=1

(i)

X (i)

b | Rba 6=0



 (i) ln(Rba ) − zb + za = 0,

(i)

which, since Rab = (Rba )−1 , is equivalent to writing m X

−2 i.e.,

m X

wi

i=1

(i)

b | Rab 6=0

X

wi

i=1

X

(i)

b | Rab 6=0



 (i) ln(Rab ) − za + zb = 0,

(za − zb ) =

m X

wi

i=1

X

(i)

ln(Rab ).

(i)

b | Rab 6=0

By stacking the above equation for all a, since by construction

P

(i)

b | Rab 6=0 (i)

(za −zb )

is equal to the a-th component of the vector L(A(i) )z, where L(A ) is the Laplacian matrix defined with respect to the adjacency matrix A(i) of the graph G(i) , we have that the optimal z∗ satisfies m X

wi L(A(i) )z∗ =

i=1

m X

wi log(R(i) )1n ,

i=1

where log(R(i) ) is the n×n matrix collecting the logarithm of the corresponding entries of R(i) if the entry is positive and zero otherwise. Note that the matrices L(A(i) ) are singular by construction [35]; hence, in order to find z∗ , we resort to the approximation ∗

z =

m X i=1

(i)

wi L(A )

!†

13

m X i=1

wi log(R(i) )1n ,

where (·)† denotes the Moore-Penrose left pseudoinverse. 3.3. Illustrative Example 225

In order to illustrate the effectiveness of the proposed approach, we consider a scenario where n = 8 nodes have to be ranked according to m = 4 metrics, whose values are reported in Table 1; notice that, in this example, we consider metrics that are not available for each node. Table 1: Value for each of the eight nodes with respect to the four metrics considered in the example.

M1 id

M2

M3

M4

Evaluations

1

1.0

0.6

-

-

2

0.4

-

-

0.1

3

0.1

-

-

-

4

-

-

0.6

0.1

5

-

-

-

0.5

6

1.0

0.4

-

0.2

7

-

0.6

0.2

-

8

-

0.2

0.4

-

During the Metric Relevance Estimation Stage, we calculate the weights that encode the importance of each metric, considering the ratio matrix   1 1 0 1/2     1 1 7 0   W =   0 1/7 1 0    2 0 0 1

that collects the relative importance of a subset of the pairs of metrics according to an SME. Based on such an information, and by solving Problem 1, we obtain

14

0.4 0.2 0

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

0.4 0.2 0 0.4 0.2 0

Figure 2: Comparison of the holistic metric obtained according to the proposed approach (upper plot) against the result of a weighted average where lack of information is treated as zero (central plot) and the results of a weighted average where only available/nonzero information is considered for each node (lower plot). The result for each plot is normalized so that the sum over all nodes is equal to one.

weights

h i w∗ = 0.241 0.241 0.034 0.482 .

At this point, during the Measures Aggregation Stage, we solve Problem 3 with respect to the weights w∗ , thus obtaining h i r∗ = 0.163 0.065 0.015 0.070 0.337 0.134 0.154 0.061 .

Figure 2 provides a comparison of the proposed holistic metric r∗ (upper

230

plot) against the result of a weighted average where lack of information is treated as zero (central plot) and the results of a weighted average where only available/nonzero information is considered for each node (lower plot). According to the figure, the results are completely different from both a numerical and ordinal point of view. In particular, we observe that the fifth node is catego-

235

rized as the most important according to the proposed approach, while it is 15

the third most important according to the two weighted averages. Notice r∗ is the optimal solution to Problem 2 and the corresponding objective function is equal to 0.2866; however, the objective function in the case of the weighted average where lack of information is treated as zero corresponds to a value 4.4282 240

(i.e., about +1554%) and the weighted average over the available pieces of information results in an objective function equal to 1.6304 (i.e., about +568%). Interestingly, the Kendall correlation τ (see Section 2.2) between the proposed holistic metric and the weighted average (treating lack of information as zero) is moderate, being equal to 0.6429, while the Kendall correlation τ between r∗ and

245

the weighted average (considering only available entries for each node) drops to −0.0714, i.e., there is a total lack of correlation. Overall, this example supports the conclusion that the proposed approach is able to handle missing information in a way that is more satisfactory with respect to a weighted average; in fact, the proposed approach is such that each piece of information has an either di-

250

rect or indirect (e.g., via paths) effect on the estimation process, while weighted average focuses on the single metric.

4. Case Study: Central London Tube In this section, we consider as an example the Central London Tube network [17], depicted in Figure 3. Specifically, with the aim to perform a vulnerability 255

assessment via MCVD approach we consider three complementary graph representations of the Central London Tube, over the same set of n = 50 nodes (i.e., the stations) • Structural Graph G1 : in the structural graph, the edges model the direct interconnections between stations. Notice that the edges are bidi-

260

rectional, and we represent this by a pair of directed edges in opposite directions for each connection. The graph features 178 edges and the edges have unitary associated weight. The graph is reported in Figure 4. Notice that we keep track of the subway line each edge belongs to in the actual tube map; this will turn useful later while we consider specific in16

265

dices that take into account the amount of different lines passing through each node. • Travel-time Graph G2 : the travel-time graph has the same structure as the structural graph, but the edges convey information on the (average) time required to move from a station to a neighboring one (see Figure 5

270

where edge thickness is proportional to the travel time). We model this via directed edges with associated weights that represent the travel time associated to each link. Notice that the graph is asymmetric in that, in general, the weight associated to the edge from i to j may be different from the weight of the edge from j to i).

275

• In/Out Flow Graph G3 : the In/Out Flow graph provides information on the (average) daily flow of passengers that enter as station i and leave at station j. Differently from G1 and G2 , the In/Out Flow graph is a full graph, where the weight associated to each edge models the flow. For representation reasons, we show G3 in Figure 6 via a tile plot, where the color

280

of the tile at the i-th row and j-th column reflects the flow of passengers entering at the i-th station and leaving at the j-th one; specifically, we resort to a black-white heatmap, where black tiles correspond to low flow and the white color is used for high flow tiles.

4.1. Metrics Set Definition 285

With respect to the aforementioned graphs, we consider m = 12 different metrics M1 , . . . , M12 , including both traditional measures and indicators that are aimed at assessing the effect of attacks or disruptions to specific nodes. Specifically, we consider (see [38] and references therein for details): • Node Degree (M1 ): sum of the outgoing and incoming edges at each

290

node, computed over the structural graph G1 . • Betweenness (M2 ): measures how often a node belongs to the shortest paths between any pair of nodes, computed over the travel-time graph G2 . 17

7

H Fig. 1. Central London tube map

Figure 3: Central London tube map.

than the other metrics and models). Interdiction models, as expected, produce the most accurate results. For this performance measure, the pure connectivity model PIP(1) is overall the best approach and significantly outperforms all the metrics. For example, if 5 nodes are disrupted simultaneously, the efficiency can be as low as 0.2 (as identified by PIP(1)). However, three out of the 4 metrics return a value greater than 0.4, which is double the real worst-case value. One metric (NA) returns a value of about 0.32, which is also an overestimate.

Figure 4: Structural graph G1 .

18

Figure 5: Travel-time graph G2 . Edge thickness is proportional to Travel time.

More precisely, for a specific node u, the betweenness is defined as: bu =

X

(u)

Nst /Nst ,

s,t6=u (u)

where Nst is the amount of minimum paths between nodes s and t passing via node u and Nst is the total number of minimum paths between nodes s and t. Notice that, being G2 a weighted graph, the minimum path depends on the travel time. 295

• Hubs & Authorities (in turn, M3 , M4 ): such metrics, computed over the structural graph G1 , are defined together in a recursive way. The “hubs-score” of a node is the sum of the “authorities-score” of its neighbors, and vice-versa. Such values can be regarded as the left (hubs) and right (authorities) singular vectors that correspond to the largest singular

300

value of the adjacency matrix of the graph [39].

19

Angel Baker Street Bank Monument Barbican Bayswater Blackfriars Bond Street Cannon Street Chancery Lane Charing Cross Covent Garden Edgware Road Embankment Euston Euston Square Farringdon Gloucester Road Goodge Street Great Portland Street Green Park High Street Kensington Holborn Hyde Park Corner St Pancras Knightsbridge Lancaster Gate Leicester Square London Bridge Mansion House Marble Arch Marylebone Moorgate Notting Hill Gate Old Street Oxford Circus Paddington Piccadilly Circus Queensway Regents Park Russell Square Sloane Square South Kensington St James Park St Pauls Temple Tottenham Court Road Victoria Warren Street Waterloo Westminster

Angel Baker Street Bank Monument Barbican Bayswater Blackfriars Bond Street Cannon Street Chancery Lane Charing Cross Covent Garden Edgware Road Embankment Euston Euston Square Farringdon Gloucester Road Goodge Street Great Portland Street Green Park High Street Kensington Holborn Hyde Park Corner St Pancras Knightsbridge Lancaster Gate Leicester Square London Bridge Mansion House Marble Arch Marylebone Moorgate Notting Hill Gate Old Street Oxford Circus Paddington Piccadilly Circus Queensway Regents Park Russell Square Sloane Square South Kensington St James Park St Pauls Temple Tottenham Court Road Victoria Warren Street Waterloo Westminster

Figure 6: In/Out Flow Graph G3 . The tile at the i-th row and j-th column represents the average daily amount of passengers that enter at station i and leave at station j, according to a grayscale colormap where large flows are associated to the white color, while low flows are shown in black.

• Closeness (M5 ): this metric is based on the inverse of the sum of the distances from a node to all other nodes in the graph. Specifically, the closeness is defined as cu = Au /(Cu (n − 1)), where Au is the number of reachable nodes from node u (not counting u), n is the number of nodes in 305

the graph, and Cu is the sum of the distances from node u to all reachable nodes (if the node is isolated then Cu = 0). We compute this measure with respect to the travel-time graph G2 . • Eigenvector Centrality (M6 ): this metric uses the eigenvector corre20

sponding to the largest eigenvalue of the graph adjacency matrix. The 310

scores are normalized such that the sum of all values is equal to 1. We compute this measure with respect to the structural graph G1 . • Entries (M7 ): The estimation of the number of people who daily start their travels from the station, using the In/Out flow graph G3 . • Exits (M8 ): The estimation of the number of people who daily stop their

315

travels in the station, using the In/Out flow graph G3 . • Lines (M9 ): The number of different railway lines which crosses the station, computed over the structural graph G1 by counting for each node the amount of different lines their incident incoming and outgoing edges belong to.

320

• Critical Index (M10 ): this metric, presented in [11], highlights the nodes that are critical to the connectivity of the network. This metric derives from the solutions in the Pareto front of a multi-objective optimization problem which proposes an attacker perspective in order to identify the most important nodes for the connectivity. We compute this measure with

325

respect to the structural graph G1 . • Node Vulnerability (M11 ): the metric considers the loss of efficiency due to the disruption of a node i. Let Λ be the network efficiency defined as the average of the reciprocals of the length of each shortest path in the graph. For a node i the node vulnerability is defined as: N V (i) = Λ(0) − Λ(i), where Λ(0) is the topological efficiency before the disruption and Λ(i) represents the topological efficiency after the removal of node i from the graph [16]. We compute this measure with respect to the structural graph G1 .

330

• Passenger Flow Influence (M12 ): as defined in [12], this metric represents the total amount of flow that is affected when node i is removed 21

from the graph considering the sum of the generated, the attracted and the intercepted flows. We compute this measure with respect to the In/Out flow graph G3 . 335

(i)

Notice that, in this case study, all terms ra > 0. Notice further that metrics M1 , M3 , M4 and M6 are classical centrality measures based only on topological aspects of the system represented by the structural graph G1 . Similarly, the metric M 9, being the number of different lines that are incident to a node, essentially accounts for the topological structure, but treats the edges of the

340

graph G1 differently according to the line they belong to. Conversely, metrics M2 and M5 are computed based on the travel-time graph G2 . The metrics M7 and M8 , instead, consider the daily passengers flow described by the graph in Figure 6. Finally, the last three metrics M10 , M11 and M12 are respectively based on G1 , G2 , and G3 and associate to each node an importance based on

345

the effects caused by its disconnection from the infrastructure. 4.2. Metric Relevance Estimation Let us now discuss how to assign a numerical importance value wi to each of the 12 metrics considered in this case study. As discussed in the previous section, the proposed approach is based on interviews with an SME, using the

350

approach in Problem 1 (eventually, multiple decision makers could be considered at once using the approach proposed in Problem 3). In more details, in this case study a security manager with specific competence on railway infrastructure was given a brief description of the different indicators and was asked to provide his preferences on the set of metrics; such preferences were provided by comparing

355

pairs of metrics and by specifying how many times metric i was preferred to metric j. Notice that, in order to avoid biases, the SME has not been informed about the specific case study, because he had just to compare the relevance of the different metrics. Figure 7 shows the result of the preference assessment pro-

22

Figure 7: Evaluation of the ratios among the metrics. The edge thickness is proportional to the relative evaluation among the metrics while the result of the sparse AHP (the absolute evaluation) is represented by the node color.

cedure; specifically we represent the available information in terms of a graph3 360

where the metrics play the role of the nodes, while each available comparison is represented by an edge; for each edge, a numerical preference value is provided. Notably, the SME provided just 32 out of 132 possible comparisons; moreover, each alternative is compared at most with five other alternatives. In spite of such a lack of information, the LLS sparse AHP approach is able to provide an

365

estimate for the importance wi of each metric. Such values are show in Figure 7, 3 The

graph representation here is functional to calculating the weights but it has no relation

with the graphs G1 , G2 and G3 described above.

23

in that the color of the interior of the nodes (i.e., the metrics) is defined by the blue-yellow heatmap given in the figure (important metrics are yellow while not important metrics are shown in blue). Notice that, according to Figure 7, the procedure assigns highest importance to metrics M7 and M8 (daily entries 370

and exits for each station), and intermediate importance to metrics related to the number of lines, flow and effects of an attack to the nodes; conversely, more traditional topological measures of importance are assigned smaller importance. 4.3. Values Computation Figure 8 shows the importance of the nodes according to each of the metrics

375

selected above (for simplicity we show the structural topology G1 , where the interior of the nodes is colored according to a blue-yellow heatmap). According to the figure, it is evident that each metric captures a different aspect of the importance of the nodes. For the sake of completeness, in Table 2 we show the Kendall’s correlation index among the ordinal rankings obtained according to

380

the 12 metrics. According to Table 2, high correlation coefficients among the metrics M3 , M4 , and M6 (Hub, Authorities, Eigenvector Centrality) can be noted. Also the evaluations based on the first two metrics M1 and M2 (Node Degree and Betweenness) provide two rankings characterized by a moderately high correlation

385

(τ = 0.47). Similar correlation is observed between M9 and M12 (Lines and Passenger Flow Influence). However, the vast majority of the entries in Table 2 are close to zero, thus emphasizing that different metrics measure different aspects of node importance. Data in Table 3, where the five most important stations according to each

390

metric are shown, supports the conclusion that most of the considered metrics measure heterogeneous aspects of node importance. Even though the results are different, it can be noted that Green Park station is listed among the most important stations according to all metrics but M7 , M8 , and M9 (entries, exits, and Railway Lines); similarly Oxford Circus is listed as very important according

395

to all metrics but M7 and M9 (Entries and Railway Lines). 24

Figure 8: Nodes relevances evaluation considering the 12 metrics. The evaluations are normalized in the range [0 . . . 1].

Moreover, notice that, consistently with the results in Table 2, the high correlation (τ = 1) among the metrics M3 , M4 , and M6 (Hub, Authorities, Eigenvector Centrality) is confirmed by the rankings in Table 3, (i.e. for these metrics the five most important stations coincide). 400

4.4. Measures Aggregation The last stage of the MCVD process consists in the aggregation of the multiple evaluations by taking into account the relative importance of the different metrics via the weight coefficients w1 , . . . , w12 (see Fig. 7) by solving Problem 3; the result of the procedure is represented in Figure 9. According to the figure,

405

the proposed aggregated measure appears as a tradeoff between the different metrics; this is confirmed by Table 4, where we report the Kendall’s correlation 25

Table 2: Kendall’s ranks correlation coefficients among the node importance rankings evaluated via M1 . . . M12

M1 M2 M3 M4 M5 M6

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

0.47

-0.022

-0.022

0.24

-0.022

-0.022

0.2

-0.11

0.11

0.16

0.24

0.24

0.24

0.067

0.24

0.067

-0.16

0.067

0.022

-0.11

0.33

1.0

-0.33

1.0

0.022

0.24

0.022

-0.29

0.29

0.2

-0.33

1.0

0.022

0.24

0.022

-0.29

0.29

0.2

-0.33

0.022

0.16

-0.067

-0.022

-0.24

-0.067

0.022

0.24

0.022

-0.29

0.29

0.2

-0.29

0.2

-0.11

0.022

-0.16

-0.47

-0.16

0.33

-0.11

-0.11

0.022

0.47

-0.47

-0.29

M7 M8 M9 M10 M11

0.11

index among the ranking obtained according to the proposed aggregated metric and the rankings obtained according to M1 , . . . , M12 . It can be noted that the maximum correlations obtained over the whole set of nodes is about 0.1 and is 410

associated to the comparison between the aggregated metric and M9 (Railway Lines) and M10 (Critical Index). Overall, the above results suggest that the proposed index, by aggregating different metrics, assigns a criticality to the nodes that can not be exhaustively explained by any of the original metrics. In fact, by looking at Table 3 and 6 , it can be noted that the most influential nodes ac-

415

cording to the proposed aggregated metric are indeed represented by the union of the most influent nodes according to all the different metrics. Notice that, according to the rankings in Table 3, all the first three most important stations are in the first 10 positions of the aggregated ranking in Table 6. The only exception is the absence of the Green Park station which is in the 30th position

420

of the aggregated ranking. Note that this station does not appear in the first five positions of the rankings computed according to the metrics M7 and M8 (Entries and Exits) which are associated to the highest weights at the end of the second stage (w7 = w8 = 0.22) as shown in Figure 7. With reference to the rankings in Table 3 associated to these two metrics, notice that the all the

425

five most important stations appear in the first ten positions of the aggregated ranking, the only exception is represented by the London Bridge Station which

26

Figure 9: Node importance according to the aggregated evaluation.

does not appear in the first positions (48th in the aggregated ranking).

5. Metric Incidence in the Aggregation Process In this section we want to analyze how the ranking computed with respect 430

to a specific metric Mi correlates with te aggregated ranking, depending on the weight wi . To this end, we perform a simulation campaign, considering 100 random networks composed of 30 nodes. For the sake of simplicity, we analyze just a subset of metrics, and specifically: node degree, betweenness, hubs, closeness, and eigenvector centrality measures.

435

The formulation of Problem 2 within the MCVD framework suggests that the correlation between the ranking obtained according to Mi and the aggregated index is proportional to the weight wi . With the aim to investigate this relation, in Figure 10, we report the Kendall’s correlation coefficient between the ranking 27

computed according to the five selected metrics and the aggregated ranking 440

computed via MCVD procedure, for different choices of the weights. More precisely, for each metric Mi we consider increasingly high values of the weight wi ∈ (0, 1] while the other weights reduce their importance accordingly, and specifically we set wj =

1−wi 4

for all j 6= i (so that w1 + . . . + w5 = 1). The

results in Figure 10 suggest that, for each metric, the Kendall’s correlation with 445

the aggregated index increases as the importance of the metric grows; however, we observe that each metric exhibits a different behavior when wi > 0.5. For some metrics, the proposed approach, despite high associated weights, produces an aggregated ranking characterized by a low correlation with the metric-specific ranking. This phenomenon is particularly evident for the node degree. Note

450

that, in this case, the final value of the Kendall’s correlation coefficient τ is 0.312 for wi = 1 (while one would expect τ to be close to one). The only one that matches (τ = 1) to the aggregated one, is obtained by considering the nodes in terms of Hubs Centrality measure. In order to further investigate the phenomenon, we consider an index νi that takes into account the descriptiveness of each metric, i.e., νi =

|r(i) |U n

where |r(i) |U denotes the number of unique values assumed by the entries of the 455

vector r(i) , while n is the number of nodes in the network. In this way, ν is close to one if the i-th metric associates several different importance measures to the

nodes in the graph, while ν is close to 0 for metrics with smaller amount of different values. In Figure 11, for each metric, we report the average Kendall’s correlation coefficient attained for wi = 1 and the corresponding index νi . The 460

results in Figure 11 suggest that a low descriptive ranking (i.e. small values of ν) does not influence the aggregated ranking due to its inability to discriminate the node importance with several different values.

28

Figure 10: Metric influence for increasing value of wi .

Figure 11: Metric influence for increasing value of wi .

6. Conclusions In this paper we develop a novel approach to combine different metrics, pos465

sibly available only for a subset of the nodes, that characterize the importance of the nodes (possibly based on different topologies) into a holistic indicator of criticality. This is done by converting the metrics in ratio matrices and by implementing a multi-criteria framework based on the Logarithmic Least Squares Sparse Analytic Hierarchy Process. Moreover, Subject Matter Experts

470

are involved in the process, in order to quantify the importance of each metric. The proposed holistic indicator can be the basis for the implementation

29

of tools aiming at supporting CI stakeholders and practitioners in prioritizing security investments. In order to show the potential of the proposed approach, we provide an illustrative example where we compare our approach with dif475

ferent weighted averaging schemes and we consider a case study involving the London Central Tube Network. The illustrative example shows that the proposed approach is particularly effective in the case of partial or incomplete data, being remarkably different in terms of objective function and rank correlation from weighted averages. Moreover, according to the case study, the proposed

480

methodology is able to synthesize an aggregated measure of node importance that represents a good trade-off among the different metrics and that is able to assign large relevance to the most influential nodes according to the single indices; yet, the resulting criticality cannot be exhaustively explained by any of the original metrics thus, suggesting that the proposed approach indeed pro-

485

vides new information. Overall, the proposed methodology allows the decision maker to prioritize the protection of selected nodes, based on several criteria at once and considering his/her subjective preferences among the criteria. Notice that, although the proposed case study focuses on a single infrastructure, the application of the framework to more complex scenarios involving several inter-

490

connected infrastructures is straightforward, provided that suitable information is available in order to compute several importance metrics for the infrastructures’ composing elements (e.g., topologies and/or flows among the elements of the single infrastructures and connections with other infrastructures); in fact, the weight computation based on information provided by an SME is substan-

495

tially independent from the actual calculation of the metrics. However, it would be highly beneficial to consider multiple SMEs at the same time, possibly with different backgrounds and points of view. In this view, future work will be mainly aimed at assessing the effect of the involvement of multiple SMEs in the decision process, with the aim to verify the reach of a consensus among the

500

decision-makers or the presence of groups with different opinions. Moreover, we will extend the framework to consider also edge importance.

30

References References [1] C. W. Anderson, J. R. Santos, Y. Y. Haimes, A risk-based input–output 505

methodology for measuring the effects of the august 2003 northeast blackout, Economic Systems Research 19 (2) (2007) 183–204. [2] O. P. Popova, P. Jenniskens, V. Emel’yanenko, A. Kartashova, E. Biryukov, S. Khaibrakhmanov, V. Shuvalov, Y. Rybnov, A. Dudorov, V. I. Grokhovsky, et al., Chelyabinsk airburst, damage assessment, meteorite

510

recovery, and characterization, Science 342 (6162) (2013) 1069–1073. [3] R. Setola, A. Sforza, V. Vittorini, C. Pragliola, Railway Infrastructure Security, Springer, 2015. [4] G. Stergiopoulos, P. Kotzanikolaou, M. Theocharidou, D. Gritzalis, Risk mitigation strategies for critical infrastructures based on graph central-

515

ity analysis, International Journal of Critical Infrastructure Protection 10 (2015) 34–44. [5] V. Rosato, L. Issacharoff, F. Tiriticco, S. Meloni, S. Porcellinis, R. Setola, Modelling interdependent infrastructures using interacting dynamical models, International Journal of Critical Infrastructures 4 (1-2) (2008) 63–79.

520

[6] M. Theoharidou, P. Kotzanikolaou, D. Gritzalis, A multi-layer criticality assessment methodology based on interdependencies, Computers & Security 29 (6) (2010) 643–658. [7] G. Stergiopoulos, P. Kotzanikolaou, M. Theocharidou, G. Lykou, D. Gritzalis, Time-based critical infrastructure dependency analysis for large-scale

525

and cross-sectoral failures, International Journal of Critical Infrastructure Protection 12 (2016) 46–60. [8] H. Sepp¨ anen, P. Luokkala, Z. Zhang, P. Torkki, K. Virrantaus, Critical infrastructure vulnerabilitya method for identifying the infrastructure service 31

failure interdependencies, International Journal of Critical Infrastructure 530

Protection 22 (2018) 25–38. [9] K. Liu, M. Wang, W. Zhu, J. Wu, X. Yan, Vulnerability analysis of an urban gas pipeline network considering pipeline-road dependency, International Journal of Critical Infrastructure Protection 23 (2018) 79–89. [10] R. Setola, How to measure the degree of interdependencies among critical

535

infrastructures, International Journal of System of Systems Engineering 2 (1) (2010) 38–59. [11] L. Faramondi, G. Oliva, S. Panzieri, F. Pascucci, M. Schlueter, M. Munetomo, R. Setola, Network structural vulnerability: a multiobjective attacker perspective, IEEE Transactions on Systems, Man, and Cybernetics:

540

Systems (99) (2018) 1–14. [12] X. Chen, Critical nodes identification in complex systems, Complex & Intelligent Systems 1 (1-4) (2015) 37–56. [13] L. L¨ u, D. Chen, X.-L. Ren, Q.-M. Zhang, Y.-C. Zhang, T. Zhou, Vital nodes identification in complex networks, Physics Reports 650 (2016) 1–63.

545

[14] P. Chopade, M. Bikdash, New centrality measures for assessing smart grid vulnerabilities and predicting brownouts and blackouts, International Journal of Critical Infrastructure Protection 12 (2016) 29–45. [15] D. F. Rueda, E. Calle, J. L. Marzo, Robustness comparison of 15 real telecommunication networks:

550

Structural and centrality measurements,

Journal of Network and Systems Management 25 (2) (2017) 269–289. [16] S. Starita, A. E. Amideo, M. P. Scaparra, Assessing urban rail transit systems vulnerability: Metrics vs. interdiction models, in: International Conference on Critical Information Infrastructures Security, Springer, 2017, pp. 144–155.

32

555

[17] S. Starita, M. Paola Scaparra, Passenger railway network protection: a model with variable post-disruption demand service, Journal of the Operational Research Society 69 (4) (2018) 603–618. [18] F. Wang, X.-z. Zheng, N. Li, X. Shen, Systemic vulnerability assessment of urban water distribution networks considering failure scenario uncer-

560

tainty, International Journal of Critical Infrastructure Protection 26 (2019) 100299. [19] C. Ciaponi, E. Creaco, A. Di Nardo, M. Di Natale, C. Giudicianni, D. Musmarra, G. F. Santonastaso, Reducing impacts of contamination in water distribution networks: A combined strategy based on network partitioning

565

and installation of water quality sensors, Water 11 (6) (2019) 1315. [20] A. Di Nardo, M. Di Natale, R. Gargano, C. Giudicianni, R. Greco, G. F. Santonastaso, Performance of partitioned water distribution networks under spatial-temporal variability of water demand, Environmental Modelling & Software 101 (2018) 128–136.

570

[21] C. Alcaraz, J. Lopez, A cyber-physical systems-based checkpoint model for structural controllability, IEEE Systems Journal 12 (4) (2017) 3543–3554. [22] C. Alcaraz, E. E. Miciolino, S. Wolthusen, Multi-round attacks on structural controllability properties for non-complete random graphs, in: Information Security, Springer, 2015, pp. 140–151.

575

[23] G. Crawford, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling 9 (3-5) (1987) 327–334. [24] G. Oliva, R. Setola, A. Scala, Sparse and distributed analytic hierarchy process, Automatica 85 (2017) 211–220. [25] S. Boz´ oki, V. Tsyganok, The (logarithmic) least squares optimality of the

580

arithmetic (geometric) mean of weight vectors calculated from all spanning trees for incomplete additive (multiplicative) pairwise comparison matrices, International Journal of General Systems 48 (4) (2019) 362–381. 33

[26] T. L. Saaty, A scaling method for priorities in hierarchical structures, Journal of mathematical psychology 15 (3) (1977) 234–281. 585

[27] G. Oliva, A. E. Amideo, R. Starita, R. Setola, M. P. Scaparra, Aggregating centrality rankings: A novel approach to detect critical infrastructure vulnerabilities, in: Critical Information Infrastructures Security, S. NadjmTehrani Ed., Springer (2020). [28] R. Fagin, R. Kumar, M. Mahdian, D. Sivakumar, E. Vee, Comparing and

590

aggregating rankings with ties, in: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, ACM, 2004, pp. 47–58. [29] E. Dopazo, M. L. Mart´ınez-C´espedes, Rank aggregation methods dealing with ordinal uncertain preferences, Expert Systems with Applications 78

595

(2017) 103–109. [30] J. A. Aledo, J. A. G´ amez, A. Rosete, Utopia in the solution of the bucket order problem, Decision Support Systems 97 (2017) 69–80. [31] G. Lindmark, C. Altafini, Combining centrality measures for control energy reduction in network controllability problems, in: 2019 18th European

600

Control Conference (ECC), IEEE, 2019, pp. 1518–1523. [32] M. G. Kendall, A new measure of rank correlation, Biometrika 30 (1/2) (1938) 81–93. [33] S. Boz´ oki, J. F¨ ul¨ op, L. R´ onyai, On optimal completion of incomplete pairwise comparison matrices, Mathematical and Computer Modelling 52 (1-2)

605

(2010) 318–333. [34] M. Menci, G. Oliva, M. Papi, R. Setola, A. Scala, On optimal completion of incomplete pairwise comparison matrices, Proceedings of the 2018 European Control Conference.

34

[35] C. Godsil, G. Royle algebraic graph theory, Graduate text in mathematics, 610

Springer, New York. [36] G. Oliva, A. Scala, R. Setola, P. Dell’Olmo, Opinion-based optimal group formation, Omega 89 (2019) 164–176. [37] G. Oliva, R. Setola, A. Scala, P. Dell’Olmo, Sparse analytic hierarchy process: an experimental analysis, Soft Computing 23 (9) (2019) 2887–2898.

615

[38] A. Borodin, G. O. Roberts, J. S. Rosenthal, P. Tsaparas, Link analysis ranking: algorithms, theory, and experiments, ACM Transactions on Internet Technology (TOIT) 5 (1) (2005) 231–297. [39] M. Benzi, E. Estrada, C. Klymko, Ranking hubs and authorities using matrix functions, Linear Algebra and its Applications 438 (5) (2013) 2447–

620

2474.

35

36 St Pancras Bank Monument Oxford Circus

London Bridge

Victoria

Bank Monument

Victoria

Waterloo

Waterloo

St Pancras

Exits (M8 )

Entries (M7 )

Holborn

Tottenham Court Road

Bank Monument

St Pancras

Green Park

Baker Street

Paddington

Moorgate

Embankment

Baker Street

Bank Monument

Railway Lines (M9 )

Tottenham Court Road

Bond Street

Piccadilly Circus

Green Park

Oxford Circus

Bond Street Oxford Circus

Green Park

Hubs (M3 )

Betweenness (M2 )

Oxford Circus

Node Degree (M1 )

Green Park

Oxford Circus

Embankment

St Pancras

Bank Monument

Critical Indices (M10 )

Tottenham Court Road

Bond Street

Piccadilly Circus

Green Park

Oxford Circus

Authorities (M4 )

Holborn

Baker Street

Green Park

Bond Street

Oxford Circus

Node Vulnerability (M11 )

Leicester Square

Green Park

Bond Street

Tottenham Court Road

Oxford Circus

Closeness (M5 )

Waterloo

Green Park

St Pancras

Oxford Circus

Bank Monument

Passenger Flow Influence (M12 )

Tottenham Court Road

Bond Street

Piccadilly Circus

Green Park

Oxford Circus

Eigenvector Centrality (M6 )

Table 3: Five most important stations of Central London Tube according to each metric.

Table 4: Kendall’s ranks correlation between the aggregated ranking and the rankings computed according to M1 . . . M12 considering all the nodes in the network.

Aggregated Rank

M1

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

-0.023

0.074

-0.005

0.005

-0.159

-0.002

0.051

-0.058

0.108

0.090

-0.033

0.026

Table 5: Kendall’s ranks correlation between the aggregated ranking and the rankings computed according to M1 . . . M12 considering the 10 most important nodes.

Aggregated Rank

M1

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

-0.022

-0.022

-0.244

-0.244

-0.066

-0.244

0.111

-0.289

0.289

0.156

-0.067

0.111

Table 6: Ten most important stations of the Central London Tube according to the aggregated ranking.

Station Name Oxford Circus Bank Monument St Pancras Victoria Waterloo Bond Street Baker Street Tottenham Court Road Piccadilly Circus Embankment

37