On the centrality in a directed graph

On the centrality in a directed graph

SOCIAL SCIENCE RESEARCH, 2, 371-378 (1973) On the Centrality in a Directed Graph U. J. NIEMINEN Finnish Academy, Helsinki, Finland The concept of st...

470KB Sizes 13 Downloads 45 Views

SOCIAL SCIENCE RESEARCH, 2, 371-378 (1973)

On the Centrality in a Directed Graph U. J. NIEMINEN Finnish Academy, Helsinki, Finland

The concept of structural centrality in a weakly connected digraph is considered. Some requirements for the pointcentrality and for the centrality index are proposed and a construction satisfying the requirements is given. Several examples are calculated.

1. INTRODUCTION

AND BASIC CONCEPTS

Much work has been done to construct a twostage index measuring the degree of centralization of a graph. Regrettably, there are two important open questions: What means the concept of the centrality in a structure described by a graph, and what kinds of phenomena the constructed pointcentralities and centrality indices shall measure ? First, we construct a pointcentrality measure satisfying a set of general centrality requirements and correlate to the well-known Bavelas pointcentrality. Then a centrality index will be defined measuring the same phenomenon as the Bavelas index does, according to Flament (1963). In Section 3, we consider a way to define the concept of centrality in a graph and construct an index basing on the defined ideas. The considerations here concentrate on the topological properties of a graph different from the ideas of Mackenzie (1966) concerning the rate of information flow in each channel of the graph. Besides Flament (1963), Sabidussi (1966) has criticized the Bavelas index and its improvement, called Beauchamp index (Beauchamp, 1965). Further, Sabidussi has constructed an axiom system to test to what extent the known indices satisfy the requirements of the system. The system is applicable to undirected graphs and as we shall consider directed graphs, which need not even be strongly connected, we must be contented with much more unexact things illustrating the concept of centrality in a communication graph. The basic concepts used in this paper are those defined in the book of Harary et al. (1965), and we shall first recall the most important of them. A directed graph (DG) is a pair (P(DG),L(DG)) of sets, where P(DG) = {x,y,z, . . .} is the set of points of DG and L(DG) that of lines. L(DG) consists of the ordered pairs (x,y) of points in DG so that (x,y)eL(DG), if 371 Copyright @ 1973 by Seminar Press, Inc. AU riahts of reproduction in any form reserved.

372

NIEMINEN

there is a directed line from x to y in DC. A path is a sequence for every XO,XlJ2,~ . . ,x, of distinct points of DC so that (xi-,,x&I,(DG) value of i, i = 1, . . ,n, and a semipath is a sequencex0,x1, ,s,, so that (xir,xJ or (Xi,Xi-r ) is a line in DG for every i. A cycle of DG is a path, where x0 = x,, and a semicycle has the definition analogousto that of a semipath in DG. The length of a path (a cycle) x0,x1, ,x,, in DG is n, i.e.. every line is of length one. d(x,v) is the distance between two points x and u in DG, i.e., the length of the shortest path from x to v in DG, if such exists. Note that d(x,v) need not equal d(v,x), d(x,x) = 0, and cl(x,y) = m, if there is no path from x to y in DG. A directed graph, briefly digraph, DG is weakly connected, if for any two points x,yeP(DG) there is a semipathin DG connecting x and y, and DG is strongly connected, if for any two points x and y of DG there is a path from x to y and another from y to x in DG. DG has no loops and no multiple lines, if (x,x)#L(DG) for any point x of DG, and (XJ) is the only line of DG from x to y, respectively. A digraph is finite, if its sets of points and lines are finite. In this paper we shall consider structures and communication networks describedby a finite, weakly connected digraph without loops and multiple lines only. On says that a point y is reachablefrom a point x in DC, if there is a path from x to y in DG. If A is any set, H 1will denote the number of elementsin A. The outdegree of a point x in DC is the number of the lines in DG from x to another point of DG; the indegree has an analogousdefinition. A wheel W, of p points is a weakly connected digraph of specific kind, where a point x has the outdegree p- 1 and all the others the outdegree zero (seeFig. 1). Accoraing to intuition, WP is the most centralized weakly connected digraph having p points. 2. A VARIETY FOR THE BAVELAS INDEX

2.1. On Pointcentrality A pointcertrality c(x) is a function defined on the point set P(DG) of a digraph DG measuring the centrality of a point x in DG. There seemto be

Fig. 1. The wheel of p points.

ON THE

CENTRALITY

IN A DIRECTED

GRAPH

373

two things which correspond to the centrality of a point x in a digraph DG: the total distance from x to other points and being able to reach a large number of points. In the following we present some general requirements for a pointcentrality basing on these two most used concepts. The requirements here are slight modifications of those proposed by Harary (see, e.g., Harary et al. (1964) p. 189) for the status of a point in an organization. Consider a weakly connected digraph DC of p points. Call v an immediate subordinate of x, if line (x,v) is in DC, and in general, call v a subordinate of x, if v is reachable from x. The sequence (et,ez . . . ,ep-i), where et is the number of points at distance i from x in DG, is the subordinate vector of x. It is useful to require a pointcentrality c(x) to have the following properties.

(1) c(x)

is an integer. (2) c(x) is 0 if and only if x has no subordinates. (3) If the subordinate vector of Y is obtained from that of x by adding one subordinate at any distance from V, the centrality of v is greater than that of x. (4) If the subordinate vector of v is obtained from that of x by decreasing the distance of any subordinate, the centrality of v is greater than that of x. v The widely used pointcentrality s(x) = YeLp(x) d(x,y), where RP(x) is the set of all vertices reachable from x in the given digraph DC, does not satisfy the fourth condition. On the other hand, the ordinary Bavelas

1

pointcentrality

b(x) = xeP:DG) s(X) y&x) d(x,y) fails in the first, second, [ I and third condition, where the second and third are the most serious.

2.2. A Pointcentrality

Construction

In the following we construct a pointcentrality, denoted by h(x), and which is based on both the central concepts mentioned before: the total distance from x to other points and the ability to reach a large number of points. Let DG be a weakly connected digraph and let xeP(DG). It is worth noting that x always belongs to the set RP(x) in DG. We define the function h(x) on P(DG) as follows

h(x) = WJW(x)l ( 0, if M(x)

- d(x,v)), if lRP(x)l 2 2; veRRp(x), = x.

Consider, whether the function h(x) satisfies the useful requirements (l)-(4) of Section 2.1. Clearly h(x) is an integer. h(x) = 0 if and only if x has no subordinates in DG, since lRp(x)( > d(x,v) for any v&P(x), if lRP(x)( > 2.

374

NIEMINEN

Hence (1) and (2) are valid for h(x). Assume that the subordinate vector of point y is obtained from that of x by adding one subordinate z at any distance from y in DC. Then IRPb)I = IRP(x)l t 1, and thus h@) = h(x) t W’(x)l + IRm)l - dti,z), where 1Rm)I >d@,z).~ Hence h(x) satisfies (3). The validity of (4) is obvious. It is interesting to consider the pointcentrality concept for a point in terms of the number of paths involved in joining x to its reachable vertices. This consideration gives a connection between the functions h(x) and S(X). pp(x)I - 1 is the total number of such paths, since there is exactly one path required for each of the vertices in RI’(x) - {x}. Let z&P(x) - {x} . The path from x to z determines a total of d(x,z) shortest distance paths so that IRP(x)I - d(x,z) = ]RP(x)l - 1 - (d(x,z)-I) is the number of paths used to reach vertices from X, except the d(x,z)-1 paths that lie on the shortest path from x to z. Thus h(x) assigns to z a weight of the path d(x,z) plus the number of all other paths from x to a vertex of W(x) - {x}, except the d(x,z) - 1 paths lying on the path to z. s(x) utilizes different weights at each vertex z. It counts all the paths lying along the shortest path from x to z, of which there are exactly d(x,z). There is a contrast between the two functions. At any vertex t, each function includes the one path to z, then h(x) adds the number of paths that do not lie on the path from x to z, whereas s(x) adds the number of paths that do lie on the path to z. Clearly h(x) + s(x) = Cy[~flx)I - d(x,v)] + Hence these two functions operate in a x:v d(w) = IRP(x)12, v&P(x). complementary way. This property shows that the Bavelas pointcentrality b(x) = S/s(x), where S = XX s(x), xeP(DG), operates in rather a satisfactory way, though it does not satisfy the requirements (l)-(4) mentioned before. Generally, the function b(x) has been applied to the strongly connected digraphs only, and then the requirement (2) is somewhat meaningless. Consider finally what kind of neighborhood a point x having the maximum value of k(x) on a digraph of p points shall have. Now h(x) = and clearly h(x) obtains its maximum if W(x)12 - Zv d(x,v), v&P(x), /Rp(x)( = p and Xv d(x,v) = p- 1. Hence in a weakly connected digraph DG of p points, a point x has maximum centrality, if DG has a wheel IV, as its partial graph and x is the hub of the wheel.

2.3. A Centrality Index Bavelas (1950) has defined the centrality index B(DG) of a digraph as follows: B(DG) = XX b(x), xeP(DG). Flament has shown that the minimum index of centrality is obtained if and only if b(x) = bb) for every pair of points x,y in DG, and he concludes that the Bavelas index measures specifically the degree of disparity between the points of a digraph. But there is a simpler and more immediate scheme to measure that phenomenon,

ON THE CENTRALITY

IN A DIRECTED GRAPH

375

namely the dispersion, denoted D(DG), D(DG) = C, (ha - h(x)), xeP(DG), where ho = x$gG) {h(x)}. 1n a strongly connected digraph DC having p points the connection between functions h(x) and s(x), derived above, implies that h(x) = p* - s(x), whence D(DG) = C, (h,-h(x)) = &(s(x)-so),

where so = Xer$iG){s(x)}.

Clearly D(DG) obtains its maximum in a digraph of p points having the structure isomorphic to the wheel IV,. This does not, however, imply that D(DG), and the Bavelasindex as well, measurethe centrality of a digraph in the topological senseof the word. 3. ON THE CENTRALITY OF A DIGRAPH 3.1. Some General Lines A centrality index C(DG) is a function defined on the family of weakly connected, finite digraphs depending on the pointcentrality c(x) and on the structure of DG. In the following we define some things which seemto be characteristic of the concept of structural centrality in a digraph. Only Sabidussi(1966) has given a definition for the structural centrality of a digraph, but his definitions seem to be appropriate only to strongly connected digraphs. There is an agreement about the most centralized structure, it is the wheel W,,, but the concept of a decentralized structure is contradictory. The Bavelas index implies that the digraph DGi is as centralized asDG2 (seeFig. 2), and according to the Beauchampindex, DGr is more centralized than DG2, whereasMackenzie (1966) considersthe digraph DGi as the most decentralized digraph. According to Sabidussi’sopinion, DGi is highly centralized, DGa hardly at all. Especially, in the caseof weakly connected digraphs a direct way, like the sum of the pointcentralities, to measurethe centrality of a digraph seems to be biased: in the caseof ordinary pointcentralities the digraph DGi (see Fig. 2) is more centralized than the wheel Ws. The other direct ways give ambiguousresults analogousto the previous one. The centrality of a digraph DG dependsvery closely on the local relations between the points in DG not

Fig. 2. Two decentralized digraphs of five points. A line without any direction between two points x and y implies the existence of the directed lines (x,y) and @,x).

376

NIEMINEN

measured by the pointcentrality. For that reason we introduce the concept of the local centrality of a point x in DC. This gives the base for giving some general, descriptive properties of centrality in a digraph. Let c(x) be the given pointcentrality. The local pointset of a point x in a digraph DC with respect to the pointcentrality c(x), denoted by LP(x,c(x)), is the following set of points f,P(x,c(x))

= {y: (x,J~)~L(DG)

and c(v) < c(xj}.

The local centrality of a point x, denoted by LCyx), is Mx)

= qccd

- c@)), Y~WW(XN

+ 9,

and if LP(x,c(x)) = Q, then Lc(x) = 0. Let DC be a weakly connected digraph of p points. The sequence ((‘(XlhL (4xzh), (4x3)?%), . . . , (c(x,),n,) of pairs, where c(xr) > c(xz) > . . >c(x,.), divides the pointset P(DG) into r point-disjoint, nonempty classes, where the class of pointcentrality c(xJ has ni points, i = I, . . . , r. The centrality of the whole digraph depends more closely on the local centralities of the points in the classes of large pointcentrality c(xJ than on those of small pointcentrality. Hence we require that if the local centrality of a point y in the class determined by c(x,) increases by an amount a and the local centrality of a point z in the class determined by c(xI) decreases by the same amount *, the other local centralities remain unaltered, and c(x,) > c(x,), then the centrality of DC increases, and if c(x,) < c(x,), then the centrality of DC decreases. Further, if the number nl increases and the local centralities remain unaltered in DG, the centrality of DG increases, and similarly, if the number n, and the local centralities remain unaltered, but c(xr) increases, then the centrality of DG increases. The last requirement implies that the digraph DC1 is more centralized than DG1 (see Fig. 2). Note that the requirements above do not imply the comparability of each two digraphs with respect to a measure satisfying the requirements. Hence the result given by a measure is an approximation in many cases. 3.2. A Centrality Index and the Local Centrality Let h(x) be the given pointcentrality, DG a weakly connected digraph of p points, and (hl ,n,), (hz,nz), (h3,n3), . . . , (h,,n,) the sequence of pairs dividing P(DG) into r classes having the same value hi of pointcentrality, i = 1, . . ., r. We define the centrality index, denoted by H(DG), as follows H(DG)

= [E x (I ’ C(x)/E 1i?(x) I @I + W[h,lb+l-nIlI>

where i(x) = i,,, when h(x) = hiO, and xeP(DG). The coefficient before the term [h, /(p+l-nr )] depends on the used pointcentrality and in the case c(x)

ON THE

CENTRALITY

IN A DIRECTED

GRAPH

377

= h(x) the value l/3 gives results which are consistent with the requirements above and our intuition on the centrality in a digraph. Clearly, if X(X,) increases by an amount *, Lc(x,) decreases by the same amount, and if all other local centralities remain unaltered, the centrality of DC increases, if h(x,) > h(xt) since Z/zyxt) ni > Z:z’$“u) nit and it decreases, if h(x,) < h(x,) according to a similar reason as above. If the local centralities remain unaltered and number nl increases, p+l-n, decreases and hence H(DG) increases. Similarly, if h, increases, the increase of the term h,/@tl-n,) implies the increase of H(DG). Thus the index H(DG) satisfies the descriptive requirements of the previous section. In terms of Flament (1963) H(DG) measures first the degree of disparity between the points connected by a line of length one, and second the relative rate of the points having the maximum value ht of h(x) in DG multiplied by h r . Consider finally the centrality of some specified, weakly connected digraphs of five points measured by H(DG) (see Fig. 3). In Ws RI’(u), RP(w), w(z), RI+) contain only the points u,w,z,y, respectively, and hence h(u) = h(w) = h(z) = /2(/v) = 0. M(x) = {x, y ,z,u,w}, !RP(x)I = 5, and thus h(x) = 4(5-l) + 5 = 21. The classification of P(Ws) is determined by the sequence (21,1), (0,4). Further, LPQ) = U’(z) = M(U) = U(w) = Q implying LC(,v) = U?(z) = K(u) = E(w) = 0, but U’(x) = {y,z,u,w} and thus M(x) = 4.21 = 84. According to the formula for H(DG), H(Ws) = 84 + (l/3) * (2 l/6- 1) = 85.4. For the other diagraphs of Fig. 3 we do not follow the calculations throughout, but give only the value of H(DG). We obtain: H(DGr) = 40.1, H(DG,) = 15.4, H(DG,) = 14.6, H(DG,) = 9.8, H(DG,) = 9.8, H(DG,) = 7.0, H(DG,) = 5.3, H(DGs) = 5.3, and H(DG9) = 3.5. Further, the digraphs of three points in Fig, 4 have the centralities H(DG,) = 2.8 and H(DG,) = 2.3. Naturally the results above highly depend on the used pointcentrality. A very simple pointcentrality is the outdegree of a point, denoted by a(x). If we put c(x) = o(x) and use the formula of H(DG) to compute the values of the

Fig. 3. A set of weakly

connected

digraphs

with

five points.

378

NIEMINEN

-

DG,

A

DG2

Fig. 4. Two digraphs of three points. centrality index, denoted in this case by U(DG), we obtain the following results for the digraphs of Fig. 3: O(Ws) = 16.3, O(DG,) = 9.3, O(DG,) =

4.2, O(DG4) = 0.3, O(DG9) = 1.2, O(DG2) = 12.3, O(DG,) = 5.7, O(DG,) = 1.3, O(DG,) = 1.0, and O(DGa = 0.7. The coefficient of the term al/&l-nl), where o1 is the maximum outdegree of DG, was l/3. Note that the pointcentrality o(x) does not satisfy the requirements (l)-(4) for the pointcentrality proposed in Section 2.1. For applications and further information on the usage of the theoretical material of this paper we refer to the articles of Glanzer and Glaser (1959, 1961).

ACKNOWLEDGMENT I wish to give my sincere thanks to the referees for their valuable suggestions and comments. REFERENCES Bavelas, A. (1950), “Communication patterns in task-oriented groups,” Journal of the Accoustical Society of America 22, 725-730. Beauchamp, M. (1965), “An improved index for centrality,” Behavioral Science 10, 161-163. Flament, C. (1963), Applications of graph theory to group structure, pp. 50-52, Prentice-Hall, Englewood Cliffs, N.J. Glanzer, M., and Glaser, R., (1959), “Techniques for the study of group structure and behavior: I. Analysis of structure,” Psychological Bulletin 56, 317-332. Glanzer, M., and Glaser, R. (1961), “Techniques for the study of group structure and behavior: 11. Empirical studies of the effects of structure in small groups,” Psychological Bulletin 58, l-27. Harary, F., Norman, R. Z., and Cartwright, D. (1965), Structural Models: An Introduction to the Theory of Directed Graphs, Wiley, New York. Mackenzie, K. (1966), “Structural centratity in communication networks,” Psychometrika 31, 17-25. Sabidussi, G. (1966), “The centrality index of a graph,” Psychometrika 31, 581-603.